The wonderous evolution of the archive continues...
Having a massive wall of 88x31 buttons will always be hip and cool, but something I've learned is people actually do use the archive to find sites to read and follow. A single 88x31 button does not convey a lot of information, so I wanted to figure out ways to helpfully categorize these links with the caveat I must be able to automate this process. Any web task that requires more than one command line argument will surely wither and die.
This archive primarily functions as a historical record, so I will continue to archive buttons that are either outdated or point to sites that no longer exist, but in the case of dead sites I will note they are deceased and omit the link for now.
First I chose to break up the buttons alphabetically by username. The template takes all the images in a folder and organizes them into a dictionary with the format { subdomain: [list of images]}. It may make the archive easier to navigate if you want to confirm your button is there.
import string alphabet = list(string.ascii_lowercase)</p> {% for letter in alphabet %} {{letter.upper()}} {% for citizen, button_list in buttons.items() %} {% if citizen.startswith(letter) %} {% for button in button_list %} <a href="//{{citizen}}.neocities.org"><img src="/images/buttons/neocitizens/{{button}}" alt="{{citizen}}"/></a> {% endfor %} {% endif %} {% endfor %} {% endfor %}
Next, I wanted to identify dead sites and we're gonna pull site info using python-neocities. Something like:
for subdomain in buttons: url = "https://{0}.neocities.org".format(subdomain) try: site_data = nc.info(subdomain) except neocities.neocities.NeoCities.InvalidRequestError as e: citizen_data[subdomain]["tags"] = [] citizen_data[subdomain]["status"] = False else: citizen_data[subdomain]["tags"] = site_data["info"]["tags"] citizen_data[subdomain]["status"] = True
All this is stored in a JSON I maintain. JSTOOL is a handy JSON plugin for Notepad++ that makes it easy to review and manage. Each time I generate a list I iterate of over the list and add entries for any new buttons.
python-neocities returns us a nifty little dictionary of site data and I'll use 1kb as the example since it's the first site on my list:
{'result': 'success', 'info': {'sitename': '1kb', 'views': 26744, 'hits': 75727, 'created_at': 'Thu, 18 Jun 2020 06:05:24 -0000', 'last_updated': 'Tue, 02 Nov 2021 04:26:38 -0000', 'domain': None, 'tags': ['blog', 'graphics', 'resources', 'personal'], 'latest_ipfs_hash': None}}
The problem with user submitted data is consistency. Some users list tags, some don't. Some tags are more useful than others. If I want to do more (say, include the site title), I'll need to use a library like beautifulsoup, but again, consistency will always be an issue. (How many users use a title like "home" for their root index? It's just as many as you think!) I ended up giving each button a title attribute with the username and tags, if available. Dead sites have little skulls. Ended up with a a template block like this:
{% for letter in alphanumeric %}
<div id="{{letter}}"><h3>{{letter[0].upper()}}</h3>
{% for citizen in citizen_data %}
{% if citizen|first in letter %}
{% if citizen_data[citizen]['buttons']|length > 1 %}
<div class="citizen-multi">
{% else %}
<div class="citizen">
{% endif %}
{% for button in citizen_data[citizen]['buttons'] %}
{% if citizen_data[citizen]['status'] %}
{% if citizen_data[citizen]['tags'] %}
<a href="//{{citizen}}.neocities.org"><img src="/images/buttons/neocitizens/{{button}}" alt="{{citizen}}" title="{{citizen}}: {{', '.join(citizen_data[citizen]['tags'])}}"/></a>
{% else %}
<a href="//{{citizen}}.neocities.org"><img src="/images/buttons/neocitizens/{{button}}" alt="{{citizen}}" title="{{citizen}}"/></a>
{% endif %}
{% else %}
<img src="/images/buttons/neocitizens/{{button}}" alt="{{citizen}}" title=" 💀 Rest in Peace, {{citizen}} 💀"/>
{% endif %}
{% endfor %}
</div>
{% endif %}
{% endfor %}
</div>
{% endfor %}
Finally, one last thought... Some users have multiple buttons, which is why the population count is less than the number of buttons. I would like to visually group these in some way, and the easiest idea that sprung to mind was a colored DIV.
Beyond this, we start getting into potentially time-consuming stuff I don't want to deal with right now. I need to output the tag lists so I can look at them and decide if certain tags are worth using to group sites. Tags like "poetry" are probably informative, but tags like "personal" maybe aren't. Probably the best solution is to sort and count the tags and see which ones are the most useful. Eventually, an alphabetical topical listing might replace the current alphabetical username listing.
Another thought was checking to see if dead sites are archived at the Wayback Machine or elsewhere and providing a link to the archived site, but here's the thing. 1) wow that's extra work absolutely no one asked for am I stupid or what? 2) Everyone has the right to delete their personal cyber slag heap, and if someone REALLY wants to dig up your bones, maybe that's on them, you know? Maybe they should have to do the legwork if they're really that interested in examining the corpse. Seems fair.
I've been meaning to build a NeoCities site directory and leveraging the 2,500+ buttons I've gathered seemed like a good starting point. The presence of an 88x31 button alone doesn't mean a site is interesting, but it does signal a user may be more invested in NeoCities as a community and therefore more likely to spend time making an interesting site. In my previous post I mentioned using the NeoCities API to pull the self-assigned tags some users add to their sites. I dumped these into a .json file.
A few numbers:
The top tags roughly correlate with NeoCities' popular tags cloud. Here's a small sample:
('personal', 832)
('art', 788)
('videogames', 336)
('music', 307)
('blog', 233)
('anime', 183)
('writing', 150)
('programming', 104)
('90s', 100)
('2000s', 84)
('games', 83)
('cute', 67)
('nostalgia', 61)
('comics', 57)
('graphics', 55)
('retro', 54)
('furry', 52)
('ocs', 50)
Of note, 8 of you use "" as a tag (bless). 5 of you use the tag "neocities," a tag I immediately applied to my own site.
There are many tags that can be combined (for instance: "oc", "ocs", "originalcharacters", and maybe even "characters" and "characterdesign"). Some tags present interesting abiguity ("languages," for instance) and some tags are perhaps used a too liberally (I assumed sites tagged "links" would be link directories but not all are). Tags like "art" and "personal" are so widely used as to be meaningless for our purposes. Additionally, some users may be tagging personal interests rather subjects their site is about.
I make a few assumptions:
For example, the NeoCities Zelda tag is small and still pulls an unreasonable amount of not-Zelda. The archive list pulls 4 sites: clubnintendoarchives, eligood, raylin-shire, rubyfire77. Three of these have Zelda content. 75%. I'll take it!
Okay, so let's look at clubnintendoarchives. It is tagged "nintendo", "videogames", "clubnintendo", "mario", "zelda". The most relevant of these tags is nintendo. Video games is a large enough tag to be irrelevant (336) but the nintendo tag, which pulls 23 sites, is about the size I'm looking for. If you cruise the nintendo tag, however, you'll find many of these sites would might be better associated with more specific tags. garyland (tagged 90s, nintendo, anime, pokemon, tv) seems best represented as a Pokemon site, for example.
I wanted to see if I could up with some code to determine tags by association, but when I grabbed all the tags used by users who also use the "nintendo" tag I got:
['pixelart', 'asian', 'hylics', 'crustacean', 'archive', 'vocaloid', 'reviews', 'finalfantasy', 'fancomic', 'mario', 'videogames', 'animalcrossing', 'zelda', 'concerts', 'splatoon', 'gaming', 'fromsoftware', 'anime', 'tcg', 'kirby', 'sega', 'nes', 'blog', 'music', 'art', 'gameboy', 'tamagotchi', '2000s', 'retro', 'pokemon', '90saesthetic', 'midi', 'games', 'writing', 'marinelife', 'memes', 'yoshi', 'personal', 'glitches', 'pixel', 'skruffy64', 'nintendo', 'robots', 'tv', 'thrifting', '90s', 'pikmin', 'clubnintendo', 'kawaii']
As a human, I can easily manually select the relevant subtags (finalfantasy, mario, animalcrossing, zelda, splatoon, kirby, pokemon, yoshi, pikmin), but automating this chaos doesn't seem easy. To use another video game example, 4 users have the tag "finalfantasy," and if you look at their tag lists:
['kingdomhearts', 'finalfantasy', 'videogames']
['finalfantasy', 'writing', 'art', 'gaming', 'videogames']
['ffxiv', 'finalfantasy', 'games', 'art', 'videogames']
['gaming', 'games', 'nintendo', 'videogames', 'finalfantasy']
They all use variations of videogames, games, and gaming in their tags. These users mean "games" to refer to video games explicitly, but Boardgame NeoCities uses "games" to mean something different. Next I isolated "spatoon":
['pixelart', 'pixel', 'nintendo', 'art', 'splatoon']
['dollz', 'splatoon', 'personal', 'webcore']
['nintendo', 'splatoon', 'art', 'blog', 'videogames']
['splatoon', 'nintendo', 'memes', 'videogames']
['personal', 'art', 'videogames', 'splatoon']
['pastel', 'kidcore', 'splatoon', 'oldweb']
Some users tag with videogame, but some don't. Can I write code that infers, by this list, that splatoon is related to nintendo, and nintendo is a subtag of gaming? I'm sure someone smarter than me can, but this feels outside the scope of a simple directory generation project. I concluded tags need to be hand-selected. Fair enough, lack of curation reduces a directory's usefulness. I can still use the button archive data as a starting place.
Narrow tags for consideration to any that have at least 4 uses and the list shrinks from 1,417 to a manageable 265. I went through and picked tags that I felt represented interesting communities (or cliques or niches) at NeoCities. Some I had to look up (tokipona and egl, for instance). dolls is not the same as dollz, and so on. I grudgingly acknowledged tags like "zine" and "zines" would need to be combined somehow.
I learned some interesting things. Users almost always use "comic" to mean to "webcomics," for instance, but "comics" has a broader range of meaning. A tag like "insects" reliably means the site is substantively about insects, but some users use the tag "cats" just because... idk, they like cats? They might not have any cats on their site at all.
Curating a basic list of tags and outputting a basic directory using a jinja2 template is simple. Finding ways to combine tags or further curate the listings was harder. I wanted to set up a system where I could just dump .json data in an reliably generate a directory without getting needlessly complicated with special cases or things that needed to be adjusted by hand.
I decided to maintain a .json that contains the following:
The last two are necessary because sometimes people don't update their tags when the site focus changes and sometimes people use tags arbitrarily. I also want to start adding sites that don't have buttons, and those will have to be added manually anyway.
This was originally going to be a series of blog posts, but then I realized... why? Vertical webspace is infinite, and one HTML file is easier to manintain. File under Design Philosophy. This pages references the following: