Archive Pagination and Diggin' Up Graves

Mood: mood: working

Archive Pagination Get

Several thousand 88x31 buttons ago I understood I needed to divide the archive into pages, and I did not do that because It Felt Like Work. Well, I am once again dragging myself into a depraved nest of code to slay the dragons I myself birthed into this world. (Translation: finally gonna update it.) Fact is, the archive is already over 50MB and NeoCities isn't getting any smaller. Hopefully this update will make it less overwhelming.

It's best to divide the pages by letter, rather than # of buttons. That means 27 pages. Yay. We'll keep things simple, and name them consistently and logically: 88x31-a, 88x31-b, etc.

Extremely tempting to use frames for this, just to drive people on Hacker News insane, but no. These need to be individual pages. I'll keep the 88x31 explanation header for context, and plorp the buttons below it.

The current code works by taking a big ol .json of neocitizen data and tossing it into a single template:

rendered_template = template.render(citizen_data=citizen_data,population=population,today_date=today_date,alphanumeric=alphanumeric)

The fix is to iterate over the alphabet, and render a page for each letter. Which is actually pretty easy (pardon the 🍝):

for letter in alphanumeric:
    page = template.render(citizen_data=citizen_data,population=population,today_date=today_date,alphanumeric=alphanumeric,letter=letter)
        if letter[0] == "#":
            page_filename = '88x31.html'
        else: 
            page_filename = '88x31-' + str(letter) + ".html"

with open(filename, 'w') as f:
     f.write(page)

Now I'm not sure why I dragged my feet for so long, and I feel kinda silly about that.

πŸ’€ πŸ’€ πŸ’€DEATH BECOMES THEM πŸ’€πŸ’€ πŸ’€

The other thing I needed to do is a deadlink check. Why does it matter? Because people actually cruise the archive looking for sites to follow, and there's no point wasting anyone's time with dead links when we have an easy way to check. Right? Right?? Right.

Currently, Crawlie only verifies if a site exists when the site button is added to the archive. Just clicking around, I landed on multiple dead sites (RIP) please stop deleting your sites I beg you.

Anyway I didn't feel... super great... about doing like 7,500 API calls to verify all these links. But according to 44Nifty's blog post on special sauce, 500k API calls gives a sauce penalty. So it seems 500k is what Kyle Drake deems excessive.

Therefore, the plan was to do this quickly under the cover of darkness, while Kyle was trapped in chatbot hell, and hope the site overlords didn't perceive me.

The tasks: pull a list of living neocitizens, check the url to see if they're still alive, then update the corresponding .json entry. This info is stored under a status key.

Active site:

{"000010": {"buttons": ["000010.gif"], "tags": ["klonoa", "blog"], "status": true}}

DED site:

{"3fetid": {"buttons": ["3fetid.png"], "tags": [], "status": false}}

Maaaan this feels like work 😭 but ☝️☝️☝️☝️ we've heard that excuse before so onward!

Digging for Graves...

Once again, this really was not that much work. It took far longer to actually check howevermany thousands of urls.

graves = list()

with open(JSON_FILE) as f:
    citizen_data = json.load(f)
    for subdomain in citizen_data.keys():        
        if citizen_data[subdomain]["status"]:
            url = "https://{0}.neocities.org".format(subdomain) 
            try:
                site_data = nc.info(subdomain)                   
            except neocities.neocities.NeoCities.InvalidRequestError as e:     
                citizen_data[subdomain]["status"] = False
                print(("{} is dead").format(url))
                graves.append(url)

with open(JSON_FILE, "w") as outfile:
    json.dump(citizen_data, outfile)

This resulted in 392 corpses, assuming none of the names are typos. I am saddened by some of the names I recognize from my web wanderings. Let us have a moment of silence, to pay respect to our fallen neighbors:

πŸͺ¦ 4ivilo, πŸͺ¦ angelgirl134123, πŸͺ¦ artists-of, πŸͺ¦ atohacya, πŸͺ¦ avertedvision, πŸͺ¦ ayu-mi-x, πŸͺ¦ betah, πŸͺ¦ bojanthelibrarian, πŸͺ¦ cafemika, πŸͺ¦ caors, πŸͺ¦ cappyy, πŸͺ¦ conduit-7, πŸͺ¦ cosmopopcomics, πŸͺ¦ deathslingxr, πŸͺ¦ drhelvetica, πŸͺ¦ edn, πŸͺ¦ eightbriitt, πŸͺ¦ everlastingcontrast, πŸͺ¦ freesha2002, πŸͺ¦ fretnoize, πŸͺ¦ fundamentallyunloveable, πŸͺ¦ geniphony, πŸͺ¦ geno7, πŸͺ¦ gh0stprince, πŸͺ¦ goodgoobgooby, πŸͺ¦ greyasashe, πŸͺ¦ halssite, πŸͺ¦ hatto, πŸͺ¦ heart-soda, πŸͺ¦ helianthuspetal, πŸͺ¦ hermione-art, πŸͺ¦ herocore, πŸͺ¦ hydratriangle, πŸͺ¦ icg, πŸͺ¦ ihearttheundead, πŸͺ¦ incelperspective, πŸͺ¦ isfim, πŸͺ¦ jaysmicezone, πŸͺ¦ jonniecore, πŸͺ¦ junothesilly, πŸͺ¦ katzentrys, πŸͺ¦ kirby677, πŸͺ¦ kittydoll, πŸͺ¦ konbie, πŸͺ¦ konfetti, πŸͺ¦ lambdafun, πŸͺ¦ leanneu, πŸͺ¦ lightninglove, πŸͺ¦ limineonal, πŸͺ¦ line-space, πŸͺ¦ lisathepainful, πŸͺ¦ logicalwillow, πŸͺ¦ lukuak, πŸͺ¦ manjaruntu, πŸͺ¦ medulladeath, πŸͺ¦ megalomaniac, πŸͺ¦ milk2008, πŸͺ¦ miserabledolly, πŸͺ¦ mollysdndjournal, πŸͺ¦ mordzine, πŸͺ¦ mortalboy, πŸͺ¦ natolmo, πŸͺ¦ naxdot, πŸͺ¦ ninjabou, πŸͺ¦ omarlego, πŸͺ¦ otamadachi, πŸͺ¦ owlcollective, πŸͺ¦ peachpum, πŸͺ¦ perish-x, πŸͺ¦ pleurodelinae, πŸͺ¦ prosequitur, πŸͺ¦ quoderatdemonstrandum, πŸͺ¦ rebdotexe, πŸͺ¦ rh0mbus0fruin, πŸͺ¦ rozariosanguinem, πŸͺ¦ scheherazades-niche, πŸͺ¦ shitlist, πŸͺ¦ sightseer, πŸͺ¦ sillylittledog, πŸͺ¦ skeletalcomrade, πŸͺ¦ stefwithanf, πŸͺ¦ strawbearie, πŸͺ¦ superstarmations, πŸͺ¦ t0by-toxxik, πŸͺ¦ takkyuudou2003fan, πŸͺ¦ techrakatt, πŸͺ¦ techramancer, πŸͺ¦ terminalrot, πŸͺ¦ thatpupi, πŸͺ¦ theglittersalamango, πŸͺ¦ theisadoragroove, πŸͺ¦ theran, πŸͺ¦ they-walk-among-us, πŸͺ¦ thoughtcrimes, πŸͺ¦ timesmice, πŸͺ¦ toothman, πŸͺ¦ treefingerfilms, πŸͺ¦ untitledblog, πŸͺ¦ vendetta2525, πŸͺ¦ vivus, πŸͺ¦ vodkabinereb, πŸͺ¦ vomitboyz, πŸͺ¦ wardof, πŸͺ¦ wastelandimperiatrix, πŸͺ¦ wavecave, πŸͺ¦ webpage1990colourised, πŸͺ¦ weedpizza, πŸͺ¦ wicked-forest, πŸͺ¦ widepop, πŸͺ¦ wtrclover, πŸͺ¦ wyrmwhisper, πŸͺ¦ xxbunnyratxx, πŸͺ¦ xxxemobunnyxxx, πŸͺ¦ ziggycore, πŸͺ¦ 1roomsurvival, πŸͺ¦ chuyas, πŸͺ¦ cloutsandrec, πŸͺ¦ coffin-corner, πŸͺ¦ cubicsimulation, πŸͺ¦ dfbw, πŸͺ¦ headcaze, πŸͺ¦ niconiconapster, πŸͺ¦ xenonical, πŸͺ¦ 16-no-solitude, πŸͺ¦ angelnite, πŸͺ¦ br2k5, πŸͺ¦ eigenvoid, πŸͺ¦ kitz, πŸͺ¦ marzka, πŸͺ¦ schooltrenchcoat, πŸͺ¦ trench, πŸͺ¦ zaks-slaughterhouse, πŸͺ¦ cyberblank, πŸͺ¦ softmoon, πŸͺ¦ stardustdreamz, πŸͺ¦ tai7kmusic, πŸͺ¦ erythronium, πŸͺ¦ cyberangeldust, πŸͺ¦ dc-blog, πŸͺ¦ dunkingtruth, πŸͺ¦ fartmother, πŸͺ¦ galerexia, πŸͺ¦ goat-online, πŸͺ¦ h0pey0ng, πŸͺ¦ i-me-and-myself, πŸͺ¦ lauragrave, πŸͺ¦ luv-ghoul, πŸͺ¦ vasterror, πŸͺ¦ vampireautopsy, πŸͺ¦ barneysmind, πŸͺ¦ elsenn, πŸͺ¦ francess, πŸͺ¦ its11pmwhatamidoing, πŸͺ¦ kmartparkinglot, πŸͺ¦ sentimentality, πŸͺ¦ sunnysystem, πŸͺ¦ vilxdryad, πŸͺ¦ andrymeda, πŸͺ¦ chmuryverse, πŸͺ¦ fille-de-pierrot, πŸͺ¦ rindustrial, πŸͺ¦ angelpuppyclub, πŸͺ¦ angelscake, πŸͺ¦ awhe, πŸͺ¦ babybellcheese, πŸͺ¦ blooperblog, πŸͺ¦ cardcaptors, πŸͺ¦ cherrychocochan, πŸͺ¦ codecrusader, πŸͺ¦ deathbycats, πŸͺ¦ evemarie, πŸͺ¦ glittertown, πŸͺ¦ gnomescourt, πŸͺ¦ janpan, πŸͺ¦ jt802t, πŸͺ¦ juriettoo, πŸͺ¦ kojou, πŸͺ¦ ladytron, πŸͺ¦ lorekeeping, πŸͺ¦ lyricaltokarev, πŸͺ¦ majickss, πŸͺ¦ momolover, πŸͺ¦ much-ado-about-everything, πŸͺ¦ n64bug, πŸͺ¦ nyatchi, πŸͺ¦ oswalds, πŸͺ¦ pinocchiop, πŸͺ¦ princesslunagameblog, πŸͺ¦ qso404, πŸͺ¦ radiationcat, πŸͺ¦ radpage, πŸͺ¦ raspergine, πŸͺ¦ realitv, πŸͺ¦ sealparty, πŸͺ¦ snowenti, πŸͺ¦ sockspace, πŸͺ¦ sodagirl, πŸͺ¦ stickmanishere, πŸͺ¦ strangled, πŸͺ¦ teshief, πŸͺ¦ theariaeliot, πŸͺ¦ thenoxwitch, πŸͺ¦ tlaylor, πŸͺ¦ uglygirlswag, πŸͺ¦ uwuboa, πŸͺ¦ vashti, πŸͺ¦ vocaland, πŸͺ¦ waxynwane, πŸͺ¦ wiverscuomo, πŸͺ¦ xn--0s9h, πŸͺ¦ ygg, πŸͺ¦ andrewsstuff, πŸͺ¦ aomuroguna, πŸͺ¦ artiostudio, πŸͺ¦ axetrax, πŸͺ¦ berry-playground, πŸͺ¦ bluenight, πŸͺ¦ cineparaidiotas, πŸͺ¦ divsel, πŸͺ¦ federiefederi, πŸͺ¦ ghoulish-cinnamon-site, πŸͺ¦ glytch, πŸͺ¦ gone-girl, πŸͺ¦ heirofslime, πŸͺ¦ hypercomplex, πŸͺ¦ kengo, πŸͺ¦ ketraline, πŸͺ¦ ketsuu, πŸͺ¦ lovelttr, πŸͺ¦ lunatic-writing-projects, πŸͺ¦ nekatina, πŸͺ¦ nekhnona, πŸͺ¦ nicolasperez, πŸͺ¦ omnis-unfair-mind, πŸͺ¦ pawzbutton, πŸͺ¦ primaballerina, πŸͺ¦ randomzigzag, πŸͺ¦ sillivis, πŸͺ¦ skellsparty, πŸͺ¦ somecaninething, πŸͺ¦ spacetroll, πŸͺ¦ squareroot, πŸͺ¦ sunnyapples, πŸͺ¦ tipheret, πŸͺ¦ typoism, πŸͺ¦ uglypsyche, πŸͺ¦ vampirology, πŸͺ¦ webcatz, πŸͺ¦ yuahouse, πŸͺ¦ 2ainnet, πŸͺ¦ beezlebabe, πŸͺ¦ chocov4mp, πŸͺ¦ circustiger, πŸͺ¦ cornpop397, πŸͺ¦ dansdungeon, πŸͺ¦ doompy, πŸͺ¦ emoalien, πŸͺ¦ exodusfleet, πŸͺ¦ extensioncord, πŸͺ¦ funkydealer, πŸͺ¦ funwplushtrap, πŸͺ¦ g00nsgarbage, πŸͺ¦ goofysillygoober, πŸͺ¦ izabellaofcastile, πŸͺ¦ jackalopes, πŸͺ¦ jlon, πŸͺ¦ jubiland, πŸͺ¦ koiitann, πŸͺ¦ linksfem, πŸͺ¦ madewithrealsugar, πŸͺ¦ mikumyung, πŸͺ¦ nancer, πŸͺ¦ neogeist, πŸͺ¦ rosie-eclairs, πŸͺ¦ rymin, πŸͺ¦ schooltown, πŸͺ¦ silversquid, πŸͺ¦ soapfriendo, πŸͺ¦ sodiecake, πŸͺ¦ thecohort, πŸͺ¦ trenty-brenty, πŸͺ¦ vinylwolf, πŸͺ¦ walpurgizt, πŸͺ¦ xxxiwudnvrstopuxxx, πŸͺ¦ 07151129, πŸͺ¦ aetherie-99, πŸͺ¦ angelmorningstar, πŸͺ¦ baileylockheart, πŸͺ¦ bibliohound, πŸͺ¦ biocatzard, πŸͺ¦ bisexualism, πŸͺ¦ chaoticdreamz, πŸͺ¦ cybertomboy, πŸͺ¦ dariusaur, πŸͺ¦ duskspirals, πŸͺ¦ epicvgmusic, πŸͺ¦ f-t-p, πŸͺ¦ fishstew, πŸͺ¦ frogobongo, πŸͺ¦ futen, πŸͺ¦ g0shiki, πŸͺ¦ gammagoop, πŸͺ¦ ghostnoire, πŸͺ¦ liliyoa, πŸͺ¦ mayfl0wer, πŸͺ¦ mazetthew, πŸͺ¦ menheraaudino, πŸͺ¦ minnowpond, πŸͺ¦ mrsaturn, πŸͺ¦ obsession-central, πŸͺ¦ pankines-zone, πŸͺ¦ parhelictriangle, πŸͺ¦ podminton, πŸͺ¦ polybiuzz, πŸͺ¦ savior-god, πŸͺ¦ swoesight, πŸͺ¦ techrahouse, πŸͺ¦ terminationevent, πŸͺ¦ thesolarstudio, πŸͺ¦ vetulicolia, πŸͺ¦ vomitdistrict, πŸͺ¦ waloeders, πŸͺ¦ whitepeach, πŸͺ¦ 3dsangel, πŸͺ¦ captain-goofball-kandi, πŸͺ¦ christinee, πŸͺ¦ dannkestreet, πŸͺ¦ dykeism, πŸͺ¦ exhibitd, πŸͺ¦ germpills, πŸͺ¦ hgari, πŸͺ¦ jayzzztv, πŸͺ¦ jonathn, πŸͺ¦ juneish, πŸͺ¦ kaiserpug, πŸͺ¦ kekkon2puro, πŸͺ¦ loosesocks, πŸͺ¦ luciuscorvus, πŸͺ¦ mehlancholia, πŸͺ¦ metallicscorner, πŸͺ¦ networkneighbourhood, πŸͺ¦ rotten-egghead, πŸͺ¦ slank1800, πŸͺ¦ snowibunni, πŸͺ¦ sparklewiziz, πŸͺ¦ sunkissed-feathers, πŸͺ¦ thegportal, πŸͺ¦ thejesusof, πŸͺ¦ toxic-revolution, πŸͺ¦ traversetown, πŸͺ¦ trendgender, πŸͺ¦ xxcrazyrainbow02xx, πŸͺ¦ zanefolder, πŸͺ¦ alysrealm, πŸͺ¦ bloodthirstymasquerade, πŸͺ¦ cupidspostalservice, πŸͺ¦ everybodysleeps, πŸͺ¦ nethazard, πŸͺ¦ sacred-gate, πŸͺ¦ thatwitecat, πŸͺ¦ drabunwolfcat, πŸͺ¦ winteryear, πŸͺ¦ applejar, πŸͺ¦ happylittlecomputerfag, πŸͺ¦ kiittypawbz, πŸͺ¦ kwason, πŸͺ¦ meowlimit, πŸͺ¦ polyphylactery, πŸͺ¦ sacredchoral, πŸͺ¦ tranceclickheart, πŸͺ¦ anarkrist, πŸͺ¦ griferssiters, πŸͺ¦ fairy-hermit, πŸͺ¦ ratteta, πŸͺ¦ webforever, πŸͺ¦ aspdestos, πŸͺ¦ keatongamer1248, πŸͺ¦ sweet-pea, πŸͺ¦ game-boy, πŸͺ¦ inwealorwoe, πŸͺ¦ bughug, πŸͺ¦ farikz, πŸͺ¦ greytext, πŸͺ¦ lovealone, πŸͺ¦ mshellfire, πŸͺ¦ tauruspawz, πŸͺ¦ bananaslurpee, πŸͺ¦ hellojoy, πŸͺ¦ nachowinnie, πŸͺ¦ netmagicalgirl, πŸͺ¦ hohi, and πŸͺ¦ olistormarts.

Rest in peace, beloved citizens.

Now. The weirdness. You can't go digging up a bunch of dead bodies without finding a little weirdness.

WISE FROM YOUR GWAVE

1979 appears to be a dead site, but when I check the site info, it does not throw the expected Invalid Request Error. The profile throws a 404 error. So what gives? 1979, are you dead or just sleeping? Or is this NeoCities weirdness under the hood?

What now

God, I hope nothing.

We've got a scraper. We've got pagination. We've got dead link finder thing. We've got a scraper. Did I already say that?

No more updates until next year!!! ✊