Finally - More Technology Bullshit · 1:23pm Nov 30th, 2013
Finally.
If anyone actually reads these things, you may recall that a long time ago, I created a script to download all 130,000+ (however many there are) stories on fimfiction, in a brute-force like fashion. That fucking sucked. It took like 3 days for the script to run, start time to finish time, i'm sure knighty wouldn't appreciate all the bandwidth being used by a massive crawler like wget (on recursive mode), and it honestly crashed my file manager when I tried to open the folder, because there were so many files. (i'm willing to bet even knighty doesn't store these stories as files, but rather, as database entries... they just get represented to the web-browser as if they were files)
BUT, I've revised my scripts. I've now got a setup that will completely archive every story on your favorites list, and even create follow-able links. Everything displays 100% the way it should, but it's all on my local machine. The good news is, it just does your favorited stories, so you don't have all these extra files lying around. I'm kinda tired, so I don't wanna have to put the script up today, but if anyone wants it, give me a pm, and i'll upload it, or send it to you, or we'll figure something out.
I just... can't believe it's finally done. And i'm sure knighty probably wouldn't mind this revision as much. It doesn't occupy too much bandwidth to do the initial crawl, and everytime you got to your localhost (or "*.internal") copy of fimfiction, you're basically requesting data from your own computer, not knighty's servers, so it potentially SAVES him some bandwidth.
No pic, no proof, and I intend to prove myself.
http://www.mediafire.com/convkey/315e/pddutdwgfhfx8hdfg.jpg?size_id=e
http://www.mediafire.com/convkey/e12c/h8b5clcqucj0h48fg.jpg?size_id=e
http://www.mediafire.com/convkey/85f5/lm7u3qmrcg46h8wfg.jpg?size_id=e
If anyone has any questions or anything, feel free to pm me, I really do like to talk about technology and shit. If you want this, and you're using a windows or mac os x computer (i'm using GNU/Linux - CentOS in the picture) then.... well, you'll probably have to wait a moment for me to convert the scripts to a compatible format, but it shouldn't take too long, as I have a much better idea of what i'm doing now.
Oh, but uh... one other little point of interest is... you'll always be logged in on your localhost copy. You see, fimfiction.net is a dynamic database-driven website, but this whole archiving process.... you COULD make a dynamic database-driven archive of fimfiction, but that'd be... fucking hell, because you'd have to reconstruct the whole structure. What i've done is archived everything into a static form. (notice on the screenshot with the story, it has a "%3F" in the url, instead of a "?"... that's because the files got archived with the querymark in the filename, and i'm referencing actual static files, where as on fimfiction, you're not actually accessing static files, you're sending a query (everything after the "?") to the page indicated by everything preceding the query. One page can serve 100,000 request, if he wanted to, just by having you query the page, and having the page respond with data dependent on the query. But my archive is all static, no queries, each page has a separate file, and all the queried pages are referenced explicitly.
If you feel you have a hard time understanding any of this, don't worry, i'm more than happy to teach the pre-requisite knowledge. If you feel that everything here is beneath you, and you're wondering why this little nerd is acting like he's actually done something with his little "kiddie-script" project, just remember that we all start somewhere. (I'm at the point where some people perceive my computer knowledge to be great, while others perceive it to be very little, but truly, it's all relative. Of course there will be people who haven't gotten this far yet, and will be lost in what I did, and likewise, there will be people so far above where I am, that my projects are "child's play" to them. )