I'm doing some mirroring/archiving, but I would certainly encourage anyone else who wants to do so as well. I do not have what I downloaded available online anywhere yet though.
I have three archives:
The first archive is everything
except the forums. So that would include pretty much anything at *.cityofheroes.com that can be found through recursively linking from the main website, as well as content on a few other related domains. So this includes domains such as ftp.coh.com, goingrogue.na.cityofheroes.com, ftp.ncsoft.com. I last took a snapshot around September 10 but I may take a new snapshot again later this week. This archive is more than 20 GB in size, due to media files.
The second archive is
just the forums. My initial attempts to archive this met with some difficulties. Because it's a dynamic site with many ways of viewing the same information, you end up with many many copies of the same information. And because it's all "flat" it was making my computer cry to have so many files in a single directory. However, this thread got me investigating again and I discovered that there's an
archive friendly version of the forums that strips out links, images, formatting, etc. Now
that is great for archiving! So last night I kicked off an archive job and it downloaded successfully, though it stopped at I think 1,000,000 links (because I didn't realize that was a default). I now have it set to download with a much higher threshold. I don't know how long it will take to finish, but I'll try to make sure I get at least one full snapshot. Not sure how big it will end up being, but the partial download was 1 GB.
The third archive is that I downloaded all the videos from their Twitch.TV, Ustream, and YouTube accounts. This is nearly 60GB of data.
I'm also keeping the archives I make backed up on
SpiderOak so I'm reasonably confident I won't lose them due to drive failure or anything. However, as I said, I wouldn't want to discourage anyone from making their own archives; more copies is safer. I'm using
HTTrack to make my website archives.