An article I published back in 2013 in the Journal of the Canadian Historical Review, entitled “Mining the Internet Graveyard: Rethinking the Historians’ Toolkit” is now post firewall and accessible online. It grew out of some of my first postdoctoral research and represents my first thoughts and engagement with web archives. It ended up getting quite a bit of traction: it was awarded the Best Article in the Journal of the Canadian Historical Association award by the Canadian Historical Association, and seemed to generate a good bit of discussion judging by where it’s been cited and used in courses.
It’s a bit dated now: the example I use with Canada’s Digital Collections wouldn’t be the one I take now, but for somebody who was really just starting down the programming rabbit hole, I think it’s a good reflection of where I was then. I’m still pretty happy with the first half of the piece, and where I situate web archiving within the historical profession and the digital humanities more generally.
Anyways, the abstract:
“Mining the Internet Graveyard” argues that the advent of massive quantity of born-digital historical sources necessitates a rethinking of the historians’ toolkit. The contours of a third wave of computational history are outlined, a trend marked by ever-increasing amounts of digitized information (especially web based), falling digital storage costs, a move to the cloud, and a corresponding increase in computational power to process these sources. Following this, the article uses a case study of an early born-digital archive at Library and Archives Canada – Canada’s Digital Collections project (CDC) – to bring some of these problems into view. An array of off-the-shelf data analysis solutions, coupled with code written in Mathematica, helps us bring context and retrieve information from a digital collection on a previously inaccessible scale. The article concludes with an illustration of the various computational tools available, as well as a call for greater digital literacy in history curricula and professional development.
One final note: it’s too bad it took so long for this to go open access, but options were limited in Canadian historiography when I was publishing this. The new Tri-Agency Open Access policy means, however, that I now have a very real and legal obligation to make my work accessible: and editors need to play along, or they don’t get to publish federally-funded scholars.