New Article: “If These Crawls Could Talk: Studying and Documenting Web Archives Provenance”

I’m part of a team that’s just published a new article, “If These Crawls Could Talk: Studying and Documenting Web Archives Provenance” in the Journal of the Association for Information Science and Technology. If your institution subscribes, you can find the article here. Alternatively, we have a preprint here.

Our abstract does a hopefully good job of explaining what the article is about. Read on if you’re curious:

The increasing use and prominence of web archives raises the urgency of establishing mechanisms for transparency in the making of web archives to facilitate the process of evaluating a web archive’s provenance, scoping, and absences. Some choices and process events are captured automatically, but their interactions are not currently well understood or documented. This study examined the decision space of web archives and its role in shaping what is and what is not captured in the web archiving process. By comparing how three different web archives collections were created and documented, we investigate how curatorial decisions interact with technical and external factors and we compare commonalities and differences. The findings reveal the need to understand both the social and technical context that shapes those decisions and the ways in which these individual decisions interact. Based on the study, we propose a framework for documenting key dimensions of a collection that addresses the situated nature of the organizational context, technical specificities, and unique characteristics of web materials that are the focus of a collection. The framework enables future researchers to undertake empirical work studying the process of creating web archives collections in different contexts.

This was the product of research I did with Christoph Becker, Emily Maemura, and Nicholas Worby at the University of Toronto’s Digital Curation Institute and was supported by the Marshall McLuhan Centenary Fellowship at the Faculty of Information. It was a very rewarding position and I’m so glad to see this article come out.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s