Whenever I even think about archival trips, my back pre-emptively aches. It involves sitting or standing near documents, taking digital photographs. And I know that if I looked around the archive, chances are that nine out of ten of my colleagues are doing something similar (and yes, because the plural of an anecdote is not data, Ithaka S+R has reported on this widespread trend in historical research).
When we all travel home to our universities, those historians who are travelling on SSHRC’s dime will surely deposit their research data (photos?) after a reasonable amount of time with their organization’s institutional repository or on some other sharing website, right, to make sure their publicly-funded research is made accessible? SSHRC just wants “qualitative information in digital format,” so maybe our photos, or just our notes, right? </sarcasm>
I wager – unscientifically, based only on anecdotal conversations at the Canadian Historical Association, on Twitter, and in hallways – that the vast majority of historians in Canada would be opposed to the very idea, even if their work was generously funded. The value of our work is too wrapped up in the scarcity of sources themselves, rather than just the narratives that we weave with them. Continue reading
I did a short interview on CBC Radio’s The Current with Anna Maria Tremonti, which aired this morning. I was responding to some of the utopian arguments made by Christian Rudder’s book Dataclysm, noting that while the historical record is going to be enriched by digital sources, we’ve got to consider issues of access, preservation, and funding. I was nervous, but I think I got my main points across pretty decently.
The talk is available here: “Historians want Canada to give them access to Big Data.”
It was a fantastic experience, and it really did get me thinking about how it would be rewarding to build the capital to get website legal deposit in Canada — or at the very least, to get the preservation of digital resources a little bit more on the table. Maybe I’ll try my hand at some popular writing.
X-Posted with ActiveHistory.ca
By Ian Milligan
Yesterday afternoon, in the atrium of the University of Waterloo’s Stratford Campus, a packed room forewent what was likely the last nice weekend of summer to join Peter Mansbridge and guests for a discussion around “What’s the future of the library in the age of Google?” It was aired on CBC’s Cross Country Checkup on CBC Radio One, available here. It was an interesting discussion, tackling major issues such as what local libraries should do in the digital age, issues of universal accessibility, and whether we should start shifting away from a model of physically acquiring sources (notably books) towards new models for the 21st century. Historians, and those who care about history, have much to contribute to these sorts of conversations. Those who know me or have read my writings over the last three years know that I’m not a luddite. But I came away worried about some of the assumptions made in the conversation, and what they mean for us who write about the past.
A big crowd of folks who care enough about libraries to spend a beautiful Sunday afternoon in a university building lobby.
I don’t want to rehash the conversation, as you could rewatch it, but a brief summary of some of the main themes might help. The broadcast began with Peter Mansbridge asking the major question “Digital technology is changing the way we store information, and how we learn from it. Does it make sense to stack printed books in costly buildings when virtual libraries are just a mouse-click away?” Mansbridge was joined by Christine McWebb, director of academic programs at the Waterloo Stratford Campus, and Ken Roberts, former chief librarians of the Hamilton Public Library and a member of the Royal Society of Canada’s Expert Panel on the Future of Libraries and Archives in Canada. Continue reading
I think even my colleagues are surprised when they’re reminded that I’m beginning my third year at the University of Waterloo. Last year was a fantastic one: great students, fun colleagues, and getting more involved in the life of the university (from sitting on doctoral committees, Master’s committees, and getting to attend fun events).
I’m teaching four courses this year, two in the Fall and two in the Winter. This term, I’m teaching two classes: the second-year historical methodology course for our majors, minors, and lovers-of-history, and a fourth-year honours seminar on Canadian social movements. If you’re curious, feel free to click through on the syllabus thumbnails below to read the whole things. As with every class, there are things that have been left out, but that’s the way of things!
My digital history course, which a few folks on Twitter have asked about, will be offered again in Winter 2015.
History 250 – The Art and Craft of History
History 403A – Canadian Honours Seminar
A low-resolution version of the EnchantedForest visualization. Read on for higher-resolution downloads.
ImagePlot, developed by Lev Manovich’s Software Studies Initiative, promises to help you “explore patterns in large image collections.” It doesn’t disappoint. In this short post, I want to demonstrate what we can learn by visualizing the 243,520 images of all formats that make up the child-focused EnchantedForest neighbourhood of the GeoCities web archive.
Setting it Up
Loading web archived images into ImagePlot (macros which work with the open-source program ImageJ) requires an extra step, which works for both Wide Web Scrape as well as GeoCities data. Images need to be 24-bit RGB to work. My experience was that weird file formats broke the macros (i.e. an
ico file, or other junk that you do get in a web archive), so I used ImageMagick to convert the files. Continue reading
These popular colours from the children-focused area of GeoCities would look great on a paint swath, right?
I have been tinkering with images in web archives – an idea that I had was to pull out the colours and see how frequent they were in a given web archive. With some longitudinal data, we could see how colour evolves. In any attempt, by running enough analysis on various test corpuses and web archives, we could begin to get a sense of what colours characterize what kind of image collections: another method to ‘distantly’ read images.
This is easier said than done, of course! I did a few things. First, I began by getting average RGB values from each image using ImageMagick as discussed in my previous post: this was potentially interesting, but I had difficulty really extracting meaningful information from it (if you average enough RGB values you begin to be comparing different shades of grey). This is the least computationally intensive route. I might return to this, by binning the various colour values and then tallying them.
The second was to turn to an interesting blog post from 2009, “Flag Analysis with Mathematica.” In the post, Jon McLoone took all the flags of the world, calculated the most frequent colours, and used it to play with fun questions like ‘if I ever set up my own country, I know what are fashionable choices for flag colors.’ It’s actually functionality that’s been incorporated to a limited degree in Mathematica 10, but the blog post has some additional functionality that’s not in the new fancy integrated code.
As I’ll note below, I’m not thrilled with what I’ve come up with. But we can’t always just blog about the incredible discoveries we make, right? Continue reading
Do you like animated GIFs? Curious about what this? Then read on.. :)
Full confession: I rely too heavily on text. It was a common proviso of the talks I gave this summer on my work, which focused on the workflow that I used for taking WARC files, implementing full-text search, and extracting meaning from it all. What could we do if we decided to extract images from WARC files (or other forms of web archives) and began to distantly read them? I think we could learn a few things.
A montage of images from the GeoCities EnchantedForest neighbourhood. Click for a higher-resolution version.
This continues from my work discussed in my “Image File Extensions in the Wide Web Scrape” post which provides the basics of what I did to begin to play with images. It also touches on work I’ve done with creating montages of images in both GeoCities and the Wide Web Scrape.
While creating montages was fun, it didn’t necessarily scale: up to a certain level, you find yourself clicking and searching around. I like creating them, think they make wonderful images and are useful on several levels, but it’s hardly harnessing the power of the computer. So I’ve been increasingly playing with various image analysis tools to distantly read. Continue reading