Ian Milligan

A Digital, Public, and Youth Historian of 20th-Century Canada


Leave a comment

An Aside: Blog Post Picked up in Condensed Format by Nature

Screen Shot 2013-05-16 at 5.52.32 PMPut this in the “another reason why it’s good to blog” pile. My blog post from last month, which argued that Yahoo! Messages was consciously destroying a fifteen-year swath of history, led Nature magazine to ask if I would submit it in condensed format to their Correspondence section (behind a paywall, unfortunately, but you’ll be able to read it if you’re on an institutional IP connection). Click here to view it.

I submitted it, and after further pruning by their staff (an aside to this aside: great editorial staff, from the contact people to the copy editors), a very short version appeared in today’s issue.

It’s almost certainly the shortest thing I’ve ever written, but I like to think that it’ll get some readers and hopefully encourage more  researchers to hop on the digital preservation bandwagon.

Thanks to all my readers who reblogged and retweeted my earlier post, which helped spread it around the Internet.


Leave a comment

A WARC to Topic Model Visualization Workflow

Time to share some tinkering, in the hopes that somebody, somewhere, might find it helpful. :)

Regular readers will know that I’m interested in how historians can approach web archives, as discussed in a three part series in late 2012 (see part one, two, and three). As I’ve stressed, in both tweets and in some draft writing: Historians need to understand web archives, however, as we will be professional end users of these archives.  We played a critical role in shaping the modern practice of traditional archiving. Let us make sure that historians are present for the next step. There’s a conversation, but its largely amongst people involved in web archiving as creators rather than as users.

[if you want to skip to my code, it's here]

So here’s me positing a problem: Some web archives do not have description, so you aren’t sure what you’re going to find inside. This includes some just-in-time web saves, like this mirror of the Montreal Mirror’s website. There’s always an item listing, automatically generated, that lets you know what exactly is in the website. When dealing with wide web crawl data, part of that massive 80TB dataset, this is a life saver. Very briefly: Web ARChive files are complicated containers of multiple files – ActiveHistory.ca, for example, is made up of over 18,000 files. That’d choke a file system, but you can turn it into one Web ARChive that you can play with later.

Furthermore, these WARC files are too big. Wouldn’t it be nice if you could, at a glance, see what ianmilligan.ca is about without having to read what I’m writing here? (yes, but also imagine if you were just looking at a bunch of visualizations – would be invaluable in a research project)

But for the historian, it’s not terribly useful in and of itself. What if we had a lot of these files, how could we quickly see what related to our topic, and what didn’t? Would we be able to automate it?

Here’s my idea, which I cooked up as a way to learn some more technical skills and invent a tool to help me in my workflow. What if we could take a Web ARChive file (WARC) and then hook it up to a topic modelling visualizer, like Stanford’s Termite [see their paper here]? Continue Reading →


2 Comments

It’s Getting Harder to Commit History: SSHRC Postdocs, 1995-2013

One of the most popular blog posts on this site is my SSHRC Postdocs: What’s Going On post, which plots applications, successful awards, and the success rate. Back in February, I tweeted a revised version of the chart, but wanted to put it here with some additional commentary:

SSHRC Postdoctoral Fellowship Success Rates, 1995-2013.

SSHRC Postdoctoral Fellowship Success Rates, 1995-2013.

The above includes the 2013-14 results. We’re seeing a blip upwards in the success rate: there are more awards and a slight decline in the number of applicants. We’ll need a few more years to see whether this is a statistical hiccup or the beginning of a new trend. There might be some chill factor with the impending tightening restrictions around the number of applications, announced last summer. In any case, this success rate still is low: although it is an uptake, it is still lower than at any time between the 1996-97 competition year and the 2008-09 competition year.

An additional thing to keep in mind: The “History Wars” have been getting a lot of attention lately, most recently due to the federal government’s decision to begin a “comprehensive review of significant aspects in Canadian history.” The specifics are best left to others, notably Active History co-editor Thomas Peace. I don’t want to get into it, but certainly there’s a tendency for the Conservatives to want to encourage scholarship that they find politically agreeable (military history, Diefenbaker) – just as there was for the Liberals (peacekeeping, multiculturalism). But keep in mind, as we discuss historians, that it’s quite frankly getting harder to do any kind of history as a professional undertaking. 

There’s a war on junior scholars in Canada. That’s being a bit provocative, I know, and the federal government doesn’t deserve all the blame. We have seen declines in total postdoctoral fellowships awarded in several years as seen above, which we can attribute to funding issues. Universities and provincial governments also deserve some blame, as they’ve flooded the market. But whatever the root causes, we’re seeing a major issue of human capital in the data above.

This affects all of us: whether you’re a military historian, a social historian, a Conservative, a Liberal, or a New Democrat. If we really care about history, let’s put our money where our mouth is and help fund the next generation of historians. It affects junior scholars most of all, but also senior scholars. This is the future of the profession.

A return to 2004-05 success rates would mean 62 more postdoctoral fellowships would have been awarded this year (244 out of 903 would have been a 27.02% success rate, which we had in 2004-05).

It’s getting harder to commit history.


Leave a comment

An Aside: An Ode to Exploratory Research

With the semester done, the search committee I was on having wrapped up, and – finally – my two article drafts (one on moral panics on the early Canadian Internet and one on WebArchiving) completed, I had an entire afternoon of guilt-free exploratory research.

I love exploratory research. A forthcoming article grew out of exploratory research (blogged about here), when I was messing around with dissertations and citation counts. Now, on the other hand, that obscures out the days and days that I’ve literally spent hitting up against dead ends, batting my wall up against bad data, technical limitations, or sources that never really went everywhere. I have a terabytes of external hard drives, filled up with datasets, some of them representing a few days of work that won’t soon see the light of day.

So it’s nice to have these guilt free days to just play, in a constructive way. I think of it as akin to the Google 20% time. To check out what’s new at the Internet Archive. To take an abstract problem and play with it in a programming language. To listen to a CBC debate on Canadian history.

So what have I discovered today? Continue Reading →


Leave a comment

NCPH Panel: “Reaching the Public through the Web: The Practice of Digital Active History”

Screen Shot 2013-04-17 at 2.13.06 PMOn Friday morning at 8:30am (!), I’m looking forward to giving brief presentation alongside several of my colleagues at the NCPH annual meeting. In our panel, “Reaching the Public through the Web: The Practice of Digital Active History,” we will be providing several different perspectives on, well, how to reach the public through the Internet. Our session abstract puts it eloquently (I didn’t write it):

Active history is history that listens, is responsive and encourages a broad range of forms of public engagement. As the accessibility and volume of digital content increases, so do possibilities for digital outreach activities. These opportunities bring challenges, benefits, and new methods of approaching the past. This panel focuses on the intersection of active history and digital technologies; with an emphasis on community involvement, alternate reality games, digital vs. physical engagement, and the possibilities of engaging disparate audiences.

What will we be talking about? Continue Reading →


Leave a comment

An Aside: “An Academic with Impostor Syndrome”

In academia, we talk a lot about the stresses of the job. A lot of this, for me, comes down to these two must-reads: Joseph Kasper’s “An Academic With Imposter Syndrome” in the Chronicle of Higher Education and Aimée Morrison’s “Academic Impostor Syndrome” in the always great blog Hook and Eye.

It’s funny, as we often tend to put forward imposing, confident faces on social media: the researcher, teacher, and administrator with his/her stuff together, progressing forward. Even in a few months where I’ve designed and taught two new courses while moving a pretty aggressive research agenda forward, that voice is always there: “Is this enough?” “Shouldn’t you work just a bit later on the weekend?” “How dare you take a break when others fight for full-time work?”

These articles helped remind me that I’m not alone. Hopefully others find them helpful too.

Back to work, though! Tomorrow, I’m off to Ottawa for NCPH 2013, and am looking forward to a glass of wine on the train.


Leave a comment

Topic Modelling Review Up

Screen Shot 2013-04-11 at 12.39.28 PMThe new issue of the Journal of Digital Humanities is out, and I’m really happy to be – along with Shawn Graham of Carleton University – a contributor. In this issue, we’re reviewing the Java-based topic modelling tool MALLET (the MAchine Learning for LanguagE Toolkit). I’m really flattered to be part of the issue, and glad to see several links to our free, open access guide to how to use MALLET in the Programming Historian 2 throughout.

Topic modelling, as the review and Programming Historian 2 piece notes, have been critical parts of my digital toolkit over the last year or so. It’s enabled me to make cursory explorations of very large datasets, from musical lyrics, Canadian history dissertations, dead websites, and even more. Some of the initial euphoria around it being a ‘magic bullet’ for everything have worn off (honestly, the first time you try it, you’ll be hopping off the walls) but it’s an incredible program to use.

If you ever want to chat topic modelling, shoot me an e-mail.

Follow

Get every new post delivered to your Inbox.