Extracting Weather Data for All Dates in a Document: Adding Context to a Document

The weather in Ottawa for Wednesday, November 25th, 1981 (date extracted from Trudeau-Lévesque correspondence). What if we could get this data for every date in a document?

Building on my theme of using Wolfram|Alpha to figure out things about the past, I wondered if I could write a program that could take a document or secondary source, extract all of the dates, and then let you know about that day’s weather. I envision this as eventually being a reader that pops up weather data as you’re reading something, giving you some added context about the date being mentioned. It would have to be automated, though, otherwise you might not want to use it!

Luckily, Mathematica has date recognition functions built in. As a proof of concept, I decided to test it out on the Wikipedia page for Stephen Harper (Canada’s current Prime Minister). We’ll try it out on primary documents below. Some jigging is required depending on the date format, but I could easily write a function that could get all dates.

First, we import the page, then extract the dates in the standard wikipedia date format, and then send the extracted dates to Wolfram|Alpha. I’ve decided here to limit myself to the first ten dates:

scrape=Import["http://en.wikipedia.org/wiki/Stephen_Harper", "Plaintext"];
harperdates = StringCases[scrape, DatePattern[{"MonthName", " ", "Day", ", ", "Year"}]];
tendates = Take[harperdates, 10];
alltemps = {};
Do[
querystring = "average temperature in Ottawa (Ontario), " <> ToString[date];
temp = WolframAlpha[querystring, {{"Result", 1}, "Plaintext"}];
AppendTo[alltemps, temp];
, {date, tendates}];

And our result:

"-7 °C (degrees Celsius) (Monday, February 6, 2006)"
"-1 °C (degree \
Celsius) (Saturday, March 20, 2004)"
"-7 °C (degrees Celsius) (Monday, 
February 6, 2006)"
"9 °C (degrees Celsius) (Tuesday, May 21, 2002)"
"-23 °C (degrees Celsius) (Thursday, January 8, 2004)"
"21 °C (degrees Celsius) 
(Friday, June 28, 2002)"
"(no data available for Ottawa, Ontario, Canada on 
Monday, October 25, 1993)"
"(no data available for Ottawa, Ontario, Canada 
on Monday, June 2, 1997)"
"12 °C (degrees Celsius) (Thursday, April 30, 
1959)"
"12 °C (degrees Celsius) (Thursday, April 30, 1959)"

Let’s now do it on a letter from Pierre Trudeau to Quebec Premier Lévesque, dated in 1981. The date pattern is a bit different, but the code is basically the same. We extract the dates, throw them through Wolfram|Alpha, and receive the following:

"-10 °C (degrees Celsius) (Tuesday, December 1, 1981)"
"-10 °C (degrees 
Celsius) (Tuesday, December 1, 1981)"
"-10 °C (degrees Celsius) (Wednesday, 
November 25, 1981)"
"14 °C (degrees Celsius) (Monday, September 28, 1981)"
"(no data available for Ottawa, Ontario, Canada on Saturday, April 19, 
1975)"
"(no data available for Ottawa, Ontario, Canada on Thursday, October 14, 1976)"
"(no data available for Ottawa, Ontario, Canada on Wednesday, 
January 19, 1977)"
"7 °C (degrees Celsius) (Wednesday, October 22, 1980)"
"-4 °C (degrees Celsius) (Thursday, November 5, 1981)"
"10 °C (degrees 
Celsius) (Thursday, September 11, 1980)"
"-4 °C (degrees Celsius) (Thursday, November 5, 1981)"
"-4 °C (degrees Celsius) (Thursday, November 5, 1981)"
"10 °C (degrees Celsius) (Thursday, September 11, 1980)"
"-4 °C (degrees 
Celsius) (Thursday, November 5, 1981)"

I could see this being a fun complement to a document reader, as it’s always kind of fun to imagine what the weather during a given date was like. By automating it, I feel like I’d be more likely to check things out, rather than plugging dates in individually.

And who knows, maybe we could learn something more about the past this way.

Incidentally, I’ve been blogging a lot lately – the late summer burst of energy. Starting next week, I’ll be turning my attention to teaching matters and will have a bit less tinkering time!

2 thoughts on “Extracting Weather Data for All Dates in a Document: Adding Context to a Document

  1. Jonathan Goodwin (@joncgoodwin) says:

    I once had a test license for Mathematica, but this was before the Wolfram Alpha days. I don’t think I can afford it now, sadly enough, though I haven’t checked to see if my university has a license. I’m thinking about how I could do this in R, as it would be fun to gather some data about the the effect of barometric pressure on human behavior, an idea that Stanley Kubrick was apparently very interested in.

  2. Ian Milligan says:

    That’s the big downside for Mathematica: the cost and the issue that there seem to be fewer site licenses for the product, at least in Canada. I’m not too familiar with R, but it should be possible. I see that Wolfram|Alpha has an API, that allows for free non-commercial calls (http://products.wolframalpha.com/api/), so that could be one way forward?

    Good luck – I’d love to hear how this goes in another language!

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s