(and here my series continues… I’m blogging through August mainly to keep the work going when it can be so easy to sneak away, and this is more of an internal diary than anything else!)
Mathematica is made by the same company, Wolfram Research, that brings us Wolfram Alpha – the computational knowledge engine that powers parts of Siri, as well as being an overall fun resource to use as historians, tinkerers, or well, anybody (I’ve written about it before). As a diversion, I thought I would start comparing economic data to the topics that I am finding through MALLET.
Using free-form input, let’s get annual figures for unemployment in the US, 1964-1989.
With that done, we can then manipulate our data – getting them into comparable datasets – and begin to run correlations. Let’s see if we can find correlations in topic occurrences against the unemployment rate…
Where do we find the Max and Min correlations?
The most correlated is topic 17, “don tonight night give wanna fire rock inside lover good lose start body light burning crazy dream hot waiting” and the least correlated is topic 39 “baby girl time world soul ya goodbye boy hear goin don girls fool man ring lies waiting make tears.”
So when the US unemployment rate is high, songs lyrics contain references to fire, rock, lover, burning, crazy, dream, hot, waiting; and when it’s lowest, the topic about baby, girls, world, goodbye, boy, fool, waiting, tears is the most common.
What is happening here?
Well, we actually end up getting a sense of the “golden age” of American unemployment versus the more dour days of the 1980s. Turns out topic 39 is the most popular topic for the late 1960s: a time of hope, yes, but also low unemployment. And, then, topic 17 is the most popular topic in the 1980s, especially during Reagan’s first term.
These are such wild swings in the unemployment rate – the 1980s being highest, and the 1960s being lowest, that we’re seeing it here.
More digging will have to be done, I guess.