Creating Link Graphs with Warcbase

Screen Shot 2015-06-05 at 11.51.29 AMI was at the Columbia Web Archiving Collaboration: New Tools and Models conference this Thursday and Friday, and gave a quick demo. Here’s a bit more detail on it.

If you use Warcbase, using this handy guide to installing it on OS X, and follow scripts, you will eventually come up with a data file that looks a bit like this.

You can download your own sample file here, which draws on the Canadian Political Party and Political Interest Groups collection.

But how do you turn this into a beautiful Gephi visualization?


Step One: Convert Link Output into GDF Format

You could do this manually by creating a CSV file, adding ‘TimeInt Source Target Weight’ in the first line of the file (tabs separating each value), and so forth, or you could use our handy script

Usage is as follows: usage:

$ ./ <file> > <output file>


$ cat <file> | ./


$ ./ < <file>

Step Two: Import into Gephi

You now want to take it into Gephi. Start Gephi (if you have trouble running it, this tutorial might help). Open the GDF file that you just generated. Click OK.

Now visit the ‘Data Laboratory’ panel and do the following. Select the ‘edges’ table so it looks like this.

Click on the ‘Merge Columns’ button and do this:

Make sure to parse dates as yyyymm.

The final step is to click on ‘nodes’ in the upper left, click ‘copy data to other column,’ select ‘id,’ and copy to ‘label.’

Bobs your uncle! You’ll notice that you’ve got the option to enable a dynamic timeslider at the bottom.

Select ‘Overview,’ use the ‘Force Atlas’ visualization, tinker with the settings, and you have a dynamic web graph!

