I was at the Columbia Web Archiving Collaboration: New Tools and Models conference this Thursday and Friday, and gave a quick demo. Here’s a bit more detail on it.
If you use Warcbase, using this handy guide to installing it on OS X, and follow scripts, you will eventually come up with a data file that looks a bit like this.
200510 acq.osd.mil acq.osd.mil 96 200510 acq.osd.mil akss.dau.mil 12 200510 agoracosmopolite.com agorabookcafe.com 325 200510 agoracosmopolite.com agoracosmopolitan.com 271 200510 agoracosmopolite.com agoracosmopolite.com 8319 200510 agoracosmopolite.com genesmedia.com 325 200510 bloc.org go.microsoft.com 22 200510 blocpot.qc.ca blocpot.qc.ca 104 200510 blocpot.qc.ca marijuanaparty.org 16 200510 blocpot.qc.ca norml.org 16 200510 blocquebecois.org bernardbigras.qc.ca 16 200510 blocquebecois.org bloc.org 1069 200510 blocquebecois.org blocquebecois.org 276682
You can download your own sample file here, which draws on the Canadian Political Party and Political Interest Groups collection.
But how do you turn this into a beautiful Gephi visualization?
Step One: Convert Link Output into GDF Format
You could do this manually by creating a CSV file, adding ‘TimeInt Source Target Weight’ in the first line of the file (tabs separating each value), and so forth, or you could use our handy script pig2gdf.py.
Usage is as follows:
pig2gdf.py usage: $ ./pig2gdf.py <file> > <output file> OR $ cat <file> | ./pig2gdf.py OR $ ./pig2gdf.py < <file>
Step Two: Import into Gephi
You now want to take it into Gephi. Start Gephi (if you have trouble running it, this tutorial might help). Open the GDF file that you just generated. Click OK.
Now visit the ‘Data Laboratory’ panel and do the following. Select the ‘edges’ table so it looks like this.
Click on the ‘Merge Columns’ button and do this:
Make sure to parse dates as yyyymm.
The final step is to click on ‘nodes’ in the upper left, click ‘copy data to other column,’ select ‘id,’ and copy to ‘label.’
Bobs your uncle! You’ll notice that you’ve got the option to enable a dynamic timeslider at the bottom.
Select ‘Overview,’ use the ‘Force Atlas’ visualization, tinker with the settings, and you have a dynamic web graph!