Using Google Spreadsheet as a database

// UPDATE: Great addition in the comment field. It turns out the Javascript library tabletop.js helps you do the same thing as I walk you through in this blog post. Check it out.

For almost a year now I have been working with datajournalism at the Swedish public broadcaster Sveriges Radio. One of the things that I have learnt is that a good datavisualization should not be a disposable product. It should be possible to effortlessly updated it with new data. Ideally by someone that has never written a line of code.

In other words, I have come to understand the importance of easily accessible datasets in the newsroom. Enter Google Spreadsheet.

Setting up the spreadsheet

So I’m working with notice data here, that is the number of jobs that are given notice in Sweden per month. The task is to make a graph that shows the development of this number and I want the graph to be automatically updated when make changes in my spreadsheet.

Here is my data.

We want to communicate this data to our HTML file. To publish the spreadsheet choose File > Publish to the web. Choose the sheet you want to publish, make sure the “Automatically republish when changes are made” box is checked and click Start publishing. Instead of Web page, chose ATOM and click Cells.

If you open given url you can see that the spreadsheet has been transformed to an XML file. However, I would prefer the javascript native JSON format. To do that simply add ?alt=json-in-script to the url.

Transforming the data

Now lets create an HTML document and import the data. First make sure you include jQuery.

<script class="hiddenSpellError" type="text/javascript">// <![CDATA[
src</span>="//ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js">
// ]]></script>

Next I call the JSON file with an AJAX query and log the response.

url = "https://spreadsheets.google.com/feeds/cells/0AojPWc7pGzlMdDNhSXd1MW8wSmxiSkt5LXJkZDZWNEE/od6/public/basic?alt=json-in-script&callback=?";
$.getJSON(url, function(resp) {
	console.log(resp)
});

This is the object that is returned.

As you can see the structure is a bit messy. You will find the interesting parts if you dig down to feed > entry. That is where the cells are hidden. They are returned as a set of objects (0 to 21 in this case).

I have written a function that transforms our data to one well structured object. Here is the code (and feel free to tell me how to improve it – I’m a journalist, not a hacker).

function getData(resp) {
	//Functions to get row and column from cell reference
	var get_row = function(str) { return /[0-9]{1,10}/.exec(str)*1 };
	var get_col = function(str) { return /[A-Z]{1,2}/.exec(str) };

	var data = [];
	var cols = ["A"];
	var cells = resp.feed.entry;
	var column_headers = {};
	var number_of_cols = 0;
	var number_of_rows = get_row(cells[cells.length-1].title.$t);
	var row = 1;
	var	_row = {};

	//Get column headers
	for (i=0; i
		var _cell_ref = cells[i].title.$t;
		var _column = get_col(_cell_ref);
		if (_cell_ref == "A2") {
			break;
		}
		else {
			column_headers[_column] = cells[i].content.$t;
			number_of_cols = 0;
		};
	};

	//Iterate cells and build data object
	for (i=number_of_cols; i
		var cell_ref = cells[i].title.$t,
			col = get_col(cell_ref);
		if (row < get_row(cell_ref)) {
			//new row
			data.push(_row);
			_row = {};
			row++;
		}
		_row[column_headers[col]] = cells[i].content.$t;
	}
	data.push(_row);
	data.shift();
	return data;
}

This function returns an array of objects containing the data from each row. In other words everything I need to build my visualization.

Putting a visualization on top of it all

Now lets see what the data looks like. I split the data array into an array of categories (months) and an array of data points, feed them to a basic Highcharts graph and – bam! We have a dynamic datavisualization. And – most importantly – updating the graph is now super easy. Just add another row of data. Check out the final result here.

January-August.

January-October

Advertisements