Tutorial: Choropleth map in CartoDB using QGIS

The 2.0 version of the Fusion Tables challenger CartoDB was released last week. This is great news for all those of you that, for one reason or another, want to break free of Google Maps. CartoDB lets you geocode and plot addresses and other geographical points on Open Street Map tiles,  just like Fusion Tables does in Google Maps. With a little help from Quantum GIS you will also be able to make beautiful choropleth maps.

In this tutorial I will walk you through this process. It will hopefully be a good intro to QGIS and geospatial data if you are new to these concepts.

Step 1: Getting the geo data

I’m going to map unemployment data in Swedish muncipalities. So I obviously need a municipality map of Sweden.

Maps come in different formats. For this purpose a basic vector map (in svg format for example) is not robust enough. We need a .shp-, .geojson- or .kml-file. The difference between those formats and svg is that they define data points in geographical coordinates, whereas the svg is defined in pixels. Fortunately the Statistics Bureu of Sweden is generous enough to provide shape-maps of the municipalities.


Now go to QGIS (I think you need to be at version 1.7 to be able to follow this tutorial) and import the downloaded map (Layer >Add vector layer or Ctrl+Shift+V). Sweden_municipality07.shp is the file you want to select. Set the encoding to ISO-8859-1 to get all the Swedish charachters right.



Step 2: Appending data

I got some basic unemployment data from SCB and changed the municipality names to official ids. Here is the whole set.

To join the unemployment data with the map we need to have a column that we can use as a key. A way to patch the two datasets together. In QGIS, right-click the map layer and choose Open Attribute Table to see the embedded metadata.

attribute table

As you kan see the map contains both names and ids for all the municipalities. We’ll use these ids as our key.

Go back to your Google spreadsheet and save it dataset as a csv-file (File > Download as) and import it in QGIS (again: Layer >Add vector layer or Ctrl+Shift+V).


You will now see a new layer in the left-hand panel.


Confirm that the data looks okay my right-clicking the unemployment layer and selecting attribute table.

attribute table2

To join the two datasets right-click the map layer and choose Properties. Open the tab Joins.


Make sure that you select kommunid and KNKOD (the key columns). To make sure that the merge was successful open the attribute table of the map layer.

join completeAs you can see a column with the unemployment data has now been added.The map now contains all the information we need and we are ready to export it. Select the map layer and click Layer > Save as.


Two important things to note here. Choose GeoJSON as your format and make sure you set the coordinate system to WGS 84. That is the global standard used on Open Street Map and Google Maps. On this map the coordinates are defined in the Swedish RT90 format, which you can see in the statusbar if you hover the map.


Step 3: Making the actual map

Congratulations you’ve now completed all the difficult parts. From here on it’s a walk in the park.

Go to CartoDB, login and click Create new table. Now upload your newly created GeoJSON file.

CartoDB import

You can double-click the GeoJSON cells to see the embedded coordinates.

Skärmavbild 2012-12-14 kl. 18.34.01

Move all the way to the right of the spreadsheet and locate the unemployment column. If it has been coded as a string you have to change it to number.


If you go to the Map view tab your map should now look something like this:


To colorize the map click the Style icon in the right-side panel, choose choropleth map and set unemployment as your variable. And, easy as that, we have a map showing the unemployment in Sweden:

final map

Click to open map in new window.

Concluding remarks

I have played around with CartoDB for a couple of days now and really come to like it. It still feels a little buggy and small flaws quickly start to bug you (for exemple the lack of possibilities to customize the infowindows). And the biggest downside of CartoDB is obviously that it is not free of charge (after the first five maps). But apart from that it feels like a worthy competitor to Google Fusion Tables. Once you get a hold of the interface you can go from idea to map in literally a few minutes.

Pimp your Google Map

Today I’m having my first visualization published at Helsingin Sanomat. A map of all the discharges in Helsinki during one night (21-22.12) showing reported crimes and accidents. Check it out:

Go to HS.fi.

It’s a pretty basic visualizations. I got a bunch of addresses that I geocoded using the Yahoos place finder API and projected on Google Maps. I’m a huge fan of a lot of the Google tools out there. Most notably the Docs platform, which step by step is phasing out my decency on MS Office (at least MS Word).

However, I have been quite sceptical towards Google Maps. Mostly because of its – in lack of a better word – ugliness. We’ve seen enough of that pale blue-green-yellow layout.

The good news is that Google has made it possible to easily style your maps using a simple online interface. In this example I’ve just inverted the lightness, added a pink hue and reduced the saturation of the water.

The options are endless. You can easily spend hours playing around with the different settings.

Not too many developers and designers seem to have found this tool yet, but my prediction for 2012 is that there will be a lot more styled Google Maps. And why not a portal with open map skins for anyone to use? I can’t find anything like that at the moment.

Tutorial: How to extract street coordinates from Open Street Map geodata

I’ve spent almost a year learning about data-driven journalism and tools for analysis and visualization of data. I have now become confident enough to think that I might even be able to teach someone else something. So here it goes: my first tutorial.

The task

Earlier this fall Helsingin Sanomat published a huge dump of price data from Oikotie, a Finnish market place for apartments. I had an idea to build a kind of heat map where every street would be colored based on the average price of the apartments.

With the JavaScript library Polymaps you can easily make stylish web maps. The problem is that you need an overlay GeoJSON layer with the colored streets. Finnish authorities do not – yet! – provide open street-level geodata. Fortunately Open Street Map does.

From .shp to .geojson

The raw data from Open Street Map is downloadable in shape-format. So in my case I download the shapefile package of Finland and opened it in Quantum GIS (Layer > Add vector layer). This is what the finland_highway.shp file looks like.

This is A LOT of geodata, but in this case I’m only interested in the Helsinnki region. So I zoom in Helsinki an and select, roughly, the streets that I’m interested in using the lasso tool (select object tool ).

To export the selected part of the map to the GeoJSON format that Polymaps can read, chose Layer > Save Selection as vector file and GeoJSON as your format. Save! Done!

Filtering the GeoJSON file

We got a our GeoJSON-file. Now there is just one problem: it is huge, 18 MB! But there are a lot of streets here that we don’t need. We want to filter these streets. This will require some programming skills. I turn to Ruby.

This is the structure of an object in the GeoJSON file:

{ "type": "Feature", "properties": { "TYPE": "cycleway", "NAME": "", "ONEWAY": "", "LANES": 0.000000 }, "geometry": { "type": "LineString", "coordinates": [ [ 24.773350, 60.203288 ], [ 24.774540, 60.203008 ], [ 24.777840, 60.202300 ], [ 24.781013, 60.201565 ], [ 24.781098, 60.201546 ], [ 24.782735, 60.201199 ], [ 24.784300, 60.201045 ], [ 24.785846, 60.201085 ], [ 24.787381, 60.201133 ], [ 24.787812, 60.201169 ], [ 24.788101, 60.201207 ], [ 24.797454, 60.201623 ], [ 24.797636, 60.201620 ], [ 24.799625, 60.201405 ], [ 24.801848, 60.201089 ] ] } }

This street does apparently not have a name, but the others do, which means I can extract that streets that I’m interested in based on their name.

In another array I list the streets that I want to be included in the visualization. Like this:

streets = [
# and so on...

I now want to tell the computer to iterate through the GeoJSON file and extract the streets that are included in the streets array. Or in practice I approach it the other way around: I check what streets in the GeoJSON file that are not included in the array and remove them.

This is is the code:

def process(data)
json = JSON.parse(data)

#-- STEP 1. Go through the geojson file and add the index numbers ("i") of the street names that are not found in the array "streets" to a new array ("del")
i = 0
del = []

json["features"].each do |a|

unless $streets.include? a["properties"]["NAME"]


i += 1


#-- STEP 2: Iterate through the del array from the back and remove the streets with the corresponding index numbers in the geojson data ---
del.reverse.each do |d|



#-- Open a new json file and save the filtered geojson ---

File.open("hki.json", 'a'){ |f| f.write(JSON.generate(json))}

In this case data is the GeoJSON file and $streets the array of the selected streets. And voilà: you got yourself a new GeoJSON file. In my case I managed to shrink it down to 1.6 MB.

The visualization

I now got what I wanted in the beginning: the geographical coordinates for the streets that I want to plot, which means I’m halfway to making my visualization.

I won’t go in to details on how the actual visualization was put together. The short version is that I used this pavement quality example as base script and made some small modifications. The price data is then picked from a separate file. This is the result, the housing prices in Helsinki, street by street:

Open the full map in new window.

Not too shabby, right? I managed to sell this visualization to Hufvudstadsbladet which now runs it on their website.



Animation: World terrorism 2004-2011

After the terror attacks of nine-eleven the USA set out to fight terrorism. It has been a succesful quest in the sense that the Americans themselves have not been hit by terrorist since – but others have. According to statistics from the American Worldwide Incident Tracking System 37,798 lethal attacks have been carried out since 2004 killing 174,547. That’s a lot of nine-elevens.

Since the WITS provides such easily accessible data it would be a shame not to do something with it. So I did and this is what I ended up with (click to open in new window):

A few words about how I did this visualization.

The data

The basic data was really easy to gather here. I just filtered the attacks with ten or more casualties and downloaded the spreadsheet from WITS. The challenge was to geocode the places. I hadn’t done this before.

I wrote a Ruby script that called the Yahoo Place Finder API to transform the place names to longitudes and latitudes. For some reason a few locations got completely wrong coordinates (I started to wonder when the USA was suddenly hit by major attacks that I had never heard of). These were filtered away.

The visualization

This job provided two new challenges. One, working with dates. Two, working with maps. Just as the last time I used the JavaScript library d3.js to put the visualization together.

For the map I used the provided Albers example as a base script. With some assistance from this thread on Google groups I managed to figure out how to make a map in d3 (my heureka moment was when I realized that you can modify d3.geo.js to center the world map wherever you want).

Getting a hold of the dates in JavaScript became much easier with the date.js library. Highly recommended.

Final thoughts

A lot could have been done to polish the animation. One could have added some sort of timeline with key events, graphs and so on. But I think this is a pretty neat base for visualizing, lets says, earthquakes of other catastrophes. And you gotta like a viz on black.