Rapid heatmaps from large scale AIS shipping data

James Thompson

This map shows the density of shipping taking place around the UK in 2017, using different scales for the red, green and blue channels to increase the effective resolution in the extreme high and low density areas. It's based on a raw AIS dataset kindly shared by Cefas and Defra - about 300 million messages.

If you look closely you can see the shape of the sea floor, making the heatmap a rudimentary bathymetry map. This works because fish cluster in areas where the depth is dropping rapidly, and we pick up the fishing vessels that follow them.

This kind of image is really easy to make, if you're willing to pull out some binary formats. Rendering time is just a few seconds, even with an output may thousands of pixels wide.

Preprocessing the data

First decode your AIS data using my aisutil project or directly with Kurt Schwehr's libais and preprocess the output into the 'aismmf' format using one of the helper programs built in this project. This stores the positional data as a memory-mapped array of structs, organised into chunks according to which vessel sent the message; it can be traversed by standard pointer following, which is very fast. You can access the data from C++ using the C++ client.

Generating the PNG

It's crazily easy to make PNGs using Sean Barrett's stb_image_write.h single file C library. The preamble docs make things pretty obvious: just allocate an array of uint8_t -holding structs and pass it to the stbi_write_png() function. I suggest using the described #define STBIWDEF static inline trick to avoid the need for a separate 'impl' translation unit. You might also want to write a trivial RAII-friendly wrapper for the array, with some bounds checking and a more composable interface.

We don't need a clever projection to show the results, as this map isn't being used for navigation and people are quite used to the distortions, so can use the dead simple Equirectangular projection. This maps latitude and longitude linearly onto x, y positions on the page. From our perspective it's just paramterised on how many pixels you want per degree: convert lat/lon to y/x by adding 90/180 and multiplying. You can implement this as a trivial wrapper around your earlier image class. PNG compression handles large areas of constant colour well, so you don't even have to trim to a subset of world until building in your first version.

Who's fishing?

I've made a couple more maps to show the effect of fishing in the high gradient regions; they both use the same scale. The first shows only broadcasts from vessels self-describing as fishing vessels, and as expected is much more focussed here. The second shows only vessels broadcasting as another type. Interestingly there is still quite a lot of activity in the high gradient regions in the latter group, so it would be interesting to delve into who's spending all that time there.

Keeping the code simple

This work is much easier with a ranges library included. I usually avoid range-v3 since it's such a big dependency, so I wrote my own 'micro range' library that's tiny enough to drop into another project, even by copy-paste if the using library is single file header only. You can see it in use here, traversing a collection of vessel identity data and finding all the MMSIs that have broadcast as fishing type vessels (we also ignore MMSI 0 as tons of rubbish ends up there):

    std::vector<int32_t> fishingMmsis = urange::make (identiyDats)
        .filter ([&](const auto &e) {return e.mmsi != 0;})
        .filter ([&](const auto &e) {return e.shiptype == 30;})
        .map    ([&](const auto &e) {return e.mmsi;})
        .sort   ()
        .uniq   ()
        .vector ();
Get in touch at contact@inkblotsoftware.com to see how we can help with your data challenges