Beyond shape files…

[This is one of those moments when you realise you haven’t been seeing the big picture. Digging around the edges of a new concept you suddenly see the foundations are much deeper than you thought. So – hats off to our wonderful dev team for being several steps ahead…]

I finally had a few moments of spare time the other day, so I got to watching some internal training videos for Tableau 10.2. These particular videos are what we call WINK (what I need to know) training and are deep dive sessions on new features we have released. One of them immediately caught my eye with the following abstract:

“Extract API supports geospatial data”

Wait… what!?!

Sure enough – when I went looking I found one of the new features in 10.2 is that the extract API now supports the spatial data type. You can find more about this feature in the Tableau SDK Reference. The really cool part of this is that it’s super simple to use – all you have to do is insert spatial data in WKT format. This means you can easily fabricate your own spatial data or import it from a spatial database using a function like ST_AsText().

It’s been a long time since I flexed my coding muscles but my Google-fu is mighty, so without too much hassle I was able to install Python, install our Tableau API and fiddle with our SDK sample. The code was easy (once I realised that indenting is apparently important in Python J) and the relevant lines are highlighted:

# Insert Data
row = Row( schema )
row.setDateTime( 0, 2012, 7, 3, 11, 40, 12, 4550 )    # Purchased
row.setCharString( 1, 'Beans' )                # Product
row.setString( 2, u'uniBeans'    )            # Unicode Product
row.setDouble( 3, 1.08 )                 # Price
row.setDate( 6, 2029, 1, 1 )                 # Expiration Date
row.setCharString( 7, 'Bohnen' )            # Produkt
for i in range( 10 ):
    row.setInteger( 4, i * 10 )                # Quantity
    row.setBoolean( 5, i % 2 == 1 )             # Taxed
    inner = str(i * 3)
    outer = str(i * 5)
    row.setSpatial( 8, "POLYGON ((" + inner + " " + inner + ", " + inner + " " +
outer + ", " + outer + " " + outer + ", " + outer + " " + inner + ", " + inner +
" " + outer + "))"
    table.insert( row )

The result was this:

I was also able to load the file with mixed spatial types:

# Insert Data
row = Row( schema )
row.setDateTime( 0, 2012, 7, 3, 11, 40, 12, 4550 )  # Purchased
row.setCharString( 1, 'Beans' )                     # Product
row.setString( 2, u'uniBeans'    )                  # Unicode Product
row.setDouble( 3, 1.08 )                            # Price
row.setDate( 6, 2029, 1, 1 )                        # Expiration Date
row.setCharString( 7, 'Bohnen' )                    # Produkt
for i in range( 10 ):
    row.setInteger( 4, i * 10 )                     # Quantity
    row.setBoolean( 5, i % 2 == 1 )                 # Taxed
    inner = str(i * 3)
    outer = str(i * 5)
    if ( i % 2 == 0 ):
           row.setSpatial( 8, 'POINT(' + inner + ' ' + inner + ')' )
    else:
           row.setSpatial( 8, 'POLYGON ((' + inner + ' ' + inner + ', ' + inner +
' ' + outer + ', ' + outer + ' ' + outer + ', ' + outer + ' ' + inner + ', ' +
inner + ' ' + outer + '))' )
    table.insert( row )

Note that this isn’t fully supported in Tableau but you can use them if you are careful not to mix them in the same mark. Get it wrong and you’ll see this:

Get it right and you’ll see this:

The result of all this is that we are not limited to just bringing in spatial data from spatial files – we can bring it from anywhere with a little bit of effort. This is very exciting and I look forward to seeing what you all create.

Posted in Uncategorized | Leave a comment

How to visualize polar projection data in Tableau

[This ice cap data is the gift that just keeps giving. Today’s guest post is courtesy of Sarah Battersby – Chief Crazy Map Lady in Residence – where she explains how she weaves her dark magic.

And then she gloats. Fair enough – she earned it.]

The first step in attempting something like this is to wait for someone else to find a nice dataset for you to work with and call you out in their blog lamenting the challenges of working with polar data in a Web Mercator projected base map.

Then get to work.

The National Snow and Ice Data Center data is delivered in a polar stereographic projection.    For the southern hemisphere data that I used, the projection was the NSIDC Sea Ice Polar Stereographic South.  This projection takes our latitude and longitude coordinates out of  angular units and puts them onto a nice, flat Cartesian plane, with the center of the projection (the South Pole) as our new 0,0 coordinate, and all locations plotted on a grid measured in meters away from the center.

Here is what that projection looks like (graphic from NSIDC)

sdfg

And with this great dataset, founded on solid sea and ice data science, and represented in a carefully selected projection that is mathematically appropriate for the data…we will start doing some serious lying with data to bend it to our will.

1.   Projections are just mathematical transformations from angular coordinates to planar coordinates.  So, using an open source GIS (QGIS) we’ll tell our first lie: Set the coordinate system to Web Mercator.  Do not re-project into Web Mercator!  We just want the dataset to think it is in Web Mercator coordinates.

sdfgg

That essentially shifts the center of our projection from (0°, -90°) to (0°, 0°).  That’s right, we just moved the south pole to a spot right off the west coast of Africa.   I am already a little ashamed of myself, but, now I can show polar data in a system that uses Web Mercator.

fdsasdf

But, there is a problem…my coordinates are still in “Web Mercator” meters.  While Tableau can work with shapefiles in non-latitude and longitude coordinates if there is a projection defined for the dataset, I still wanted to force the data back into latitude and longitude, so I then reprojected (or, perhaps un-projected) back to latitude and longitude using QGIS.

2.   And there is an even bigger problem – I don’t want a base map that shows Antarctica at the equator!  I need a new custom base map…off to Mapbox I go.  With Mapbox I can style a new set of basemap tiles starting from a blank canvas.  That means I can lie about the coordinate system of all sorts of spatial files and have them show up in the same wrong location in the world!  I am totally going to lose my cartographic license for this…

I grabbed a dataset with boundaries of the world countries (in latitude and longitude) -> used QGIS to project to Polar Stereographic to match the real sea ice data (to get it in the right coordinate system) -> changed the projection definition to Web Mercator (introducing the lie to make it think it was really located at the equator) -> reprojected back to latitude and longitude.

I spent way too much time searching for some imagery to bling up the base map, and eventually found a nice geotiff (tiff with geographic coordinates attached to it) from IBCSO.  I jumped through the hoop of lie about the projection (it was originally the polar stereographic, just like the sea ice data) and then send back to lat/lon process.

Using Mapbox Studio I put all of the data onto a blank basemap and published it for use in Tableau.

dsasdfgfsdfg

3.   Load up the new tiles in Tableau, and then lock down the pan/zoom to hide the fact that this is a polar stereographic wearing Web Mercator clothing (where did the rest of those continents go???  That’s right, I deleted them because I found them inconvenient and ugly in my map…cartographic license at work!).

fasdfds

4.   In the battle of polar data:  Sarah – 1, Alan – 0

Posted in Uncategorized | Leave a comment

I am not worthy…

This is why it’s fun working with people who are smarter than me. Way, way smarter…

Less than 24 hours after I post about my issues with polar data, Crazy Map Lady Extraordinaire Sarah Battersby tears it up and produces this:

Little Polar

In her own words:

NOAA polar ice files come in using a Polar Stereographic projection. I (ahem) just modified the definition to make it think it was Web Mercator. I reprojected into WGS84 to make it think it was lat/lon (which then places Antarctica roughly over the equator). First step down – data is in Tableau, ready to be analyzed.

To get a bit of context, I used the same projection trick with some continent data that I had lying around – the data round tripped from WGS84 data -> Polar Stereographic -> tell the data that it’s in Web Mercator (but it is really in Polar Stereographic) -> WGS84

I threw the bastardized projection version of the continents into Mapbox to run off some quick tiles.  Added them to Tableau, and… see video.

On Tableau Public here.

So not worthy…

Posted in Uncategorized | Leave a comment

The Importance of Projections

A few days ago I found a wonderful story about polar ice caps melting that led me to some wonderful data. I thought I could potentially make a viz that showed the changing extent of the sea ice at the poles – some line charts for the temporal view and a map with the shape files. I figured some animation would be even sexier.

I pulled the shape files down from the web and used Alteryx to union them together into a single file (+1 vote for shape file union). I loaded it into Tableau and drew my map but I got this:

WTF!?! Something is very broken. I decided to take a closer look at a single shape file:

Yep – definitely borked somewhere. My initial thought was that maybe there was a problem with the polygon data – that perhaps the polygons weren’t being closed properly. Because look here…

But surely not. I mean, these people are professionals. I’m sure their data is used all the time and an error like this would certainly be flagged. I downloaded and installed QGIS, pointed it at the file (and the Tableau tile service) and voila! One sea ice polygon:

BTW – check out how QGIS takes our Tile Service and projects it nicely. Very cool!

Anyhow – it turns out the problem is with the projection in the data. The shape file has the data in EPSG:3412 (NSIDC Sea Ice Polar Stereographic South). However, Tableau only understands WGS 84 (Web Mercator) and so it is doing on-the-fly transformations. Here’s what happens when I transform the data into a Mercator projection in QGIS:

BOOM! Also borked. I’m no GIS expert (looks around for Sarah Battersby) but it looks like Mercator can’t handle polygons that cross the +/- 180 degree meridian. So until Tableau can support more projections, I’m going to have to park this project. Or like Sarah suggested, map it to completely different coordinates somewhere else on the globe.

Learnin’ every day, folks.

Posted in Uncategorized | 4 Comments

Using GEOMETRY Fields in Calculations

In Tableau 10.2, we have a new data type that is read from spatial files – the GEOMETRY field. Right now, it would seem there is not much we can do directly with these fields other than display them.

map

The GEOMETRY field is presented as a measure object with a single aggregation function COLLECT(). This aggregation makes a group of polygons and/or points – GEOMETRY fields can contain both – act together as a collection (hence the name) based on the dimensions included in the viz. This means they are coloured, labelled, selected, highlighted, etc. as a single mark.

Right now there are no other built-in functions for GEOMETRY fields but we can use them in calculated fields. Here’s a simple, yet interesting application allowing us to dynamically select different levels of detail.

Australia’s Bureau of Statistics reports their data spatially via Statistical Areas (SA). There are multiple levels of detail in this model from SA4 down to SA1 (and further down to mesh blocks). The boundary definitions for these areas are available from the ABS website in ESRI and MapInfo formats.

To bring this data together, we can download and join the shape files together in a single data source:

joins

With some cleanup this results as follows – with a GEOMETRY field sourced from each file, containing the boundaries of the associated SA level:

measures

We can create a parameter that allows the user to select which level they would like to display:

parameter

We can use this parameter in a calculated field, returning a different GEOMETRY field based on the parameter value:

CASE [Select Level]
  WHEN "SA1" THEN [SA1 Geometry]
  WHEN "SA2" THEN [SA2 Geometry]
  WHEN "SA3" THEN [SA3 Geometry]
  WHEN "SA4" THEN [SA4 Geometry]
END

Double-clicking on this calculated GEOMETRY field and exposing the parameter allows the end user to display the required SA level dynamically. However, because there is no dimension in the viz, the COLLECT() aggregation makes all the polygons act as a single mark. To have each area act as a separate mark, we can use the parameter again to create a dynamic code dimension:

CASE [Select Level]
  WHEN "SA1" THEN [Sa1 7Dig16]
  WHEN "SA2" THEN [Sa2 Name16]
  WHEN "SA3" THEN [Sa3 Name16]
  WHEN "SA4" THEN [Sa4 Name16]
END

I look forward to finding more cool things to do with this new spatial capability in Tableau 10.2, and reading about your tricks as well.

Enjoy!

Posted in Uncategorized | 5 Comments

Points and Polygons in Tableau 10.2

I have previously written about how we continue to add new features to Tableau and this means that we can develop better solutions to previously difficult problems. I’d like to present another example of this – where we want to plot points and polygons together on a single Tableau map. In an earlier blog post, I had shown how we could present points and polygons together in Tableau. However in Tableau 10.1 and earlier it was a complex process that required significant data preparation.

In Tableau 10.2 we are introducing a spatial file connector that will make it much easier to plot polygons and points together. You can now simply present your  polygon data to Tableau in an ESRI shapefile, KML or MapInfo file format:

2

and Tableau can directly plot the polygons:

1

Note that this is a much less complex viz structure than in earlier versions of Tableau. We are not using a vertex list anymore – we simply double-click on the new Geometry object (the globe field in the Measures section) and Tableau takes care of the rest. Note that in the above viz we have the (generated) latitude and longitude fields on the row/column shelves and the COLLECT(Geometry) measure on the details shelf.

When we want to overlay points on this polygon map, we simply need to have our point data also presented in a spatial file format. For this example, I had raw lat/lon data in a CSV and using Alteryx I converted it to a spatial object. I also used some of Alteryx’s spatial matching features to tag each location with the nameof the SA3 region in which it is contained (it will become clear later why this was done):

9

Once again we can simply connect to the resulting shapefile:

3

and plot the points on the map – note the layout of the viz is the same as for the polygon map above with the (generated) lat/lon fields and the COLLECT(Geometry) field on details:

4

To combine the two spatial data sets we can use the new cross-data-source join feature introduced in Tableau 10 to join the two shapefiles:

5

Starting with the previous polygon map, we can CTRL-drag the longitude field on the column shelf to duplicate the map and swap the COLLECT(Geometry) field on the detail shelf of the second map to show the Geometry field from the point location data source. Set the mark type to a red circle and you will see the following:

6

If we make the map a dual axis map, we now have points and polygons shown together. Yay! This is a much simpler (and more useful) solution compared to the previous approach.

7

Finally, because both data sets have the SA3 name we can use this to highlight both a polygon and the set of points therein, allowing for interactions that previously were not easy to do:

8

So thanks, Tableau development team, for adding these new capabilities and making old, complex techniques obsolete.

Posted in Uncategorized | 8 Comments

Filled maps and low bandwidth connections

I regularly work from my home office which is where my demo server is also located. Having a gigabit Ethernet connection to the server means network bandwidth/latency isn’t something I pay any attention to most of the time. However, the other day I was doing a demo at a customer’s office (my laptop connected to the internet via my phone’s hotspot) and noticed that some of my workbooks were considerably slower to open than I remembered them to be. Not all, but certainly some – and with further investigation I narrowed it down to a couple of dashboards:

And

Notice anything in common? Yep – the both have filled maps. I wondered if that had anything to do with the slow performance?

Once I got home, I opened up the offending dashboards again and the response time was nice and quick. Not as quick as other workbooks (which can be almost instant) but still fast enough not to concern me. Thinking this was something to do with the network I used Chrome’s developer tools (press CTRL-SHIFT-I) to look at the network traffic for the first dashboard above. There was the answer:

Because we are using client-side rendering, the bootstrap package has to pass all the data to the browser so it can draw it locally. This includes the marks which for filled maps are complex polygons requiring much more data to describe than a simple line or bar chart. Consequently, the bootstrap payload for a dashboard with filled maps can be ~1000x the size of an equivalent bar chart or even symbol map!

To really test this, I created a simple workbook showing all 2647 postcodes in Australia. One view displayed the data as a simple tabular list, another used a symbol map and the third used a filled map. I published this workbook to my demo server and use the Chrome developer tools to record the bootstrap package size and the total data transfer to the browser. I also used the developer tools to throttle the network to simulate a 4G connection (a really neat tool!) and recorded the time to last byte. The results were:

  • Tabular list : bootstrap package = 2.6K; total transfer = 38KB; total time = ~1.5s

  • Symbol map : bootstrap package = 61.9K; total transfer = 182KB; total time = ~3.5s (including fetching map tiles)



  • Filled map : bootstrap package = a whopping 2.2M; total transfer = 2.3M; total time = ~10s (including fetching map tiles)

Clearly, the use of filled maps requires a much larger data transfer to the client and therefore should be used sparingly in deployments where you have limited bandwidth. Or you could switch to server-side rendering – here are the results for the same workbook using the ?:render=no URL parameter – much faster!

  • Tabular list : bootstrap package = 2.6K; total transfer = 38K; total time = ~1.5s
  • Symbol map : bootstrap package = 3K; total transfer = 111K; total time = ~2s
  • Filled map : bootstrap package = 3K; total transfer = 139K; total time = ~2s

Enjoy!

Posted in Uncategorized | 1 Comment