You Can Never Be Too Hexy…

Some time ago I wrote a post about how to create hexagonal heat maps. Central to the technique is the need for a “path” field to plot the 6 vertices around the polygon marks. As this is a synthetic value it won’t exist in the source data, so we need to create it. The simplest way would be to just read the data multiple times, once for each PointID value:

SELECT …fields…, 1 as [PointID]
FROM Table
UNION
SELECT …fields…, 2 as [PointID]
FROM Table
UNION
SELECT …fields…, 3 as [PointID]
FROM Table
UNION
SELECT …fields…, 4 as [PointID]
FROM Table
UNION
SELECT …fields…, 5 as [PointID]
FROM Table
UNION
SELECT …fields…, 6 as [PointID]
FROM Table

Clearly this isn’t a great approach as it requires us to read the data 6 times, and for anything other than trivial volumes this becomes an issue. Another way is to simply generate the PointID value endpoints (1 and 6) and use a feature in Tableau called data densification to populate the interpolated values:

SELECT …fields…, 1 as [PointID]
FROM Table
UNION
SELECT …fields…, 6 as [PointID]
FROM Table

This is better as now we only need to read the data twice, but we’re still duplicating data. What if we could do it with just a single pass over the data? That would be even better…

One of my colleagues, Mary Solbrig, commented on the original blog post that if you have a field in your data that has at least 2 values within each hexbin, then you can synthesise the PointID field without multiple passes. At the time I nodded and thought to myself “yeah, that makes sense” but it wasn’t until someone asked me about this in more detail that I actually looked to see how it would be done.

With this approach, the PointID field is a calculation (not an output of custom SQL) and the secret is to use a LOD expression:

IF [ID] = {FIXED [Hex Lat],[Hex Lon] : MIN([ID])} THEN 1 ELSE 6 END

What this does is for each hexbin (uniquely identified by the calculated hexbin centroid) it produces two values – a 1 and a 6. It does this without needing to duplicate the data so it’s much faster and more efficient – both of which are highly desirable characteristics.

We can then build the densification scaffolding by creating a 1 unit wide bin over the PointID calculation, and then it’s the same approach as before to plot the viz:

2016-05-10_10-49-58.png

I’m surprised that it’s taken me this long to internalise this approach and to recognise the brilliance. So a belated thank you, Mary – it might be late but it is heartfelt.

You can download an example workbook for this here.

About Alan Eldridge

Hi. I'm Alan. By day I manage the APAC sales engineering team for Snowflake Computing. By night, I'm a caped crusader. Or sleeping. Most often it's sleeping.
This entry was posted in Uncategorized. Bookmark the permalink.

11 Responses to You Can Never Be Too Hexy…

  1. Mary Solbrig says:

    Thanks! And thanks again for all your great content as well, especially the great-arcs post. Absolutely excellent example for explaining densification strategies.

  2. cw226blog says:

    Thank you so much! This completely solved the issues I had: incremental extracts are now viable and as you said, it definitely takes much less time for it to calculate the viz. Your blog is amazingly helpful, and I look forward to learning more from you!

  3. ericksondata says:

    Great Job Alan! I just gave a presentation on this before your post. I will have to update them since Union is no longer needed!

  4. Bora Beran says:

    I used this trick in my January 2014 blog post “Using R forecasting packages from Tableau” but at the time we had not yet introduced LOD expressions so had to add max date using custom sql 😦

  5. Pingback: Use Tableau to Create Census Tract Map – Data & Web

  6. Pingback: Custom Map – Data & Web

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s