Some time ago I wrote a post about how to create hexagonal heat maps. Central to the technique is the need for a “path” field to plot the 6 vertices around the polygon marks. As this is a synthetic value it won’t exist in the source data, so we need to create it. The simplest way would be to just read the data multiple times, once for each PointID value:
SELECT …fields…, 1 as [PointID]
FROM Table
UNION
SELECT …fields…, 2 as [PointID]
FROM Table
UNION
SELECT …fields…, 3 as [PointID]
FROM Table
UNION
SELECT …fields…, 4 as [PointID]
FROM Table
UNION
SELECT …fields…, 5 as [PointID]
FROM Table
UNION
SELECT …fields…, 6 as [PointID]
FROM Table
Clearly this isn’t a great approach as it requires us to read the data 6 times, and for anything other than trivial volumes this becomes an issue. Another way is to simply generate the PointID value endpoints (1 and 6) and use a feature in Tableau called data densification to populate the interpolated values:
SELECT …fields…, 1 as [PointID]
FROM Table
UNION
SELECT …fields…, 6 as [PointID]
FROM Table
This is better as now we only need to read the data twice, but we’re still duplicating data. What if we could do it with just a single pass over the data? That would be even better…
One of my colleagues, Mary Solbrig, commented on the original blog post that if you have a field in your data that has at least 2 values within each hexbin, then you can synthesise the PointID field without multiple passes. At the time I nodded and thought to myself “yeah, that makes sense” but it wasn’t until someone asked me about this in more detail that I actually looked to see how it would be done.
With this approach, the PointID field is a calculation (not an output of custom SQL) and the secret is to use a LOD expression:
IF [ID] = {FIXED [Hex Lat],[Hex Lon] : MIN([ID])} THEN 1 ELSE 6 END
What this does is for each hexbin (uniquely identified by the calculated hexbin centroid) it produces two values – a 1 and a 6. It does this without needing to duplicate the data so it’s much faster and more efficient – both of which are highly desirable characteristics.
We can then build the densification scaffolding by creating a 1 unit wide bin over the PointID calculation, and then it’s the same approach as before to plot the viz:
I’m surprised that it’s taken me this long to internalise this approach and to recognise the brilliance. So a belated thank you, Mary – it might be late but it is heartfelt.
You can download an example workbook for this here.
Thanks! And thanks again for all your great content as well, especially the great-arcs post. Absolutely excellent example for explaining densification strategies.
Thanks to you, Mary! I’m just kicking myself that it took me so long to realise how important that LOD was!
Thank you so much! This completely solved the issues I had: incremental extracts are now viable and as you said, it definitely takes much less time for it to calculate the viz. Your blog is amazingly helpful, and I look forward to learning more from you!
Great Job Alan! I just gave a presentation on this before your post. I will have to update them since Union is no longer needed!
But you still need at least 2 points right? This is not as much of a problem really. I had a low-hex filter on my dashboard anyways. I can just force 2 to be the minimum.
Yes – you must have a minimum of two points to hold the start and end markers. You then use densification to create the intermediate points.
Alan, This is cool. If I add something like “total(count(id))” onto the shelf and compute across it does populate the missing hexagons. However if I tweak the calculation to do it by the hexagon (to color by the total within that hexagon) the missing hexagons disappear. You can see at least 3 hexes (1 top left and bottom left there are 2 on the outskirts)
https://public.tableau.com/views/DataDense_93_singlepass/ThreeMethods?:embed=y&:display_count=yes&:showTabs=y
Yeah, I just verified it full. The union method is no longer needed. Totaling across generates the needed ‘6’ somehow. Dual axis that with a total at the point ID colors it correctly.
https://public.tableau.com/views/PolygonHexSinglePass/Dashboard1
I used this trick in my January 2014 blog post “Using R forecasting packages from Tableau” but at the time we had not yet introduced LOD expressions so had to add max date using custom sql 😦
Pingback: Use Tableau to Create Census Tract Map – Data & Web
Pingback: Custom Map – Data & Web