Update 18 July 2018:
The whitepaper has now been released as an official Snowflake whitepaper so I’ve updated the download link to point to the document on the Snowflake website. The new whitepaper is much prettier and has been through editing to clean up all my bad writing habits. Thanks to Vincent Morello and Marta Bright in our content marketing team for all their help in making this happen.
As announced in my last post, since joining Snowflake I’ve been working on a whitepaper that provides best practice guidance for using Tableau with our built-for-the-cloud data warehouse.
Well, I’m pleased to report that it’s done. Or at least, done enough to release. You can download it from here:
https://resources.snowflake.net/ecosystem/best-practices-for-using-tableau-with-snowflake
I hope you find it useful, and please let me know if you have any feedback or corrections.
I made a minor update to the whitepaper based on a suggestion from Dan Cory at Tableau. There’s a new section on accessing semi-structured data elements via RAWSQL functions which provides for a very agile approach. Note that there is currently a bug if you don’t include a field in the RAWSQL function – the simple workaround is to pass in the VARIANT field even though you don’t reference it in the SQL fragment – so you would write:
RAWSQL_STR(“V:city.name::string”, V)
instead of just
RAWSQL_STR(“V:city.name::string”)
Enjoy!
Another minor update – the document originally suggested that you should keep VARIANT fields to ~8MB for performance reasons. This guidance has changed and is being updated in our online documentation. The document now reads:
Note that the maximum number of key-value pairs that will be columnarised for a single VARIANT column is 1000. If your semi-structured data has > 1000 key value pairs you may benefit from spreading the data across multiple VARIANT columns. Additionally, each VARIANT entry is limited to a maximum size of 16MB of compressed data.