Tableau and Snowflake Boolean fields

Hello, gentle reader. It’s been a while. I hope you are well. 🙂

Today, I was helping a colleague with a problem that stems from the fact that the current version of the Snowflake driver in Tableau doesn’t correctly support Boolean fields and therefore returns data from Snowflake BOOLEAN fields as a 1/0 integer value. This means that the field cannot be directly used in a Boolean expression like:

A AND B

where A and B are both themselves Booleans. So I thought I’d do some digging to understand what was happening under the covers.

For this particular problem, I came up with four different solutions… combinations of using RAWSQL_BOOL to force the result to be interpreted as a Boolean, and simple integer comparisons that return a Boolean result:

  • Create individual Boolean fields for A and B using RAWSQL_BOOL(), then create a third field that ANDs them together.
    • Formula:
      • [A_BOOL] = RAWSQL_BOOL(“A”)
        [B_BOOL] = RAWSQL_BOOL(“B”)
        [A&B] = [A_BOOL] and [B_BOOL]
    • Produced the following SQL:
      • (CASE WHEN ((A) AND (B)) THEN 1 WHEN NOT ((A) AND (B)) THEN 0 ELSE NULL END)
  • Create a single field that embeds the AND into the RAWSQL_BOOL function. Less flexible than option 1 but maybe better performing?
    • Formula:
      • [A&B] = RAWSQL_BOOL(“A AND B”)
    • Produced the following SQL:
      • (CASE WHEN (A AND B) THEN 1 WHEN NOT (A AND B) THEN 0 ELSE NULL END)
  • Create individual Boolean fields for A and B using an integer comparison, then creating a third field that ANDs them together. My thought here was that integer queries are generally very fast so this might be faster than the RAWSQL approach…
    • Formula:
      • [A_INT] = [A] = 1
        [B_INT] = [B] = 1
        [A&B] = [A_INT] and [B_INT]
    • Produced the following SQL:
      • (CASE WHEN ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 1 WHEN NOT ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 0 ELSE NULL END)
  • Create a single field that does an integer comparison for each field and ANDs the results. To be honest, I didn’t expect much difference between this approach and option 3…
    • Formula:
      • [A&B] = [A]=1 AND [B]=1
    • Produced the following SQL:
      • (CASE WHEN ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 1 WHEN NOT ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 0 ELSE NULL END)

I put the four examples above into a Tableau worksheet and the following query was run in Snowflake:

SELECT (CASE WHEN (A AND B) THEN 1 WHEN NOT (A AND B) THEN 0 ELSE NULL END) AS “Calculation_ 1822761623506259971”,
(CASE WHEN ((A) AND (B)) THEN 1 WHEN NOT ((A) AND (B)) THEN 0 ELSE NULL END) AS “Calculation_ 1822761623506604036”,
(CASE WHEN ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 1 WHEN NOT ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 0 ELSE NULL END) AS “Calculation_ 1822761623507091461”,
(CASE WHEN ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 1 WHEN NOT ((“BOOL_TEST”.”A” = 1) AND (“BOOL_TEST”.”B” = 1)) THEN 0 ELSE NULL END) AS “Calculation_ 1822761623510069255”,
SUM(“BOOL_TEST”.”C”) AS “sum:C:ok”
FROM “PUBLIC”.”BOOL_TEST” “BOOL_TEST”
GROUP BY 1,
2,
4,
3

Looking at the query profile, the two Boolean approaches produced the same expression in the query profile:

IFF(BOOL_TEST.A AND BOOL_TEST.B, 1, IFF(NOT(BOOL_TEST.A AND BOOL_TEST.B), 0, null ))

As did the two integer approaches:

IFF((BOOL_TEST.A = TRUE) AND (BOOL_TEST.B = TRUE), 1, IFF(NOT((BOOL_TEST.A = TRUE) AND (BOOL_TEST.B = TRUE)), 0, null))

Finally, I timed a series of tests against an admittedly small data volume (~100K rows) and the integer logic seemed to come out slightly faster on average – 140ms vs. 166ms over 12 iterations. I’m not convinced that the result is statistically valid, but read into it what you will.

If you want to have a play around with this yourself, here’s the create table statement:

CREATE OR REPLACE TABLE BOOL_TEST (
A BOOLEAN
, B BOOLEAN
, C INTEGER
);

The data for the table and the Tableau workbook can be found in my Dropbox.

Overall, I’m not sure if that helps but it certainly explains what was going on under the covers. It also gives some workaround options until the problem is fixed by Tableau.


Addendum:

The objective of this piece was to discuss how you could use Boolean values in Boolean logic formulae, however, there is one additional piece worth talking about… how to get the field to display as a Boolean true/false value? While the approaches above create fields that are true Booleans in Tableau, there is another approach that doesn’t require creating a calculation, and that is using an alias on the field. You can apply an alias of “true” to the numeric value of “1” and “false” to the numeric value of “0”. This would then display the alias value but the field is still a numeric for the purposes of calculations and such.

I just thought it was worth noting this additional option.

About Alan Eldridge

Hi. I'm Alan. By day I manage the APAC sales engineering team for Snowflake Computing. By night, I'm a caped crusader. Or sleeping. Most often it's sleeping.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s