How to unnest an array column in Snowflake database into multiple columns

Question

How to unnest a variant(array) column in Snowflake into multiple columns

column name: user; table name: event; the column name is a variant format


    [
      {
        "key": "user_id",
        "value": {
          "set_timestamp_micros": 1621804433449213,
          "string_value": "auth0|6094a88b602505006f20fc0e"
        }
      },
      {
        "key": "env",
        "value": {
          "set_timestamp_micros": 1621804433445213,
          "string_value": "staging"
        }
      },
      {
        "key": "first_open_time",
        "value": {
          "int_value": 1620248400000,
          "set_timestamp_micros": 1620245124142213
        }
      }
    ]

My objectives are to transpose like

user_id	env
auth0\|6094a88b602505006f20fc0e	staging

I tried FLATTEN function, but it is not working as I expected.

Snowflake is not BigQuery so I fixed the tags.

Gordon Linoff
– Gordon Linoff

2021-05-24 00:43:17 +00:00
Commented May 24, 2021 at 0:43 — Gordon Linoff
– Gordon Linoff, Commented May 24, 2021 at 0:43

Simeon Pilgrim · Accepted Answer · 2021-05-24 01:16:33Z

1

So FLATTEN on your JSON would give you access to the three sub objects of the array, but you are wanting to access two sub objects by name, if you have sets of there values/objects in your data, and they are all related via set_timestamp_micros, you could PIVOT after FLATTEN or you could MAX like

SELECT f.value:value:set_timestamp_micros::number as set_timestamp_micros
    ,max(iff(f.value:key = 'env', f.value:value:string_value::text, null)) as env
    ,max(iff(f.value:key = 'user_id', f.value:value:string_value::text, null)) as user_id 
    ,max(iff(f.value:key = 'first_open_time', f.value:value:int_value::number, null)) as first_open_time 
FROM data_table AS dt, 
 TABLE(FALTTEN(input=> dt.json)) f
GROUP BY set_timestamp_micros
ORDER BY set_timestamp_micros;

answered May 24, 2021 at 1:16

Simeon Pilgrim

26.7k3 gold badges38 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mike Walton · Accepted Answer · 2021-05-24 01:32:21Z

Flatten just gives you access to the elements of the array. Since the form of the JSON is key-value as separate attributes, you'll need to pivot after you flatten:

WITH x AS (
    SELECT parse_json('    [
      {
        "key": "user_id",
        "value": {
          "set_timestamp_micros": 1621804433449213,
          "string_value": "auth0|6094a88b602505006f20fc0e"
        }
      },
      {
        "key": "env",
        "value": {
          "set_timestamp_micros": 1621804433445213,
          "string_value": "staging"
        }
      },
      {
        "key": "first_open_time",
        "value": {
          "int_value": 1620248400000,
          "set_timestamp_micros": 1620245124142213
        }
      }
    ]') as var
  ), z AS (
   SELECT y.value:key::string as key, y.value:value:string_value::string as value
   from x,
   lateral flatten(input=>var) y
  )
SELECT "'user_id'" as user_id, "'env'" as env
FROM z
PIVOT (MAX(value) FOR key IN ('user_id','env')) AS TMP;

Collectives™ on Stack Overflow

How to unnest an array column in Snowflake database into multiple columns

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related