0

I have a table with data structured like this. Each product ID has a list of element IDs--for each element, there is a dictionary including a list of elements and their assigned IDs. Not every element will have an ID on every product

product_id element_id
product_1 {"FIRE": ["1630808"],"WATER": ["188028","234"],"SHADOW": ["213181"]

For each product I'd like to be able to count how many of each element ID appear, in a table like this:

product_id fire_count water_count shadow_count forest_count
product_1 1 2 1 0

I've tried using the LATERAL FLATTEN function with KEY and VALUE, but I'm getting duplicate results and wonder if there is a more crisp way of writing this type of query, especially because I also need to count instances where an ID does not appear.

My data is stored in Snowflake and I query it using Snowflake SQL.

Any advice? Thank you!

2

2 Answers 2

2

It can be achieved without flattening and aggregation:

CREATE OR REPLACE TABLE TAB(PRODUCT_ID, ELEMENT_ID) AS SELECT 'product_1',
{'FIRE':['1630808'],'WATER':['188028','234'],'SHADOW':['213181']};

SELECT
  PRODUCT_ID,
  ARRAY_SIZE(ELEMENT_ID:FIRE) AS FIRE_COUNT,
  ARRAY_SIZE(ELEMENT_ID:WATER) AS WATER_COUNT,
  ARRAY_SIZE(ELEMENT_ID:SHADOW) AS SHADOW_COUNT,
FROM TAB;
/*
+------------+------------+-------------+--------------+
| PRODUCT_ID | FIRE_COUNT | WATER_COUNT | SHADOW_COUNT |
+------------+------------+-------------+--------------+
| product_1  |          1 |           2 |            1 |
+------------+------------+-------------+--------------+
*/
Sign up to request clarification or add additional context in comments.

1 Comment

I'd also suggest wrapping each value with COALESCE(..., 0)
1

I'd do the following...

  • flatten just one level
  • use ARRAY_SIZE() to count the elements in the arrays
  • use conditional aggregation to pivot the results
SELECT
  src.product_id,
  SUM(CASE WHEN f.key = 'fire'   THEN ARRAY_SIZE(f.value) ELSE 0 END)   AS fire_count,
  SUM(CASE WHEN f.key = 'water'  THEN ARRAY_SIZE(f.value) ELSE 0 END)   AS water_count,
  SUM(CASE WHEN f.key = 'shadow' THEN ARRAY_SIZE(f.value) ELSE 0 END)   AS shadow_count
FROM
  your_table   AS src
CROSS JOIN LATERAL
  FLATTEN (
    INPUT => src.element_id,
    OUTER => TRUE 
  )
    AS f
GROUP BY
  src.product_id

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.