4

I have a BigQuery table that includes an String array field. For some records, the array can hold duplicate string values.

Is is possible in a BigQuery UNNEST clause to filter out the duplicates so that the UNNEST only returns distinct array string values?

2 Answers 2

10

There are many ways to do this. Since you didn't specify a desired input and output, I'll arbitrarily choose one.

Usings ARRAY_AGG(DISTINCT):

WITH data AS (
  SELECT 1 id, ["a", "a", "b", "e", "a", "c", "b", "a"] strings
)


SELECT id, ARRAY_AGG(DISTINCT string) strings
FROM data, UNNEST(strings) string
GROUP BY id

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

What about if I want the id shows in each row?
6

Below is for BigQuery Standard SQL

#standardSQL
SELECT * REPLACE(ARRAY(SELECT DISTINCT el FROM t.arr AS el) AS arr)
FROM `project.dataset.table` t    

You can test, play with above using dummy data as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 col1, 2 col2, ["a", "a", "b", "e", "a", "c", "b", "a"] arr, 3 col3 UNION ALL
  SELECT 4, 5, ["x", "y", "z"], 5
)
SELECT * REPLACE(ARRAY(SELECT DISTINCT el FROM t.arr AS el) AS arr)
FROM `project.dataset.table` t   

with output

Row col1    col2    arr col3     
1   1       2       a   3    
                    b        
                    e        
                    c        
2   4       5       x   5    
                    y        
                    z        

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.