1

Standard SQL

Hey everyone,

I have a big query table that approximately look like this

team_id, channel_id, value_1, value_2,...
A        1111        xxx      yyy
B        1111        xxx      yyy
C        2222        uuu      vvv
D        2222        uuu      vvv
E        3333        grr      sss

So as you can see, currently, everything is indexed by team_id while all the data that has the same channel_id is same. I need the data to be indexed by channel_id. The result should look like this:

team_id, channel_id, value_1, value_2,...
[A,B]    1111        xxx      yyy
[C,D]    2222        uuu      vvv
[E]      3333        grr      sss

Such that rows with same channel ID are reduced to one row, but the team_id is merged into an array. Is this possible? I am still quite new to SQL/BigQuery and don't know how to do this. I would be happy for some help!

Big thank in advance!

2 Answers 2

1

Below is for BigQuery Standard SQL

... all the data that has the same channel_id is same ...

So, below eliminates needs in explicitly listing all those columns so code the same for any number of those columns and their names

#standardSQL
SELECT ARRAY_AGG(team_id) AS team_id, ANY_VALUE(str).*
FROM (
  SELECT team_id,
    (SELECT AS STRUCT * EXCEPT(team_id) FROM UNNEST([t])) AS str
  FROM `project.dataset.table` t
)
GROUP BY TO_JSON_STRING(str)    

Depends on your coding preferences - you can use refactored version of above

#standardSQL
SELECT ARRAY_AGG(team_id) AS team_id, 
  ANY_VALUE((SELECT AS STRUCT * EXCEPT(team_id) FROM UNNEST([t]))).*
FROM `project.dataset.table` t
GROUP BY TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(team_id) FROM UNNEST([t])))

In both cases, if to apply to sample data from your question - the output as below

Row team_id channel_id  value_1 value_2  
1   A       1111        xxx     yyy  
    B                
2   C       2222        uuu     vvv  
    D                
3   E       3333        grr     sss  
Sign up to request clarification or add additional context in comments.

Comments

1

I thin you just want aggregation:

select array_agg(team_id) as team_ids, channel_id, value_1, value_2
from t
group by channel_id, value_1, value_2;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.