0

I'm facing a challenge with Redshift: I'm trying to dynamically move rows into columns and aggregate by count, however I noticed the pivot table feature is only available from PostgreSQL 9.

Any idea about how to do the following?

 index   fruit     color 
 1       apple     red           
 2       apple     yellow           
 2       banana    blue           
 2       banana    blue           
 3       banana    blue     
 3       banana    green     
 3       pear      green     
 3       pear      red           

to:

 index   red       yellow    blue    green 
 1       1         0         0       0
 2       0         1         2       0
 3       1         0         1       2

Essentially, grouping and counting occurrences of color per id (fruit is not so important, although I'll use it as a filter later).

Note: I might also want to do a binary transformation later on (i.e 0 for 0 and 1 if > 0)

Edit: If the above is not possible, any way to do this instead ?

 index   color     count     
 1       red       1        
 1       yellow    0           
 1       blue      0
 1       green     0 
 2       red       0        
 2       yellow    1           
 2       blue      2
 2       green     0
 3       red       1         
 3       yellow    0           
 3       blue      1
 3       green     2

(again blue,yellow,blue and green should be dynamic)

2 Answers 2

1

For the Edit, you could do

select x.index, x.color, sum(case when y.index is not null then 1 else 0 end) as count
from 
((select index
from [table]
group by index
order by index) a
inner join 
(select color
from [table]
group by color
order by color) b
on 1 = 1) x
left outer join
[table] y
on x.index = y.index
and x.color = y.color
group by x.index, x.color
order by x.index, x.color
Sign up to request clarification or add additional context in comments.

Comments

0

If PIVOT is not available in Redshift, then you could always just use a standard pivot query:

SELECT
    index,
    SUM(CASE WHEN color = 'red'    THEN 1 ELSE 0 END) AS red,
    SUM(CASE WHEN color = 'yellow' THEN 1 ELSE 0 END) AS yellow,
    SUM(CASE WHEN color = 'blue'   THEN 1 ELSE 0 END) AS blue,
    SUM(CASE WHEN color = 'green'  THEN 1 ELSE 0 END) AS green
FROM yourTable
GROUP BY index

3 Comments

Thanks ! But that means I have to know 'red', 'yellow', 'blue' etc. in advance? What if I don't ?
Then you would need to use dynamic SQL for that, which appears to be not supported for Redshift.
which is not available in Redshift ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.