0

I have a input redshift dataset like the one below:

car_id | driver_id | salesman | tyre_id | price_min | price_max | price_avg
-------+-----------+----------+---------+-----------+-----------+----------
A      |         1 |        1 |       9 |         1 |         1 |         1
A      |         2 |        1 |       9 |         1 |         1 |         1
A      |         3 |        1 |       9 |         1 |         1 |         1
A      |         4 |        1 |       9 |         1 |         1 |         1
A      |         1 |        2 |       7 |         0 |         1 |         1
A      |         2 |        2 |       7 |         0 |         0 |         1
A      |         3 |        2 |       7 |         0 |         1 |         1
A      |         4 |        2 |       7 |         0 |         0 |         0

I would like to pivot multiple columns whose values may be dynamic in nature. I want to pivot the columns salesman, tyre_id with the values price_min, price_avg, price_max. Basically, each driver_id will have 1 single row.

I can achieve it in python but I need a query as my requirement.

The is the expected output: enter image description here

I tried concatenating salesman and tyreId. It partially works but I need help from the expert community for this to achieve fully. Can anyone help me and guide me through this?

This is the query I tried:

CREATE TABLE public.temp_car_id 
(
  car_id      VARCHAR(10),
  driver_id   INT,
  salesman    INT,
  tyre_id     INT,
  price_min   INT,
  price_max   INT,
  price_avg   INT
);

INSERT INTO public.temp_car_id
VALUES
('A',1,1,9,0.61,0.89,0.91),
('A',2,1,9,0.63,0.93,0.58),
('A',3,1,9,0.91,0.83,0.99),
('A',4,1,9,0.53,0.84,0.79),
('A',1,2,7,0.12,0.97,0.87),
('A',2,2,7,0.13,0.30,0.84),
('A',3,2,7,0.17,0.62,0.63),
('A',4,2,7,0.09,0.01,0.19);
     

SELECT car_id,
       driver_id,
       CONCAT('.',CONCAT(CONCAT(salesman,'.'), tyre_id)) AS sales_tyre,
       price_min,
       price_max,
       price_avg INTO #temp_car_id_unpivot
FROM public.temp_car_id;

SELECT *
FROM #temp_car_id_unpivot PIVOT (MAX(price_min) 
     FOR
     sales_tyre IN ('.1.9','.2.7'));

1 Answer 1

1

For the example you give I have some more general SQL below but this is for the static case you put in your example. Not sure if you are looking for a general / dynamic pivot functionality but if you are you will need to create some sort of SQL generator that takes the column value you want to pivot as input. This generator can be Redshift external code or a stored procedure but cannot be done in basic SQL on Redshift. The code below can be used as a template for use with such a SQL generator.

Set up (you had some decimal values defined as INT):

CREATE TABLE public.temp_car_id 
(
  car_id      VARCHAR(10),
  driver_id   INT,
  salesman    INT,
  tyre_id     INT,
  price_min   decimal(8,2),
  price_max   decimal(8,2),
  price_avg   decimal(8,2)
);

INSERT INTO public.temp_car_id
VALUES
('A',1,1,9,0.61,0.89,0.91),
('A',2,1,9,0.63,0.93,0.58),
('A',3,1,9,0.91,0.83,0.99),
('A',4,1,9,0.53,0.84,0.79),
('A',1,2,7,0.12,0.97,0.87),
('A',2,2,7,0.13,0.30,0.84),
('A',3,2,7,0.17,0.62,0.63),
('A',4,2,7,0.09,0.01,0.19);

SQL code:

select car_id, driver_id,
  min(avg_price_min_s1_t9) as avg_price_min_s1_t9,
  min(avg_price_max_s1_t9) as avg_price_max_s1_t9,
  min(avg_price_avg_s1_t9) as avg_price_mavg_s1_t9,
  min(avg_price_min_s2_t7) as avg_price_min_s2_t7,
  min(avg_price_max_s2_t7) as avg_price_max_s2_t7,
  min(avg_price_avg_s2_t7) as avg_price_mavg_s2_t7
from (
  SELECT car_id,
    driver_id,
    avg(case when tyre_id = 9 and salesman = 1 then price_min end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_min_s1_t9,
    avg(case when tyre_id = 9 and salesman = 1 then price_max end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_max_s1_t9,
    avg(case when tyre_id = 9 and salesman = 1 then price_avg end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_avg_s1_t9,
    avg(case when tyre_id = 7 and salesman = 2 then price_min end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_min_s2_t7,
    avg(case when tyre_id = 7 and salesman = 2 then price_max end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_max_s2_t7,
    avg(case when tyre_id = 7 and salesman = 2 then price_avg end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_avg_s2_t7
  FROM public.temp_car_id ) a
group by 1,2
order by 1,2;

It is unclear what aggregate function you want to use for combining these values as you only have 1 value per type so I used avg().

Sign up to request clarification or add additional context in comments.

2 Comments

Awesome. Thanks however, I am looking for dynamic column values. might need to find some other way. Anyways, thanks a lot
For dynamic pivoting you will need some non-SQL code. I like to use external code (like a Lambda function) to do this but a stored procedure can work. The above code is a fine template for expanding inside of this code layer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.