1

Question

How can I add a specific value from a column when using a GROUP BY statement and an aggregate function in my query?

Overview

This is a sample of my table:

id  | year | quarter | wage | comp_id | comp_industry |
123 | 2012 | 1       | 1000 | 456     | abc           |
123 | 2012 | 1       | 2000 | 789     | def           |
123 | 2012 | 2       | 1500 | 789     | def           |
456 | 2012 | 1       | 2000 | 321     | ghi           |
456 | 2012 | 2       | 2000 | 321     | ghi           |

To calculate the sum of each person's wage value by quarter and wage, I ran the following query:

SELECT SUM(wage) AS sum_wage
FROM t1
GROUP BY id, year, quarter, sum_wage;

Results in

id  | year | quarter | sum_wage | 
123 | 2012 | 1       | 3000     |
123 | 2012 | 2       | 1500     |
456 | 2012 | 1       | 2000     |
456 | 2012 | 2       | 2000     |

Desired Output

I would like to update my query to include the comp_industry column where the individual's wage is highest for each quarter and year. I'm unsure where to start so that I only return the industry where folks made the most money for each quarter and year.

id  | year | quarter | sum_wage | comp_industry
123 | 2012 | 1       | 3000     | def
123 | 2012 | 2       | 1500     | def
456 | 2012 | 1       | 2000     | ghi
456 | 2012 | 2       | 2000     | ghi

I've taken a look at Get value based on max of a different column grouped by another column and Fetch the row which has the Max value for a column but am unsure where to go from there.

Any help or advice would be greatly appreciated!

2
  • Your query, from your sample data, not produces that result, what you have in "Results in" Commented Sep 7, 2018 at 17:23
  • @OtoShavadze - ah you are right! Thank you for catching that. Commented Sep 7, 2018 at 18:00

2 Answers 2

1

You can try to use window function with SUM and ROW_NUMBER.

Make row number by id, year, quarter columns order by wage desc then get rn = 1.

Schema (PostgreSQL v9.6)

CREATE TABLE T (
   id INT, 
   year INT,
   quarter INT,
   wage INT,
   comp_id INT,
  comp_industry VARCHAR(50)
);


INSERT INTO T VALUES (123 , 2012 , 1 , 1000 , 456    ,'abc');
INSERT INTO T VALUES (123 , 2012 , 1 , 2000 , 789    ,'def');
INSERT INTO T VALUES (123 , 2012 , 2 , 1500 , 789    ,'def');
INSERT INTO T VALUES (456 , 2012 , 1 , 2000 , 321    ,'ghi');
INSERT INTO T VALUES (456 , 2012 , 2 , 2000 , 321    ,'ghi');

Query #1

SELECT id, year,quarter ,sum_wage, comp_industry FROM (
  SELECT *,
           SUM(wage)  OVER (PARTITION BY  id, year, quarter  order by year ) sum_wage,
           ROW_NUMBER() OVER (PARTITION BY  id, year, quarter order by wage desc) rn
    FROM T
) t1
where rn = 1;

| id  | year | quarter | sum_wage | comp_industry |
| --- | ---- | ------- | -------- | ------------- |
| 123 | 2012 | 1       | 3000     | def           |
| 123 | 2012 | 2       | 1500     | def           |
| 456 | 2012 | 1       | 2000     | ghi           |
| 456 | 2012 | 2       | 2000     | ghi           |

View on DB Fiddle

Sign up to request clarification or add additional context in comments.

2 Comments

thank you for including the DB Fiddle example! This worked out great. I really appreciate you taking the time to help me.
@CristianE.Nuno No problem glad to help :)
1

I'm not 100% sure I understand the question, does this do the trick for you?

SELECT id, 
       year, 
       quarter, 
       comp_industry, 
       SUM(wage)
  FROM (SELECT id, 
               year, 
               quarter,
               comp_industry, 
               wage
          FROM (SELECT TMP.*,
                       RANK() OVER
                         ( PARTITION BY id, 
                                        year, 
                                        quarter
                               ORDER BY wage_sum DESC         
                         ) wage_rnk
                  FROM (SELECT t1.*,
                               SUM(wage) OVER
                                 ( PARTITION BY id, 
                                                year, 
                                                quarter 
                                 ) wage_sum
                        FROM t1
                        GROUP BY id, 
                                 year, 
                                 quarter
                       ) TMP
               ) TMP2
         WHERE wage_rnk = 1
       ) TMP3
 GROUP  
    BY id, 
       year, 
       quarter, 
       comp_industry;

1 Comment

Thank you for your answer! I'm very new to writing queries so I appreciate you helping me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.