0

Question I've got a Redshift table which contains hierarchal data by month. I am trying to calculate sum for manager based on people under them but struggling to get the SQL working. Actual data contains multiple levels.

Example Sum of Bob's 'Person_actual' is some of all 'Person_actual' column for all the people reporting into him. Reading online I can see i've to use recursive query but can't get it working. Any guidance

Sample Data | Person_owner_id | Person_id | Person_target | Person_actual | Date_description |-----------------|-----------|---------------|---------------|----------------- | |Bob | | |Jan-23
| |Bob | | |Feb-23 | |Bob | | |Mar-23
| |Bob | | |Apr-23
| |Bob | | |May-23
| |Bob | | |Jun-23
| |Bob | | |Jul-23
| |Bob | | |Aug-23
| |Bob | | |Sep-23
| |Bob | | |Oct-23
| |Bob | | |Nov-23
| |Bob | | |Dec-23
|Bob |Chris |10 |9 |Jan-23
|Bob |Chris |10 |9 |Feb-23
|Bob |Chris |10 |7 |Mar-23
|Bob |Chris |10 |5 |Apr-23
|Bob |Chris |10 |6 |May-23
|Bob |Chris |10 |3 |Jun-23
|Bob |Chris |10 |2 |Jul-23
|Bob |Chris |10 | |Aug-23
|Bob |Chris |10 | |Sep-23
|Bob |Chris |10 | |Oct-23
|Bob |Chris |10 | |Nov-23 |Bob |Chris |10 | |Dec-23
|Bob |Ali |10 |9 |Jan-23
|Bob |Ali |10 |9 |Feb-23
|Bob |Ali |10 |7 |Mar-23
|Bob |Ali |10 |5 |Apr-23
|Bob |Ali |10 |6 |May-23
|Bob |Ali |10 |3 |Jun-23
|Bob |Ali |10 |2 |Jul-23
|Bob |Ali |10 | |Aug-23
|Bob |Ali |10 | |Sep-23
|Bob |Ali |10 | |Oct-23
|Bob |Ali |10 | |Nov-23 |Bob |Ali |10 | |Dec-23

Expected Output | Person_owner_id | Person_id | Person_target | Person_actual | Date_description |-----------------|-----------|---------------|---------------|----------------- | |Bob | |18 |Jan-23
| |Bob | |19 |Feb-23 | |Bob | |8 |Mar-23
| |Bob | |10 |Apr-23
| |Bob | |12 |May-23
| |Bob | |6 |Jun-23
| |Bob | |4 |Jul-23
| |Bob | | |Aug-23
| |Bob | | |Sep-23
| |Bob | | |Oct-23
| |Bob | | |Nov-23
| |Bob | | |Dec-23
|Bob |Chris |10 |9 |Jan-23
|Bob |Chris |10 |9 |Feb-23
|Bob |Chris |10 |7 |Mar-23
|Bob |Chris |10 |5 |Apr-23
|Bob |Chris |10 |6 |May-23
|Bob |Chris |10 |3 |Jun-23
|Bob |Chris |10 |2 |Jul-23
|Bob |Chris |10 | |Aug-23
|Bob |Chris |10 | |Sep-23
|Bob |Chris |10 | |Oct-23
|Bob |Chris |10 | |Nov-23 |Bob |Chris |10 | |Dec-23
|Bob |Ali |10 |9 |Jan-23
|Bob |Ali |10 |10 |Feb-23
|Bob |Ali |10 |1 |Mar-23
|Bob |Ali |10 |5 |Apr-23
|Bob |Ali |10 |6 |May-23
|Bob |Ali |10 |3 |Jun-23
|Bob |Ali |10 |2 |Jul-23
|Bob |Ali |10 | |Aug-23
|Bob |Ali |10 | |Sep-23
|Bob |Ali |10 | |Oct-23
|Bob |Ali |10 | |Nov-23 |Bob |Ali |10 | |Dec-23

2 Answers 2

0
--- Table definition
CREATE TABLE rstable (
  Person_owner_id TEXT,
  Person_id TEXT,
  Person_target INTEGER,
  Person_actual INTEGER,
  Date_description TEXT
);

--- Recreating data as insert statement
INSERT INTO rstable (Person_owner_id, Person_id, Person_target, Person_actual, Date_description) VALUES
('Bob', NULL, 18, 0, 'Jan-23'),
('Bob', NULL, 19, 0, 'Feb-23'),
('Bob', NULL, 8, 0, 'Mar-23'),
('Bob', NULL, 10, 0, 'Apr-23'),
('Bob', NULL, 12, 0, 'May-23'),
('Bob', NULL, 6, 0, 'Jun-23'),
('Bob', NULL, 4, 0, 'Jul-23'),
('Bob', NULL, NULL, 0, 'Aug-23'),
('Bob', NULL, NULL, 0, 'Sep-23'),
('Bob', NULL, NULL, 0, 'Oct-23'),
('Bob', NULL, NULL, 0, 'Nov-23'),
('Bob', NULL, NULL, 0, 'Dec-23'),
('Bob', 'Chris', 10, 9, 'Jan-23'),
('Bob', 'Chris', 10, 9, 'Feb-23'),
('Bob', 'Chris', 10, 7, 'Mar-23'),
('Bob', 'Chris', 10, 5, 'Apr-23'),
('Bob', 'Chris', 10, 6, 'May-23'),
('Bob', 'Chris', 10, 3, 'Jun-23'),
('Bob', 'Chris', 10, 2, 'Jul-23'),
('Bob', 'Chris', 10,NULL,'Aug-23'),
('Bob', 'Chris', 10,NULL,'Sep-23'),
('Bob', 'Chris', 10,NULL,'Oct-23'),
('Bob', 'Chris', 10,NULL,'Nov-23'),
('Bob', 'Chris', 10,NULL,'Dec-23'),
('Bob','Ali' ,10 ,9 ,'Jan-23' ),
('Bob','Ali' ,10 ,9 ,'Feb-23' ),
('Bob','Ali' ,10 ,7 ,'Mar-23' ),
('Bob','Ali' ,10 ,5 ,'Apr-23' ),
('Bob','Ali' ,10 ,6 ,'May-23' ),
('Bob','Ali' ,10 ,3 ,'Jun-23' ),
('Bob','Ali' ,10 ,2 ,'Jul-23' ),
('Bob','Ali' ,10,NULL,'Aug-23' ),
('Bob','Ali' ,10,NULL,'Sep-23' ),
('Bob','Ali' ,10,NULL,'Oct-23' ),
('Bob','Ali' ,10,NULL,'Nov-23' ),
('Bob','Ali' ,10,NULL,'Dec-23');

The query

WITH RECURSIVE t(
  Person_owner_id,
  Person_id,
  Person_target,
  Person_actual,
  Date_description,
  SUM_ACTUAL
) AS (
  SELECT
    Person_owner_id,
    Person_id,
    Person_target,
    Person_actual,
    Date_description,
    SUM(Person_actual) OVER (
      PARTITION BY Person_owner_id
      ORDER BY Date_description
      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS SUM_ACTUAL
  FROM
    (
      SELECT
        Person_owner_id,
        Person_id,
        Person_target,
        Person_actual,
        Date_description
      FROM
        rstable
    ) t
  WHERE
    Person_owner_id IS NOT NULL
)
SELECT
  t.Person_owner_id,
  t.Person_id,
  t.Person_target,
  t.Person_actual,
  t.Date_description,
  t.SUM_ACTUAL
FROM
  t;

We first create a recursive CTE called t. It starts with the rows where the Person_owner_id is NULL. These rows represent the top-level managers. The query then calculates the cumulative sum of the Person_actual column for each manager and their subordinates. The query continues to iterate through the t CTE until there are no more rows and here's the output Query result

Sign up to request clarification or add additional context in comments.

1 Comment

Hi @PGzlan, thanks for the answer. However, it is not what I am after. As it shows in your output, you are summing Ali based on Chris as well. In our output, Ali & Chris should remain as it is as they do not have any subordinates but Bob should have a sum based on Ali & Chris. Example (for sum_actual): Bob | Jan 23 | 18 Ali | Jan 23 | 9 Chris Jan 23 | 9 Bob | Feb 23 | 18 Ali | Feb 23 | 9 Chris | Feb 23 | 9 Logic: For each month, sum the actual value for all the subordinates just for that month
0

I was able to get the results I wanted with following query:

SELECT
    t1.person_owner_id,
    t1.date_from,
    COALESCE(SUM(t2.person_actual), 0) AS sum_person_actual
FROM (
    SELECT DISTINCT person_owner_id, date_from
    FROM staging.rstable
) AS t1
CROSS JOIN (
    SELECT DISTINCT person_id
    FROM staging.rstable
    WHERE person_id IS NOT NULL
) AS subordinates
LEFT JOIN staging.rstable AS t2
    ON t1.person_owner_id = t2.person_owner_id
    AND t1.date_from = t2.date_from
    AND t2.person_id = subordinates.person_id
GROUP BY t1.person_owner_id, t1.date_from
ORDER BY t1.person_owner_id, t1.date_from;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.