1

We have a table where total presence time on a given (period, volunteer) pair is saved and another in which actions are logged: we know which presence the action was taken in. But we do not track time spent on a given action. It is not a great data structure but I cannot change it.

CREATE TABLE volunteer_presence (id integer, volunteer_id integer, minutes integer);
CREATE TABLE logged_actions (id  integer, presence_id integer, section varchar);

What we would like to do is to approximate time spent on a given action by assuming time was split equally among all actions in a given presence. It's not great but we are happy to at least have that. For example let us say we have the following data:

INSERT INTO volunteer_presence (id, volunteer_id, minutes) values (1, 333, 400);
INSERT INTO volunteer_presence (id, volunteer_id, minutes) values (2, 444, 90);
INSERT INTO volunteer_presence (id, volunteer_id, minutes) values (3, 555, 80);

INSERT INTO logged_actions (id, presence_id, section) values (10, 1, 'Cats');
INSERT INTO logged_actions (id, presence_id, section) values (11, 1, 'Dogs');
INSERT INTO logged_actions (id, presence_id, section) values (13, 1, 'Cats');
INSERT INTO logged_actions (id, presence_id, section) values (14, 1, 'Cats');
INSERT INTO logged_actions (id, presence_id, section) values (15, 1, 'Front');
INSERT INTO logged_actions (id, presence_id, section) values (16, 2, 'Dogs');

We would expect the following result:

section presence_id estimated_minutes
Cats 1 240.0
Dogs 1 80.0
Front 1 80.0
Dogs 2 90.0

For example (3 out of 5 actions)*400 is 240.

The solution I came up with feels wrong with two subqueries in the join clauses. It feels it would not scale well.

SELECT  action_count_table.section,
        action_count_table.presence_id,
        ((action_count_table.total_action_count / presence_total_actions.action_count) * time_table.minutes) AS estimated_minutes
FROM (
  SELECT event.section AS section,
         volunteer_presence.id AS presence_id,
         SUM(event.action_count) AS total_action_count

  FROM (
    SELECT logged_actions.presence_id,
           logged_actions.section,
           COUNT(logged_actions.id) AS action_count
    FROM logged_actions
    GROUP BY logged_actions.presence_id, logged_actions.section
    ORDER BY logged_actions.presence_id, logged_actions.section
    ) AS event
  INNER JOIN volunteer_presence
  ON volunteer_presence.id = event.presence_id
  GROUP BY event.section, volunteer_presence.id
  ) AS action_count_table
INNER JOIN (
  SELECT DISTINCT logged_actions.presence_id AS presence_id,
                  volunteer_presence.minutes AS minutes
  FROM logged_actions
  INNER JOIN volunteer_presence
  ON volunteer_presence.id = logged_actions.presence_id
  ) AS time_table
ON action_count_table.presence_id = time_table.presence_id
INNER JOIN (
     SELECT logged_actions.presence_id,
           COUNT(logged_actions.id) AS action_count
    FROM logged_actions
    GROUP BY logged_actions.presence_id
    ORDER BY logged_actions.presence_id
) AS presence_total_actions
ON presence_total_actions.presence_id = time_table.presence_id

Is there a better solution, that would scale better or at least be simpler ?

1 Answer 1

2

step-by-step demo:db<>fiddle

SELECT
    la.presence_id,
    la.section,
    vp.minutes * COUNT(*) / total                            -- 4
FROM (
    SELECT 
        *,
        COUNT(*) OVER (PARTITION BY presence_id) as total    -- 1
    FROM logged_actions
) la
JOIN volunteer_presence vp ON vp.id = la.presence_id         -- 2
GROUP BY la.presence_id, la.section, la.total, vp.minutes    -- 3
ORDER BY la.presence_id, la.section
  1. The COUNT() window function adds the total count per presence_id to each record
  2. Join the volunteer_presence
  3. Grouping
  4. Calculate the count for each section per presence_id, divide by the previously calculated total count per presence_id and multiply with the minutes.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.