0

I have multiple tables and I need to aggregate the data from all of them, but it seems that I always get the wrong results for the sums. What am I doing wrong?

customers
ID      Name   
1       c1
2       c2
3       c3

budget
ID      Cust_ID     Value
1       1           100
2       1           300
3       2           600
4       3           450

forecast
ID      Cust_ID     Value
1       1           200
2       1           500
3       2           100
4       2           700
5       3           550

orders
ID      Cust_ID     Net_Sales
1       1           100
2       1           200
3       1           300
4       2           400
5       3           500

Here is the expected result:

ID      Name        sum(budget.Value)       sum(forecast.Value)     sum(orders.Net_Sales)       count(orders.ID)
1       c1          400                     700                     600                         3
2       c2          600                     800                     400                         1
3       c3          450                     550                     500                         1

And here's what I've tried so far:

SELECT customers.ID, customers.Name, sum(budget.Value), sum(forecast.Value), sum(orders.Net_Sales), count(orders.ID)
FROM customers
INNER JOIN budget ON budget.Cust_ID = customers.ID
INNER JOIN forecast ON forecast.Cust_ID = customers.ID
INNER JOIN orders ON orders.Cust_ID = customers.ID
GROUP BY customers.ID
ORDER BY customers.ID ASC
2
  • You can add sample data in this sqlfiddle.com fiddle. Commented Jul 7, 2021 at 15:19
  • 1
    Unless your cardinality is 1:1 for each of the tables, your sums are going to be multiplied by the cardinality of the other records. You either need to 1) Sum before you do the joins or 2) sum using an analytic. There are other options but those 2 are the most common. Commented Jul 7, 2021 at 15:21

1 Answer 1

3

You are joining along multiple dimensions, which multiplies the results.

A simple solution is correlated subqueries:

SELECT c.ID, c.Name, 
       (SELECT SUM(b.Value)
        FROM budget b
        WHERE b.Cust_ID = c.ID
       ) as budget,
       (SELECT SUM(f.Value)
        FROM forecast f
        WHERE f.Cust_ID = c.ID
       ) as forecast,
       (SELECT SUM(o.Net_Sales)
        FROM orders o
        WHERE o.Cust_ID = c.ID
       ) as net_sales
FROM customers c
ORDER BY c.ID ASC;

With the right indexes in the second tables (budget(cust_id, value), and so on), this may actually be faster than a JOIN approach.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.