0

I have a DB2 table containing rows of customer data. This is used to determine if a customer saved money using our service. The older record is the purchase that triggered our process and the newer record is after they purchase the same product again. The desired result is to see one row containing their oldest paid amount, their newest amount and the difference between the two rows to verify they saved money.

The data is laid out like this

ID       Name       Product ID    Sale ID      First Paid     Last Paid
1        Mary       15            195          8              NULL
2        Mary       15            195          NULL           3
3        Bob        8             283          16             NULL
4        Bob        8             283          NULL           11

The desired result is this

Name     Sale ID    Product ID     First Paid  Last Paid    Savings     
Mary     195        15             8           3            5
Bob      283        8              16          11           5

This is what I get instead

Name    Sale ID    Product ID     First Paid     Last Paid     Savings   
Mary    195        15             8              NULL          8
Mary    195        15             NULL           3             -3
Bob     283        8              16             NULL          16
Bob     283        8              NULL           11            -11

The results of this query are used to drive a larger report so this is being generated as part of a subquery.

SELECT cost.name, cost.saleid, cost.productid, cost.saleid,
cost.firstpaid, cost.lastpaid, sum(cost.firstpaid - cost.lastpaid) as savings 
from (
    select distinct saleid, max(name) as name, max(productid) as productid, 
    max(firstpaid) as firstpaid, max(lastpaid) as lastpaid) as cost

I have found that my larger query works as intended but the multiple rows returned by this innermost query is having a negative impact on the results as customers are counted twice when they should only be counted once. Is there a way in DB2 to get these values into the same row or will I need to pull back the results and filter them in php code rather than in the SQL query?

1
  • Did you intentionally mean to have 'cost.saleid' twice? Could that be causing the duplicate entries? Commented Apr 25, 2017 at 14:20

1 Answer 1

2

Assuming two rows per customer, then aggregation seems like the right approach:

select Name, SaleID, ProductID,
       sum(firstpaid) as firstpaid, sum(lastpaid) as lastpaid
       sum(firstpaid) - sum(lastpaid) as savings
from t
group by Name, SaleID, ProductID;

This works for more than two rows. I'm not sure if you want sum() or min() or max() or avg() when there are additional rows.

Sign up to request clarification or add additional context in comments.

1 Comment

be carefull sum(firstpaid) - sum(lastpaid) return null if sum(firstpaid) is null or sum(lastpaid) is null. May be you can use sum(ifnull(firstpaid, 0)) and into other sum instead ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.