1

I have two data frames df1 - which holds a 'grouped inventory' of items grouped by numerical values A, B and C. For each item there is a sum column which should reflect the total price of all the items I have of that particular type. Initially I have set the sum column to zero.

df2 is a list of items I have with A, B, C and the price of the item.

df1 (Initial Inventory):

A     B     C   SUM
1     1     1    0
1     1     2    0
1     2     2    0
2     2     2    0

df2 (List of items):

A     B     C   PRICE
2     2     2    30
1     1     2    100
1     1     2    110
1     1     2    105

So my code should convert df1 into:

df1 (expected output):

A     B     C   SUM
1     1     1    0
1     1     2    315
1     2     2    0
2     2     2    30

Explanation: My list of items (df2) contains one item coded as 2,2,2 which has a value of 30 and contains three items coded as 1,1,2 which has values of 100 + 110 + 105 = 315. So I update the inventory table df1, to reflect that I have a total value of 30 for items coded 2,2,2 and total value of 315 for items coded 1,1,2. I have 0 in value for items coded 1,1,1 and 1,2,2 - since they aren't found in my items list.

What would be the most efficient way to do this?

I would rather not use loops since df1 is 720 rows and df2 is 10,000 rows.

3
  • 1
    can you explain more on how did you come up with the result? Commented Feb 11, 2018 at 14:57
  • 1
    @Tai Sure. My list of items (df2) contains one item coded as 2,2,2 which has a value of 30 and contains three items coded as 1,1,2 which has values of 100 + 110 + 105 = 315. So I update the inventory table df1, to reflect that I have a total value of 30 for items coded 2,2,2 and total value of 315 for items coded 1,1,2. I have 0 in value for items coded 1,1,1 and 1,2,2 - since they aren't found in my items list. Commented Feb 11, 2018 at 14:59
  • 2
    Please supply additional information by editing your question. People tend not to read comments. Commented Feb 11, 2018 at 15:27

1 Answer 1

1

You can try to merge on columns "A", "B", and "C" with how="left". (df2_sum below is a subset of df1, so we choose left here.)

df2_sum = df2.groupby(["A", "B", "C"])["PRICE"].sum().reset_index()

df1.merge(df2_sum, on=["A","B","C"], how="left").fillna(0)
    A   B   C   SUM PRICE
0   1   1   1   0   0.0
1   1   1   2   0   315.0
2   1   2   2   0   0.0
3   2   2   2   0   30.0

You can then add PRICE to your SUM column.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.