2

I'm having a problem with pandas. I have a dataframe with three columns: 'id1','id2','amount'.

From this, I would like to create another dataframe which index is 'id1', which columns is 'id2', and the cells contain the corresponding 'amount'.

Let's go for an example:

import pandas as pd
df = pd.DataFrame([['first_person','first_item',10],['first_person','second_item',6],['second_person','first_item',18],['second_person','second_item',36]],columns = ['id1','id2','amount'])

which yields:

     id1              id2             amount
0    first_person     first_item      10
1    first_person     second_item     6
2    second_person    first_item      18
3    second_person    second_item     36

And from this I would like to create a second dataframe which is:

                 first_item    second_item
first_person     10            6
second_person    18            36

Of course, before posting I've worked on it for a time, but all I've managed to do for this is a double 'for loop'... Which for the size of my dataframes is nowhere to be computable. Would you know how to do this in a more pythonic way? (which would obviously be far more efficient than 'for' loops!)

1 Answer 1

4

I think you can use pivot with rename_axis (new in pandas 0.18.0):

print df
             id1          id2  amount
0   first_person   first_item      10
1   first_person  second_item       6
2  second_person   first_item      18
3  second_person  second_item      36

print df.pivot(index='id1', columns='id2', values='amount')
        .rename_axis(None)
        .rename_axis(None, axis=1)

               first_item  second_item
first_person           10            6
second_person          18           36
Sign up to request clarification or add additional context in comments.

5 Comments

It seems to fit perfectly what I need! Execpt for that renaming index problem. Indeed, the '.rename_axis(None)' method returns the following error: 'must pass an index to rename' (I am under Python 2.7 if it matters).
And what is version of pandas? print pd.show_versions()
0.17.1 which explains why the rename_axis doesn't work.
If you can update, use df.index.name = None and df.columns.names = None
Without updating I found on another post: stackoverflow.com/questions/29765548/… This works perfectly for erasing index and columns names. Thank you for the pivot trick, it helps me a lot!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.