2

I'm new to python. I have two pandas dataframes that are indexed differently. I want to copy a column from one to another. Dataframe 1: Holds the id and class that each image belongs to

      ID  index  class
0  10472  10472      0
1   7655   7655      0
2   6197   6197      0
3   9741   9741      0
4   9169   9169      0

Dataframe 2: Holds the id of the image in index and the image data in data columns

                                                    data
index                                                   
5882   [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
360    [[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0...
1906   [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
3598   [[[255, 255, 255, 0], [232, 232, 247, 25], [34...
231    [[[255, 255, 255, 0], [234, 234, 234, 0], [57,...

I want to iterate through the dataframe1 and pick up the image id and look up the dataframe 2 for the matching id in the index and copy the 'data' column over to dataframe1. How could i do this (in a performance optimal way)?

1 Answer 1

2

First for match data need same types, so if get different:

print (df1['index'].dtype)    
int64
print (df2.index.dtype)   
object

there are 2 possible solutions - convert index to integers by:

df2.index = df2.index.astype(int)

Or column to strings:

df1['index'] = df1['index'].astype(str)

Then use map by column data in df2:

df1['data'] = df1['index'].map(df2['data']) 

Or if need add multiple columns from df2 (e.g. in real data) use join:

df1 = df1.join(df2, on=['index'])
Sign up to request clarification or add additional context in comments.

6 Comments

I tried your first suggestion, but it returns 'NaN' for all values:
What is print (df1.dtypes) and print (df2.index) ? Need same types, integers or strings
This is what is returned (in that order): ID int64 index int64 class int64 data object dtype: object Index([u'0', u'1', u'10', u'100', u'1000', u'10000', u'10001', u'10002', u'10003', u'10004', ... u'9990', u'9991', u'9992', u'9993', u'9994', u'9995', u'9996', u'9997', u'9998', u'9999'], dtype='object', name=u'index', length=12533)
Sorry about the formatting, but it looks like the df2 index is of type u'index'
@Praveen - Added to answer also :) Thank you.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.