1

I have two CSV file both have more than 50000 rows and now I want to find only matches records between two CSV files. I have tried so many pandas function like and all records return response is only true and false.

How do I get only matches column values between two CSVs?

 df2 = id externalcode
       1   00
       2   00




 df2 = id externalcode
        1   00
        2   00


Any help would be appreciated. The code which I have tried is given below:

data_frame1 = pd.read_csv("one.csv")
data_frame2 = pd.read_csv("two.csv")
print(type(data_frame1),type(data_frame2))
result = data_frame1[data_frame1['id'] == data_frame2['id']]
df1 = data_frame1['id'].isin(data_frame2['id'])
df2 = data_frame1['values_externalCode'].isin(data_frame2['values_externalCode'])
2
  • You want matching rows or matching columns? Can you also add your desired output df? : Commented Mar 12, 2020 at 10:42
  • only want to match only column values between two data frames. Commented Mar 12, 2020 at 10:59

1 Answer 1

1
df1 = data_frame1[data_frame1['id'].isin(list(data_frame2['id']))]

I modified one of your attempts, please let me know if it works! So this code will select ids in data_frame1 which also appear in data_frame2.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.