-1

I have the two tables below: One table is Orders and the other table is a Master table. Both are dataframes in Pandas, Python.

Orders
-------------
Date  | item_id
--------------
672022| 123
672022| 976
672022| 532
--------------

Master Table
----------------------------------------------
item_id | Description | Supplier | Ship_From
----------------------------------------------
234     | Oranges     | CWF      | NY
341     | Nuts        | DVR      | NJ
532     | Grapes      | ETT      | CT
123     | Apples      | ERH      | CT
976     | Raspberry   | HKQ      | NY
731     | Bread       | FBE      | NJ
-----------------------------------------------

I want my final table to look like the below:

-------------------------------
Date  | item_id | Description
-------------------------------
672022| 123     | Apples
672022| 976     | Raspberry 
672022| 532     | Grapes
-------------------------------

I am unable to get the desired output in a simple way What is the best way to write joins code in Python? I am using the below code:

new = pd.merge(orders, master, on="item_id", how="left")
new = new[['Date', 'item_id', 'description']]
new.drop_duplicates(subset=None, keep='first', inplace=True)
2
  • I am getting many rows when I do a right or left join because my master table is huge. I need a simple table with item_id and description Commented Jun 7, 2022 at 16:55
  • I get exactly the result you want with exactly your code. I am not sure what the problem is. Commented Jun 7, 2022 at 17:10

1 Answer 1

1

This should work:

import pandas as pd
import io

data1 = '''Date  | item_id
--------------
672022| 123
672022| 976
672022| 532'''

data2 = '''item_id | Description | Supplier | Ship_From
234     | Oranges     | CWF      | NY
341     | Nuts        | DVR      | NJ
532     | Grapes      | ETT      | CT
123     | Apples      | ERH      | CT
976     | Raspberry   | HKQ      | NY
731     | Bread       | FBE      | NJ'''

df1 = pd.read_csv(io.StringIO(data1), sep='\s*\|\s*', engine='python')
df2 = pd.read_csv(io.StringIO(data2), sep='\s*\|\s*', engine='python')
df1.merge(df2[['item_id', 'Description']], on='item_id', how='inner')

output:

Date item_id Description
0 672022 123 Apples
1 672022 976 Raspberry
2 672022 532 Grapes
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.