2

I have 4 CSV files with thousands of lines so I will make a quick replica of those here.

'zip','econ_risk_score'
'22011','5'

'zip','food_risk_score'
'22011','2'

'zip','healthlit_risk_score'
'22011','4'

'zip','housing_risk_score'
'22011','5'

my result table should look like this

'zip','econ_risk_score','food_risk_score','healthlit_risk_score','housing_risk_score'
'22011','5','2','4','5'

so far this is my code but I keep getting the error

merge() missing 1 required positional argument: 'right' and can't seem to fix it.

Please let me know your thoughts, thanks

import pandas as pd

df1= pd.read_csv('econ_risk_zip.csv')
df2= pd.read_csv('food_risk_zip.csv')
df3= pd.read_csv('health_risk_zip.csv')
df4= pd.read_csv('housing_risk_zip.csv')

df = pd.merge([df1,df2,df3,df4], right_on = 'zip')
df.to_csv('risk_combined.csv')

2 Answers 2

3

You can cut down on the writing with reduce. Just specify a list of files then you can merge them, this way if you need to add files you only need to modify the list of files.

from functools import reduce
import pandas as pd

files = ['econ_risk_zip.csv', 'food_risk_zip.csv',
         'health_risk_zip.csv', 'housing_risk_zip.csv']

df = reduce(lambda l,r: l.merge(r, on='zip'), [pd.read_csv(f) for f in files])
Sign up to request clarification or add additional context in comments.

Comments

1

You can merge the dataframes one by one:

df1= pd.read_csv('econ_risk_zip.csv', quotechar="'")
df2= pd.read_csv('food_risk_zip.csv', quotechar="'")
df3= pd.read_csv('health_risk_zip.csv', quotechar="'")
df4= pd.read_csv('housing_risk_zip.csv', quotechar="'")

df = df1.merge(df2, on="zip").merge(df3, on="zip").merge(df4, on="zip")
df.to_csv('risk_combined.csv')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.