I am trying to combine two csv, namely to update the csv consisting of older data (old.csv) if a new one exists in the csv of newer data (new.csv). Both have the same number of columns (headings) and can be identified by an unique id.
old.csv
id,description,listing,url,default
2471582,spacex,536,www.spacex.com,0
3257236,alibaba,875,www.alibaba.com,0
3539697,ethihad,344,www.etihad.com,0
2324566,pretzel,188,www.example.com,1
new.csv
id,description,listing,url,default
2471582,spacex,888,www.spacex.com,0
3539697,ethihad,348,www.etihad.com,0
2324566,pretzel,396,www.pretzelshopexample12345.com,1
Here is what I have tried so far in Python & Pandas:
import pandas as pd
f1 = pd.read_csv('old.csv', delimiter=',')
f2 = pd.read_csv('new.csv', delimiter=',')
with open('final.csv', 'w', encoding='utf-8', newline='') as out:
pd.merge(f1, f2, on='id', how='left').to_csv(out, sep=',', index=False)
Current output:
id,description_x,listing_x,url_x,default_x,description_y,listing_y,url_y,default_y
2471582,spacex,536,www.spacex.com,0,spacex,888.0,www.spacex.com,0.0
3257236,alibaba,875,www.alibaba.com,0,,,,
3539697,ethihad,344,www.etihad.com,0,ethihad,348.0,www.etihad.com,0.0
2324566,pretzel,188,www.example.com,1,pretzel,396.0,www.pretzelshopexample12345.com,1.0
What I am trying to achieve:
id,description,listing,url,default
2471582,spacex,888,www.spacex.com,0
3257236,alibaba,875,www.alibaba.com,0
3539697,ethihad,344,www.etihad.com,0
2324566,pretzel,396,www.pretzelshopexample12345.com,1
So I was wondering how can I use pandas to merge the two csv based on id to replace the whole row if a newer data exists in the new.csv, while keeping the remaining rows in old.csv? Thank you in advance for any help on this