I am trying to merge two csv files with a common id column and write the merge to a new file. I have tried the following but it is giving me an error -
import csv
from collections import OrderedDict
filenames = "stops.csv", "stops2.csv"
data = OrderedDict()
fieldnames = []
for filename in filenames:
with open(filename, "rb") as fp: # python 2
reader = csv.DictReader(fp)
fieldnames.extend(reader.fieldnames)
for row in reader:
data.setdefault(row["stop_id"], {}).update(row)
fieldnames = list(OrderedDict.fromkeys(fieldnames))
with open("merged.csv", "wb") as fp:
writer = csv.writer(fp)
writer.writerow(fieldnames)
for row in data.itervalues():
writer.writerow([row.get(field, '') for field in fieldnames])
Both files have the "stop_id" column but I'm getting this error back - KeyError: 'stop_id'
Any help would be much appreciated.
Thanks
data.setdefault(row["stop_id"], {}).update(row)- why so complex?pandas.merge, see here pandas.pydata.org/pandas-docs/stable/…