For multiple csv files in a folder, I hope to loop all files ends with csv and merge as one excel file, here I give two examples:
first.csv
date a b
0 2019.1 1.0 NaN
1 2019.2 NaN 2.0
2 2019.3 3.0 2.0
3 2019.4 3.0 NaN
second.csv
date c d
0 2019.1 1.0 NaN
1 2019.2 5.0 2.0
2 2019.3 3.0 7.0
3 2019.4 6.0 NaN
4 2019.5 NaN 10.0
...
My desired output is like this, merging them based on date:
date a b c d
0 2019/1/31 1.0 NaN 1.0 NaN
1 2019/2/28 NaN 2.0 5.0 2.0
2 2019/3/31 3.0 2.0 3.0 7.0
3 2019/4/30 3.0 NaN 6.0 NaN
4 2019/5/31 NaN NaN NaN 10.0
I have edited the following code, but obviously there are some parts about date convert and merge dfs are incorrect:
import numpy as np
import pandas as pd
import glob
dfs = pd.DataFrame()
for file_name in glob.glob("*.csv"):
# print(file_name)
df = pd.read_csv(file_name, engine='python', skiprows=2, encoding='utf-8')
df = df.dropna()
df = df.dropna(axis = 1)
df['date'] = pd.to_datetime(df['date'], format='%Y.%m')
...
dfs = pd.merge(df1, df2, on = 'date', how= "outer")
# save the data frame
writer = pd.ExcelWriter('output.xlsx')
dfs.to_excel(writer,'sheet1')
writer.save()
Please help me. Thank you.
merge, so no result so far. Forconcatenate, the link from here is useful: stackoverflow.com/questions/56033013/…