I want to concatenate two earthquake catalogs stored as pandas dataframes.
import pandas as pd
ISC = {'my_index': [0,2,3], 'date': ['2001-03-06', '2001-03-20', '2001-03-30'], 'magnitude': [4.7,4.7,4.9]}
df1 = pd.DataFrame(data=ISC).set_index('my_index')
USGS = {'my_index': [1,4],'date': ['2001-03-20', '2001-03-30'], 'magnitude': [4.8,5]}
df2 = pd.DataFrame(data=USGS).set_index('my_index')
Here is catalog 1 (df1):
my_index date magnitude
0 2001-03-06 4.7
2 2001-03-20 4.7
3 2001-03-30 4.9
And catalog 2 (df2):
my_index date magnitude
1 2001-03-20 4.8
4 2001-03-30 5.0
When concatenating both dataframes (df3=pd.concat([df1,df2],axis=1,join='outer')), this is what I get:
my_index date magnitude date magnitude
0 2001-03-06 4.7 NaN NaN
1 NaN NaN 2001-03-20 4.8
2 2001-03-20 4.7 NaN NaN
3 2001-03-30 4.9 NaN NaN
4 NaN NaN 2001-03-30 5.0
However, after concatenation, I would like quakes happening on the same day to show up on the same line. This is my desired output:
index date magnitude date magnitude
0 2001-03-06 4.7 NaN NaN
1 2001-03-20 4.7 2001-03-20 4.8
2 2001-03-30 4.9 2001-03-30 5.0
Any idea how can I achieve this result?