I have a problem loading a dataframe from csv when I have a multiindex with more than one date in it.
I am running the following code:
import pandas as pd
import datetime
date1 = datetime.date.today()
date2 = datetime.date.today().replace(month=1)
date_cols=['date1', 'date2']
index = pd.MultiIndex.from_product([[date1],[date2]])
#create dataframe with a single row
df= pd.DataFrame([{'date1':date1, 'date2':date2, 'a':1, 'b':2}])
df.set_index(date_cols, inplace=True)
#print the single row -> correct
print df.loc[index]
# write to csv and load it again
df.to_csv('df.csv')
dfr = pd.read_csv('df.csv', parse_dates=date_cols, dayfirst=True)
dfr.set_index(date_cols, inplace=True)
# print the single row -> incorrect, shows nan,
print dfr.loc[index]
Whilst I expect to get the same output, i.e. the single row in the dataframe, the second print statement prints out nan, because the index is not in the dataframe. When running df.index, I see that the multiindex object contains the two dates, but now also holds time information, where the time is 00:00:00
Is this a bug?