Pandas - convert dataframe multi-index to datetime object

Question

Consider an input file, b.dat:

string,date,number
a string,2/5/11 9:16am,1.0
a string,3/5/11 10:44pm,2.0
a string,4/22/11 12:07pm,3.0
a string,4/22/11 12:10pm,4.0
a string,4/29/11 11:59am,1.0
a string,5/2/11 1:41pm,2.0
a string,5/2/11 2:02pm,3.0
a string,5/2/11 2:56pm,4.0
a string,5/2/11 3:00pm,5.0
a string,5/2/14 3:02pm,6.0
a string,5/2/14 3:18pm,7.0

I can group monthly totals like so:

b=pd.read_csv('b.dat')
b['date']=pd.to_datetime(b['date'],format='%m/%d/%y %I:%M%p')
b.index=b['date']
bg=pd.groupby(b,by=[b.index.year,b.index.month])
bgs=bg.sum()

The index of the grouped totals looks like:

bgs

            number
2011 2       1
     3       2
     4       8
     5      14
2014 5      13

bgs.index

MultiIndex(levels=[[2011, 2014], [2, 3, 4, 5]],
       labels=[[0, 0, 0, 0, 1], [0, 1, 2, 3, 3]])

I'd like to reformat the index into date time format (days can be first of month).

I've tried the following:

bgs.index = pd.to_datetime(bgs.index)

and

bgs.index = pd.DatetimeIndex(bgs.index)

Both fail. Does anyone know how I can do this?

I get an error if I use this code directly with Pandas 0.13. It breaks on the pd.to_datetime call, claiming that the use of %p is incorrect via KeyError: 'p' in /pandas/tslib.so in pandas.tslib.array_strptime (pandas/tslib.c:20989). — ely
– ely, Commented Jun 6, 2014 at 21:27
In fact, I can reproduce the pandas error with any string needing to parse the 'am' or 'pm'. There must be a bug in handling how that gets passed to strftime or whatever. — ely
– ely, Commented Jun 6, 2014 at 21:33

Andy Hayden · Accepted Answer · 2014-06-06 21:21:49Z

5

Consider resample by 'M' rather than grouping by attributes of the DatetimeIndex:

In [11]: b.resample('M', how='sum').dropna()
Out[11]:
            number
date
2011-02-28       1
2011-03-31       2
2011-04-30       8
2011-05-31      14
2014-05-31      13

Note: you have to drop the NaN if you don't want the months in between.

answered Jun 6, 2014 at 21:21

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Lee Over a year ago

That's great thanks - I'm trying to find more info on the 'rule' parameter. How do you know that 'M' groups by month. I'd like to know what else can it do. There's possibly a search term I don't know in order to find it in the man pages?

Andy Hayden Over a year ago

The keyword is "offset" pandas.pydata.org/pandas-docs/stable/… :)

ely · Accepted Answer · 2014-06-06 21:20:14Z

4

You can create a column from the index via the date calculation you want, then set that as the index:

bgs['expanded_date'] = bgs.index.map(lambda x: datetime.date(x.year, x.month, 1))
bgs.set_index('expanded_date')

answered Jun 6, 2014 at 21:20

ely

77.8k36 gold badges158 silver badges234 bronze badges

Collectives™ on Stack Overflow

Pandas - convert dataframe multi-index to datetime object

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related