83

I have a datetime column Date_Time that I wish to groupby without creating a new column. Is this possible? I tried the following and it does not work.

df = pd.groupby(df,by=[df['Date_Time'].date()])

3 Answers 3

107

You can use groupby by dates of column Date_Time by dt.date:

df = df.groupby([df['Date_Time'].dt.date]).mean()

Sample:

df = pd.DataFrame({'Date_Time': pd.date_range('10/1/2001 10:00:00', periods=3, freq='10H'),
                   'B':[4,5,6]})

print (df)
   B           Date_Time
0  4 2001-10-01 10:00:00
1  5 2001-10-01 20:00:00
2  6 2001-10-02 06:00:00

print (df['Date_Time'].dt.date)
0    2001-10-01
1    2001-10-01
2    2001-10-02
Name: Date_Time, dtype: object

df = df.groupby([df['Date_Time'].dt.date])['B'].mean()
print(df)
Date_Time
2001-10-01    4.5
2001-10-02    6.0
Name: B, dtype: float64

Another solution with resample:

df = df.set_index('Date_Time').resample('D')['B'].mean()

print(df)
Date_Time
2001-10-01    4.5
2001-10-02    6.0
Freq: D, Name: B, dtype: float64
Sign up to request clarification or add additional context in comments.

Comments

92

resample

df.resample('D', on='Date_Time').mean()

              B
Date_Time      
2001-10-01  4.5
2001-10-02  6.0

Grouper

As suggested by @JosephCottam

df.set_index('Date_Time').groupby(pd.Grouper(freq='D')).mean()

              B
Date_Time      
2001-10-01  4.5
2001-10-02  6.0

Deprecated uses of TimeGrouper

You can set the index to be 'Date_Time' and use pd.TimeGrouper

df.set_index('Date_Time').groupby(pd.TimeGrouper('D')).mean().dropna()

              B
Date_Time      
2001-10-01  4.5
2001-10-02  6.0

4 Comments

This is great! How do i prevent it from adding dates that there are no data for? For example if i had data for days 9/1,9/2,and 9/4 it still has 9/3 in there with NaN values.
@GoBlue_MathMan Use .dropna()
Here, when grouping by 'hour', it adds hours that did not exist in the source file with zero values.
You can avoid .set_index('Date_Time') by doing pd.Grouper(key='Date_Time', freq='D'). Could be useful if the index is significant.
7
df.groupby(pd.Grouper(key='Date_Time', axis=0, freq='M')).sum()
  • M for month
  • Y for year
  • D for day

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.