6

I have a use case where:

Data is of the form: Col1, Col2, Col3 and Timestamp.

Now, I just want to get the counts of the rows vs Timestamp Bins.

i.e. for every half hour bucket (even the ones which have no correponding rows), I need the counts of how many rows are there.

Timestamps are spread over a one year period, so I can't divide it into 24 buckets.

I have to bin them at 30 minutes interval.

1 Answer 1

19

groupby via pd.Grouper

# optionally, if needed
# df['Timestamp'] = pd.to_datetime(df['Timestamp'], errors='coerce')  
df.groupby(pd.Grouper(key='Timestamp', freq='30min')).count()

resample

df.set_index('Timestamp').resample('30min').count()
Sign up to request clarification or add additional context in comments.

2 Comments

@COLDSPEED thanks a lot! it works! what does errors=coerce do? And one more question: resample does it sample all the rows?
@davidnadal it will convert invalid datetime strings to NaT (instead of throwing parser errors). Resample will sample all rows.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.