2

I have a pandas dataframe like this (with timestamp converted to datetime object):

    id  timestamp
0   221 2020-11-07 12:02:00
1   223 2020-11-08 13:21:00
2   224 2020-11-09 12:50:00
3   225 2020-11-10 14:23:00
4   226 2020-11-11 12:25:00
5   227 2020-11-14 14:26:00

I want to find number of rows between a time interval. e.g 12:00-13:00, here it would be 3 (entries 0, 2 and 4)

5
  • 2
    Looks like between_time will be helpfull Commented Mar 26, 2020 at 21:07
  • 1
    Here's an example of between_time from SO stackoverflow.com/questions/35052691/… Commented Mar 26, 2020 at 21:12
  • Have you tried anything, done any research? Commented Mar 26, 2020 at 21:27
  • Does this answer your question? Select DataFrame rows between two dates Commented Mar 26, 2020 at 21:28
  • between_time would work. Just set timestamp as index. Commented Mar 26, 2020 at 21:30

4 Answers 4

3

So as I commented, I think you can just use between_time function:

CountRows = df.set_index('timestamp').between_time('12:00','13:00').shape[0]

This, in your case, sets column timestamp as index and then returns the count of rows between the two timevalues. Where:

"The first element of the tuple returned by Dataframe.shape contains the number of items in index in a dataframe i.e. basically the number of rows in the dataframe. Source

Sign up to request clarification or add additional context in comments.

1 Comment

and if you need the results, just remove .shape[0]
1

Please try

Coerce timestamp to datetime and sort ascending

  df['timestamp']=pd.to_datetime(df['timestamp']).sort_values(ascending=True)

Reset index but do not drop to keep id

df.reset_index(drop=False, inplace=True)

Set timestamp as new index to allow use of df.betweeen time

df.set_index(df['timestamp'], inplace=True)
df.between_time('12:00', '13:00')

Comments

1

Since I believe that suggested between_time works only for DatetimeIndex, you can either set your DataFrame index to 'timestamp' and then use between_time or, alternatively, first, use sorting

df.sort_values(by='timestamp',axis='columns',inplace=True)

Then use sorted search

start = df['timestamp'].searchsorted(pd.Timestamp('2020-11-07 12:00:00'), side='left')
end = df['timestamp'].searchsorted(pd.Timestamp('2020-11-07 13:00:00'), side='right')

Then find number of rows

count = start - end

Comments

0

As suggested by posts in Comments section between_time works well. One need to make timestamp (which is a datetime64 object here) a index first, then use the between_time function.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.