2

I have two columns StartTime and EndTime, I need to select events occurring between 7-9 and 18-20. What I tried so far is this:

+------------+--------------------------------+-------------------------------+
|            |                StartTime       |            EndTime            |
+------------+--------------------------------+-------------------------------+
|        25  | 2018-05-17 11:52:21.769491600  | 2018-05-17 23:08:35.731376400 |
|        32  | 2018-05-19 14:22:24.141359000  | 2018-05-19 18:37:04.003643800 |
|        42  | 2018-05-22 08:25:01.015975500  | 2018-05-22 22:32:34.249869500 |
|        43  | 2018-05-22 08:46:06.187427200  | 2018-05-22 21:29:17.397438000 |
|        44  | 2018-05-22 13:38:37.289871700  | 2018-05-22 18:38:36.498623500 |
+------------+--------------------------------+-------------------------------+

I extracted hours from data used them to calculate following

df = df[((df['start_hr']<=7) & (df['end_hr']>=9)) | ((df['start_hr']<=18) & (df['end_hr']>=20))]

Is there a more accurate and fast alternative to it?

1

2 Answers 2

1

It will increase your memory consumption for a while but you can do something like this where you create two temp columns and use "df.query" on them. Make sure to delete the columns later.

df = df.assign(start_hr=df.start_hr.dt.hour, end_hr=df.end_hr.dt.hour)

df.query('(start_hr <= 7  and end_hr >=9) or (start_hr <= 18  and end_hr >=20) ')
Sign up to request clarification or add additional context in comments.

2 Comments

good one, but since you are here, can you tell me if this logic will work if want to get all those events, that occur during this time. e.g. I want all active event during this time. whether it starts before 7 or after like 8 hrs and end after 9 , I modified my query asdf.query('((start_hr <= 7 and end_hr >=9) or start_hr==8) or ((start_hr <= 18 and end_hr >=20) or start_hr==19) ')
@M_S_N yeah, I dont see why not
0

You can use this:


df['start_hr'] = pd.to_datetime(df['start_hr']) 
df['end_hr'] = pd.to_datetime(df['end_hr'])

df['start_hr_day'] = df['start_hr'].dt.day
df['end_hr_day'] = df['start_hr'].dt.day 

df.loc[((df['start_hr_day']<=7) & (df['end_hr_day']>=9))|((df['start_hr_day']<=18) & (df['end_hr_day']>=20))]

1 Comment

i think it will skip the one that occured in between the given ranges i.e for 7-9, 8 will not be included

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.