I'm working with 5-min level data that only includes timestamps between 09:30 and 16:00. (dateTime is saved as column not as index)
after applying operation to the group, I get additional data beyond the expected time frame 16:00
even though there’s no data beyond 16:00. These groups appear with empty values or NaNs.
Here is my code:
filtered = self.df.groupby(pd.Grouper(key = 'Datetime', freq = '30min', origin = 'start'))
self.other_df['Avg_volatility'] = filtered['volatility'].mean()
(The original data doesn't even include dateTime beyond 16:00 for each date. I suspect it has to deal with freq = 30min?)
Here's the original data frame (df):
Datetime Open Close High Low Volume VWAP_Price Volatility
0 2025-06-10 09:30:00-04:00 200.600006 202.136993 203.149994 200.570007 4374637 201.614250 2.579987
1 2025-06-10 09:35:00-04:00 202.134995 202.139999 202.389999 201.695007 1077512 202.090000 0.694992
2 2025-06-10 09:40:00-04:00 202.139999 201.798996 202.324997 201.434998 897000 201.924747 0.889999
.. ... ... ... ... ... ... ... ...
78 2025-06-10 15:55:00-04:00 201.804993 201.934998 202.020004 201.380005 805672 201.785000 0.639999
79 2025-06-11 09:30:00-04:00 201.927200 202.869995 203.110001 201.865005 1176969 202.443050 1.244995
this is my output of other_df:
Avg_Volatility
Datetime
2025-06-10 09:30:00-04:00 1.146612
2025-06-10 10:00:00-04:00 0.556870
...
2025-06-10 15:00:00-04:00 0.259351
2025-06-10 15:30:00-04:00 0.317085
2025-06-10 16:00:00-04:00 NaN
2025-06-10 16:30:00-04:00 NaN
2025-06-10 17:00:00-04:00 NaN
2025-06-10 17:30:00-04:00 NaN
2025-06-10 18:00:00-04:00 NaN
Thank you,
I'm working with 5-min level data that only includes timestamps between 09:30 and 18:00.
After applying operation to the group, I get additional data like 18:00, 18:30, etc., even though there’s no data beyond 18:00 in my original data frame (df). These groups appear with empty values or NaNs.
I don't want to keep data that go beyond date frame in range from 09:00:00 to 16:00:00.
other_dfor simply drop the rows with null?