I have a DataFrame that contains the price of a ticker like this:
date price
2006-01-02 09:00:00 6548.34
2006-01-02 09:01:00 6457.61
2006-01-02 09:02:00 6452.82
2006-01-02 09:03:00 6452.39
2006-01-02 09:04:00 6451.61
...
2021-01-29 13:29:40 15229.54
2021-01-29 13:29:45 15229.54
2021-01-29 13:29:50 15229.54
2021-01-29 13:29:55 15229.54
2021-01-29 13:30:00 15138.31
How can I remove the hour, minute, and second parts in the date?
I mean I want something like this.
date price
2006-01-02 6577.33
2006-01-03 6488.62
2006-01-04 6469.82
2006-01-05 6555.39
2006-01-06 6448.65
...
2021-01-25 15229.53
2021-01-26 15129.34
2021-01-27 14129.64
2021-01-28 15228.34
2021-01-29 15138.31
I only want the last price of each day.
So the thing I want to do is to remove all the other prices of each day.
How can I achieve this?
My df.index returns the following
>>> df.index
DatetimeIndex(['2006-01-02 09:00:00', '2006-01-02 09:01:00',
'2006-01-02 09:02:00', '2006-01-02 09:03:00',
'2006-01-02 09:04:00', '2006-01-02 09:05:00',
'2006-01-02 09:06:00', '2006-01-02 09:07:00',
'2006-01-02 09:08:00', '2006-01-02 09:09:00',
...
'2021-01-29 13:29:15', '2021-01-29 13:29:20',
'2021-01-29 13:29:25', '2021-01-29 13:29:30',
'2021-01-29 13:29:35', '2021-01-29 13:29:40',
'2021-01-29 13:29:45', '2021-01-29 13:29:50',
'2021-01-29 13:29:55', '2021-01-29 13:30:00'],
dtype='datetime64[ns]', name='date', length=6345485, freq=None)
df.columns returns
>>> df.columns
Index(['price'], dtype='object', name='stock_id')
My pd.to_datetime(df.index, format='%Y%m%d') returns
>>> pd.to_datetime(df.index)
DatetimeIndex(['2006-01-02 09:00:00', '2006-01-02 09:01:00',
'2006-01-02 09:02:00', '2006-01-02 09:03:00',
'2006-01-02 09:04:00', '2006-01-02 09:05:00',
'2006-01-02 09:06:00', '2006-01-02 09:07:00',
'2006-01-02 09:08:00', '2006-01-02 09:09:00',
...
'2021-01-29 13:29:15', '2021-01-29 13:29:20',
'2021-01-29 13:29:25', '2021-01-29 13:29:30',
'2021-01-29 13:29:35', '2021-01-29 13:29:40',
'2021-01-29 13:29:45', '2021-01-29 13:29:50',
'2021-01-29 13:29:55', '2021-01-29 13:30:00'],
dtype='datetime64[ns]', name='date', length=6345485, freq=None)
This is so strange. Because the to_datetime() conversion doesn't seem to work.