I have the following dataset
OPEN TIME CLOSE TIME
0 09:44:00 10:07:00
1 10:07:00 11:01:00
2 11:05:00 13:05:00
But here the timestamps are in string format, how can I convert them to time format?
to_datetime
df['Open'] = pd.to_datetime(df['OPEN TIME'],format= '%H:%M:%S' ).dt.time
df['Close'] = pd.to_datetime(df['CLOSE TIME'],format= '%H:%M:%S' ).dt.time
.dt.time extracts the hour/min/secs part of the date. This is needed since to_datetime will also introduce year/month/day info.It's possible to convert both columns in a one-liner using apply. Try:
df = df.assign(**df[['OPEN TIME', 'CLOSE TIME']].apply(pd.to_datetime, format='%H:%M:%S'))
To get the times without dates, use the following:
# assign back to the columns ---- sometimes, this case throws a SettingWithCopyWarning if `df` was filtered from another frame
df[['OPEN TIME', 'CLOSE TIME']] = df[['OPEN TIME', 'CLOSE TIME']].apply(lambda x: pd.to_datetime(x, format='%H:%M:%S').dt.time)
# or call assign and create a new dataframe copy ---- this case never throws a warning
df = df.assign(**df[['OPEN TIME', 'CLOSE TIME']].apply(lambda x: pd.to_datetime(x, format='%H:%M:%S').dt.time))
This converts each string into datetime.time objects. However, because datetime.time doesn't have a corresponding pandas dtype, it's difficult to leverage vectorized operations. For example, it's not possible to find time difference between OPEN TIME and CLOSE TIME as datetime.time objects (so there's not much improvement from strings) but if they were datetime64, it's possible. For example, the following creates datetime64:
df1 = df.assign(**df[['OPEN TIME', 'CLOSE TIME']].apply(pd.to_datetime, format='%H:%M:%S'))
df1['CLOSE TIME'] - df1['OPEN TIME']
0 0 days 00:23:00
1 0 days 00:54:00
2 0 days 02:00:00
dtype: timedelta64[ns]