0

I have a large dataframe that I am trying to edit. I am trying to simplify the timestamp column by removing the date and the seconds so the column only shows a four-digit time e.g. "00:00". So far all I know what to do is use a for loop but I have no clue what condition to apply to this problem

              timestamp  date    activity   Id     total hour activity
720 2003-05-08 00:00:00 2003-05-08  0   condition_1 NaN
721 2003-05-08 00:01:00 2003-05-08  0   condition_1 NaN
722 2003-05-08 00:02:00 2003-05-08  0   condition_1 NaN
723 2003-05-08 00:03:00 2003-05-08  0   condition_1 NaN
724 2003-05-08 00:04:00 2003-05-08  0   condition_1 NaN
... ... ... ... ... ...
10794   2003-05-14 23:54:00 2003-05-14  0   condition_1 NaN
10795   2003-05-14 23:55:00 2003-05-14  12  condition_1 NaN
10796   2003-05-14 23:56:00 2003-05-14  0   condition_1 NaN
10797   2003-05-14 23:57:00 2003-05-14  18  condition_1 NaN
10798   2003-05-14 23:58:00 2003-05-14  0   condition_1 NaN
10079 rows × 5 columns

for i, row in df.iterrows():
if row['timestamp']:
    #delete row
1
  • What condition do you want to apply? Are you sure you need a condition, won't you be changing the value in the timestamp column of each row? Commented Mar 21, 2021 at 17:09

4 Answers 4

2

Here I am assuming that the values of 'timestamp' column is of type string(Object)

You can do this by split() method:-

df['timestamp']=df['timestamp'].str.split(' ',expand=True)[1].str.split(':',1,expand=True)[1]

Explaination:- Firstly we splitted values by ' ' so

2003-05-08 00:00:00 is changed into [2003-05-08,00:00:00] and then we are selecting

'00:00:00' and again using split() method on it and breaking it by ':' so it becomes [00,00:00] and again we are grabbing 00:00 part from it

And If the 'timestamp' column is of datetime dtype then you can use:-

df['timestamp']=df['timestamp'].dt.strftime('%H:%M')
Sign up to request clarification or add additional context in comments.

Comments

0

In order to keep hour and minutes, you could do the following:

df['timestamp'] = pd.to_datetime(df.timestamp)  # this step only if the "timestamp" column is not already datetime

df['timestamp']=df['timestamp'].dt.strftime('%H:%M')  # keeps hours (H) and minutes(M)

Comments

0

You don't need a for loop to grab hours and minutes only. If your column is datetime, then you can create a separate column from your timestamp by calling:

df['Time'] = df.timestamp.apply(lambda x: str(x.time())[:-3])

That will return "00:00" format as a string. Alternatively you can just call

df['Time'] = df.timestamp.apply(lambda x: x.time())

To get "00:00:00" format of type datetime.time.

Comments

0

If you want a column, say 'Time', with only the time from the timestamp you can use this.

df['Time'] = df['timestamp'].dt.time

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.