2

So I am trying to convert 2 columns into 1 datetime column. Input columns look like this:

date            hour
1/1/2015          1
1/1/2015          2
1/1/2015          3

where the values of df.date is a string and the values of df.hour is an int. I am trying to convert these two columns into one such that:

datetime
2015-1-1 1:00:00
2015-1-1 2:00:00
2015-1-1 3:00:00

I thought a simple df['x'] = pd.to_datetime(df[['date', 'hour']] would work but I'm getting a ValueError as a result

3
  • Is the frequency is always 1 hour? Commented Jul 7, 2016 at 2:42
  • @JoeR yes,frequency is always at 1 hour intervals. Sorry, forgot to clarify that. Commented Jul 7, 2016 at 2:46
  • @JoeR. Well, sometimes there could be a lapse from the observations that's why I wanted to convert it from the 2 columns. Commented Jul 7, 2016 at 2:48

3 Answers 3

2

You can paste the two columns together as a single column and then convert with a corresponding format parameter:

pd.to_datetime(df['date'] + ' ' + df['hour'].astype(str), format = "%d/%m/%Y %H")

# 0   2015-01-01 01:00:00
# 1   2015-01-01 02:00:00
# 2   2015-01-01 03:00:00
# dtype: datetime64[ns]
Sign up to request clarification or add additional context in comments.

1 Comment

Woah, Thanks! solution seems simple. Never thought to_datetime will work like that. I got an error with my original data but it worked with some simple data I tested out. I'll just look at my original data first. Thanks for the quick response.
2

Basically, you will need to use pandas.to_datetime and datetime.timedelta.

from datetime import timedelta
df = pd.to_datetime(df['date']) + df['hour'].apply(lambda x: timedelta(hours=int(x)))

Explanation:

from datetime import timedelta
dft['date'] = pd.to_datetime(dft['date'])
dft['hour_h'] = dft['hour'].apply(lambda x: timedelta(hours=int(x)))
dff = dft['date']+dft['hour_h']

Output:

dff
Out[42]: 
0   2015-01-01 01:00:00
1   2015-01-01 02:00:00
2   2015-01-01 03:00:00
dtype: datetime64[ns]

1 Comment

@kobrakai, You are welcome. Hope it helps, datetime is indedd a powerful package for dealing with all the datetime problems regardless if your are using pandas or not.
1

This is another approach:

In [224]:
df['datetime'] = pd.to_datetime(df['date']) + pd.TimedeltaIndex(df['hour'], unit='h')
df

Out[224]:
       date  hour            datetime
0  1/1/2015     1 2015-01-01 01:00:00
1  1/1/2015     2 2015-01-01 02:00:00
2  1/1/2015     3 2015-01-01 03:00:00

basically the key difference here is to construct a TimedeltaIndex from the hour column and add this to the converted datetime col result from to_datetime

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.