1

I have a imported a time series csv file into a pandas DataFrame, however there is a quirk in the time from the file. Midnight is set as '24:00:00', not '00:00:00' (which is how pythons datetime likes it).

To create a datetime column in pandas I've done the following (both 'Date' and 'Time' are strings):

df['Date and Time'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])

However, datetime requires the hour of time to be between 0 and 23. I can replace '24:00:00' to '00:00:00' with:

df['Time'].replace('24:00:00', '00:00:00', inplace = True)

But then this in fact the morning of that day, not the night. Ideally I would then add a day to the date except I can't work out how to do this. I want to say "Where 'Time' == '00:00:00' add one day onto the date". I've tried something like this:

df['Date and Time'][df['Time'] == '00:00:00'] = df['Date and Time'[df['Time'] == '00:00:00'] + timedelta(days = 1)

But that doesn't work (and looks horrible).

Any ideas how I can get this to work?

Thanks!

3
  • 2
    Maybe check this post out Commented Nov 3, 2017 at 17:15
  • It seems like I'd have to do that in a loop before then putting it into a pandas column. Certainly doable, just seems like there could be another easier way. Commented Nov 3, 2017 at 17:58
  • Well you could apply it as a function, only to values >= 24. Should not be inefficient at all Commented Nov 3, 2017 at 19:03

2 Answers 2

1

From this answer:

import email.utils as eutils
import time
import datetime
def fix_datetime(d_time):
    ntuple=eutils.parsedate(d_time)
    timestamp=time.mktime(ntuple)
    return datetime.datetime.fromtimestamp(timestamp)

df['Date and Time'] = (df['Date'] + ' ' + df['Time']).apply(fix_datetime)

Resultant column 'Date and Time' is of type datetime64.

If the date is of the form 'YYYY-MM-DD', we first convert it to the RFC 2822 standard like so:

df['Date'] = df['Date'].apply(lambda date: datetime.datetime.strptime(date, '%Y-%m-%d').strftime('%d %b %Y'))
Sign up to request clarification or add additional context in comments.

2 Comments

This doesn't seem to work - eutils.parsdate(d_time) returns None when passed the d_time string. The datetime string is in the format '2015-01-01 01:00:00' and the parsedate can't seem to understand that?
Ah, I assumed it was of the form '01 Jan 2015'. See my edit.
0

I've worked out an way to make this work although I'm not sure its the most elegant. Its kind of based on Sebastians answer so thanks!

def add_day(timestamp):
  if timestamp.hour == 0:
    timestamp = timestamp + timedelta(days = 1 )
  return timestamp

df['Date and Time'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])
df['Date and Time']  = df['Date and Time'].apply(add_day)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.