3

I currently have a bunch of times in a column written like "27:32:18", meaning someone was waiting for 27 hours, 32 minutes, and 18 seconds. I keep getting "ValueError: hour must be in 0..23" whenever I try to parse these values.

How should I go about parsing those values or converting them to a more standard format? I tried the following as a test on a single value:

time1 = "56:42:12"
time2 = time1.split(':')
time2 = [int(n) for n in time2]
time2.insert(0, time2[0] // 24)
time2[1] %= 24

At that point, time2 is a list consisting of [2, 8, 42, 12], which is equivalent to 2 days, 8 hours, 42 minutes, and 12 seconds. How would I go about converting that to a Python datetime representation in days, hours, minutes, and seconds in a way that will allow Python to parse it? Note that I will eventually be doing unsupervised clustering on these time values, which represent waiting times.

4
  • 3
    That is not a date. A time duration is a different beast altogether. Commented Jun 26, 2014 at 14:02
  • 1
    A python datetime object can only be used to represent a specific point in time; e.g. the 31st of July 1889. Durations are expressed in timedelta() objects. Commented Jun 26, 2014 at 14:05
  • 1
    An interval needs to have a reference from which it is a delta. Just saying "2 days, 8 hours, 42 minutes" without specifying the since part, will make the date part difficult. If you just want to calculate time durations, this (like Martijn said), is another thing. Commented Jun 26, 2014 at 14:05
  • Thank you! So work with timedeltas instead of datetimes. Commented Jun 26, 2014 at 14:29

1 Answer 1

11

You don't have a date, you have a time duration. That may be related to dates and timestamps, but only in that the same units of time are involved and are displayed similarly to timestamps.

As such, you cannot use dateutil for parsing such values. It is easy enough to split out and parse yourself:

hours, minutes, seconds = map(int, time1.split(':'))

You can then use a datetime.timedelta() object to represent the duration:

td = datetime.timedelta(hours=hours, minutes=minutes, seconds=seconds)

This'll then track the delta in terms of days, seconds and microseconds:

>>> import datetime
>>> time1 = "56:42:12"
>>> hours, minutes, seconds = map(int, time1.split(':'))
>>> datetime.timedelta(hours=hours, minutes=minutes, seconds=seconds)
datetime.timedelta(2, 31332)
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks! Although for whatever reason it's not letting me actually change the stored values. I have the following for loop. Please note that it's Pandas code: for time in df.elapsed_time: hours, minutes, seconds = map(int, time.split(':')) time = timedelta(hours=hours, minutes=minutes, seconds=seconds)
Why do you need to change the stored values? What values are you trying to change?
I need to change the stored values so I can do data analysis on waiting times for patients in an emergency room. There is other pertinent patient information so I need the updated values to be in the large data frame I am working with. I am trying to convert values stored like "54:23:45" to timedeltas like (2, 31332) so I can eventually do some clustering in scikit-learn.
Hi @BrandonSherman you can use df['elapsed_time_new'] = pd.to_timedelta(df.elapsed_time) to create a new column in your DataFrame that will contain timedelta objects that you can manipulate as you need for your analysis. The documentation for to_timedelta can be found here. Note that these objects are broken down into days, hours, minutes, seconds so you may need some further work to break it down into (days, seconds) which is what you seem to want.
That's really unfortunate :( what you can do instead then is write a function (let's call it convert to keep it short) which takes one of your strings and then converts it and returns a timedelta object then you can do df['elapsed_time_new'] = df['elapsed_time'].apply(convert) and it will apply your convert function to each string in your column and place it in your new column. You can probably use the method that Martijn has given in his answer above to write such a function.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.