0

I have a bunch of time durations in a list as follows

['23m3s', '23:34', '53min 3sec', '2h 3m', '22.10', '1:23:33', ...]

A you can guess, there are N permutations of time formatting being used.

What is the most efficient or simplest way to extract duration in seconds from each element in Python?

3
  • :-O But they are totaly random? I mean, for example, 23:34 what is? 23h and 34min? Or 1:23:33? Is like 1day 23hour 33min, or 1h 23min 33sec? Commented Jan 6, 2014 at 1:26
  • you will have to write the strptime format for each one and parse them in a loop. Commented Jan 6, 2014 at 1:34
  • @maurelio79 23:34 is 23m 34s and 1.23.33 is 1h 23m 33s. Let's assume this is the case always. Commented Jan 6, 2014 at 8:37

1 Answer 1

2

This is perhaps still a bit crude, but it seems to do the trick for all the data you've posted so far. The second totals all come to what I would expect. A combination of re and timedelta seems to do the trick for this small sample.

>>> import re
>>> from datetime import timedelta

First a dictionary of regexes: UPDATED BASED ON YOUR COMMENT

d = {'hours': [re.compile(r'(\d+)(?=h)'), re.compile(r'^(\d+)[:.]\d+[:.]\d+')],
     'minutes': [re.compile(r'(\d+)(?=m)'), re.compile(r'^(\d+)[:.]\d+$'),
     re.compile(r'^\d+[.:](\d+)[.:]\d+')], 'seconds': [re.compile(r'(\d+)(?=s)'),
     re.compile(r'^\d+[.:]\d+[.:](\d+)'), re.compile(r'^\d+[:.](\d+)$')]}

Then a function to try out the regexes (perhaps still a bit crude):

>>> def convert_to_seconds(*time_str):
    timedeltas = []
    for t in time_str:
        td = timedelta(0)
        for key in d:
            for regex in d[key]:
                if regex.search(t):
                    if key == 'hours':
                        td += timedelta(hours=int(regex.search(t).group(1)))
                    elif key == 'minutes':
                        td += timedelta(seconds=int(regex.search(t).group(1)) * 60)
                    elif key == 'seconds':
                        td += timedelta(seconds=int(regex.search(t).group(1)))
        print(td.seconds)

Here are the results:

>>> convert_to_seconds(*t)
1383
1414
3183
7380
1330
5013

You could add more regexes as you encounter more data, but only to an extent.

Sign up to request clarification or add additional context in comments.

3 Comments

This is good stuff. And I did explore down this path, however I had to keep adding to the regex dictionary. I'll accept this unless I stumble upon a more elegant solution by the time I actually have to use it... Thanks
23:34 is 23 minutes and 34 seconds, not 23 hours and 34 minutes. Same for 22.10.
@m42 Thanks for pointing that out. I missed the OP's comment to this effect. I have updated the regexes and posted the new results.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.