7

How can I parse the foll. in python to extract the year:

'years since 1250-01-01 0:0:0'

The answer should be 1250

3 Answers 3

25

There are all sorts of ways to do it, here are several options:

  • dateutil parser in a "fuzzy" mode:

    In [1]: s = 'years since 1250-01-01 0:0:0'
    
    In [2]: from dateutil.parser import parse
    
    In [3]: parse(s, fuzzy=True).year  # resulting year would be an integer
    Out[3]: 1250
    
  • regular expressions with a capturing group:

    In [2]: import re
    
    In [3]: re.search(r"years since (\d{4})", s).group(1)
    Out[3]: '1250'
    
  • splitting by "since" and then by a dash:

    In [2]: s.split("since", 1)[1].split("-", 1)[0].strip()
    Out[2]: '1250'
    
  • or may be even splitting by the first dash and slicing the first substring:

    In [2]: s.split("-", 1)[0][-4:]
    Out[2]: '1250'
    

The last two involve more "moving parts" and might not be applicable depending on possible variations of the input string.

Sign up to request clarification or add additional context in comments.

2 Comments

Didn't know about "fuzzy". Neat.
Neat! I didn't know about this one either.
5

You can use a regex with a capture group around the four digits, while also making sure you have a particular pattern around it. I would probably look for something that:

  • 4 digits and a capture (\d{4})

  • hyphen -

  • two digits \d{2}

  • hyphen -

  • two digits \d{2}

Giving: (\d{4})-\d{2}-\d{2}

Demo:

>>> import re
>>> d = re.findall('(\d{4})-\d{2}-\d{2}', 'years since 1250-01-01 0:0:0')
>>> d
['1250']
>>> d[0]
'1250'

if you need it as an int, just cast it as such:

>>> int(d[0])
1250

1 Comment

You don't need the \s in the beginning.
2

The following regex should make the four digit year available as the first capture group:

^.*\(d{4})-\d{2}-\d{2}.*$

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.