0

So I am querying a server for specific data, and I need to extract the year, from the date field returned back, however the date field varies for example:

2009
2009-10-8
2009-10
2017-10-22
2017-10

The obvious would be to extract the date into a array and fetch the max: (but there is a problem)

year = max(d.split('-'))

for some reason this gives out false positives as 22 seems to be max verses 2017, also if future calls to the server result in the date being stored as "2019/10/20" this will bring forth issues as well.

1
  • 3
    If it's at least known that the year will always be the first 4 characters: d[:4]. And the reason max doesn't work for you is because you're comparing strings, you need to turn them into ints first: max(map(int, d.split('-'))). Commented May 9, 2020 at 9:52

2 Answers 2

4

The problem is that, while 2017 > 22, '2017' < '22' because it's a string comparison. You could do this to resolve that:

year = max(map(int, d.split('-')))

But instead, if you don't mind being frowned upon by the Long Now Foundation, consider using a regular expression to extract any 4-digit number:

match = re.search(r'\b\d{4}\b', d)
if match:
    year = int(match.group(0))
Sign up to request clarification or add additional context in comments.

1 Comment

Ah, the feared Y10K problem… ;-)))
2

I would use the python-dateutil library to easily extract the year from a date string:

from dateutil.parser import parse

dates = ['2009', '2009-10-8', '2009-10']

for date in dates:
    print(parse(date).year)

Output:

2009
2009
2009

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.