0

i got a list of dates, like below:

date_list = ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021', '4. Okt 2021', '5. Okt 2021', '6. Okt 2021', '24. Sep 2021', '25. Sep 2021', '26. Sep 2021']

i want to transform into datetime

dates = [datetime.strptime(x,"%d %b %Y") for x in date_list]

Output is:

Traceback (most recent call last):
  File "c:/Users/Benutzt/Desktop/web_scraping/main.py", line 27, in <module>
    dates = [datetime.strptime(x,"%d %M %Y") for x in date_list]
  File "c:/Users/Benutzt/Desktop/web_scraping/main.py", line 27, in <listcomp>
    dates = [datetime.strptime(x,"%d %M %Y") for x in date_list]
  File "C:\Users\Benutzt\anaconda3\lib\_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "C:\Users\Benutzt\anaconda3\lib\_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '1. Okt 2021' does not match format '%d %b %Y'
6
  • What month is Okt? Is that English? Commented Oct 1, 2021 at 11:36
  • oktober in german Commented Oct 1, 2021 at 11:38
  • also, there is not 31st and 32nd of September, could you please clarify? Commented Oct 1, 2021 at 11:39
  • it is only an example because the data is much larger. Error on my part I rewrite it. Commented Oct 1, 2021 at 11:42
  • 1
    Why tag pandas? The values are in a dataframe? Commented Oct 1, 2021 at 11:47

3 Answers 3

2

For language specific month (or day) names, you can set the locale, e.g. German

import locale
locale.setlocale(locale.LC_TIME, 'de_de') # locale (2nd parameter) is platform-specific !

For a list of valid date inputs, this gives for example

from datetime import datetime
date_list = ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021', '4. Okt 2021', '5. Okt 2021', '6. Okt 2021', '30. Sep 2021']
dates = [datetime.strptime(x, "%d. %b %Y") for x in date_list]

print(dates)
[datetime.datetime(2021, 10, 1, 0, 0), datetime.datetime(2021, 10, 2, 0, 0), datetime.datetime(2021, 10, 3, 0, 0), datetime.datetime(2021, 10, 4, 0, 0), datetime.datetime(2021, 10, 5, 0, 0), datetime.datetime(2021, 10, 6, 0, 0), datetime.datetime(2021, 9, 30, 0, 0)]

Side-note: The locale setting also makes it work in pandas:

import pandas as pd
df = pd.DataFrame({'dates': date_list})
df['dates'] = pd.to_datetime(df['dates'], format="%d. %b %Y")

df['dates']
0   2021-10-01
1   2021-10-02
2   2021-10-03
3   2021-10-04
4   2021-10-05
5   2021-10-06
6   2021-09-30
Name: dates, dtype: datetime64[ns]
Sign up to request clarification or add additional context in comments.

7 Comments

TypeError: strptime() argument 1 must be str, not float
@mika cannot reproduce. the example works fine for me... please check the state of your variable "date_list" or run my example in a clean namespace to get the idea how this works.
yea i see, thank you! can i upload the .csv file ? where I got the data
solution is to import locale big thanks!!!
|
1

You can use dateparser package:

# Python env: pip install dateparser
# Anaconda env: conda install dateparser
from dateparser import parse

df = pd.DataFrame({'Date': ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021',
                            '4. Okt 2021', '5. Okt 2021', '6. Okt 2021',
                            '24. Sep 2021', '25. Sep 2021', '26. Sep 2021']})

df['Date'] = df['Date'].apply(parse, languages=['de'])
print(df)

# Output:
0   2021-10-01
1   2021-10-02
2   2021-10-03
3   2021-10-04
4   2021-10-05
5   2021-10-06
6   2021-09-24
7   2021-09-25
8   2021-09-26
Name: Date, dtype: datetime64[ns]

For a list:

date_list = ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021',
             '4. Okt 2021', '5. Okt 2021', '6. Okt 2021',
             '24. Sep 2021', '25. Sep 2021', '26. Sep 2021']

dates = [parse(d, languages=['de']) for d in date_list]
print(dates)

# Output:
[datetime.datetime(2021, 10, 1, 0, 0),
 datetime.datetime(2021, 10, 2, 0, 0),
 datetime.datetime(2021, 10, 3, 0, 0),
 datetime.datetime(2021, 10, 4, 0, 0),
 datetime.datetime(2021, 10, 5, 0, 0),
 datetime.datetime(2021, 10, 6, 0, 0),
 datetime.datetime(2021, 9, 24, 0, 0),
 datetime.datetime(2021, 9, 25, 0, 0),
 datetime.datetime(2021, 9, 26, 0, 0)]

2 Comments

@mika. Even if you already chose the right answer for you, can you check my solution. dateparser is a very useful module.
convenient... gave it to %timeit and this runs about 25x slower than pd.to_datetime with format specified, so traded for efficiency I guess.
0

It looks like the first part of your date is an ID, e.g. the order of the item in a list. If so, you'll need to remove it before converting the dates. Also, Okt will not match the %b format. You'll need to convert it to Oct.

dates = [datetime.strptime(x.split(".")[-1].strip(), "%b %Y") for x in date_list]

2 Comments

that just ignores the day, no?
@MrFuppes, if you look at the edits to the question, you will see that the original example included the following two date strings: 31. Sep 2021 and 32. Sep 2021. As these are not valid days, I assumed that the first part of the strings were not dates, but were in fact from a list of some type. Perhaps they copied from another document, which could have entries like 88. Dec 2021 and 245. Oct 2022.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.