20

I have a pandas.dataframe like this ('col' column has two formats):

    col                            val
'12/1/2013'                       value1
'1/22/2014 12:00:01 AM'           value2
'12/10/2013'                      value3
'12/31/2013'                      value4 

I want to convert them into datetime, and I am considering using:

test_df['col']= test_df['col'].map(lambda x: datetime.strptime(x, '%m/%d/%Y'))    
test_df['col']= test_df['col'].map(lambda x: datetime.strptime(x, '%m/%d/%Y %H:%M %p'))

Obviously either of them works for the whole df. I'm thinking about using try and except but didn't get any luck, any suggestions?

4
  • 1
    for item in test_df.col: test_df.col = datetime.strptime(test_df.col, '%m/%d/%Y') Commented Jun 30, 2015 at 20:08
  • Are you referring to pandas dataframes? Commented Jun 30, 2015 at 20:11
  • @Christopher Pearson oh right! You mean for each item, try and except, right? THANKS! Commented Jun 30, 2015 at 20:12
  • @TigerhawkT3 Yes! pandas. Sorry about not mentioning it... I have updated my question, thanks. Commented Jun 30, 2015 at 20:13

4 Answers 4

19

Just use to_datetime, it's man/woman enough to handle both those formats:

In [4]:
df['col'] = pd.to_datetime(df['col'])
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4 entries, 0 to 3
Data columns (total 2 columns):
col    4 non-null datetime64[ns]
val    4 non-null object
dtypes: datetime64[ns](1), object(1)
memory usage: 96.0+ bytes

The df now looks likes this:

In [5]:
df

Out[5]:
                  col     val
0 2013-12-01 00:00:00  value1
1 2014-01-22 00:00:01  value2
2 2013-12-10 00:00:00  value3
3 2013-12-31 00:00:00  value4
Sign up to request clarification or add additional context in comments.

3 Comments

I guess the problem with this solution is that you provide a format to speed things up - to_datetime without a format is very slow.
@Ian true but if you don't have a fixed format then you have to do it this way
@EdChum - I think if you have a reasonable amount of data, to_datetime is too slow. I've added my answer below.
17

I had two different date formats in the same column Temps, similar to the OP, which look like the following;

01.03.2017 00:00:00.000
01/03/2017 00:13

The timings are as follows for the two different code snippets;

v['Timestamp1'] = pd.to_datetime(v.Temps)

Took 25.5408718585968 seconds

v['Timestamp'] = pd.to_datetime(v.Temps, format='%d/%m/%Y %H:%M', errors='coerce')
mask = v.Timestamp.isnull()
v.loc[mask, 'Timestamp'] = pd.to_datetime(v[mask]['Temps'], format='%d.%m.%Y %H:%M:%S.%f',
                                             errors='coerce')

Took 0.2923243045806885 seconds

In other words, if you have a small number of known formats for your datetimes, don't use to_datetime without a format!

1 Comment

This is a nice solution, knowing the pains of real world data you could even add a check statement where if there are still nulls after iterating over the known date formats, apply the generic to_datetime to the remaining values. As well as the speed improvement above, this will help minimise the risk of days/months being confused and erroneous results being produced.
1

You can create a new column :

test_df['col1'] = pd.Timestamp(test_df['col']).to_datetime()

and then drop col and rename col1.

Comments

1

It works for me. I had two formats in my column 'fecha_hechos'. The formats where:

  • 2015/03/02
  • 10/02/2010

what I did was:

carpetas_cdmx['Timestamp'] = pd.to_datetime(carpetas_cdmx.fecha_hechos, format='%Y/%m/%d %H:%M:%S', errors='coerce')
mask = carpetas_cdmx.Timestamp.isnull()
carpetas_cdmx.loc[mask, 'Timestamp'] = pd.to_datetime(carpetas_cdmx[mask]['fecha_hechos'], format='%d/%m/%Y %H:%M',errors='coerce')

were: carpetas_cdmx is my DataFrame and fecha_hechos the column with my formats

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.