2

Thanks for taking the time to look at my question.

I try to convert two date columns in a pandas dataframe using the function below. I use this function, because the "Closed Date" has 4221 lines, so it should not crash on the null cells.

Ultimately, the change results into a dataframe of the original row numbers. So, I don't want to loose the rows that have null values at closed dates.

Dataframe overview:

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4272 entries, 0 to 4271
Data columns (total 4 columns):
Created Date    4272 non-null object
Closed Date     4221 non-null object
Agency          4272 non-null object
Borough         4272 non-null object
dtypes: object(4)

designed function:

col='Closed Date'
df[(df[col].notnull())] = df[(df[col].notnull())].apply(lambda    x:datetime.datetime.strptime(x,'%m/%d/%Y %I:%M:%S %p'))

generated error:

TypeError                                 Traceback (most recent call  last)
<ipython-input-155-49014bb3ecb3> in <module>()
      9 
     10 col='Closed Date'
---> 11 df[(df[col].notnull())] = df[(df[col].notnull())].apply(lambda     x:datetime.datetime.strptime(x,'%m/%d/%Y %I:%M:%S %p'))
     12 print(type(df[(df[col].notnull())]))

/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in     apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4358                         f, axis,
   4359                         reduce=reduce,
-> 4360                         ignore_failures=ignore_failures)
   4361             else:
   4362                 return self._apply_broadcast(f, axis)

/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in     _apply_standard(self, func, axis, ignore_failures, reduce)
   4454             try:
   4455                 for i, v in enumerate(series_gen):
-> 4456                     results[i] = func(v)
   4457                     keys.append(v.name)
   4458             except Exception as e:

<ipython-input-155-49014bb3ecb3> in <lambda>(x)
      9 
     10 col='Closed Date'
---> 11 df[(df[col].notnull())] = df[(df[col].notnull())].apply(lambda     x:datetime.datetime.strptime(x,'%m/%d/%Y %I:%M:%S %p'))
     12 print(type(df[(df[col].notnull())]))

TypeError: ('strptime() argument 1 must be str, not Series', 'occurred     at index Created Date')
1
  • Why don't you use df[col] = pd.to_datetime(df[col], format='%m/%d/%Y %I:%M:%S %p')? NaNs will be stored as NaTs Commented Sep 6, 2017 at 7:29

1 Answer 1

4

I think you need only to_datetime - it convert NaN to NaT, so all values are datetimes in column:

col='Closed Date'
df[col] = pd.to_datetime(df[col], format='%m/%d/%Y %I:%M:%S %p')

Sample:

df = pd.DataFrame({'Closed Date':['05/01/2016 05:10:10 AM', 
                                  '05/01/2016 05:10:10 AM', 
                                   np.nan]})

col='Closed Date'
df[col] = pd.to_datetime(df[col], format='%m/%d/%Y %I:%M:%S %p')
print (df)
          Closed Date
0 2016-05-01 05:10:10
1 2016-05-01 05:10:10
2                 NaT

print (df.dtypes)
Closed Date    datetime64[ns]
dtype: object
Sign up to request clarification or add additional context in comments.

2 Comments

ah thanks again, second post here so still a bit earning the platform. this reminds me to go to my previous post to accept it :)
Thank you very much, nice day!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.