2

I am doing data preprocessing, so I am trying to convert the date string format into an int, but I got an error, please help me how to convert it.

I have data like this :

0        Apr-12
1        Apr-12
2        Mar-12
3        Apr-12
4        Apr-12

I tried this :

d=df['d_date'].apply(lambda x: datetime.strptime(x, '%m%Y'))

I got an error.

ValueError                                Traceback (most recent call last)
<ipython-input-134-173081812744> in <module>()
----> 1 d=test['first_payment_date'].apply(lambda x: datetime.strptime(x, '%m%Y'))

~\Anaconda3\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
   4036             else:
   4037                 values = self.astype(object).values
-> 4038                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   4039 
   4040         if len(mapped) and isinstance(mapped[0], Series):

pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-134-173081812744> in <lambda>(x)
----> 1 d=test['first_payment_date'].apply(lambda x: datetime.strptime(x, '%m%Y'))

~\Anaconda3\lib\_strptime.py in _strptime_datetime(cls, data_string, format)
    563     """Return a class cls instance based on the input string and the
    564     format string."""
--> 565     tt, fraction = _strptime(data_string, format)
    566     tzname, gmtoff = tt[-2:]
    567     args = tt[:6] + (fraction,)

~\Anaconda3\lib\_strptime.py in _strptime(data_string, format)
    360     if not found:
    361         raise ValueError("time data %r does not match format %r" %
--> 362                          (data_string, format))
    363     if len(data_string) != found.end():
    364         raise ValueError("unconverted data remains: %s" %

ValueError: time data 'Apr12' does not match format '%m%Y'
3
  • 1
    You need %b-%y. See this for more info. Commented Aug 22, 2019 at 16:28
  • 1
    pd.to_datetime(df['d_date'],format='%b-%y') expanding on what @harvpan says Commented Aug 22, 2019 at 16:36
  • Question has nothing to do with machine-learning - kindly do not spam irrelevant tags (removed & replaced with datetime). Commented Aug 22, 2019 at 21:31

1 Answer 1

2

IIUC, You need to se %b-%y as Apr is %b and 12 is %y. Refer to Python's strftime directives for more information. Once you convert to datetime objects, you can then convert them to UNIX.

df:

col
0   Apr-12
1   Apr-12

For int datetime,

pd.Series(pd.to_datetime(df['col'], format='%b-%y').values.astype(float)).div(10**9)

Output:

0    1.333238e+09
1    1.333238e+09
dtype: float64

Explanation:

pd.to_datetime(df['col'], format='%b-%y')

Outputs:

0   2012-04-01
1   2012-04-01
Name: col, dtype: datetime64[ns]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.