How to convert int array back to pandas timestamp?

Question

I am able to convert a numpy-array column of type pandas timestamp to an int array:

import numpy as np
import pandas as pd

df = pd.DataFrame({'a': [pd.datetime(2019, 1, 11, 5, 30, 1), pd.datetime(2019, 1, 11, 5, 30, 1), pd.datetime(2019, 1, 11, 5, 30, 1)], 'b': [np.nan, 5.1, 1.6]})

a = df.to_numpy()
a
# array([[Timestamp('2019-01-11 05:30:01'), nan],
#       [Timestamp('2019-01-11 05:30:01'), 5.1],
#       [Timestamp('2019-01-11 05:30:01'), 1.6]], dtype=object)
a[:,0] = a[:,0].astype('datetime64').astype(np.int64)
# array([[1547184601000000, nan],
#        [1547184601000000, 5.1],
#        [1547184601000000, 1.6]], dtype=object)

For this array a, I would like to convert the column 0 back to a pandas timestamp. As the array is quite big and my overall process quite time consuming, I would like to avoid the usage of python loops, applys, lambdas or similar things. Instead, I am looking for speed optimized native numpy based functions etc.

I tried already things like:

a[:,0].astype('datetime64')

(result: ValueError: Converting an integer to a NumPy datetime requires a specified unit)

and:

import calendar
calendar.timegm(a[:,0].utctimetuple())

(result: AttributeError: 'numpy.ndarray' object has no attribute 'utctimetuple')

How can I convert my column a[:,0] back to

array([[Timestamp('2019-01-11 05:30:01'), nan],
      [Timestamp('2019-01-11 05:30:01'), 5.1],
      [Timestamp('2019-01-11 05:30:01'), 1.6]], dtype=object)

in a speed optimized way?

What do you mean "back to", I can't see the difference between your original data with the desired output? — Frank AK
– Frank AK, Commented Aug 15, 2019 at 3:04
I mean with 'back to' that I get from the column with ints ('1547184601000000' etc.) back to the Timestamps ('2019-01-11 05:30:01') — user7468395
– user7468395, Commented Aug 15, 2019 at 3:08

Frank AK · Accepted Answer · 2019-08-15 03:22:58Z

1

Let's review docs

Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.

So, we can use DatetimeIndex. and then covert it by using np.int64.

In [18]: b = a[:,0]                                                             

In [19]: index = pd.DatetimeIndex(b)

In [21]: index.astype(np.int64)                                                 
Out[21]: Int64Index([1547184601000000000, 1547184601000000000, 1547184601000000000], dtype='int64')

answered Aug 15, 2019 at 3:22

Frank AK

1,78117 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user7468395 Over a year ago

Thanks, that was very helpful. You have to multipy by 1000 though, to get the right result: pd.DatetimeIndex(a[:,0]*1e3)

wwii Over a year ago

can be boxed to - what does that mean?

Frank AK Over a year ago

@wwii To explain your question, I think you should check the link pandas.pydata.org/pandas-docs/version/0.25/reference/api/…

Collectives™ on Stack Overflow

How to convert int array back to pandas timestamp?

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related