I am able to convert a numpy-array column of type pandas timestamp to an int array:
import numpy as np
import pandas as pd
df = pd.DataFrame({'a': [pd.datetime(2019, 1, 11, 5, 30, 1), pd.datetime(2019, 1, 11, 5, 30, 1), pd.datetime(2019, 1, 11, 5, 30, 1)], 'b': [np.nan, 5.1, 1.6]})
a = df.to_numpy()
a
# array([[Timestamp('2019-01-11 05:30:01'), nan],
# [Timestamp('2019-01-11 05:30:01'), 5.1],
# [Timestamp('2019-01-11 05:30:01'), 1.6]], dtype=object)
a[:,0] = a[:,0].astype('datetime64').astype(np.int64)
# array([[1547184601000000, nan],
# [1547184601000000, 5.1],
# [1547184601000000, 1.6]], dtype=object)
For this array a, I would like to convert the column 0 back to a pandas timestamp. As the array is quite big and my overall process quite time consuming, I would like to avoid the usage of python loops, applys, lambdas or similar things. Instead, I am looking for speed optimized native numpy based functions etc.
I tried already things like:
a[:,0].astype('datetime64')
(result: ValueError: Converting an integer to a NumPy datetime requires a specified unit)
and:
import calendar
calendar.timegm(a[:,0].utctimetuple())
(result: AttributeError: 'numpy.ndarray' object has no attribute 'utctimetuple')
How can I convert my column a[:,0] back to
array([[Timestamp('2019-01-11 05:30:01'), nan],
[Timestamp('2019-01-11 05:30:01'), 5.1],
[Timestamp('2019-01-11 05:30:01'), 1.6]], dtype=object)
in a speed optimized way?
ints ('1547184601000000' etc.) back to the Timestamps ('2019-01-11 05:30:01')