I have a dataframe with dates that comes from csv file. I need to add a column with actual days difference between the dates in my column and '6/'1/2021' date. I used
Act_Days.append((pd.to_datetime(df.date[t])-
pd.to_datetime(df.settle_date))/np.timedelta64(1, 'D'))
this code works, but this code takes a long time to calculate as the dataset has about 30K rows and I assume it calculates row by row. Is it anyway to increase the speed. I heard that working with numpy arrays is much faster,then with pandas series, however when I try to convert my dates column to numpy array , after python doesn't subtract 6/1/2021 date. it shows an error:
dates=output.date.to_numpy()
np.datetime64(dates)-np.timedelta64('2021-6-1', 'D')
--------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-05fdef3e68dd> in <module>
1 dates=output.date.to_numpy()
----> 2 np.datetime64(dates)-np.timedelta64('2021-6-1', 'D')
ValueError: Could not convert object to NumPy datetime

datetime64values, and the RESULT is a `timedelta64.[t]? A general note; since pandas uses numpy arrays under the hood, I doubt you'll gain much from changing data types - you might benefit more from refactoring the code.