50

Considering a pandas dataframe in python having a column named time of type integer, I can convert it to a datetime format with the following instruction.

df['time'] = pandas.to_datetime(df['time'], unit='s')

so now the column has entries like: 2019-01-15 13:25:43.

What is the command to revert the string to an integer timestamp value (representing the number of seconds elapsed from 1970-01-01 00:00:00)?

I checked pandas.Timestamp but could not find a conversion utility and I was not able to use pandas.to_timedelta for this.

Is there any utility for this conversion?

1

5 Answers 5

47

You can typecast to int using astype(int) and divide it by 10**9 to get the number of seconds to the unix epoch start.

import pandas as pd
df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]})
df_unix_sec = pd.to_datetime(df['time']).astype(int)/ 10**9
print(df_unix_sec)
Sign up to request clarification or add additional context in comments.

9 Comments

This would be fantastic but it's not giving the expected result: I tried the following lines: df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]}) df['time'] = pandas.to_datetime(df['time'], unit='s',origin='unix') It is not returning any error but I cannot see any change in the column
Psst, casting to int is in my answer ;-)
@FrancescoBoi actually initially I misunderstood the to_datetime parameters. Have a look I also asked a question on SO here stackoverflow.com/questions/54313463/…. So if you cast it to int then it'll work for you :)
Well, if you can just add you need to divide by 10 ** 9 to get a nix timestamp, I'll just delete my answer then.
Since I was getting a float type after dividing by 10**9 in my opinion is better to add another cast: res = (pd.to_datetime(df['time'], unit='s').astype(int)/10**9).astype(int)
|
41

The easiest and fastest way is to use .view(int):

df['time'] = df['time'].view(int)//1e9

Other options:

df['time'] = df['time'].apply(lambda x: x.value)//1e9
df['time'] = df['time'].astype(int)//1e9

Using %%timeit on 1000 dates I measured:

  • .view: 119 µs ± 998 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
  • .astype: 129 µs ± 676 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
  • .apply: 629 µs ± 5.38 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

 

5 Comments

.astype(int) was faster for my dask.compute(): df[col] = df[col].apply(lambda x: x.value, meta=(col, int)) # 1.0s vs df[col] = df[col].astype(int) # 0.8s
@WestonA.Greene for such a small time difference, you'd want to run each command over several iterations and calculate the mean and standard error for each estimate. Or test on a much larger set of data.
@DryLabRebel exactly. You can check this using %%timeit on a notebook. I will share the results in the response.
I just updated my answer with the best option and measuring run-times. I actually came with a faster solution than .astype!
Thank you DryLabRebel and Ignacio! If you have the time, Ignacio, including in your post the full %%timeit code would help other beginners like me more quickly perform testing in future speed related questions.
9

Use .dt.total_seconds() on a timedelta64:

import pandas as pd
df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]})

# pd.to_timedelta(df.time).dt.total_seconds() # Is deprecated
(df.time - pd.to_datetime('1970-01-01')).dt.total_seconds()

Output

0    1.547559e+09
Name: time, dtype: float64

Comments

9

One can also use .view(...):

import pandas as pd
df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]})
df_unix_sec = pd.to_datetime(df['time']).view(int) // 10 ** 9
print(df_unix_sec)

Casting with .astype(int), recommended above, is deprecated in pandas 1.3.0, and throws a warning:

FutureWarning: casting datetime64[ns] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.

Comments

3

As @Ignacio recommends, this is what I am using to cast to integer:

df['time'] = df['time'].apply(lambda x: x.value)

Then, to get it back:

df['time'] = df['time'].apply(pd.Timestamp)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.