Pandas - how to convert RangeIndex into DateTimeIndex

Question

I have the following dataframe. It is OHLC one-minute data. Obviously I need the T column to become and index in order to use time-series functionallity

C H L O T V

13712 6873.0 6873.0 6873.0 6873.0 2018-01-13T17:17:00 799.448421 
13713 6878.0 6878.0 6875.0 6875.0 2018-01-13T17:18:00 1707.578666 
13714 6880.0 6880.0 6825.0 6825.0 2018-01-13T17:21:00 481.245707 
13715 6876.0 6876.0 6876.0 6876.0 2018-01-13T17:22:00 839.177283 
13716 6870.0 6878.0 6830.0 6878.0 2018-01-13T17:23:00 4336.830277

I used:

df['T'] = pd.to_datetime(df['T'])

So far so good! The T column is now recognised as a date

Check:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 13717 entries, 1970-01-01 00:00:00 to 1970-01-01 00:00:00.000013716
Data columns (total 7 columns):
BV    13717 non-null float64
C     13717 non-null float64
H     13717 non-null float64
L     13717 non-null float64
O     13717 non-null float64
T     13717 non-null datetime64[ns]
V     13717 non-null float64
dtypes: datetime64[ns](1), float64(6)
memory usage: 857.3 KB

And now comes the fun and unexplainable part:

df.set_index(df['T'])


   C H L O T V
T

2018-01-03 17:27:00 5710.0 5710.0 5663.0 5667.0 2018-01-03 17:27:00 3863.030204 
2018-01-03 17:28:00 5704.0 5710.0 5663.0 5710.0 2018-01-03 17:28:00 1208.627542 
2018-01-03 17:29:00 5699.0 5699.0 5663.0 5663.0 2018-01-03 17:29:00 1755.123688

Still looks good, but when I check the type of index I get:

RangeIndex(start=0, stop=13717, step=1)

And now if I try:

df.index = pd.to_datetime(df.index)

I end up with:

DatetimeIndex([          '1970-01-01 00:00:00',
               '1970-01-01 00:00:00.000000001',
               '1970-01-01 00:00:00.000000002',
               '1970-01-01 00:00:00.000000003',
               '1970-01-01 00:00:00.000000004' and so on...

which is evidently wrong.

The questions are: 1. Why don't I get the normal DateTimeIndex if I am converting a date to index?

How can I convert that RangeIndex to DateTimeIndex with correct timestamps?

Thanks!

You forget assign back only df = df.set_index('T') or usedf.set_index('T', inplace=True) — jezrael
– jezrael, Commented Jan 14, 2018 at 9:30
But if use csv as input data, simpliest is df = pd.read_csv(file, parse_dates=['T'], index_col=['T']) — jezrael
– jezrael, Commented Jan 14, 2018 at 9:31
Input data is json. But I found out that df.index = df['T'] does the trick — rioZg
– rioZg, Commented Jan 14, 2018 at 11:03
I understand my error now. Without df = df.set_index I was simply converting integers to timestamps adding miliseconds or nanoseconds to initial Unix timestamp. — rioZg
– rioZg, Commented Jan 14, 2018 at 11:15

jezrael · Accepted Answer · 2018-01-14 09:42:36Z

10

If input data are csv the simpliest is use parameters parse_dates and index_col in read_csv:

df = pd.read_csv(file, parse_dates=['T'], index_col=['T'])

If not, then use your solution, don't forget assign back output of set_index and if need drop column T also after DatetimeIndex use T instead df['T']:

df['T'] = pd.to_datetime('T')
df = df.set_index('T')

#alternative solution
#df.set_index('T', inplace=True)

Why don't I get the normal DateTimeIndex if I am converting a date to index?

Because your index is default (0,1,2..), so df.index = pd.to_datetime(df.index) parse integerss like ns and get weird datetimes.

edited Jan 14, 2018 at 9:42

answered Jan 14, 2018 at 9:35

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Ivelin Ivanov Over a year ago

The first solution kept returning RangeIndex. I tried several different parameters permutations. So this did not work for me with the latest pandas 2.1.4 pd.read_csv(file, parse_dates=['T'], index_col=['T']) However the second solution does work: df['T'] = pd.to_datetime('T') Looks like to_dateset does ensure DateTimeIndex format, while read_csv does not. I could not find in the docs when Pandas uses Index or RangeIndex for datetime formatted input columns.

Collectives™ on Stack Overflow

Pandas - how to convert RangeIndex into DateTimeIndex

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related