How to Create a Pandas Index Faster?

Question

Why is the following snippet performing so badly:

import numpy
import pandas

time = numpy.array(range(0, 1000000, 10), dtype = numpy.uint32)
index = [ pandas.Timedelta(str(t) + 'ms') for t in time ]

It takes approximately a second and a half on a decent desktop and we are talking only a million of pandas.Timedelta. Any ideas how to rewrite the last line?

What about pd.to_timedelta(time, unit='ms') ?

jezrael
– jezrael

2017-06-18 19:41:42 +00:00
Commented Jun 18, 2017 at 19:41 — jezrael
– jezrael, Commented Jun 18, 2017 at 19:41

jezrael · Accepted Answer · 2017-06-18 19:44:08Z

3

If need TimedeltaIndex is possible use to_timedelta or TimedeltaIndex:

index = pd.to_timedelta(time, unit='ms')

Or:

index = pd.TimedeltaIndex(time, unit='ms')

answered Jun 18, 2017 at 19:44

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

major4x Over a year ago

This improves the performance 5 times. You are right. I should use TimeDeltaIndex instead of arrays of `TimeDelta'. Thanks. I will accept the answer in 10 minutes.

piRSquared · Accepted Answer · 2017-06-18 20:42:02Z

3

You can also use pd.timedelta_range

index = pd.timedelta_range(0, periods=10000, freq='10ms')

answered Jun 18, 2017 at 20:42

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Collectives™ on Stack Overflow

How to Create a Pandas Index Faster?

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related