1

Why is the following snippet performing so badly:

import numpy
import pandas

time = numpy.array(range(0, 1000000, 10), dtype = numpy.uint32)
index = [ pandas.Timedelta(str(t) + 'ms') for t in time ]

It takes approximately a second and a half on a decent desktop and we are talking only a million of pandas.Timedelta. Any ideas how to rewrite the last line?

1
  • 1
    What about pd.to_timedelta(time, unit='ms') ? Commented Jun 18, 2017 at 19:41

2 Answers 2

3

If need TimedeltaIndex is possible use to_timedelta or TimedeltaIndex:

index = pd.to_timedelta(time, unit='ms')

Or:

index = pd.TimedeltaIndex(time, unit='ms')
Sign up to request clarification or add additional context in comments.

1 Comment

This improves the performance 5 times. You are right. I should use TimeDeltaIndex instead of arrays of `TimeDelta'. Thanks. I will accept the answer in 10 minutes.
3

You can also use pd.timedelta_range

index = pd.timedelta_range(0, periods=10000, freq='10ms')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.