Python pandas dataframe - any way to set frequency programmatically?

Question

I'm trying to process CSV files like this:

df = pd.read_csv("raw_hl.csv", index_col='time', parse_dates = True))
df.head(2)
                    high        low 
time                
2014-01-01 17:00:00 1.376235    1.375945
2014-01-01 17:01:00 1.376005    1.375775
2014-01-01 17:02:00 1.375795    1.375445
2014-01-01 17:07:00 NaN         NaN 
...
2014-01-01 17:49:00 1.375645    1.375445

type(df.index)
pandas.tseries.index.DatetimeIndex

But these don't automatically have a frequency:

print df.index.freq
None

In case they have differing frequencies, it would be handy to be able to set one automatically. The simplest way would be to compare the first two rows:

tdelta = df.index[1] - df.index[0]
tdelta
datetime.timedelta(0, 60)

So far so good, but setting frequency directly to this timedelta fails:

df.index.freq = tdelta
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-25-3f24abacf9de> in <module>()
----> 1 df.index.freq = tdelta

AttributeError: can't set attribute

Is there a way (ideally relatively painless!) to do this?

ANSWER: Pandas has given the dataframe has a index.inferred_freq attribute - perhaps to avoid overwriting a user defined frequency. df.index.inferred_freq = 'T'

So it just seems to be a matter of using this instead of df.index.freq. Thanks to Jeff, who also provides more details below :)

by default since you only have 2 elements frequency is not computed, nor necessary. If you have at least 3 then it will be inferred. — Jeff
– Jeff, Commented Dec 22, 2014 at 21:30
In fact the data I'm using has several thousand rows; so this doesn't seem to be it. I'll edit to clarify. — birone
– birone, Commented Dec 23, 2014 at 2:17
It won't do this unless its necessary (as its a tiny bit computational), e.g. when you resample. You can see what it is, by doing df.index.inferred_freq. However if it is STILL None. then it is not a regular frequency. You might want to reindex to make it one. — Jeff
– Jeff, Commented Dec 23, 2014 at 2:23
Thanks! I get: df.index.inferred_freq = T, which I seem to remember means a 1 minute offset. I'll edit this in above. — birone
– birone, Commented Dec 23, 2014 at 14:38

Jeff · Accepted Answer · 2014-12-23 02:35:23Z

14

If you have a regular frequency it will be reported when you look at df.index.freq

In [20]: df = DataFrame({'A' : np.arange(5)},index=pd.date_range('20130101 09:00:00',freq='3T',periods=5))

In [21]: df
Out[21]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:06:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

In [22]: df.index.freq
Out[22]: <3 * Minutes>

Have an irregular frequency will return None

In [23]: df.index = df.index[0:2].tolist() + [Timestamp('20130101 09:05:00')] + df.index[-2:].tolist()

In [24]: df
Out[24]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:05:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

In [25]: df.index.freq

You can recover a regular frequency by doing this. Downsampling to a lower freq (where you don't have overlapping values), forward filling, then reindexing to the desired frequency and end-points).

In [31]: df.resample('T').ffill().reindex(pd.date_range(df.index[0],df.index[-1],freq='3T'))
Out[31]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:06:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

answered Dec 23, 2014 at 2:35

Jeff

130k21 gold badges223 silver badges189 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user1201614 Over a year ago

Jeff, super useful many thanks. Maybe you meant "upsampling to a higher freq" ( increasing the number of samples) instead of "Downsampling to a lower freq" ?

rrobby86 · Accepted Answer · 2021-11-05 10:51:19Z

1

In my case, loading data from CSV with a regular frequency, freq is None but there is a inferred_freq attribute with the intended value, as the OP pointed out.

With the current version of pandas (1.3.4), assignment to freq seems to work, so a solution would be:

df.index.freq = df.index.inferred_freq

An alternative could be to create a new index

df.index = pd.date_range(
    start=df.index[0],
    periods=len(df),
    freq=df.index.inferred_freq
)

answered Nov 5, 2021 at 10:51

rrobby86

1,5041 gold badge10 silver badges17 bronze badges

Collectives™ on Stack Overflow

Python pandas dataframe - any way to set frequency programmatically?

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related