Slicing a datetimeindex dataframe

Question

I'm new to python and pandas.

I'm trying to code something in which I compared the second last row of a pandas dataframe indexed by datetime with the last row of the dataframe. However, I am having trouble trying to get the index to compare properly.

For example, for a dataframe

dti = pd.date_range('2018-01-01', periods=10, freq='D')
df = pd.DataFrame(np.random.randn(10, 4), index=dti, columns=['value', 'value1', 'value2', 'value3'])

Produces a dataframe

               value    value1    value2    value3
2018-01-01  0.249018 -0.630937  1.120733  0.343331
2018-01-02 -0.347979  0.218776 -0.327967  0.569075
2018-01-03  0.879929  0.316554  0.159652 -0.872002
2018-01-04 -1.492683 -0.560168 -0.871728  1.739402
2018-01-05 -3.187776 -1.399635  0.873503 -0.098537
2018-01-06 -0.482267 -0.174304  0.821850  1.530861
2018-01-07  0.385600 -0.395019 -0.507524 -0.033996
2018-01-08  1.358576 -2.107309  0.956507 -0.360288
2018-01-09 -0.283369  0.630684  1.148554 -1.907604
2018-01-10  0.896697  0.030671  0.131533 -1.325552

I'm trying to compare

if df.loc[last_row, ['value']] > df.loc[second_last_row, ['value']]:
# Compare to see if 'value' in 2018-01-10 is higher than 'value' in 2018-01-09

I have tried using

length = len(df)
if if df.loc[length-1, ['value']] > df.loc[length-2, ['value']]:

but I get an error

TypeError: cannot do index indexing on <class 'pandas.tseries.index.DatetimeIndex'> with these indexers [5] of <class 'int'>

It seems like the DatetimeIndex cannot me manipulated the same way as regular integer indices.

I have also considered using

df['value'].tail(1)

but I'm not sure how I would be able to access the second last row using the .tail method.

I need advice on how I would be able to retrieve data from the tail end of my dataframe based on its distance from the end of the dataframe. Thanks!

kindly read this guide on how to frame questions. i feel u should make some changes, by including an input dataframe, and an expected output. — sammywemmy
– sammywemmy, Commented Apr 21, 2020 at 3:23

jezrael · Accepted Answer · 2020-04-21 05:09:09Z

Use:

np.random.seed(2020)
dti = pd.date_range('2018-01-01', periods=10, freq='D')
df = pd.DataFrame(np.random.randn(10, 4), index=dti, 
                  columns=['value','value1','value2','value3'])
print (df)
               value    value1    value2    value3
2018-01-01 -1.768846  0.075552 -1.130630 -0.651430
2018-01-02 -0.893116 -1.274101 -0.061154  0.064514
2018-01-03  0.410113 -0.572882 -0.801334  1.312035
2018-01-04  1.274699 -1.214358  0.313719 -1.444821
2018-01-05 -0.368961 -0.769227  0.392616  0.057294
2018-01-06  2.089979  0.041971 -0.048341 -0.513154
2018-01-07 -0.084589 -1.215450 -1.412931 -1.486911
2018-01-08  0.382225  0.937673  1.772678  0.878828
2018-01-09  0.331719 -0.306036  1.240266 -0.215627
2018-01-10  0.155929  0.098056  0.832096  2.045205

If want select last value of column is possible use indexing of DatetimeIndex like:

print (df.loc[df.index[-1], 'value'])
0.15592948059188771

print (df.loc[df.index[-2], 'value'])
0.3317191223642817

df.loc[df.index[-1], 'value'] > df.loc[df.index[-2], 'value']

Your code should be changed by select by positions by DataFrame.iloc, only necessary pass position of column value by Index.get_loc:

length = len(df)
pos = df.columns.get_loc('value')

print (df.iloc[length-1, pos])
0.15592948059188771

print (df.iloc[length-2, pos])
0.3317191223642817

df.iloc[length-1, pos] > df.iloc[length-2, pos]

Collectives™ on Stack Overflow

Slicing a datetimeindex dataframe

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related