1

For the following dataframe, each account may have a different rate each month. I was trying to locate the rate to be used for a particular transaction.

E.g. if I am looking for the Rate_A for account one at 2018-01-20, I should be getting the -1.206412 value.

What would be the best way to locate this value? I tried to use resample('D').ffill(), but get an error instead as it doesn't seem to work on multi-index.

Thanks

                    Rate_A    Rate_B    Rate_C
date     account                              
2018-01   one    -1.206412  0.132003  1.024180
          two     2.565646 -0.827317  0.569605
2018-02   one     1.431256 -0.076467  0.875906
          two     1.340309 -1.187678 -2.211372

1 Answer 1

2

UseDataFrameGroupBy.resample, but it working with DatetimeIndex only, so first convert second level to column, and then create DatetimeIndex:

df = df.reset_index(level=1)
df.index = pd.to_datetime(df.index)

df = df.groupby('account').resample('D').ffill()
print (df.head())
                   account    Rate_A    Rate_B   Rate_C
account date                                           
one     2018-01-01     one -1.206412  0.132003  1.02418
        2018-01-02     one -1.206412  0.132003  1.02418
        2018-01-03     one -1.206412  0.132003  1.02418
        2018-01-04     one -1.206412  0.132003  1.02418
        2018-01-05     one -1.206412  0.132003  1.02418

a = df.loc[('one', '2018-01-20'), 'Rate_A']
print (a)
#account  date      
#one      2018-01-20   -1.206412
#Name: Rate_A, dtype: float64

Another solution without resample use partial string indexing:

a = df.index.get_level_values('date')
b = df.index.get_level_values('account')

df.index = pd.MultiIndex.from_arrays([pd.to_datetime(a), b])
print (df)
                      Rate_A    Rate_B    Rate_C
date       account                              
2018-01-01 one     -1.206412  0.132003  1.024180
           two      2.565646 -0.827317  0.569605
2018-02-01 one      1.431256 -0.076467  0.875906
           two      1.340309 -1.187678 -2.211372


d = '2018-01-20'
a = df.loc[(d.rsplit('-', 1)[0], 'one'), 'Rate_A']
print (a)
date        account
2018-01-01  one       -1.206412
Name: Rate_A, dtype: float64

print (d.rsplit('-', 1)[0])
2018-01

And if date is datetime:

d = pd.to_datetime('2018-01-20')
print (d)
2018-01-20 00:00:00

a = df.loc[(d.strftime('%Y-%m'), 'one'), 'Rate_A']
print (a)
date        account
2018-01-01  one       -1.206412
Name: Rate_A, dtype: float64
Sign up to request clarification or add additional context in comments.

1 Comment

thank you so much, both solutions are working perfectly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.