1

I have a df called 'Series', consisting of over 2600 rows and 120 columns. Here is an extract:

         Date  Gasoil  Gasoline     Oil     Gas
0  2010-12-31  100.00    100.00  100.00  100.00
1  2011-01-03  103.97     99.88  100.18  105.55
2  2011-01-04  100.85     99.33   97.81  106.00
3  2011-01-05  102.02    100.61   98.82  101.54

I have created several empty dataframes, with the same column names and index as "Series". Each of these copy dfs need to accomodate some kind of function on the original Series df (moving averages, rolling percentiles, etc).

For example, one of these copy dataframes is called "log_returns". In every cell of log_returns, I need to calculate logarithmic returns, based on the corresponding column of the "Series" dataframe.

This is the output I have in mind. For example, log return of Gasoil on 2011-01-03 = log (103.97/100).

         Date  Gasoil Gasoline     Oil     Gas
0  2010-12-31                                 
1  2011-01-03   3.89%   -0.12%   0.18%   5.40%
2  2011-01-04  -3.05%   -0.55%  -2.39%   0.43%
3  2011-01-05   1.15%    1.28%   1.03%  -4.30%

In order to do that, I wrote a nested for loop:

rows_list = list(range(1, len(log_returns)))
columns_list = list(range(0, len(log_returns.columns)))

for row in rows_list:
    for column in columns_list:
        log_returns.iloc[row,column] = np.log(Series.iloc[row,column] / Series.iloc[row-1,column])

Unfortunately, the code is really slow to run. Are there faster alternatives? Thanks

2 Answers 2

2

After setting the Date as the index, the result can be obtained with a one-liner:

df = pd.DataFrame.from_records(
[(pd.to_datetime('2010-12-31'),  100.00,    100.00,  100.00,  100.00),
(pd.to_datetime('2011-01-03'),  103.97,     99.88,  100.18,  105.55),
(pd.to_datetime('2011-01-04'),  100.85,     99.33,   97.81,  106.00),
(pd.to_datetime('2011-01-05'),  102.02,    100.61,   98.82,  101.54)],
columns = ['Date',  'Gasoil',  'Gasoline',     'Oil',     'Gas'])
df = df.set_index('Date')
#this is the one-liner
np.log(df/df.shift(1))*100
Sign up to request clarification or add additional context in comments.

Comments

0

Try:

df.loc[:, "Gasoil":] = (
    np.log(df.loc[:, "Gasoil":] / df.loc[:, "Gasoil":].shift(-1)).shift() * -100
).fillna("")
print(df)

Prints:

         Date    Gasoil  Gasoline       Oil       Gas
0  2010-12-31                                        
1  2011-01-03  3.893221 -0.120072  0.179838  5.401459
2  2011-01-04 -3.046813 -0.552183 -2.394175  0.425432
3  2011-01-05  1.153461  1.280402  1.027319 -4.298628

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.