I have a df called 'Series', consisting of over 2600 rows and 120 columns. Here is an extract:
Date Gasoil Gasoline Oil Gas
0 2010-12-31 100.00 100.00 100.00 100.00
1 2011-01-03 103.97 99.88 100.18 105.55
2 2011-01-04 100.85 99.33 97.81 106.00
3 2011-01-05 102.02 100.61 98.82 101.54
I have created several empty dataframes, with the same column names and index as "Series". Each of these copy dfs need to accomodate some kind of function on the original Series df (moving averages, rolling percentiles, etc).
For example, one of these copy dataframes is called "log_returns". In every cell of log_returns, I need to calculate logarithmic returns, based on the corresponding column of the "Series" dataframe.
This is the output I have in mind. For example, log return of Gasoil on 2011-01-03 = log (103.97/100).
Date Gasoil Gasoline Oil Gas
0 2010-12-31
1 2011-01-03 3.89% -0.12% 0.18% 5.40%
2 2011-01-04 -3.05% -0.55% -2.39% 0.43%
3 2011-01-05 1.15% 1.28% 1.03% -4.30%
In order to do that, I wrote a nested for loop:
rows_list = list(range(1, len(log_returns)))
columns_list = list(range(0, len(log_returns.columns)))
for row in rows_list:
for column in columns_list:
log_returns.iloc[row,column] = np.log(Series.iloc[row,column] / Series.iloc[row-1,column])
Unfortunately, the code is really slow to run. Are there faster alternatives? Thanks