I have 4 data frames (pandas) that are similar in the structure to this one:
index day1 day2 day3 day4 day5 ....
0 1.23 5.41 0 0 2.31
1 2.31 7.15 0 0 1.32
...
I want to calculate for each row the mean, std, kurtosis, and skewness, and add it as new columns to another existing data frame.Right now I do it using for loop, changing the names of the columns by count number of for loop and adding the number as a string to the columns name, so I don't run over the results of the previous for loop. This looks like this:
phen_1=rain_calc.iloc[:,:20]
phen_2=rain_calc.iloc[:,20:55]
phen_3=rain_calc.iloc[:,55:70]
phen_4=rain_calc.iloc[:,70:80]
phen_5=rain_calc.iloc[:,70:110]
dfs_phens=[phen_1,phen_2,phen_3,phen_4,phen_5]
phen=1
for df in dfs_phens:
mean_col='mean_'+str(phen)
std_col='std_'+str(phen)
skew_col='skew_'+str(phen)
kurt_col='mean_'+str(phen)
total_col='total_'+str(phen)
original_df[mean_col] =df.mean(axis=1)
original_df[std_col] =df.std(axis=1)
original_df[skew_col] =df.skew(axis=1)
original_df[kurt_col]=df.kurt(axis=1)
original_df[total_col]=df.sum(axis=1)
phen=phen+1
This works and gives me the output I want - new columns with the calculated statistics. However, I wonder if there is a smarter and more esthetical code way to do so :)
So my goal is to improve my script- to give new columns names inside for loop without creating the strings every time, as I'm doing now.