Pythonic way to add multiple calculated columns to a data frame?

Question

Is there a more succinct / pythonic / pandas-native way of writing the following?

all_pos = ['NN', 'VB', 'ADJ']
for col in all_pos:
    df_out['delta_'+col] = df_out[col] - mean_df[col]

df_out and mean_df contain the same column names and indices, and I want to create new columns in df_out containing the difference between them.

So df_out could contain

Index  NN VB ADJ

239    9  4  3
250    2  2  1

And df_mean could contain

Index  NN VB ADJ

239    3  1  8
250    7  4  3

I would want df_out to look like

    Index  NN VB ADJ delta_NN delta_VB delta_ADJ

    239    9  4  3       6        3       -5
    250    2  2  1      -5       -2       -2

mozway · Accepted Answer · 2021-08-26 18:07:01Z

2

Use a simple subtraction (no need to do it per column) and concat the input and output:

pd.concat([df_out,
           (df_out - df_mean).add_prefix('delta_')
          ], axis=1)

or

df1.join((df1-df2).add_prefix('delta_'))

(df_out - df_mean) can also be written df_out.sub(df_mean)

output:

       NN  VB  ADJ  delta_NN  delta_VB  delta_ADJ
Index                                            
239     9   4    3         6         3         -5
250     2   2    1        -5        -2         -2

NB. I assumed "Index" is the index, if not first run:

df_out.set_index('Index', inplace=True)
df_mean.set_index('Index', inplace=True)

edited Aug 26, 2021 at 18:07

answered Aug 26, 2021 at 17:57

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Henry Yik Over a year ago

You can also use df1.join((df1-df2).add_prefix('delta_')).

Collectives™ on Stack Overflow

Pythonic way to add multiple calculated columns to a data frame?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related