Create multiple empty columns and assign it to 0 in pandas dataframe

Question

I am not sure if it's a good idea. I am using transfer learning to train some new data. The model shape has 180 columns(features) and the new data input has 500 columns. It 's not good to cut columns from the new data. So I am thinking to add more columns to the dataset used in the original model. So if I want to add e.g. columns from 181 to 499 and assign 0 to those cells, how can I do it? Please ignore label column now. Thanks for your help

Original df:

     0          1        2        3       4        5 ... 179 (to column 179) label
 0   0.28001 0.32042  0.93222. 0.87534. 0.44252 0.2321
 1
 2

 Expected output
     0          1        2        3       4        5 ... 179 180 181 182 ....499 label
 0   0.28001 0.32042  0.93222. 0.87534. 0.44252 0.2321    0   0   0   0      0
 1   0.38001 0.42042  0.13222. 0.67534. 0.64252 0.4321    0   0   0   0      0
 2

Andy L. · Accepted Answer · 2020-09-30 22:07:10Z

1

Since you don't care about columns label, use pd.concat on new construct dataframe from np.zeros

Sample df

In [336]: df
Out[336]:
         0        1         2         3        4       5
0  0.28001  0.32042  0.93222.  0.87534.  0.44252  0.2321
1  0.38001  0.42042  0.13222.  0.67534.  0.64252  0.4321

m = 20  #use 20 to show demo. You need change it to 500 for your real data
x, y  = df.shape

df_final = pd.concat([df, pd.DataFrame(np.zeros((x, m - y))).add_prefix('n_')], axis=1)

In [340]: df_final
Out[340]:
         0        1         2         3        4       5  n_0  n_1  n_2  n_3  \
0  0.28001  0.32042  0.93222.  0.87534.  0.44252  0.2321  0.0  0.0  0.0  0.0
1  0.38001  0.42042  0.13222.  0.67534.  0.64252  0.4321  0.0  0.0  0.0  0.0

   n_4  n_5  n_6  n_7  n_8  n_9  n_10  n_11  n_12  n_13
0  0.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0
1  0.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0

If you need columns in sequential numbers

m = 20
x, y  = df.shape

df_final = pd.concat([df, pd.DataFrame(np.zeros((x, m - y)), columns=range(y, m))], axis=1)

Out[341]:
         0        1         2         3        4       5    6    7    8    9  \
0  0.28001  0.32042  0.93222.  0.87534.  0.44252  0.2321  0.0  0.0  0.0  0.0
1  0.38001  0.42042  0.13222.  0.67534.  0.64252  0.4321  0.0  0.0  0.0  0.0

    10   11   12   13   14   15   16   17   18   19
0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
1  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

edited Sep 30, 2020 at 22:07

answered Sep 30, 2020 at 21:46

Andy L.

25.3k4 gold badges20 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

almo Over a year ago

Thanks Andy L. I tried just, it did not show. Can you use my those two rows and give me a quick demo? Thank you!

Andy L. Over a year ago

@almo: I added a sample and the output using m = 20 to show the output. Check my updated answer

almo Over a year ago

Thanks! I checked out also. It was caused by my label that I changed. Thank you:)

Collectives™ on Stack Overflow

Create multiple empty columns and assign it to 0 in pandas dataframe

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related