Python3, adding varying number of columns in pandas dataframe

Question

I have a dataframe of 2 initial columns col1 and col2. I want to add more columns based on the values in a list named y. The point is len(y) is varying.

import numpy as np
import pandas as pd

class Banana:
    def __init__(self):
        self.val1 = None
        self.Val2 = None
        self.name = None

b_a = Banana()
b_b = Banana()
b_c = Banana()
bananas = list()

# step 1: data preparation
case_dependent_value = 3
v1 = 1
v2 = np.arange(case_dependent_value)

b_a.val1 = v1
b_a.val2 = v2
b_a.name = 'b_a'

b_b.val1 = v1
b_b.val2 = v2
b_b.name = 'b_b'

b_c.val1 = v1
b_c.val2 = v2
b_c.name = 'b_c'

bananas.append(b_a)
bananas.append(b_b)
bananas.append(b_c)

# step 2: make dataframe
values_1 = list()
names = list()
for each_banana in bananas:
    values_1.append(each_banana.val1)
    names.append(each_banana.names)

df = pd.DataFrame({'values_1': values_1, 'names': names})

The generated df is:

    names | values_1
0    b_a  |  1
1    b_b  |  1
2    b_c  |  1

Now I want to add the val2 of objects b_a, b_b, b_c as columns in the df. The point is the size of val2 is changing with different cases. But each case, the size is known, like the example it is 3. So the results should be

    names | values_1 | values_2_1 | values_2_2 | values_2_3
0    b_a  |   1      |    0       |     1      |   2
1    b_b  |   1      |    0       |     1      |   2
2    b_c  |   1      |    0       |     1      |   2

Could you please tell me how to achieve it? Thanks

Nickil Maveli · Accepted Answer · 2017-03-17 09:54:17Z

2

If the size of the number of columns to be created is known beforehand, then you could use DF.assign() to explode the columns dynamically with the help of dictionary unpacking into it's function call as shown:

# consider a given size
size = 3
# generate column names with incremental ranges
cols = ["value_2_{}".format(item) for item in range(1,size+1)]
# map the column names and it's corresponding value to a dict object
df.assign(**dict(zip(cols, range(size))))

edited Mar 17, 2017 at 9:54

answered Mar 17, 2017 at 9:23

Nickil Maveli

29.8k10 gold badges86 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

aura Over a year ago

thanks @NickilMaveli, but your suggestion cannot fully solve my problem. The last line "hard-coded" the 'values_2' of all 3 objects b_a, b_b, and b_c to be the same as "0, 1, 2". In real case, the values of b_a.val2, b_b.val2, and b_c.val2 are different. How to do then?

Nickil Maveli Over a year ago

It's not getting hardcoded as you can see. The values would get filled based on the contents of your list, "y". But they would have to match the length of the index. I'm hoping that's a nested list having a form like: y = [[1,2,3],[4,5,6],[7,8,9]] for a DF whose row count is 3. If it's not in the said format, you can easily group the flat list into chunks of three and process it further.

Collectives™ on Stack Overflow

Python3, adding varying number of columns in pandas dataframe

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related