Combine multiple columns containing lists into one column given a list of column names

Question

I have a df where some columns contain lists

name   vector_1   vector_2   vector_3
foo    [1, 2]     [1, 3, 5]  [9]
bar    [3, 6]     [2, 4, 6]  [8]

I want to produce a new column with the lists flattened out whilst maintaining the integers within a list, thus

new_col
[1, 2, 1, 3, 5, 9]
[3, 6, 2, 4, 6, 8]

This does exactly what I need it to

df["new_col"] = df["vector_1"] + df["vector_2"] + df["vector_3"]

However, the number of columns (and names of those columns) may change from one user to another. Hence, the requirement is that the columns be passed in as a list ["vector_1", "vector_2", "vector_3"]. This is the bit I am struggling with.

Something like this

    df['new_col'] = df[df.columns.intersection(column_names)].apply(
        lambda x: ','.join(x.dropna().astype(str)),
        axis=1
    )

uses the list of column names fine, but converts the lists to strings resulting in

new_col
[1, 2], [1, 3, 5], [9]
[3, 6], [2, 4, 6], [8]

where the square brackets are part of the str.

Iterating through the rows using the 'column_names' and list comprehension would result in something like

new_col
[1, 2]
[3, 6]
[1, 3, 5]
[2, 4, 6]
[9]
[8]

Any ideas?

jezrael · Accepted Answer · 2021-05-25 08:51:17Z

2

Simpliest is use sum:

df['new_col'] = df[df.columns.intersection(column_names)].sum(axis=1)
print (df)
  name vector_1   vector_2 vector_3             new_col
0  foo   [1, 2]  [1, 3, 5]      [9]  [1, 2, 1, 3, 5, 9]
1  bar   [3, 6]  [2, 4, 6]      [8]  [3, 6, 2, 4, 6, 8]

If need also remove missing values:

f = lambda x: [z for y in x.dropna() for z in y]
df['new_col'] = df[df.columns.intersection(column_names)].apply(f, axis=1)

If not:

f = lambda x: [z for y in x for z in y]
df['new_col'] = df[df.columns.intersection(column_names)].apply(f, axis=1)

edited May 25, 2021 at 8:51

answered May 25, 2021 at 8:38

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Combine multiple columns containing lists into one column given a list of column names

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related