I have a df where some columns contain lists
name vector_1 vector_2 vector_3
foo [1, 2] [1, 3, 5] [9]
bar [3, 6] [2, 4, 6] [8]
I want to produce a new column with the lists flattened out whilst maintaining the integers within a list, thus
new_col
[1, 2, 1, 3, 5, 9]
[3, 6, 2, 4, 6, 8]
This does exactly what I need it to
df["new_col"] = df["vector_1"] + df["vector_2"] + df["vector_3"]
However, the number of columns (and names of those columns) may change from one user to another. Hence, the requirement is that the columns be passed in as a list ["vector_1", "vector_2", "vector_3"]. This is the bit I am struggling with.
Something like this
df['new_col'] = df[df.columns.intersection(column_names)].apply(
lambda x: ','.join(x.dropna().astype(str)),
axis=1
)
uses the list of column names fine, but converts the lists to strings resulting in
new_col
[1, 2], [1, 3, 5], [9]
[3, 6], [2, 4, 6], [8]
where the square brackets are part of the str.
Iterating through the rows using the 'column_names' and list comprehension would result in something like
new_col
[1, 2]
[3, 6]
[1, 3, 5]
[2, 4, 6]
[9]
[8]
Any ideas?