I have the following DataFrame:
A B C
0 1 3 3
1 1 9 4
2 4 6 3
I would like to create every possible unique combination of these columns without repetition so that I would end up with a dataframe containing the following data: A, B, C, A+B, A+C, B+C, A+B+C. I do not want to have any columns repeated in any combination, e.g. A+A+B+C or A+B+B+C.
I would also like to have each column in the dataframe labelled with the relevant variable names (e.g. for the combination of A + B, column name should be 'A_B')
This is the desired DataFrame:
A B C A_B A_C B_C A_B_C
0 1 1 4 2 5 5 6
1 3 9 6 12 9 15 18
2 3 4 3 7 6 7 10
This is relatively easy with just 3 variables using itertools and I have used the following code to do it:
import pandas as pd
import itertools
combos_2 = pd.DataFrame({'{}_{}'.format(a, b):
df[a] + df[b]
for a, b in itertools.combinations(df.columns, 2)})
combos_3 = pd.DataFrame({'{}_{}_{}'.format(a, b, c):
df[a] + df[b] + df[c]
for a, b, c in itertools.combinations(df.columns, 3)})
composites = pd.concat([df, combos_2, combos_3], axis=1)
However, I can't figure out how to extend this code in a pythonic way to account for a DataFrame with a much larger number of columns. Is there a way of making the following code more pythonic and extending it for use with a large number of columns? Or is there a more efficient way of generating the combinations?