Pandas Python - Transform a list-of-list column to multiple columns

Question

I would like to transform this DF

pd.DataFrame({"l1": [["fr en","en"]],
              "l2": [["fr en","in","it"]],
              "l3": [["he","es","fi"]],
              "l4": [["es"]]}).T
>> l1  [fr en, en]
   ...
   l4  [es]

to this DTM :

data = [[1,1,0,0,0,0,0], [1,0,1,1,0,0,0], [0,0,0,0,1,1,1], [0,0,0,0,0,1,1]]
pd.DataFrame(index=["l1","l2","l3","l4"], data=data, columns=["fr en","en","in","it","he","es","fi"])
>>      fr en en in it he es fi
    l1  1     1  0  0  0  0  0
    ... ...

My inefficient way to do this is to chain all possible values then to Count-Vectorize like

langs = set(chain(*df["lang"]))
pd.DataFrame(data=df["lang"].apply(lambda x: [1 if lang in x else 0 for lang in langs]).tolist(), columns=langs)

PS : I don't want to " ".join() the lists because it could represent a loss of information as you can see in fr en

jezrael · Accepted Answer · 2018-06-26 12:26:42Z

3

I think need MultiLabelBinarizer:

from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer()
df = pd.DataFrame(mlb.fit_transform(df[0]),columns=mlb.classes_, index=df.index)
print (df)
    en  es  fi  fr en  he  in  it
l1   1   0   0      1   0   0   0
l2   0   0   0      1   0   1   1
l3   0   1   1      0   1   0   0
l4   0   1   0      0   0   0   0

Or is possible use slowier solution with join by | if this separator not exist in data:

df = df[0].str.join('|').str.get_dummies()
print (df)
    en  es  fi  fr en  he  in  it
l1   1   0   0      1   0   0   0
l2   0   0   0      1   0   1   1
l3   0   1   1      0   1   0   0
l4   0   1   0      0   0   0   0

answered Jun 26, 2018 at 12:26

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

J. Doe Over a year ago

excellent I did not even think about the MultiLabelBinarizer

Collectives™ on Stack Overflow

Pandas Python - Transform a list-of-list column to multiple columns

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related