2

I have a dataframe like this

df = (pd.DataFrame({'ID': ['ID1', 'ID2', 'ID3'], 
                    'Values': [['AB', 'BC'], np.NaN, ['AB', 'CD']]}))

df

    ID  Values
0   ID1 [AB, BC]
1   ID2   NaN
2   ID3 [AB, CD]

I want to split the item inside list into column such that

    ID  AB  BC  CD
0   ID1 1   1   0
1   ID2 0   0   0
2   ID3 1   0   1

1 Answer 1

4

Pandas functions working with missing values nice, so use Series.str.join with Series.str.get_dummies, DataFrame.pop is for extract column and last join to original data:

df = df.join(df.pop('Values').str.join('|').str.get_dummies())
print (df)
    ID  AB  BC  CD
0  ID1   1   1   0
1  ID2   0   0   0
2  ID3   1   0   1

EDIT: If values are not lists, only string representation of lists use ast.literal_eval for converting to lists:

import ast

df = (df.join(df.pop('Values')
        .apply(ast.literal_eval)
        .str.join('|')
        .str.get_dummies()))
Sign up to request clarification or add additional context in comments.

3 Comments

with Pop, it's not popping values but individual character
@Hardikgupta - For me working nice, in real data are lists?
my real data is like ['ABC', 'PWR']

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.