2

I'm working at a data frame like this:

   id type1 type2 type3
0   1   dog   NaN   NaN
1   2   cat   NaN   NaN
2   3   dog   cat   NaN
3   4   cow   NaN   NaN
4   5   dog   NaN   NaN
5   6   cat   NaN   NaN
6   7   cat   dog   cow
7   8   dog   NaN   NaN

How can I transfer it to the following dataframe? Thank you.

   id  dog  cat  cow
0   1  1.0  NaN  NaN
1   2  NaN  1.0  NaN
2   3  1.0  1.0  NaN
3   4  NaN  NaN  1.0
4   5  1.0  NaN  NaN
5   6  NaN  1.0  NaN
6   7  1.0  1.0  1.0
7   8  1.0  NaN  NaN
2
  • Have you tried using OneHotEncoder from sklearn? Commented Sep 19, 2019 at 5:01
  • Not yet, this's not for machine learning, so I'm not if it's a appropriate way. Commented Sep 19, 2019 at 5:04

1 Answer 1

4

First filter ony type columns by DataFrame.filter, reshape by DataFrame.stack, so possible call Series.str.get_dummies. Then for 0/1 output use max by first level of MultiIndex and change 1 to NaNs by DataFrame.mask. Last add first column by DataFrame.join:

df1 = df.filter(like='type').stack().str.get_dummies().max(level=0).mask(lambda x: x == 0)

Or use get_dummies and max per columns names and last change 1 to NaNs:

df1 = (pd.get_dummies(df.filter(like='type'), prefix='', prefix_sep='')
         .max(level=0, axis=1)
         .mask(lambda x: x == 0))

df = df[['id']].join(df1)
print (df)
   id  cat  cow  dog
0   1  NaN  NaN  1.0
1   2  1.0  NaN  NaN
2   3  1.0  NaN  1.0
3   4  NaN  1.0  NaN
4   5  NaN  NaN  1.0
5   6  1.0  NaN  NaN
6   7  1.0  1.0  1.0
7   8  NaN  NaN  1.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.