I am having trouble to create dummy variables from a dataset like this one:
x = pd.DataFrame({'Temp':['Hot','Cold','Warm','Cold'],'Temp_2':[np.nan,'Warm','Cold',np.nan]
Note that the values are the same in both variables (Hot, Cold or Warm).
Temp Temp_2
0 Hot NaN
1 Cold Warm
2 Warm Cold
3 Cold NaN
So my problem is when using pd.get_dummies, the function does not take into consideration this relationship and codifies both variables independently.
Temp_Cold Temp_Hot Temp_Warm Temp_2_Cold Temp_2_Warm
0 0 1 0 0 0
1 1 0 0 0 1
2 0 0 1 1 0
3 1 0 0 0 0
Is there a way I can codify it so i can get it like this?
Cold Hot Warm
0 0 1 0
1 1 0 1
2 1 0 1
3 1 0 0
Thanks,
1and remaining values0in a row for that column. Your logic is not suited usingget_dummies()HotandHotin a same row in that x dataframe?