I have a DataFrame with millon of rows and a lot of NaN values. Some example:
index Company Area
0 Google Technology
1 Coca Cola Drinks
2 NaN Drinks
3 Apple Technology
4 NaN Technology
5 Gatorade Drinks
6 Dell Technology
7 Apple Technology
8 Coca Cola Drinks
9 NaN Drinks
10 Google Technology
My idea is to fill Companies NaN values with one of the 2 most common values for its Area.
From example: If the most frequent Companies in Technology area are Apple and Google, I Would like to fill the "df['Area'] == 'Technology'" NaN values with one of that values (randomly)
I've already created a Group By DataFrame with the most common values, it is something like this:
Area Company
Technology Google
Technology Apple
Drinks Coca Cola
Drinks Pepsi
The result should be something like this:
index Company Area
0 Google Technology
1 Coca Cola Drinks
2 Pepsi Drinks
3 Apple Technology
4 Google Technology
5 Gatorade Drinks
6 Dell Technology
7 Apple Technology
8 Coca Cola Drinks
9 Pepsi Drinks
10 Google Technology
I hope you can help me.
Thanks!!!