I have a dataframe where certain essential columns are NULL (that I need for further machine learning work). I have another dataframe with similar data from where I want to pull in the missing values from.
For example, df1 is the main dataframe
id col1 col2 col3 col4 col5
1 A AA 100 5.0 0.9
2 A BB 150 4.2 0.5
3 A CC 100 NaN NaN
4 B AA 300 NaN NaN
5 B BB 100 NaN NaN
6 C BB 50 3.4 0.6
The dataframe that I want to fill those NaN columns in col4 and col5 could be like
id col1 col3 col4 col5
100 A 100 4.5 1.0
101 A 100 3.5 0.8
103 B 300 5.0 0.5
105 B 300 5.5 0.8
106 B 100 5.3 0.2
107 C 100 3.0 1.2
So, I don't have col2 in the second df and there are duplicates for the col1 and col2 columns that I can merge by. So, I have to choose the value with the maximum col4 value to fill the corresponding values in df1.
For example, the correct value for df1 after filling in the data would be:
id col1 col2 col3 col4 col5
1 A AA 100 5.0 0.9
1 A BB 150 4.2 0.5
1 A CC 100 4.5 1.0
1 B AA 300 5.5 0.8
1 B BB 100 5.3 0.2
1 C BB 50 3.4 0.6
How would I do that?

col5always occur in the same rows as the maximum values incol4?