I have a pandas dataframe which has more than 4 columns. Some values in the col1 are missing and I want to set those missing values based on the following approach:
- try to set it based on the average of values of col1 of the records that have the same col2,col3,col4 values
- if there is no such record, set it based on the average of values of col1 of the records that have the same col2,col3 values
- if there is still no such record, set it based on the average of values of col1 of the records that have the same col2 values
- If none of the above could be found, set it to the average of all other non-missing values in col1
What's the best way to do this?