The question is very similar to Updating column in a dataframe based on multiple columns
I have the following dataset:
CustomerID TypeofContact Occupation Gender MaritalStatus
0 200000 Self Enquiry Salaried Female Single
1 200001 Company Invited Salaried Male Divorced
2 200002 Self Enquiry Free Lancer Male Single
3 200003 Company Invited Salaried Female Divorced
4 200004 Self Enquiry Small Business Male Divorced
5 200005 Company Invited Salaried Male Single
6 200006 Self Enquiry Small Business Female Divorced
7 200007 Self Enquiry Salaried Male Married
8 200008 NaN Salaried Male Single
9 200009 Self Enquiry Small Business Male Divorced
10 200010 Self Enquiry Small Business Male Divorced
11 200011 NaN Salaried Female Single
12 200012 Self Enquiry Small Business Male Married
13 200013 NaN Small Business Male Married
14 200014 Self Enquiry Salaried Male Single
I am looking to update the NaN in TypeofContact column to the first record that satisfies the the condition - combination of not null Occupation, Gender & Marital Status with the null record's combination of Occupation, Gender & Marital Status
Example:
CustomerID with 200014 can satisfy the CustomerID 200008 that has TypeofContact as NaN as both sets have the same Occupation, Gender & Marital Status
Same is the case with 200013 and 200012
CustomerID TypeofContact Occupation Gender MaritalStatus
0 200000 Self Enquiry Salaried Female Single
1 200001 Company Invited Salaried Male Divorced
2 200002 Self Enquiry Free Lancer Male Single
3 200003 Company Invited Salaried Female Divorced
4 200004 Self Enquiry Small Business Male Divorced
5 200005 Company Invited Salaried Male Single
6 200006 Self Enquiry Small Business Female Divorced
7 200007 Self Enquiry Salaried Male Married
8 200008 Self Enquiry Salaried Male Single
9 200009 Self Enquiry Small Business Male Divorced
10 200010 Self Enquiry Small Business Male Divorced
11 200011 Self Enquiry Salaried Female Single
12 200012 Self Enquiry Small Business Male Married
13 200013 Self Enquiry Small Business Male Married
14 200014 Self Enquiry Salaried Male Single
I was able to create another dataframe with not null's, loop through it and update the original dataframe using CustomerID identifier.
What would be an efficient way to accomplish this?
Thanks.
TypeofConcactwhere do the "combination of not null Occupation, Gender & Marital Status" come into play?