I have an Excel file in which I need to follow certain conditions and input in remarks column if it satisfy the condition. I get the necessary columns as DataFrames and here is how it looks:
svc_no i_status caller_id f_status result remarks
11111 WO 11111 WO Not Match Duplicate svc_no
22222 WO 22222 WO Match
11111 WO n/a SP Not Match Duplicate svc_no
The conditions would be:
- The svc_no is duplicated
- One of the duplicate is equal value with caller_id
- The other has a value of 'n/a' or 'NULL' in caller_id
- Result is Not Match
I used .loc and write it this way
df.loc[(df['svc_no'] != 'NULL') & (df['svc_no'] == df['caller_id']) & (df['svc_no'].duplicated()) & (df['i_status'] == 'WO') & (df['f_status'] == 'WO') & (df['result'] == 'Not Match), [remarks]] = 'Duplicate svc_no'
This code maybe right for the row where the first duplicate appeared, it does not apply to the other row where the other duplicate appeared.
Question: Is there a way where I can compare two rows with duplicates and apply necessary conditions using .loc or is there a way around?