I'm trying to implement a simple voting score in a csv file using pandas. Basically, if the `dataframe['C'] == Active and dataframe['Count'] == 0, then dataframe['Combo'] == 0. If dataframe['C'] == Active and dataframe['Count'] == 1; then dataframe['Combo'] == 1. If dataframe['C'] == Active and dataframe['Count'] == 2; then dataframe['Combo'] == 2 and so on.
This is my dataframe:
A B C Count Combo
Ptn1 Lig1 Inactive 0
Ptn1 Lig1 Inactive 1
Ptn1 Lig1 Active 2 2
Ptn2 Lig2 Active 0 0
Ptn2 Lig2 Inactive 1
Ptn3 Lig3 Active 0 0
Ptn3 Lig3 Inactive 1
Ptn3 Lig3 Inactive 2
Ptn3 Lig3 Inactive 3
Ptn3 Lig3 Active 4 3
This is my code so far for clarity:
import pandas as pd
df = pd.read_csv('affinity.csv')
VOTE = 0
df['Combo'] = ''
df.loc[(df['Classification] == 'Active') & (df['Count'] == 0), 'Combo'] = VOTE
df.loc[(df['Classification] == 'Active') & (df['Count'] == 1), 'Combo'] = VOTE + 1
df.loc[(df['Classification] == 'Active') & (df['Count'] == 2), 'Combo'] = VOTE + 2
df.loc[(df['Classification] == 'Active') & (df['Count'] > 3), 'Combo'] = VOTE + 3
My code was able to do this correctly. However, there are two 'Active' values for the pair Ptn3-Lig3; one at dataframe['Count'] = 0 and another at dataframe['Count'] = 4.
Is there a way to ignore the second value (i.e. consider only the smallest dataframe['Count'] value) and add the corresponding number to dataframe['Combo']?
I know pandas.DataFrame.drop_duplicates()might be a way to select only unique values, but it would be really good avoid deleting any rows.