Suppose I have two dataframes - conditions and data.
import pandas as pd
conditions = pd.DataFrame({'class': [1,2,3,4,4,5,5,4,4,5,5,5],
'primary_lower': [0,0,0,160,160,160,160,160,160,160,160,800],
'primary_upper':[9999,9999,9999,480,480,480,480,480,480,480,480,4000],
'secondary_lower':[0,0,0,3500,6100,3500,6100,0,4800,0,4800,10],
'secondary_upper':[9999,9999,9999,4700,9999,4700,9999,4699,6000,4699,6000,3000],
'group':['A','A','A','B','B','B','B','C','C','C','C','C']})
data = pd.DataFrame({'class':[1,1,4,4,5,5,2],
'primary':[2000,9100,1100,170,300,210,1000],
'secondary':[1232,3400,2400,380,3600,4800,8600]})
I'd like to generate a new column (group) in the "data" table that assigns a group to each row given the conditions provided in the "conditions" table.
The conditions table is structured so that rows within each group are joined by "OR"s and columns are joined by "AND"s. For example, to be assigned group "B":
(class = 4 AND 160<=primary<=480 AND 3500<=secondary<=4700)
OR
(class = 4 AND 160<=primary<=480 AND 6100<=secondary<=9999)
OR
(class = 5 AND 160<=primary<=480 AND 3500<=secondary<=4700)
OR
(class = 5 AND 160<=primary<=480 AND 6100<=secondary<=9999)
Any rows that don't match any of the conditions will get assigned group "Other". So, the final dataframe would then look like this:
+-------+---------+-----------+-------+
| class | primary | secondary | group |
+-------+---------+-----------+-------+
| 1 | 2000 | 1232 | A |
| 1 | 9100 | 3400 | A |
| 4 | 1100 | 2400 | Other |
| 4 | 170 | 380 | C |
| 5 | 300 | 3600 | B |
| 5 | 210 | 4800 | C |
| 2 | 1000 | 8600 | A |
+-------+---------+-----------+-------+