I have a dataframe with layout according to below, not including "flag_common":
cat flag_1 flag_2 flag_3 pop state year flag_common
value1 1 0 0 1.5 Ohio 2000 1
value3 1 1 0 1.7 Ohio 2001 1
value2 1 1 0 3.6 Ohio 2002 1
value11 0 1 0 2.4 Nevada 2001 2
value5 0 0 0 2.9 Nevada 2002 0
value9 0 0 1 11.1 New York 2003 3
value13 0 0 0 23.4 New York 2004 0
value10 1 1 0 0.1 California 2009 1
value7 0 0 0 0.3 California 2010 0
value14 0 1 1 1.1 California 2009 2
The column "flag_common" should be set by looking at the the binary flags and inserting value 1-3 depending if the flags are 1 or 0. When two of the flag are set to 1 for same row, the flag with the lowest number is inserted into "flag_common". This has to be dynamic, being able to handle flag_1 to "flag_n".
I have sort of solved it using an row iteration method and a for-loop, but my data is very big and its becomes quite slow, so I hope there is a "pythonic" way to write this which is vectorized.
Code for data frame is below:
df = pd.DataFrame({'state': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada', 'New York', 'New York', 'California', 'California', 'California'],
'year' : [2000, 2001, 2002, 2001, 2002, 2003, 2004, 2009, 2010, 2009],
'pop' : [1.5, 1.7, 3.6, 2.4, 2.9, 11.1, 23.4, 0.1, 0.3, 1.1],
'cat' : ['value1', 'value3', 'value2', 'value11', 'value5', 'value9', 'value13', 'value10', 'value7', 'value14'],
'flag_1' : [1, 1,1,0,0,0,0,1,0,0],
'flag_2' : [0, 1,1,1,0,0,0,1,0,1],
'flag_3' : [0, 0, 0, 0,0,1,0,0,0, 1]
})
Thanks i advance for any thoughts and suggestions!