I have a large dataframe I took off an ODBC database. The Dataframe has multiple columns; I'm trying to change the values of one column by filtering two other. First, I filter my dataframe data_prem with both conditions which gives me the correct rows:
data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))]
Then I use the replace function on the selection to change 'M' value to 'H' value:
data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))]['Reinsurer'].replace(to_replace='M',value='H',inplace=True,regex=True)
Python warns me I'm trying to modify a copy of the dataframe, even though I'm clearly refering to the original dataframe (I'm posting image so you can see my results).
I also tried using .loc function in the following manner:
data_prem.loc[((data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))),'Reinsurer'] = 'H'
which changed all rows that fit the second condition (str.contains...), but it didn't apply the first condition. I got replacements in the 'Reinsurer' column for other 'PRODUCT_NAME' values as well.
I've been scouring the web for an answer to this for some time. I've seen some mentions of a bug in the pandas library, not sure if this is what they were talking about.
I would value any opinions you might have, would also be interesting in alternative ways to solving this problem. I filled the 'Reinsurer' column with the map function with 'PRODUCT_NAME' as the input (had a dictionary that connected all 'PRODUCT_NAME' values with 'Reinsurer' values).
data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))]['Reinsurer'].replace(to_replace='M',value='H',inplace=True,regex=True)is a clear example of chained indexing. The error is correct. Always useloc.loc..locfunction before, I wrote it differently than above:data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))].loc[data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))],'Reinsurer'] = 'H'Basically, I was using.locon the filtered dataframe. I overcomplicated the code before (though I remember I tried simpler ways as well and had problems with them before I tried the overly complicated one).