Python dataframe; trouble changing value of column with multiple filters

Question

I have a large dataframe I took off an ODBC database. The Dataframe has multiple columns; I'm trying to change the values of one column by filtering two other. First, I filter my dataframe data_prem with both conditions which gives me the correct rows:

data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))]

Then I use the replace function on the selection to change 'M' value to 'H' value:

data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))]['Reinsurer'].replace(to_replace='M',value='H',inplace=True,regex=True)

Python warns me I'm trying to modify a copy of the dataframe, even though I'm clearly refering to the original dataframe (I'm posting image so you can see my results).

dataframe filtering

I also tried using .loc function in the following manner:

data_prem.loc[((data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))),'Reinsurer'] = 'H'

which changed all rows that fit the second condition (str.contains...), but it didn't apply the first condition. I got replacements in the 'Reinsurer' column for other 'PRODUCT_NAME' values as well.

I've been scouring the web for an answer to this for some time. I've seen some mentions of a bug in the pandas library, not sure if this is what they were talking about.

I would value any opinions you might have, would also be interesting in alternative ways to solving this problem. I filled the 'Reinsurer' column with the map function with 'PRODUCT_NAME' as the input (had a dictionary that connected all 'PRODUCT_NAME' values with 'Reinsurer' values).

No, I take the data off a server and create an Excel report. — MartinV
– MartinV, Commented Dec 10, 2018 at 9:53
it looks very strange to me, I don't find any logical mistake in your code. — Mohamed Thasin ah
– Mohamed Thasin ah, Commented Dec 10, 2018 at 10:02
Your first example.. data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))]['Reinsurer'].replace(to_replace='M',value='H',inplace=True,regex=True) is a clear example of chained indexing. The error is correct. Always use loc. — jpp
– jpp, Commented Dec 10, 2018 at 10:09
You should provide a minimal reproducible example with some data as text so we can reproduce your problem with loc. — jpp
– jpp, Commented Dec 10, 2018 at 10:10
OK, this is really strange. When I was trying to use the .loc function before, I wrote it differently than above: data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))].loc[data_prem[(data_prem['PRODUCT_NAME']=='ŽZ08') & (data_prem['BENEFIT'].str.contains('19.08.16'))],'Reinsurer'] = 'H' Basically, I was using .loc on the filtered dataframe. I overcomplicated the code before (though I remember I tried simpler ways as well and had problems with them before I tried the overly complicated one). — MartinV
– MartinV, Commented Dec 10, 2018 at 11:56

jpp · Accepted Answer · 2018-12-10 12:28:44Z

1

Given your Boolean mask, you've demonstrated two ways of applying chained indexing. This is the cause of the warning and the reason why you aren't seeing your logic being applied as you anticipate.

mask = (data_prem['PRODUCT_NAME']=='ŽZ08') & df['BENEFIT'].str.contains('19.08.16')

Chained indexing: Example #1

df[mask]['Reinsurer'].replace(to_replace='M', value='H', inplace=True, regex=True)

Chained indexing: Example #2

df[mask].loc[mask, 'Reinsurer'] = 'H'

Avoid chained indexing

You can keep things simple by applying your mask once and using a single loc call:

df.loc[mask, 'Reinsurer'] = 'H'

answered Dec 10, 2018 at 12:28

jpp

166k37 gold badges301 silver badges363 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python dataframe; trouble changing value of column with multiple filters

1 Answer 1

Chained indexing: Example #1

Chained indexing: Example #2

Avoid chained indexing

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Chained indexing: Example #1

Chained indexing: Example #2

Avoid chained indexing

Comments

Your Answer

Sign up or log in

Post as a guest

Related