0

There's a dataframe and I need to replace values above 512 with 263.

So, I used this code line to filter my indexes first:

df.loc[df['Fare']>512]['Fare'].astype(int)

Here is the result of this:

258     512
679     512
737     512
1234    512
Name: Fare, dtype: int64

This looks good! as it filtered all the 4 rows with a value above 512. Now I need to replace this value with 263:

df.loc[df['Fare']>512]['Fare']=df.loc[df['Fare']>512]['Fare'].astype(int).replace({512:263},inplace=True)

But it doesn't change anything in my dataframe. For instance, when I search for index 737, I found this:

df.iloc[737]

Result:

Age                                35
Fare                          512.329

So despite of above codes, the Fare hasn't been changed to 263.

4 Answers 4

3

Is there any reason why you aren't just doing

condition = df['Fare'].astype(int) > 512
df.loc[condition, 'Fare'] = 263

The condition is a boolean series and .loc will only assign rows in that series with value True to your required value.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, but it results in error: ValueError: Cannot convert non-finite values (NA or inf) to integer Plus, I think using loc is more understandable than this code by the way.
My code is essentially the same as yours except it doesn't use replace. I would argue that yours is unnecessarily lengthy, but clearly you have other opinions.
Right, now I can see that using loc is redundant
3

The below code snippet would be even simpler:

df.loc[df['fare']>512, 'fare'] = 263

The code only replaces the value for column fare. If you want to replace multiple column values, this can also be specified inside the square brackets.

The actual syntax from pandas is:

df.loc[row_indexer,col_indexer] = value

Comments

2

Remove the inplace = True option.

df.loc[df['Fare']>512]['Fare']=df.loc[df['Fare']>512]['Fare'].astype(int).replace({512:263})

or simply do not assing.

df.loc[df['Fare']>512]['Fare'].astype(int).replace({512:263}, inplace=True)

From the replace docs:

inplace : bool, default False
If True, in place. Note: this will modify any other views on this object (e.g. a column from a DataFrame). Returns the caller if this is True.

By now, you are modifing the dataframe inplace, but the assignment operator = return the caller, so you are rewriting your edit with the original values.

EDIT

Actually in my version (pandas 0.24.0) with inplace = True it does not return anything, so the bold sentence above may be version dependent (the docs refers to pandas 0.24.2).

As a side note, filtering the data with .loc and then using replace is redundant: .replace({512:263}) will convert values 512 only, no need to select that values before with .loc.
If you do:

df['Fare'].astype(int).replace({512:263}, inplace=True)

you get the same result.

2 Comments

Thanks Valentino, I didn't see your edit first. Yes you're right, .loc is pretty redundant here. But when I use your code, I get value error: ValueError: Cannot convert non-finite values (NA or inf) to integer
that is because you probably have NaN values in your dataframe (they are considered floats) and you use .astype(int). If you remove .astype(int) the error should disappear.
2

when using .loc you want to use [row, col] and not [row][col].

try:

df.loc[df['Fare']>512, 'Fare']=df.loc[df['Fare']>512, 'Fare'].astype(int).replace({512:263},inplace=True)

1 Comment

Thanks Adam, I didn't know using .loc needed that kind of format. However, when I use the new format, it returns NaN value for those indexes

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.