1

I have a large Pandas dataframe, and want to replace some values in a subset of the columns based on a condition.

Specifically, I want to replace the values that are greater than one with 1 in every column to the right of the 9th column.

Because the dataframe is so large and growing in both the number of rows and columns over time, I cannot manually specify the names of the columns to change values in. Rather, I just need to specify that column 10 and greater should be inspected for values > 1.

After looking at many different Stack Overflow posts and Pandas documentation, I tried:

df.iloc[df[:,10: ] > 1] = 1

However, this gives me the error “unhashable type: ‘slice’”.

I then tried:

df[df.iloc[:, 10:] > 1] = 1

and

df[df.loc[:, df.columns[10:]] > 1] = 1

as per 2 suggestions in the comments, but both of those give me the error “Cannot do inplace boolean setting on mixed-types with a non np.nan value”.

Does anyone know why I’m getting these errors and/or what I should change about my code to avoid them?

Thank you!

6
  • Check df[df.iloc[:, 10:] > 1] = 1 Commented Apr 20, 2021 at 14:53
  • Now I get the error “ ‘int’ object has no attribute ‘iloc’”. @ShubhamSharma Commented Apr 20, 2021 at 15:24
  • What about df[df.loc[:, df.columns[10:]] > 1] = 1? Commented Apr 20, 2021 at 15:38
  • Please check the type of df i.e type(df) it should come as pandas.core.frame.DataFrame. i guess you have overwritten the variable df with a integer value. Commented Apr 20, 2021 at 15:38
  • Ah, yes, I accidentally overwrote it. When I fixed that and ran your code from your first comment, I now get the error "Cannot do inplace boolean setting on mixed-types with a non np.nan value." This is because the first 9 of my columns are a mix of strings and ints, something which I cannot change about the dataframe. @ShubhamSharma Do you have any tips here? I don't want to mess with stacking to fix this. Commented Apr 20, 2021 at 15:57

2 Answers 2

2

1. DataFrame.where

We can use iloc to select all the columns to the right of 9th column, then using where we can replace the values in the slice of dataframe where the condition x.le(1) is False.

df.iloc[:, 10:] = df.iloc[:, 10:].where(lambda x: x.le(1), 1)

2. DataFrame.clip

Alternatively we can use clip where we can define the upper limit as 1 which assigns all the values greater than 1 in the slice of dataframe to 1.

df.iloc[:, 10:] = df.iloc[:, 10:].clip(upper=1)
Sign up to request clarification or add additional context in comments.

2 Comments

Can you explain what each of these options is doing? I'm not familiar with clip or lambda.
@jmrpink give me a minute already adding explanations.
0

I came here searching for slice of a column, that means something like df.loc[10:, column_name]. If the index is not a range index, the 5:100 need to be replaced with df.index. Here is my solution (after some trials and errors) based on this answer:

idx = df.loc[:, data_name].index[10:]  # get index
df.loc[idx, data_name] = 1  # replace

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.