1

I have a DataFrame that I need to modify based on one of column values. In particular, when the value in column a is above 110, I want the column b to be assigned value of -99. The only issue is that first 3 rows of the dataframe contain a mix of string and numerical data types so when I try:

df.loc[df['a'] >= 110, 'b'] = -99

I get a TypeError because comparison between str and int is not allowed.

So my question is: how do I do this assignment while ignoring the first 3 rows of the dataframe?

So far I came up with this rather dodgy way:

try:
    df.loc[df['a'] >= 110, 'b'] = -99
except TypeError:
    pass

This does seem to work, but it obviously doesn't seem like the proper way to do it.

EDIT: And also this method just skips first 3 rows, but I really need to keep them as is.

1 Answer 1

1

Try:

df.loc[df['a'].apply(pd.to_numeric, errors='coerce').ge(110), 'b'] = -99

or use errors='ignore'

Sign up to request clarification or add additional context in comments.

3 Comments

I want to keep the data in the top 3 rows as is. Wouldn't apply(pd.to_numeric,errors='coerse') replace the data with a bunch of NaN?
make a copy of the first 3 rows and re append after that operation
When I try to do it the way you suggested, I get a weird error: ValueError: invalid error value specified. Although it works if I do errors='ignore'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.