1

This is a quite easy task, however, I am stuck here. I have a dataframe and there is a column with type string, so characters in it:

Category
AB00
CD01
EF02
GH03
RF04

Now I want to treat these values as numeric and filter on and create a subset dataframe. However, I do not want to change the dataframe in any way. I tried:

df_subset=df[df['Category'].str[2:4]<=3]

of course this does not work, as the first part is a string and cannot be evaluated as numeric and compared to 69.

I tried

df_subset=df[int(df['Category'].str[2:4])<=3]

but I am not sure about this, I think it is wrong or not the way it should be done.

2
  • df['Category'].str[2:4]<='69'? Are you comparing to 69 or to 3? Commented Jan 11, 2023 at 15:47
  • maybe your problem is solved here: stackoverflow.com/questions/11350770/… Commented Jan 11, 2023 at 15:47

2 Answers 2

1

Add type conversion to your expression:

df[df['Category'].str[2:].astype(int) <= 3]

  Category
0     AB00
1     CD01
2     EF02
3     GH03
Sign up to request clarification or add additional context in comments.

Comments

1

As you have leading zeros, you can directly use string comparison:

df_subset = df.loc[df['Category'].str[2:4] <= '03']

Output:

  Category
0     AB00
1     CD01
2     EF02
3     GH03

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.