3

The question is very similar to this question Python: Pandas filter string data based on its string length, but I want to use pandas.DataFrame.query. Let's say we have a pandas.DataFrame. I like to filter out the rows where the string length of the column A is not equal to 3 using pandas.DataFrame.query

import pandas as pd
import numpy as np
df = pd.DataFrame({'A' : ['hi', 'hello', 'day', np.nan], 'B' : [1, 2, 3, 4]})  
df.query('A.str.len() != 3')

However, I got the following error

TypeError: unhashable type: 'numpy.ndarray'
0

2 Answers 2

2

Replacing 3 with "3" works. I'm using pandas 0.23.1.

df.query('A.str.len() != "3"')

Output:

       A  B
0     hi  1
1  hello  2
3    NaN  4

Alternatively, if you want to remove np.nan as 3-character string (NaN):

df.query('A.astype("str").str.len() != "3"')

Output:

       A  B
0     hi  1
1  hello  2

Hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

1

As of Pandas 1.4.2, OP's original code works.

Filter out rows where A values have length equal to 3:

df.query('A.str.len() != 3')

Filter out NaN values in addition to strings of length 3 (leverage the fact that NaN != NaN):

df.query('A.str.len() != 3 and A == A')

For Python 3.9.7 and Pandas 1.3.4, gyoza's answer is not filtering (it's returning the entire df back). However, converting the result of str.len() to dtype str works.

df.query('A.str.len().astype("str") != "3"')

or if it contains NaN that needs to be filtered out:

df.query('A.astype("str").str.len().astype("str") != "3"')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.