64
import pandas as pd
import numpy as np
data = 'filename.csv'
df = pd.DataFrame(data)
df 

        one       two     three  four   five
a  0.469112 -0.282863 -1.509059  bar   True
b  0.932424  1.224234  7.823421  bar  False
c -1.135632  1.212112 -0.173215  bar  False
d  0.232424  2.342112  0.982342  unbar True
e  0.119209 -1.044236 -0.861849  bar   True
f -2.104569 -0.494929  1.071804  bar  False

I would like to select a range for a certain column, let's say column two. I would like to select all values between -0.5 and +0.5. How does one do this?

I expected to use

-0.5 < df["two"] < 0.5

But this (naturally) gives a ValueError:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I tried

-0.5 (< df["two"] < 0.5)

But this outputs all True.

The correct output should be

0    True
1    False
2    False
3    False
4    False
5    True

What is the correct way to find a range of values in a pandas dataframe column?

EDIT: Question

Using .between() with

df['two'].between(-0.5, 0.5, inclusive=False)

would would be the difference between

 -0.5 < df['two'] < 0.5

and inequalities like

 -0.5 =< df['two'] < 0.5

?

2
  • 9
    There is a better alternative: df.query('-0.5 <= two < 0.5') Commented Aug 11, 2016 at 7:13
  • @MaxU Thanks for this! I hadn't thought of this. This is very clean Commented Aug 11, 2016 at 7:28

3 Answers 3

95

Use between with inclusive=False for strict inequalities:

df['two'].between(-0.5, 0.5, inclusive=False)

The inclusive parameter determines if the endpoints are included or not (True: <=, False: <). This applies to both signs. If you want mixed inequalities, you'll need to code them explicitly:

(df['two'] >= -0.5) & (df['two'] < 0.5)
Sign up to request clarification or add additional context in comments.

4 Comments

What do you mean by using inclusive=False for strict inequalities? I'm not sure I understand the difference between inclusive=True and inclusive=False?
Using between(-0.5, 0.5) , what would be the difference between -0.5 < value < 0.5 and -0.5 = < value < 0.5 ?
NB: The parenthesis in the second expression are important.
will it works for date also ? 'df['date'].between(2010-03-01, 2010-05-01, inclusive=False)' I found the sol stackoverflow.com/a/29370182/8927035
18

.between is a good solution, but if you want finer control use this:

(0.5 <= df['two']) & (df['two'] < 0.5)

The operator & is different from and. The other operators are | for or, ~ for not. See this discussion for more info.

Your statement was the same as this:

(0.5 <= df['two']) and (df['two'] < 0.5)

Hence it raised the error.

1 Comment

Thanks for the explanation as to why the ValueError was raised!
1

Here's how you would get the values within the range without using between().

df2 = pd.read_clipboard()
df2["two"][(df2["two"] >= -.5) & (df2["two"] <= .5)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.