1

I have a dataframe where one of the column names is a variable:

xx = pd.DataFrame([{'ID':1, 'Name': 'Abe', 'HasCar':1},
      {'ID':2, 'Name': 'Ben', 'HasCar':0},
      {'ID':3, 'Name': 'Cat', 'HasCar':1}])

    ID  Name    HasCar
0   1   Abe     1
1   2   Ben     0
2   3   Cat     1

In this dummy example column 2 could be "HasCar", or "IsStaff", or some other unknowable value. I want to select all rows, where column 2 is True, whatever the column name is.

I've tried the following without success:

xx.iloc[:,[2]] ==  1

    HasCar
0   True
1   False
2   True

and then trying to use that as an index results in:

xx[xx.iloc[:,[2]] ==  1]

    ID  Name    HasCar
0   NaN NaN 1.0
1   NaN NaN NaN
2   NaN NaN 1.0

Which isn't helpful. I suppose I could go about renaming column 2 but that feels a little wrong. The issue seems to be that xx.iloc[:,[2]] returns a dataframe while xx['hasCar'] returns a series. I can't figure out how to force a (x,1) shaped dataframe into a series without knowing the column name, as described here .

Any ideas?

1 Answer 1

2

It was almost correct, but you sliced in 2D, use a Series slicing instead:

xx[xx.iloc[:, 2] ==  1]

Output:

   ID Name  HasCar
0   1  Abe       1
2   3  Cat       1

difference:

# 2D slicing, this gives a DataFrame (with a single column)
xx.iloc[:,[2]]

   HasCar
0       1
1       0
2       1

# 1D slicing, as Series
xx.iloc[:,2]

0    1
1    0
2    1
Name: HasCar, dtype: int64
Sign up to request clarification or add additional context in comments.

1 Comment

It's always the small stuff that trips me up...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.