How do I iteratively select rows in pandas based on column values?

Question

I'm a complete newbie at pandas so a simpler (though maybe not the most efficient or elegant) solution is appreciated. I don't mind a bit of brute force if I can understand the answer better.

If I have the following Dataframe:

A    B    C 
0    0    1
0    1    1

I want to loop through columns "A", "B" and "C" in that order and during each iteration select all the rows for which the current column is "1" and none of the previous columns are and save the result and also use it in the next iteration.

So when looking at column A, I wouldn't select anything. Then when looking at column B I would select the second row because B==1 and A==0. Then when looking at column C I would select the first row because A==0 and B==0.

On the first iteration (column A) there's no output as nothing matches the criteria, on the second iteration (column B), the expected output is the second row: 0 0 1 and when looking at column C (the third iteration) the expected output is the first row: 0 0 1 — user3014653
– user3014653, Commented Jun 9, 2022 at 23:34

Corralien · Accepted Answer · 2022-06-10 05:00:19Z

1

Create a boolean mask:

m = (df == 1) & (df.cumsum(axis=1) == 1)
d = {col: df[m[col]].index.tolist() for col in df.columns if m[col].sum()}

Output:

>>> m
       A      B      C
0  False  False   True
1  False   True  False
2  False  False   True

>>> d
{'B': [1], 'C': [0, 2]}

I slightly modified your dataframe:

Update

For the expected output on my sample:

for rows, col in zip(m, df.columns):
    if m[col].sum():
        print(f"\n=== {col} ===")
        print(df[m[col]])

Output:

=== B ===
   A  B  C
1  0  1  1

=== C ===
   A  B  C
0  0  0  1
2  0  0  1

edited Jun 10, 2022 at 5:00

answered Jun 9, 2022 at 21:43

Corralien

121k8 gold badges44 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

rafaelc Over a year ago

Oh, I see what you mean. Good catch :) +1

Onyambu Over a year ago

Does OP only want the index? in that case isnt df.idxmax(1) sufficient? ie df.idxmax(1).reset_index().groupby(0).agg(list). I am not quite sure of just the index or rather OP is trying to rearrange the df such that he has an upper triangle maxtrix

Corralien Over a year ago

I updated my answer. Can you check it please?

Corralien Over a year ago

@onyambu. I returned the index because the expected output is not clear. With index, it's easy to extract rows. I updated my answer with a simple print.

rafaelc · Accepted Answer · 2022-06-09 21:35:19Z

0

Seems like you need a direct use of idxmax

Return index of first occurrence of maximum over requested axis.

NA/null values are excluded.

>>> df.idxmax()
A    0
B    1
C    0
dtype: int64

The values above are the indexes for which your constraints are met. 1 for B means that the second row was "selected". 0 for C, same. The only issue is that, if nothing is found, it'll also return 0.

To address that, you can use where

>>> df.idxmax().where(~df.eq(0).all())

This will make sure that NaNs are returned for all-zero columns.

A    NaN
B    1.0
C    0.0
dtype: float64

edited Jun 9, 2022 at 21:35

answered Jun 9, 2022 at 21:33

rafaelc

59.4k15 gold badges64 silver badges87 bronze badges

4 Comments

Corralien Over a year ago

Yes. I didn't see it. Sorry

Corralien Over a year ago

There is a subtlety: select all the rows. idxmax returns only one row (index)

rafaelc Over a year ago

@Corralien not sure if I follow. I believe OP wants the idxmax indeed. What they meant was to find that for all columns.

Corralien Over a year ago

For example, if there are 2 rows with 1 in C and all previous columns are set to 0 then you have to select the 2 rows not the first. Check my answer, you will understand ;)

Collectives™ on Stack Overflow

How do I iteratively select rows in pandas based on column values?

2 Answers 2

4 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related