Python pandas DataFrame selection rows and columns based on specific condition

Question

I have the following:

  cl1 cl2 cl3 .... cln
0  aaa bbb ccc .... nnn
1  bbb aaa ccc .... nnn
2  xxx xxx xxx .... xxx

Need to select rows, which columns' value(any of them).lower() == 'aaa' So it is 0 and 1 rows and output shall be:

   cl1 cl2 cl3 .... cln
0  aaa bbb ccc .... nnn
1  bbb aaa ccc .... nnn

I tried many ways, but all of them requires columns names to be specified, but in my case I have no idea about columns names.

So basically something similar would work if I know column names:

df.loc[~df['something1'].str.lower().str.strip().isin(['something2'])]

anky · Accepted Answer · 2019-03-01 20:28:22Z

2

IIUC you can use:

df[df.eq('aaa').any(axis=1)]

   cl1  cl2  cl3  cln
0  aaa  bbb  ccc  nnn
1  bbb  aaa  ccc  nnn

If lower() has to be taken to consideration:

df[df.apply(lambda x: x.str.lower()).eq('aaa').any(1)] #thanks Chris

Or:

df[df.applymap(str.lower).eq('aaa').any(axis=1)]

The second one is faster and the firstone can handle NaNs.

edited Mar 1, 2019 at 20:28

answered Mar 1, 2019 at 20:12

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

It_is_Chris Over a year ago

don't forget to handle case issues as the OP suggested str.lower(): something like df[df.apply(lambda x: x.str.lower()).eq('aaa').any(1)]

Gurgen Hovhannisyan Over a year ago

Let me rephrase my question, your solution is part of it. So basically I need the following: if df.columns.str.lower().str.strip() == 'note' and df[column_name] == 'basename' then I need those rows. So I need to check not only the values, but also the column names (but not just '==' , + also lower().strip() applied to them) I think I found the solution by drop_duplicate by selecting only the rows which I need and the value. Trying now.

Gurgen Hovhannisyan Over a year ago

Generally my problem is as follows. I have df (one of rows contains 'basename' value which column_name.lower().strip() == 'note' (can be 'Note', 'note', ' note', etc.) and then I keep appending new rows, if somehow new row contains 'basename' value for which the column name again column_name.lower().strip() == 'note', then all previous rows with that value and with that column_name shall be removed except last one. Python pure code will be

Gurgen Hovhannisyan Over a year ago

** # Python is: ** for column_name in df.columns: if column_name.lower().strip() == 'none': if df[column_name].str.lower().str.strip().isin(['basebane']) # remove this row from df df = df.loc[~df[column_name].str.lower().str.strip().isin(['basename'])] ** # I need all of this written normally by dataframe syntax in one row **

Collectives™ on Stack Overflow

Python pandas DataFrame selection rows and columns based on specific condition

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related