0

I have a pandas dataframe in the below format

No  ei1  ei2  ei3  ei4  ei1_val  ei2_val  ei3_val  ei4_val
123
124
125  0   0    0    1    low      low      high     high

To simplify, I have shown only a subset of columns here but actually the pandas dataframe has columns from ei1 to ei24 and ei1_val to ei24_val.

I have retrieved the column names using the below code:

val_cols = df[[col for col in df.columns if col.endswith("_val")]]
cols = [col.replace('_val', '') for col in val_cols.columns]

After that, I need to drop the rows from dataframe df if all columns in val_cols and all columns in cols are empty. Hence the output dataframe would drop rows with No's 123 and 124. Not sure whether is there a way to do it efficiently in Pandas rather than looping over the columns and checking the values.

Any suggestions would be appreciated.

2
  • How is empty defined? Are those empty strings or NaN? Commented Jun 7, 2021 at 14:47
  • @HenryEcker: They are all empty strings Commented Jun 7, 2021 at 14:50

1 Answer 1

3

IIUC, try:

m = ~df.filter(regex='.*_val').isna().all(axis=1)
df[m]

Output:

    No  ei1  ei2  ei3  ei4 ei1_val ei2_val ei3_val ei4_val
2  125  0.0  0.0  0.0  1.0     low     low    high    high

Find all the columns where the column header ends with _val using regex in the pd.DataFrame.filter method.

Check to see if all values are NaN using isna and all with axis=1

Sign up to request clarification or add additional context in comments.

2 Comments

OP confirms they're empty strings not NaN. .eq('').all(axis=1) should be a quick fix?
FWIW, the regex isn't really checking for ends-with, it might as well match a string that starts with _val, but doesn!t matter here i guess :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.