Using python pandas data frame, how do you delete all rows that contain a '-'?

Question

I am importing a .csv file that someone else made, and they filled in some rows with a '-' character wherever there was missing data. The data frame looks something like this:

        Data1    Data2
0       99       1
1       99       2
2       -        3
3       98       4
4       97       5
5        -       -
6        -       -

Except it's more in the thousands of rows so I don't want to manually search for each row containing a dash to delete it. I have tried the following lines of code but it keeps returning an non altered data frame with the '-' rows still remaining:

import pandas as pd
    
df = pd.read_csv("data.csv")
    
df[df['Data1'] != '-']

print(df)

My logic is here is that any row not containing a '-' should remain, but clearly that's not working. The output I want is:

        Data1    Data2
0       99       1
1       99       2
2       98       4
3       97       5

mozway · Accepted Answer · 2022-11-07 16:25:50Z

1

The easiest is to use boolean indexing to keep the rows that are not (ne) - in all columns per row:

out = df[df.ne('-').all(axis=1)]

If for some reason you want to drop (e.g. to update in place), you can use:

m = df.eq('-').any(axis=1)
df.drop(df.index[m], inplace=True)

output:

  Data1 Data2
0    99     1
1    99     2
3    98     4
4    97     5

answered Nov 7, 2022 at 16:25

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Using python pandas data frame, how do you delete all rows that contain a '-'?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related