3

I have a situation where I need to drop a lot of my dataframe columns where there are high missing values. I have created a new dataframe that gives me the missing values and the ratio of missing values from my original data set.

My original data set - data_merge2 looks like this :

A     B      C      D
123   ABC    X      Y
123   ABC    X      Y
NaN   ABC    NaN   NaN
123   ABC    NaN   NaN
245   ABC    NaN   NaN
345   ABC    NaN   NaN

The count data set looks like this that gives me the missing count and ratio:

     missing_count   missing_ratio
  C    4               0.10
  D    4               0.66

The code that I used to create the count dataset looks like :

#Only check those columns where there are missing values as we have got a lot of columns
new_df = (data_merge2.isna()
        .sum()
        .to_frame('missing_count')
        .assign(missing_ratio = lambda x: x['missing_count']/len(data_merge2)*100)
        .loc[data_merge2.isna().any()] )
print(new_df)

Now I want to drop the columns from the original dataframe whose missing ratio is >50% How should I achieve this?

2 Answers 2

4

Use:

data_merge2.loc[:,data_merge2.count().div(len(data_merge2)).ge(0.5)]
#Alternative
#df[df.columns[df.count().mul(2).gt(len(df))]]

or DataFrame.drop using new_df DataFrame

data_merge2.drop(columns = new_df.index[new_df['missing_ratio'].gt(50)])

Output

       A    B
0  123.0  ABC
1  123.0  ABC
2    NaN  ABC
3  123.0  ABC
4  245.0  ABC
5  345.0  ABC
Sign up to request clarification or add additional context in comments.

Comments

3

Adding another way with query and XOR:

data_merge2[data_merge2.columns ^ new_df.query('missing_ratio>50').index]

Or pandas way using Index.difference

data_merge2[data_merge2.columns.difference(new_df.query('missing_ratio>50').index)]

       A    B
0  123.0  ABC
1  123.0  ABC
2    NaN  ABC
3  123.0  ABC
4  245.0  ABC
5  345.0  ABC

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.