Best way to remove dataframe columns where every value is the same

Question

Let's say I have a DataFrame with multiple columns where the values are all the same e.g. df = pd.DataFrame({'State':['Texas','Texas','Texas','Texas'], 'County':['Harris','Harris','Harris','Harris'], 'Building':[1,2,3,4], 'Budget':[7328,3290,8342,4290]})

I want to write a function that simply scans the df and drops the columns where all values are the same.

I have come up with the below which also returns a separate Series x of the removed columns and their unique value.

I am new to coding and want to understand if there is a simpler way

def drop_monocols(df):              #takes df
    x = dict()
    n = 0
    while n < df.shape[1]:
        if df.iloc[:,n].nunique()==1:                   #finds columns where all values same
            x[df.columns[n]] = df.iloc[0,n]             #adds to dict colname:value
            df = df.drop(df.columns[n], axis=1)         #drops useless col from df
        else:
            n +=1
        x = pd.Series(x)
    return x, df                                        #returns useless col:value series and cleaned df

I am new to coding and want to understand if there is a simpler way. Can I use a for loop with columns instead of while? and is it possible to use .apply here instead of calling a function with df as the arg.

Does this answer your question? Drop all duplicate rows across multiple columns in Python Pandas — Teemu Risikko
– Teemu Risikko, Commented Feb 19, 2024 at 17:54

mozway · Accepted Answer · 2024-02-19 18:25:58Z

1

You can compare to the first row and see if all rows are the same to perform boolean indexing:

out = df.loc[:, ~df.eq(df.iloc[0]).all()]

Variant, keep if any value is different:

out = df.loc[:, df.ne(df.iloc[0]).any()]

Or, with nunique if you don't have NaNs (or want to ignore NaNs):

out = df.loc[:, df.nunique().gt(1)]

Output:

   Building  Budget
0         1    7328
1         2    3290
2         3    8342
3         4    4290

edited Feb 19, 2024 at 18:25

answered Feb 19, 2024 at 17:59

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Best way to remove dataframe columns where every value is the same

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related