Let's say I have a DataFrame with multiple columns where the values are all the same e.g. df = pd.DataFrame({'State':['Texas','Texas','Texas','Texas'], 'County':['Harris','Harris','Harris','Harris'], 'Building':[1,2,3,4], 'Budget':[7328,3290,8342,4290]})
I want to write a function that simply scans the df and drops the columns where all values are the same.
I have come up with the below which also returns a separate Series x of the removed columns and their unique value.
I am new to coding and want to understand if there is a simpler way
def drop_monocols(df): #takes df
x = dict()
n = 0
while n < df.shape[1]:
if df.iloc[:,n].nunique()==1: #finds columns where all values same
x[df.columns[n]] = df.iloc[0,n] #adds to dict colname:value
df = df.drop(df.columns[n], axis=1) #drops useless col from df
else:
n +=1
x = pd.Series(x)
return x, df #returns useless col:value series and cleaned df
I am new to coding and want to understand if there is a simpler way. Can I use a for loop with columns instead of while? and is it possible to use .apply here instead of calling a function with df as the arg.