I have a dataframe with multiple columns of real estate sales data. I would like to find the average price-per-square-foot 'ppsf' for all 1bed-1bath sales by zip code. Here is my attempt (each key in the dict is a zip code):
bed1_bath1={}
for zip in zip_codes:
bed1_bath1[zip]= (df.loc[(df['bed']==1) & (df['bath']==1) & (df['zip']==zip)]).mean()
The problem is that this adds the mean of all columns from the dataframe to the dictionary. I'm sure there is a better way to do this; maybe using numpy.where?