0

I have a dataframe with multiple columns of real estate sales data. I would like to find the average price-per-square-foot 'ppsf' for all 1bed-1bath sales by zip code. Here is my attempt (each key in the dict is a zip code):

bed1_bath1={}
for zip in zip_codes:
    bed1_bath1[zip]= (df.loc[(df['bed']==1) & (df['bath']==1) & (df['zip']==zip)]).mean()

The problem is that this adds the mean of all columns from the dataframe to the dictionary. I'm sure there is a better way to do this; maybe using numpy.where?

1 Answer 1

4

(df[(df['bed']==1) & (df['bath']==1) & (df['zip']==zip)])['ppsf'].mean() would do it. You simply choose the column you are interested in before calculating the mean (so you will not even do the processing for the rest of the columns).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.