What I'm doing - I have two columns in Excel: price and borough(British for district). I've generated the overall data set using python too, so I already have a list composed of the various boroughs. I want a breakdown of the mean price for each value associated with the borough list. So far - I've written a program which returns the mean value of the entire numerical data set which is the price:
import pandas as pd
file = "/Users/my_name/Documents/Startup Ideas/Python Data /file.xlsx"
df = pd.ExcelFile("/Users/my_name/Documents/Startup Ideas/Python Data /file.xlsx").parse("Sheet1")
x = []
x.append(df["Price"])
mean_value = df["Price"].mean()
The borough list is:
["Chelsea", "Kensington", "Westminster", "Pimlico", "Bank", "Holborn", "Camden", "Islington", "Angel", "Battersea", "Knightsbridge", "Bermondsey", "Newham"]
How would I add another column i.e the from the borough list and return a mean price distribution per borough? Thanks very much in advance.
I really don't know where to start
In terms of the source code for the input data:
SIZE = 70_000
BOROUGHS = ["Chelsea", "Kensington", "Westminster", "Pimlico", "Bank", "Holborn", "Camden", "Islington", "Angel", "Battersea", "Knightsbridge", "Bermondsey", "Newham"]
np.random.seed(1)
data3 = pd.DataFrame({"Sq. feet" : np.random.randint(low=75, high=325, size=SIZE),
"Price" : np.random.randint(low=200000, high=1250000, size=SIZE),
"Borough" : [random.choice(BOROUGHS) for _ in range(SIZE)]
})

df.groupby('Borough').mean()