0

I have the following reproducible code in which I create a dictionary, I group it by the factor Metropolitan Area and I use the agg() function to determine the mean by factor:

dictionaryMLB = {'Metropolitan area': ['New York City','New York City','Los Angeles', 'Los Angeles', 'San Francisco Bay Area','San Francisco Bay Area','Chicago','Chicago'],
              'Population (2016 est.)[8]': [20153634, 20153634, 13310447, 13310447,6657982,6657982,9512999,9512999],
              'MLB':['Yankees','Mets','Dodgers','Angels','Giants','Athletics','Cubs','White Sox']}

df = pd.DataFrame(dictionaryMLB)

df.groupby('Metropolitan area').agg([np.mean])

My output is the following:

                     Population (2016 est.)[8]
                           mean
Metropolitan area   
Chicago                   9512999
Los Angeles               13310447
New York City             20153634
San Francisco Bay Area    6657982

I would like to avoid the double name in the column, and just keeping either Population (2016 est.)[8] or mean to obtain, for example, the following:

                            mean
Metropolitan area   
Chicago                   9512999
Los Angeles               13310447
New York City             20153634
San Francisco Bay Area    6657982

How should I proceed?

3
  • 2
    Add .droplevel(0, axis=1) Commented Sep 16, 2021 at 17:30
  • 2
    You can just remove the list: df.groupby('Metropolitan area').agg(np.mean) Commented Sep 16, 2021 at 17:30
  • 1
    Or remove the square brackets in agg Commented Sep 16, 2021 at 17:31

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.