3

I have a list called 'gender', of which I counted all the occurrences of the values with Counter:

gender = ['2',
          'Female,',
          'All Female Group,',
          'All Male Group,',
          'Female,',
          'Couple,',
          'Mixed Group,'....]

gender_count = Counter(gender)
gender_count 
Counter({'2': 1,
     'All Female Group,': 222,
     'All Male Group,': 119,
     'Couple,': 256,
     'Female,': 1738,
     'Male,': 2077,
     'Mixed Group,': 212,
     'NA': 16})

I want to put this dict into a pandas Dataframe. I have used pd.series(Convert Python dict into a dataframe):

s = pd.Series(gender_count, name='gender count')
s.index.name = 'gender'
s.reset_index()

Which gives me the dataframe I want, but I don't know how to save these steps into a pandas DataFrame. I also tried using DataFrame.from_dict()

s2 = pd.DataFrame.from_dict(gender_count, orient='index')

But this creates a dataframe with the categories of gender as the index.

I eventually want to use gender categories and the count for a piechart.

3
  • 1
    just add reset_index at the end of your s2 creation statement Commented Mar 29, 2017 at 16:42
  • That works! Is there a way I can now change the column names? I tried .rename_axis() but that didn't work. Commented Mar 29, 2017 at 16:49
  • 2
    s2.columns = ['lovely name 1', 'lovely name 2'] Commented Mar 29, 2017 at 17:01

3 Answers 3

3

Skip the intermediate step

gender = ['2',
          'Female',
          'All Female Group',
          'All Male Group',
          'Female',
          'Couple',
          'Mixed Group']

pd.value_counts(gender)

Female              2
2                   1
Couple              1
Mixed Group         1
All Female Group    1
All Male Group      1
dtype: int64
Sign up to request clarification or add additional context in comments.

Comments

2
In [21]: df = pd.Series(gender_count).rename_axis('gender').reset_index(name='count')

In [22]: df
Out[22]:
              gender  count
0                  2      1
1  All Female Group,    222
2    All Male Group,    119
3            Couple,    256
4            Female,   1738
5              Male,   2077
6       Mixed Group,    212
7                 NA     16

2 Comments

I used your code, but this gives me the error message 'str' object is not callable, for 'gender' and 'count'.
@Lisadk: You're likely using an older version of pandas. See the output of pd.__version__.
0

what about just

s = pd.DataFrame(gender_count)

2 Comments

Since 'gender_count' is a dict, this does not work (ValueError: If using all scalar values, you must pass an index). EDIT: when you put in index=[0], it gives me the categories of gender as the columns.
This gives me the categories of gender as the columns. I would like 'gender' and 'count' as the colums.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.