19

I'm trying to use value_counts() function from Python's pandas package to find the frequency of items in a column. This works and outputs the following:

57     1811
62      630
71      613
53      217
59      185
68       88
52       70

Name: hospitalized, dtype: int64

In which the first column is the item and the right column is its frequency in the column.

From there, I wanted to access the first column of items and iterate through that in a for loop. I want to be able to access the item of each row and check if it is equal to another value. If this is true, I want to be able to access the second column and divide it by another number.

My big issue is accessing the first column from the .value_counts() output. Is it possible to access this column and if so, how? The columns aren't named anything specific (since it's just the value_counts() output) so I'm unsure how to access them.

5 Answers 5

38

Use Panda's items():

df = pd.DataFrame({'mycolumn': [1,2,2,2,3,3,4]})
for val, cnt in df.mycolumn.value_counts().items():
    print('value', val, 'was found', cnt, 'times')

value 2 was found 3 times
value 3 was found 2 times
value 4 was found 1 times
value 1 was found 1 times
Sign up to request clarification or add additional context in comments.

2 Comments

Clean approach!
getting AttributeError: 'Series' object has no attribute 'iteritems'. .items() will work though.
21

value_counts returns a Pandas Series:

df = pd.DataFrame(np.random.choice(list("abc"), size=10), columns = ["X"])
df["X"].value_counts()
Out[243]: 
c    4
b    3
a    3
Name: X, dtype: int64

For the array of individual values, you can use the index of the Series:

vl_list = df["X"].value_counts().index
Index(['c', 'b', 'a'], dtype='object')

It is of type "Index" but you can iterate over it:

for idx in vl_list:
    print(idx)

c
b
a

Or for the numpy array, you can use df["X"].value_counts().index.values

2 Comments

Is .index or what? How did you know you could call .index on value_counts()? I checked the pandas docs and don't see any available attributes for value_counts(), and yet here I see .index and .name both working (i.e. value_counts().name). Where do I find that information?
It returns a Series. You can see the .index syntax here:pandas.pydata.org/pandas-docs/stable/dsintro.html#series
2

You can access the first column by using .keys() or index as below:

df.column_name.value_counts().keys()

df.column_name.value_counts().index

Comments

0

I tried to find and then iterate all value pairs across two different columns and eventually resorted to this:

pairs = df.value_counts(['A', 'B']).reset_index()
for i in range(0, len(pairs)):
    print(pairs.loc[i]['A'])
    print(pairs.loc[i]['B'])
    print(pairs.loc[i]['count'])
    

Comments

0

I found obtaining the first value a little non-intuitive:

     counter = df[columnName].value_counts()
     print( "Item:", counter.index[0], "with value:", counter.iloc[0] )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.