How to get unique values with respective occurrence count from a list in Python?

Question

I have a list which has repeating items and I want a list of the unique items with their frequency.

For example, I have ['a', 'a', 'b', 'b', 'b'], and I want [('a', 2), ('b', 3)].

Looking for a simple way to do this without looping twice.

Just so you know... the answer you accepted violates your "without looping twice" constraint. (I'm comment here so that you get notified :-). — Tom
– Tom, Commented Mar 6, 2010 at 15:41
Can you just clarify your question a little bit too? Are your items always grouped together? Or can they appear in any order in the list? — Tom
– Tom, Commented Mar 6, 2010 at 15:57
Yes, Tom. Although my question does not specify this - but in my particular situation, the values are coming sorted. Thanks. — Samantha Green
– Samantha Green, Commented Mar 6, 2010 at 16:02

jpp · Accepted Answer · 2019-01-28 14:41:16Z

75

With Python 2.7+, you can use collections.Counter.

Otherwise, see this counter receipe.

Under Python 2.7+:

from collections import Counter
input =  ['a', 'a', 'b', 'b', 'b']
c = Counter( input )

print( c.items() )

Output is:

[('a', 2), ('b', 3)]

edited Jan 28, 2019 at 14:41

jpp

166k37 gold badges301 silver badges363 bronze badges

answered Mar 6, 2010 at 15:20

mmmmmm

32.8k28 gold badges92 silver badges124 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

ghostdog74 · Accepted Answer · 2010-03-06 16:50:05Z

16

>>> mylist=['a', 'a', 'b', 'b', 'b']
>>> [ (i,mylist.count(i)) for i in set(mylist) ]
[('a', 2), ('b', 3)]

answered Mar 6, 2010 at 16:50

ghostdog74

346k62 gold badges264 silver badges349 bronze badges

Comments

Eli Bendersky · Accepted Answer · 2010-03-06 15:48:37Z

15

If your items are grouped (i.e. similar items come together in a bunch), the most efficient method to use is itertools.groupby:

>>> [(g[0], len(list(g[1]))) for g in itertools.groupby(['a', 'a', 'b', 'b', 'b'])]
[('a', 2), ('b', 3)]

edited Mar 6, 2010 at 15:48

answered Mar 6, 2010 at 15:18

Eli Bendersky

276k92 gold badges372 silver badges427 bronze badges

6 Comments

Eli Bendersky Over a year ago

@Tom: I'm aware of this limitation. When the items are grouped, however, groupby is the efficient and preferred approach

Tom Over a year ago

You should make that clear... notice the constraint in the question says "I have a list which has repeating items"... the list the OP gave was just an example. I don't think this solution is general enough. If the OP specified that the input list always had the elements grouped, I would agree.

Eli Bendersky Over a year ago

@Tom: you're right - I've updated the answer (BTW I assumed from his "repeating items" that they're grouped)

Tom Over a year ago

Ok Eli... thanks for the update :-). I revoke my -1 because your answer is now more clear.

geotheory Over a year ago

Is there a way to sort the resulting tuple list by count?

|

jpp · Accepted Answer · 2018-09-07 15:31:22Z

7

If you are willing to use a 3rd party library, NumPy offers a convenient solution. This is particularly efficient if your list contains only numeric data.

import numpy as np

L = ['a', 'a', 'b', 'b', 'b']

res = list(zip(*np.unique(L, return_counts=True)))

# [('a', 2), ('b', 3)]

To understand the syntax, note np.unique here returns a tuple of unique values and counts:

uniq, counts = np.unique(L, return_counts=True)

print(uniq)    # ['a' 'b']
print(counts)  # [2 3]

See also: What are the advantages of NumPy over regular Python lists?

answered Sep 7, 2018 at 15:31

jpp

166k37 gold badges301 silver badges363 bronze badges

Comments

Tom · Accepted Answer · 2010-03-06 15:31:06Z

3

I know this isn't a one-liner... but to me I like it because it's clear to me that we pass over the initial list of values once (instead of calling count on it):

>>> from collections import defaultdict
>>> l = ['a', 'a', 'b', 'b', 'b']
>>> d = defaultdict(int)
>>> for i in l:
...  d[i] += 1
... 
>>> d
defaultdict(<type 'int'>, {'a': 2, 'b': 3})
>>> list(d.iteritems())
[('a', 2), ('b', 3)]
>>>

answered Mar 6, 2010 at 15:31

Tom

22k6 gold badges41 silver badges44 bronze badges

Comments

ghostdog74 · Accepted Answer · 2010-03-06 16:34:06Z

3

the "old school way".

>>> alist=['a', 'a', 'b', 'b', 'b']
>>> d={}
>>> for i in alist:
...    if not d.has_key(i): d[i]=1  #also: if not i in d
...    else: d[i]+=1
...
>>> d
{'a': 2, 'b': 3}

answered Mar 6, 2010 at 16:34

ghostdog74

346k62 gold badges264 silver badges349 bronze badges

Comments

Aaron · Accepted Answer · 2010-03-06 15:48:21Z

1

Another way to do this would be

mylist = [1, 1, 2, 3, 3, 3, 4, 4, 4, 4]
mydict = {}
for i in mylist:
    if i in mydict: mydict[i] += 1
    else: mydict[i] = 1

then to get the list of tuples,

mytups = [(i, mydict[i]) for i in mydict]

This only goes over the list once, but it does have to traverse the dictionary once as well. However, given that there are a lot of duplicates in the list, then the dictionary should be a lot smaller, hence faster to traverse.

Nevertheless, not a very pretty or concise bit of code, I'll admit.

answered Mar 6, 2010 at 15:48

Aaron

1,0726 silver badges16 bronze badges

3 Comments

Tom Over a year ago

This is identical in spirit to my solution... except defaultdict consolidates the first part (since you don't have to check for existence) and list(mydict.iteritems()) is shorter than the list comprehension.

PaulMcG Over a year ago

mytups = mydict.items() is a simpler way to get the list of tuples.

Aaron Over a year ago

Thanks @Paul and @Tom. It seems like there is always a better way to do something in Python. :)

arte · Accepted Answer · 2010-03-06 17:28:09Z

1

A solution without hashing:

def lcount(lst):
   return reduce(lambda a, b: a[0:-1] + [(a[-1][0], a[-1][1]+1)] if a and b == a[-1][0] else a + [(b, 1)], lst, [])

>>> lcount([])
[]
>>> lcount(['a'])
[('a', 1)]
>>> lcount(['a', 'a', 'a', 'b', 'b'])
[('a', 3), ('b', 2)]

answered Mar 6, 2010 at 17:28

arte

111 bronze badge

Comments

Kevlar · Accepted Answer · 2015-04-29 21:33:40Z

1

Convert any data structure into a pandas series s:

CODE:

for i in sort(s.value_counts().unique()):
  print i, (s.value_counts()==i).sum()

edited Apr 29, 2015 at 21:33

Kevlar

8,9249 gold badges59 silver badges81 bronze badges

answered Apr 29, 2015 at 20:59

Ali Arar

111 bronze badge

Comments

zashishz · Accepted Answer · 2018-05-15 10:05:48Z

0

With help of pandas you can do like:

import pandas as pd
dict(pd.value_counts(my_list))

answered May 15, 2018 at 10:05

zashishz

4676 silver badges6 bronze badges

Collectives™ on Stack Overflow

How to get unique values with respective occurrence count from a list in Python?

10 Answers 10

Comments

Comments

6 Comments

Comments

Comments

Comments

3 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

Comments

Comments

6 Comments

Comments

Comments

Comments

3 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related