4

I just run into a problem that I need to put a list, e.g. l = [1, 2, 3, 4], into a dic, e.g. {1: 1, 2: 1, 3: 1, 4: 1}. I just want to know whether I should use collections.Counter() or just write a loop by myself to do this. Is build-in method faster than writing loop by myself?

5
  • 2
    Use collections.Counter() because this what it does. Why to write your own code ;) Commented Nov 9, 2016 at 18:29
  • You can always test if something is faster, with the timeit module. In Python 3, the Counter object has C performance improvements and is very fast indeed. Commented Nov 9, 2016 at 18:39
  • @MartijnPieters: I calculated it using timeit and My own code is faster than Counter. May be I did something wrong, but everything looks same to me Commented Nov 9, 2016 at 18:43
  • @anonymous: I did mention that in Python 3 it is faster. Are you using Python 2 perhaps? See my answer below for a benchmark. Commented Nov 9, 2016 at 18:47
  • Yes, I am using Python 2. May be that is the reason. Commented Nov 9, 2016 at 18:48

2 Answers 2

5

You can always test if something is faster, with the timeit module. In Python 3, the Counter object has C performance improvements and is very fast indeed:

>>> from timeit import timeit
>>> import random, string
>>> from collections import Counter, defaultdict
>>> def count_manually(it):
...     res = defaultdict(int)
...     for el in it:
...         res[el] += 1
...     return res
...
>>> test_data = [random.choice(string.printable) for _ in range(10000)]
>>> timeit('count_manually(test_data)', 'from __main__ import test_data, count_manually', number=2000)
1.4321454349992564

>>> timeit('Counter(test_data)', 'from __main__ import test_data, Counter', number=2000)
0.776072466003825

Here Counter() was 2 times faster.

That said, unless you are counting in a performance-critical section of your code, focus on readability and maintainability in mind, and in that respect a Counter() wins hands-down over write-your-own code.

Next to all that, Counter() objects offer functionality on top of dictionaries: they can be treated as multisets (you can sum or subtract counters, and produce unions or intersections), and they can efficiently give you the top N elements by count.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes you are right. In Python 2 it is slower, but faster in Python 3
Thanks very much!!
1

It depends on the readability v/s efficiency. Let's see both the implementations first. I will be using this as list for the sample run:

my_list = [1, 2, 3, 4, 4, 5, 4, 3, 2]

Using collections.Counter():

from collections import Counter
d = Counter(my_list)

Using collections.defaultdict() creating my own counter:

from collections import defaultdict
d = defaultdict(int)
for i in [1, 2, 3, 4, 4, 5, 4, 3, 2]: 
    d[i] += 1

As you see, collections.Counter() is more readable

Let see efficiency using timeit:

  • In Python 2.7:

    mquadri$ python -m "timeit" -c "from collections import defaultdict" "d=defaultdict(int)" "for i in [1, 2, 3, 4, 4, 5, 4, 3, 2]: d[i] += 1"
    100000 loops, best of 3: 2.95 usec per loop
    
    mquadri$ python -m "timeit" -c "from collections import Counter" "Counter([1, 2, 3, 4, 4, 5, 4, 3, 2])"
    100000 loops, best of 3: 6.5 usec per loop
    

    collection.Counter() implementation is slower by 2 times than own code.

  • In Python 3:

    mquadri$ python3 -m "timeit" -c "from collections import defaultdict" "d=defaultdict(int)" "for i in [1, 2, 3, 4, 4, 5, 4, 3, 2]: d[i] += 1"
    100000 loops, best of 3: 3.1 usec per loop
    
    mquadri$ python3 -m "timeit" -c "from collections import Counter" "Counter([1, 2, 3, 4, 4, 5, 4, 3, 2])"
    100000 loops, best of 3: 5.57 usec per loop
    

    collections.Counter() is twice as faster as own code.

1 Comment

Still I think it's safe to say that it's O(n) complexity either way.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.