Counting the number of set bits in a number

Question

The problem statement is:

Write an efficient program to count number of 1s in binary representation of an integer.

I found a post on this problem here which outlines multiple solutions which run in log(n) time including Brian Kernigan's algorithm and the gcc __builtin_popcount() method.

One solution that wasn't mentioned was the python method: bin(n).count("1") which also achieves the same effect. Does this method also run in log n time?

It runs in O(n) time, where n is the total number of bits. — Martijn Pieters
– Martijn Pieters, Commented Sep 2, 2016 at 22:20
Unlikely. The efficient solutions are designed specifically for counting bits, but the count() function operates on any string and can't use those optimizations. — Barmar
– Barmar, Commented Sep 2, 2016 at 22:21
At least O(n) time. The .count("1") has to visit each bit. Is bin(n) also linear? — Robᵩ
– Robᵩ, Commented Sep 2, 2016 at 22:21
@Robᵩ But there are log(n) bits in bin(n) by definition of binary representation — Dmitry Torba
– Dmitry Torba, Commented Sep 2, 2016 at 22:22
@loremIpsum1771 I'd discuss your thinking with the interviewer. Generally it is your thinking process that matters, not if you memorised the most efficient method of counting bits. — Martijn Pieters
– Martijn Pieters, Commented Sep 2, 2016 at 23:51

Martijn Pieters · Accepted Answer · 2016-09-03 11:28:26Z

6

You are converting the integer to a string, which means it'll have to produce N '0' and '1' characters. You then use str.count() which must visit every character in the string to count the '1' characters.

All in all you have a O(N) algorithm, with a relatively high constant cost.

Note that this is the same complexity as the code you linked to; the integer n has log(n) bits, but the algorithm still has to make N = log(n) steps to calculate the number of bits. The bin(n).count('1') algorithm is thus equivalent, but slow as there is a high cost to produce the string in the first place.

At the cost of a table, you could move to processing integers per byte:

table = [0]
while len(table) < 256:
    table += [t + 1 for t in table]

length = sum(map(table.__getitem__, n.to_bytes(n.bit_length() // 8 + 1, 'little')))

However, because Python needs to produce a series of new objects (a bytes object and several integers) this method never quite is fast enough to beat the bin(n).count('1') method:

>>> from random import choice
>>> import timeit
>>> table = [0]
>>> while len(table) < 256:
...     table += [t + 1 for t in table]
...
>>> def perbyte(n): return sum(map(table.__getitem__, n.to_bytes(n.bit_length() // 8 + 1, 'little')))
...
>>> def strcount(n): return bin(n).count('1')
...
>>> n = int(''.join([choice('01') for _ in range(2 ** 16)]))
>>> for f in (strcount, perbyte):
...     print(f.__name__, timeit.timeit('f(n)', 'from __main__ import f, n', number=1000))
...
strcount 1.11822146497434
perbyte 1.4401431040023454

No matter the bit-length of the test number, perbyte is always a percentage slower.

edited Sep 3, 2016 at 11:28

answered Sep 2, 2016 at 22:22

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Dmitry Torba Over a year ago

But there are log(n) charactes in bin(n). Thus, count will work for O(len(bin(n))) = O(log n)

Martijn Pieters Over a year ago

@DmitryTorba: yes, I was already editing in an explanation there; the article linked is rather disingenuous in that the algorithm there is just as linear; just because the integer n has log(n) bits doesn't make the algorithm to count bits suddenly logarithmic.

loremIpsum1771 Over a year ago

So the count() method is going to be linear because the time to create and append to the string will dominate asymptotically? And also, wouldn't the algorithms in the article still be log(n) since there are log(n) bits?

Stefan Pochmann Over a year ago

@MartijnPieters In my tests, your table way is consistently significantly slower than the bin(n).count('1') way... (about factor 1.3 slower for large strings, and about factor 2.6 slower for short strings, even when I prepare the table ahead of timing).

Martijn Pieters Over a year ago

@StefanPochmann I didn't do any performance tests myself (lack of time), so I appreciate the effort there. How large are the integers you tested with? Does switching to 'big' for endianness make any difference?

|

md5 · Accepted Answer · 2016-09-02 23:54:00Z

5

Let's say you are trying to count the number of set bits of n. On Python typical implementations, bin will compute the binary representation in O(log n) time and count will go through the string, therefore resulting in an overall O(log n) complexity.

However, note that usually, the input parameter of algorithms is the "size" of the input. When you work with integers, this corresponds to their logarithm. That's why the current algorithm is said to have a linear complexity (the variable is m = log n, and the complexity O(m)).

edited Sep 2, 2016 at 23:54

answered Sep 2, 2016 at 22:25

md5

23.8k4 gold badges49 silver badges96 bronze badges

4 Comments

loremIpsum1771 Over a year ago

I'm not sure I understand how it would be linear. If the input size is m (meaning there are m bits) and m can be found by log n bits (where n is the decimal representation of a number), then shouldn't the runtime be log(n)?

md5 Over a year ago

loremlpsum171: The runtime is a O(log(n)) and it is also a O(m). When we say an algorithm is "linear", "quadratic" or "logarithmic" without precising the variable, the default one is the size of the input. Here it is m, not n. If this algorithm was said to be logarithmic, it would mean its complexity is a O(log(m)), i.e. a O(log(log(n))).

loremIpsum1771 Over a year ago

Oh ok, I think I understand what you're saying. So then it would be best just to say the algorithm is linear as to avoid misinterpretation?

md5 Over a year ago

loremlpsum1771: provided that the person you're talking to is aware of this rule, yes (in fact, it comes from the theoretical meaning of the time complexity of an algorithm). But a lot of people are making such mistakes today, so I think the safest way to indicate the complexity of your algorithm "in everyday's life" is to precise the meaning of your variable (e.g. "I'm counting the number of bits of n, and my algorithm has a O(log n) time complexity").

Collectives™ on Stack Overflow

Counting the number of set bits in a number

2 Answers 2

11 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

11 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related