1

How does Python hash long numbers? I guess it takes O(1) time for 32-bit ints, but the way long integers work in Python makes me think the complexity is not O(1) for them. I've looked for answers in relevant questions, but have found none straightforward enough to make me confident. Thank you in advance.

1 Answer 1

5

The long_hash() function indeed loops and depends on the size of the integer, yes:

/* The following loop produces a C unsigned long x such that x is
   congruent to the absolute value of v modulo ULONG_MAX.  The
   resulting x is nonzero if and only if v is. */
while (--i >= 0) {
    /* Force a native long #-bits (32 or 64) circular shift */
    x = (x >> (8*SIZEOF_LONG-PyLong_SHIFT)) | (x << PyLong_SHIFT);
    x += v->ob_digit[i];
    /* If the addition above overflowed we compensate by
       incrementing.  This preserves the value modulo
       ULONG_MAX. */
    if (x < v->ob_digit[i])
        x++;
}

where i is the 'object size', e.g. the number of digits used to represent the number, where the size of a digit depends on your platform.

Sign up to request clarification or add additional context in comments.

7 Comments

As far as I know, in Python 3 all numbers are of long type, so is it safe to say that it's always O(i) to hash a number in Python 3?
@GrayFall9: yes, and yes. Python 3 int is what Python 2 called long. Note that the digit size is usually either 15 or 30 bits, so even for large numbers the number of iterations is very small. You'd need numbers over 1152921504606846976 to even go to 3 iterations on most modern systems.
@EliKorvigo: to be clear the time complexity is O(log(n)) where n is the integer. For large integers that are less than available memory, you could use gmpy2, to improve time performance
@J.F.Sebastian well, that's strange. I just tried to hash a random 500-digit integer in Python 2 and it finished instantly on my machine.
@EliKorvigo: That's because math.log2(500) is just under 9. You'll need to use far bigger numbers to see an effect. What made you think that O(logN) meant that code is going to run slow for N=500?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.