Generating random 16 digit number in Python

Question

str(uuid.uuid4().int>>64)[0:8] + str(uuid.uuid4().int>>64)[0:8]

I want to create a 16 digit random number with the above code. If I generate it in two parts, does it make it more random or can I just do the following:

str(uuid.uuid4().int>>64)[0:16]

I think it is more complicated than this. The random generating function can take into account the number of the calls. I have seen once a generator that returns always the same number at the beginning, but if you call a second time, you obtain a different number. The question is legitimate. — Taha
– Taha, Commented Aug 7, 2014 at 7:38
It depends on the implementation of uuid. I wonder why str(random.randint(0, 99999999)) + str(random.randint(0, 99999999)) isn't appropriate. — Antonis Christofides
– Antonis Christofides, Commented Aug 7, 2014 at 7:38
Even str(uuid.uuid4().int>>64)[0:16] can be less than 16 digits if the first is a zero. And I found such a case while running the script. — Taha
– Taha, Commented Aug 7, 2014 at 8:00

Taha · Accepted Answer · 2014-08-07 08:45:09Z

5

I invite you to be careful about the random number generator you are using. I made a test of the generated numbers distribution. Here is what I found:

import uuid
import numpy as np
import matplotlib.pyplot as plt

# Generation of 100000 numbers using the [0:8] + [0:8] technique
res1 = np.empty(shape=100000, dtype=int)
for i in xrange(res1.size):
    tmp = str(uuid.uuid4().int>>64)[0:8] + str(uuid.uuid4().int>>64)[0:8]
    res1[i] = int(tmp)

# Generation of 100000 numbers using the [0:16] technique
res2 = np.empty(shape=100000, dtype=int)
for i in xrange(res1.size):
    tmp = str(uuid.uuid4().int>>64)[0:16]
    res2[i] = int(tmp)

# Histogram plot
plt.setp(patches, 'facecolor', 'g', 'alpha', 0.75)
n, bins, patches = plt.hist(res1, 100, normed=1, histtype='stepfilled')
n, bins, patches = plt.hist(res2, 100, normed=1, histtype='stepfilled')

Generation of random numbers as proposed in the question

As we notice, the methods are the same. Nevertheless, the second one [0:16] can give 0 as a first digit, which make 15 digit numbers.

Why do not you use the function random.

# Generation of random numbers using `random` function
res3 = np.random.randint(1e15, 1e16, 100000)
# Plot
n, bins, patches = plt.hist(res3, 100, normed=1, histtype='stepfilled', label='randint')

Generation of random numbers with the function randint

This way, you are sure to have a regular distribution of a 16 digit numbers.

answered Aug 7, 2014 at 8:45

Taha

7885 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

MiniGunnR Over a year ago

I'm trying to lower the probability of a number repeating. Which is why I did it in two parts. Does your third method create unique numbers no matter how many times I make the call?

Taha Over a year ago

The latest method doesn't grantee that two numbers aren't the same, but the probability to have this case is very low. You can use the function np.unique to eliminate numbers that are doubled. I made a test by generating 100.000.000 numbers. None of them is doubled.

Jason S · Accepted Answer · 2014-08-07 09:30:27Z

3

The uuid4 implementation in Python tries to use a system-provided uuid generator if available, then os.urandom() (so-called "true" randomness), then random.randrange() (which uses a PRNG) if neither is available. In the first two cases the randomness "should" be about as random as you can ask for from your computer. In the PRNG case each random byte is generated separately, so concatenating two halves really shouldn't help.

We can empirically check how even the distribution of digits is using code like this:

import uuid
digits = [0] * 10
for i in range(100000):
    x = str(uuid.uuid4().int)[-16:]
    for d in x:
        digits[int(d)] += 1
print(digits)

Note that I changed your code, removing >>64 because it can make the number too short and changing the slice to take the last 16 digits instead. The distribution of digits is pretty even.

[159606, 159916, 160188, 160254, 159815, 159680, 159503, 160015, 160572, 160451]

Now, let's see what changing to str(uuid.uuid4().int)[-8:] + str(uuid.uuid4().int)[-8:] does from a distribution standpoint:

[159518, 160205, 159843, 159997, 160493, 160187, 160626, 159665, 159429, 160037]

Basically nothing.

Incidentally, taking from the start of the string without the bit shift:

[151777, 184443, 184347, 166726, 151925, 152038, 152178, 152192, 151873, 152501]

There is bias toward 1s and 2s due to the 6 nonrandom bits at the beginning of the uuid4.

edited Aug 7, 2014 at 9:30

answered Aug 7, 2014 at 8:27

Jason S

13.9k2 gold badges42 silver badges43 bronze badges

3 Comments

coder.in.me Over a year ago

Good one. Couple of points: (a) Out of the 128-bits, two sub-fields are always fixed: 4-bit version number (value=4), 2-bit clock seq reserved (value=1). The former occurs in the first 64 bits, with trailing 12 bits. This may be the reason why there is a bias towards 1s and 2s. The latter (only 2 bits) occurs in the last 64 bits, with trailing 56 bits, so there is less of a bias here. It will be interesting to see in which positions 1s and 2s appear to clarify the matter.

coder.in.me Over a year ago

(b) Except for these two sub-fields, everything else is random. This means potentially the random ones can all be zeros in very very very rare cases! This implies that we definitely need to pad with leading zero digits to get to 16-digit length as needed. We shouldn't pad with anything else for danger of introducing biases while padding.

Antonis Christofides Over a year ago

Not directly related to the question, but I think os.urandom() is not "true" randomness; it uses /dev/urandom on unix, which is non-blocking; i.e. it will give you stuff even if it doesn't have entropy. So I have my doubts as to whether os.urandom() is really appropriate for cryptographic use, as the Python manual says. When I want a strong cryptographic key, what I do is I manually open /dev/random and read from there.

MariusSiuram · Accepted Answer · 2014-08-07 08:43:01Z

1

Looking at only your title, I should ask, why not:

from random import randint
s = ''
for i in range(16):
    s = s + str(randint(0,9))

You have not explained the reason to use UUID, and to me, it seems quite odd.

answered Aug 7, 2014 at 8:43

MariusSiuram

3,7143 gold badges25 silver badges45 bronze badges

1 Comment

coder.in.me Over a year ago

Indeed. I would have done this myself because I was not aware of uuid until I read this post.

Collectives™ on Stack Overflow

Generating random 16 digit number in Python

3 Answers 3

2 Comments

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related