2
str(uuid.uuid4().int>>64)[0:8] + str(uuid.uuid4().int>>64)[0:8]

I want to create a 16 digit random number with the above code. If I generate it in two parts, does it make it more random or can I just do the following:

str(uuid.uuid4().int>>64)[0:16]
6
  • random + random = random. I think its the same. Commented Aug 7, 2014 at 7:26
  • I think it is more complicated than this. The random generating function can take into account the number of the calls. I have seen once a generator that returns always the same number at the beginning, but if you call a second time, you obtain a different number. The question is legitimate. Commented Aug 7, 2014 at 7:38
  • 1
    It depends on the implementation of uuid. I wonder why str(random.randint(0, 99999999)) + str(random.randint(0, 99999999)) isn't appropriate. Commented Aug 7, 2014 at 7:38
  • That could be less than 16 digits. Commented Aug 7, 2014 at 7:50
  • Even str(uuid.uuid4().int>>64)[0:16] can be less than 16 digits if the first is a zero. And I found such a case while running the script. Commented Aug 7, 2014 at 8:00

3 Answers 3

5

I invite you to be careful about the random number generator you are using. I made a test of the generated numbers distribution. Here is what I found:

import uuid
import numpy as np
import matplotlib.pyplot as plt

# Generation of 100000 numbers using the [0:8] + [0:8] technique
res1 = np.empty(shape=100000, dtype=int)
for i in xrange(res1.size):
    tmp = str(uuid.uuid4().int>>64)[0:8] + str(uuid.uuid4().int>>64)[0:8]
    res1[i] = int(tmp)

# Generation of 100000 numbers using the [0:16] technique
res2 = np.empty(shape=100000, dtype=int)
for i in xrange(res1.size):
    tmp = str(uuid.uuid4().int>>64)[0:16]
    res2[i] = int(tmp)

# Histogram plot
plt.setp(patches, 'facecolor', 'g', 'alpha', 0.75)
n, bins, patches = plt.hist(res1, 100, normed=1, histtype='stepfilled')
n, bins, patches = plt.hist(res2, 100, normed=1, histtype='stepfilled')

Generation of random numbers as proposed in the question

As we notice, the methods are the same. Nevertheless, the second one [0:16] can give 0 as a first digit, which make 15 digit numbers.

Why do not you use the function random.

# Generation of random numbers using `random` function
res3 = np.random.randint(1e15, 1e16, 100000)
# Plot
n, bins, patches = plt.hist(res3, 100, normed=1, histtype='stepfilled', label='randint')

Generation of random numbers with the function randint

This way, you are sure to have a regular distribution of a 16 digit numbers.

Sign up to request clarification or add additional context in comments.

2 Comments

I'm trying to lower the probability of a number repeating. Which is why I did it in two parts. Does your third method create unique numbers no matter how many times I make the call?
The latest method doesn't grantee that two numbers aren't the same, but the probability to have this case is very low. You can use the function np.unique to eliminate numbers that are doubled. I made a test by generating 100.000.000 numbers. None of them is doubled.
3

The uuid4 implementation in Python tries to use a system-provided uuid generator if available, then os.urandom() (so-called "true" randomness), then random.randrange() (which uses a PRNG) if neither is available. In the first two cases the randomness "should" be about as random as you can ask for from your computer. In the PRNG case each random byte is generated separately, so concatenating two halves really shouldn't help.

We can empirically check how even the distribution of digits is using code like this:

import uuid
digits = [0] * 10
for i in range(100000):
    x = str(uuid.uuid4().int)[-16:]
    for d in x:
        digits[int(d)] += 1
print(digits)

Note that I changed your code, removing >>64 because it can make the number too short and changing the slice to take the last 16 digits instead. The distribution of digits is pretty even.

[159606, 159916, 160188, 160254, 159815, 159680, 159503, 160015, 160572, 160451]

Now, let's see what changing to str(uuid.uuid4().int)[-8:] + str(uuid.uuid4().int)[-8:] does from a distribution standpoint:

[159518, 160205, 159843, 159997, 160493, 160187, 160626, 159665, 159429, 160037]

Basically nothing.

Incidentally, taking from the start of the string without the bit shift:

[151777, 184443, 184347, 166726, 151925, 152038, 152178, 152192, 151873, 152501]

There is bias toward 1s and 2s due to the 6 nonrandom bits at the beginning of the uuid4.

3 Comments

Good one. Couple of points: (a) Out of the 128-bits, two sub-fields are always fixed: 4-bit version number (value=4), 2-bit clock seq reserved (value=1). The former occurs in the first 64 bits, with trailing 12 bits. This may be the reason why there is a bias towards 1s and 2s. The latter (only 2 bits) occurs in the last 64 bits, with trailing 56 bits, so there is less of a bias here. It will be interesting to see in which positions 1s and 2s appear to clarify the matter.
(b) Except for these two sub-fields, everything else is random. This means potentially the random ones can all be zeros in very very very rare cases! This implies that we definitely need to pad with leading zero digits to get to 16-digit length as needed. We shouldn't pad with anything else for danger of introducing biases while padding.
Not directly related to the question, but I think os.urandom() is not "true" randomness; it uses /dev/urandom on unix, which is non-blocking; i.e. it will give you stuff even if it doesn't have entropy. So I have my doubts as to whether os.urandom() is really appropriate for cryptographic use, as the Python manual says. When I want a strong cryptographic key, what I do is I manually open /dev/random and read from there.
1

Looking at only your title, I should ask, why not:

from random import randint
s = ''
for i in range(16):
    s = s + str(randint(0,9))

You have not explained the reason to use UUID, and to me, it seems quite odd.

1 Comment

Indeed. I would have done this myself because I was not aware of uuid until I read this post.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.