0

I know that random is not truly random, but my test has a massively different result than the calculated probability.

Please find the error if there is one.

Below code is generating a string of length 3 out of 62 choices and at what point it repeates. The probability I calculated of repeating is 238.328‬ or 62^3.

Wen I run my code it repeats on average after 615 times.

    import random
    import string
    
    def get_random_string(length):
        choices = string.ascii_letters + string.digits
        result_str = ''.join(random.choice(choices) for i in range(length))
        return result_str
    
    def main():
        global repeated
        results = []
        for i in range(100000):
            tmpstr = get_random_string(3)
            if tmpstr in results:
                print(f"it took {i} times to repeate.")
                repeated.append(i)
                break
            else:
                results.append(tmpstr)
    
    if __name__ == "__main__":
        print(len(string.ascii_letters + string.digits))
        repeated = []
        for i in range(10):
            main()
        print(sum(repeated)/len(repeated))
        print("done")
8
  • 4
    en.wikipedia.org/wiki/Birthday_problem Commented Sep 5, 2022 at 14:06
  • 3
    The probability I calculated of repeating is 238.328‬ or 62^3 - nope, a random number generator is not expected to generate every single number before generating a number again. That's called shuffling. Commented Sep 5, 2022 at 14:13
  • 2
    Thanks @tevemadar you are right, my math is wrong. Commented Sep 5, 2022 at 14:16
  • 1
    I want to add a random string to a filename to make it unique - add a timestamp. If your code may run on more machines at the same time, add a timestamp and a machine id. Commented Sep 5, 2022 at 14:16
  • 4
    Using Ramanujan's approximation from en.wikipedia.org/wiki/… , the average is 612.5 Commented Sep 5, 2022 at 14:27

1 Answer 1

0

As tevemadar suggested my math is wrong. en.wikipedia.org/wiki/Birthday_problem is a hint to why.

Paul Hankin calculated the average to be 612.5 which lines up with my tests.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.