Demonstrating the Universality of the Uniform using numpy - an issue with transformation

Question

Recently I wanted to demonstrate generating a continuous random variable using the universality of the Uniform. For that, I wanted to use the combination of numpy and matplotlib. However, the generated random variable seems a little bit off to me - and I don't know whether it is caused by the way in which NumPy's random uniform and vectorized works or if I am doing something fundamentally wrong here.

Let U ~ Unif(0, 1) and X = F^-1(U). Then X is a real variable with a CDF F (please note that the F^-1 here denotes the quantile function, I also omit the second part of the universality because it will not be necessary).
Let's assume that the CDF of interest to me is:

then:

According to the universality of the uniform, to generate a real variable, it is enough to plug U ~ Unif(0, 1) in the F-1. Therefore, I've written a very simple code snippet for that:

U = np.random.uniform(0, 1, 1000000)

def logistic(u):
    x = np.log(u / (1 - u))
    return x

logistic_transform = np.vectorize(logistic)
X = logistic_transform(U)

However, the result seems a little bit off to me - although the histogram of a generated real variable X resembles a logistic distribution (which simplified CDF I've used) - the r.v. seems to be distributed in a very unequal way - and I can't wrap my head around exactly why it is so. I would be grateful for any suggestions on that. Below are the histograms of U and X.

Warren Weckesser · Accepted Answer · 2022-08-15 18:23:52Z

1

You have a large sample size, so you can increase the number of bins in your histogram and still get a good number samples per bin. If you are using matplotlib's hist function, try (for exampe) bins=400. I get this plot, which has the symmetry that I think you expected:

Also--and this is not relevant to the question--your function logistic will handle a NumPy array without wrapping it with vectorize, so you can save a few CPU cycles by writing X = logistic(U). And you can save a few lines of code by using scipy.special.logit instead of implementing it yourself.

answered Aug 15, 2022 at 18:23

Warren Weckesser

116k20 gold badges207 silver badges224 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Scolpe Over a year ago

I've checked everything three times but not the bins; such a silly mistake. Thank you very much! After setting the bins to 100, the histogram looks as expected. I know the scipy.special.logit, but I wanted to explain to someone how we can use U ~ Unif(0, 1) to construct any continuous r.v. only if we have desired CDF and can calculate the inverse of that CDF - the theorem, although simple, is quite hard to grasp for many people, so I wanted to use simulation - although I've not realised that the problem lies in how matplotlib determines bins. Thank you once again ;)

Collectives™ on Stack Overflow

Demonstrating the Universality of the Uniform using numpy - an issue with transformation

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related