0

Recently I wanted to demonstrate generating a continuous random variable using the universality of the Uniform. For that, I wanted to use the combination of numpy and matplotlib. However, the generated random variable seems a little bit off to me - and I don't know whether it is caused by the way in which NumPy's random uniform and vectorized works or if I am doing something fundamentally wrong here.

  • Let U ~ Unif(0, 1) and X = F^-1(U). Then X is a real variable with a CDF F (please note that the F^-1 here denotes the quantile function, I also omit the second part of the universality because it will not be necessary).

  • Let's assume that the CDF of interest to me is:

CDF of interest

then:

enter image description here

  • According to the universality of the uniform, to generate a real variable, it is enough to plug U ~ Unif(0, 1) in the F-1. Therefore, I've written a very simple code snippet for that:
U = np.random.uniform(0, 1, 1000000)

def logistic(u):
    x = np.log(u / (1 - u))
    return x

logistic_transform = np.vectorize(logistic)
X = logistic_transform(U)

However, the result seems a little bit off to me - although the histogram of a generated real variable X resembles a logistic distribution (which simplified CDF I've used) - the r.v. seems to be distributed in a very unequal way - and I can't wrap my head around exactly why it is so. I would be grateful for any suggestions on that. Below are the histograms of U and X.

Histogram of X

Histogram of U and X

0

1 Answer 1

1

You have a large sample size, so you can increase the number of bins in your histogram and still get a good number samples per bin. If you are using matplotlib's hist function, try (for exampe) bins=400. I get this plot, which has the symmetry that I think you expected:

histogram


Also--and this is not relevant to the question--your function logistic will handle a NumPy array without wrapping it with vectorize, so you can save a few CPU cycles by writing X = logistic(U). And you can save a few lines of code by using scipy.special.logit instead of implementing it yourself.

Sign up to request clarification or add additional context in comments.

1 Comment

I've checked everything three times but not the bins; such a silly mistake. Thank you very much! After setting the bins to 100, the histogram looks as expected. I know the scipy.special.logit, but I wanted to explain to someone how we can use U ~ Unif(0, 1) to construct any continuous r.v. only if we have desired CDF and can calculate the inverse of that CDF - the theorem, although simple, is quite hard to grasp for many people, so I wanted to use simulation - although I've not realised that the problem lies in how matplotlib determines bins. Thank you once again ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.