How to calculate cumulative normal distribution?

Question

I am looking for a function in Numpy or Scipy (or any rigorous Python library) that will give me the cumulative normal distribution function in Python.

Alex Reynolds · Accepted Answer · 2025-01-29 21:07:24Z

175

Here's an example:

>>> from scipy.stats import norm
>>> norm.cdf(1.96)
0.9750021048517795
>>> norm.cdf(-1.96)
0.024997895148220435

In other words, approximately 95% of the standard normal interval lies within two standard deviations, centered on a standard mean of zero (0.975 - 0.0250 ~= 0.95).

If you need the inverse CDF:

>>> norm.ppf(norm.cdf(1.96))
array(1.9599999999999991)

edited Jan 29 at 21:07

answered Apr 30, 2009 at 22:24

Alex Reynolds

97.3k59 gold badges251 silver badges356 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Irvan Over a year ago

Also, you can specify the mean (loc) and variance (scale) as parameters. e.g, d = norm(loc=10.0, scale=2.0); d.cdf(12.0); Details here: docs.scipy.org/doc/scipy-0.14.0/reference/generated/…

qkhhly Over a year ago

@Irvan, the scale parameter is actually the standard deviation, NOT the variance.

WestCoastProjects Over a year ago

Why does scipy name these as loc and scale ? I used the help(norm.ppf) but then what the heck are loc and scale - need a help for the help..

Michael Ohlrogge Over a year ago

@javadba - location and scale are more general terms in statistics that are used to parameterize a wide range of distributions. For the normal distribution, they line up with mean and sd, but not so for other distributions.

WestCoastProjects Over a year ago

@MichaelOhlrogge . Thx! Here is a page from NIST explaining further itl.nist.gov/div898/handbook/eda/section3/eda364.htm

|

Xavier Guihot · Accepted Answer · 2019-02-28 19:50:14Z

62

Starting Python 3.8, the standard library provides the NormalDist object as part of the statistics module.

It can be used to get the cumulative distribution function (cdf - probability that a random sample X will be less than or equal to x) for a given mean (mu) and standard deviation (sigma):

from statistics import NormalDist

NormalDist(mu=0, sigma=1).cdf(1.96)
# 0.9750021048517796

Which can be simplified for the standard normal distribution (mu = 0 and sigma = 1):

NormalDist().cdf(1.96)
# 0.9750021048517796

NormalDist().cdf(-1.96)
# 0.024997895148220428

answered Feb 28, 2019 at 19:50

Xavier Guihot

62.8k26 gold badges320 silver badges202 bronze badges

4 Comments

dcl Over a year ago

Based on some quick checks, this is significantly faster than norm.cdf from scipy.stats and a fair bit faster than both scipy and math implementations of erf.

hasManyStupidQuestions Over a year ago

Does this vectorize? Or should someone use the scipy implementation if they need to compute the CDF evaluated at all points in an array?

Juozas Over a year ago

Awesome. Maybe you know how to get inverse (normsinv)? Edit: OK, it is inv_cdf(). Thank you!

zeit Dec 5, 2024 at 12:09

A quick look at NormalDist().cdf() shows that it depends on erf(). Ref: github.com/python/cpython/blob/3.8/Lib/statistics.py#L941

gibbone · Accepted Answer · 2018-01-07 20:01:15Z

61

It may be too late to answer the question but since Google still leads people here, I decide to write my solution here.

That is, since Python 2.7, the math library has integrated the error function math.erf(x)

The erf() function can be used to compute traditional statistical functions such as the cumulative standard normal distribution:

from math import *
def phi(x):
    #'Cumulative distribution function for the standard normal distribution'
    return (1.0 + erf(x / sqrt(2.0))) / 2.0

Ref:

https://docs.python.org/2/library/math.html

https://docs.python.org/3/library/math.html

How are the Error Function and Standard Normal distribution function related?

edited Jan 7, 2018 at 20:01

gibbone

2,74023 silver badges22 bronze badges

answered Mar 26, 2015 at 7:40

WTIFS

1,0309 silver badges13 bronze badges

2 Comments

Hannes Landeholm Over a year ago

This was exactly what I was looking for. If someone else than me wonders how this can be used to calculate "percentage of data that lies within the standard distribution", well: 1 - (1 - phi(1)) * 2 = 0.6827 ("68% of data within 1 standard deviation")

Bernhard Barker Over a year ago

For a general normal distribution, it would be def phi(x, mu, sigma): return (1 + erf((x - mu) / sigma / sqrt(2))) / 2.

Unknown · Accepted Answer · 2009-04-30 22:23:28Z

21

Adapted from here http://mail.python.org/pipermail/python-list/2000-June/039873.html

from math import *
def erfcc(x):
    """Complementary error function."""
    z = abs(x)
    t = 1. / (1. + 0.5*z)
    r = t * exp(-z*z-1.26551223+t*(1.00002368+t*(.37409196+
        t*(.09678418+t*(-.18628806+t*(.27886807+
        t*(-1.13520398+t*(1.48851587+t*(-.82215223+
        t*.17087277)))))))))
    if (x >= 0.):
        return r
    else:
        return 2. - r

def ncdf(x):
    return 1. - 0.5*erfcc(x/(2**0.5))

answered Apr 30, 2009 at 22:23

Unknown

47k29 gold badges142 silver badges184 bronze badges

3 Comments

Marc Over a year ago

Since the std lib implements math.erf(), there is no need for a sep implementation.

TmSmth Over a year ago

i was not able to find an answer, where do those numbers come from ?

tbrugere Over a year ago

@TmSmth If I had to guess this looks like some kind of approximation of what is inside the exponential, so you probably can calculate them with some kind of taylor expansion after fiddling with your function a bit (changing vars, then say r = t * exp( - z**2 -f(t)) and do a taylor expansion of f (which can be found numerically

Cerin · Accepted Answer · 2010-08-19 20:14:08Z

19

To build upon Unknown's example, the Python equivalent of the function normdist() implemented in a lot of libraries would be:

def normcdf(x, mu, sigma):
    t = x-mu;
    y = 0.5*erfcc(-t/(sigma*sqrt(2.0)));
    if y>1.0:
        y = 1.0;
    return y

def normpdf(x, mu, sigma):
    u = (x-mu)/abs(sigma)
    y = (1/(sqrt(2*pi)*abs(sigma)))*exp(-u*u/2)
    return y

def normdist(x, mu, sigma, f):
    if f:
        y = normcdf(x,mu,sigma)
    else:
        y = normpdf(x,mu,sigma)
    return y

edited Aug 19, 2010 at 20:14

answered Aug 19, 2010 at 19:35

Cerin

65.6k106 gold badges349 silver badges562 bronze badges

Comments

Salvador Dali · Accepted Answer · 2015-11-20 10:24:57Z

16

Alex's answer shows you a solution for standard normal distribution (mean = 0, standard deviation = 1). If you have normal distribution with mean and std (which is sqr(var)) and you want to calculate:

from scipy.stats import norm

# cdf(x < val)
print norm.cdf(val, m, s)

# cdf(x > val)
print 1 - norm.cdf(val, m, s)

# cdf(v1 < x < v2)
print norm.cdf(v2, m, s) - norm.cdf(v1, m, s)

Read more about cdf here and scipy implementation of normal distribution with many formulas here.

answered Nov 20, 2015 at 10:24

Salvador Dali

224k151 gold badges726 silver badges766 bronze badges

Comments

David Miller · Accepted Answer · 2019-02-06 22:17:18Z

2

Taken from above:

from scipy.stats import norm
>>> norm.cdf(1.96)
0.9750021048517795
>>> norm.cdf(-1.96)
0.024997895148220435

For a two-tailed test:

Import numpy as np
z = 1.96
p_value = 2 * norm.cdf(-np.abs(z))
0.04999579029644087

answered Feb 6, 2019 at 22:17

David Miller

5075 silver badges5 bronze badges

Comments

Samuel Corradi · Accepted Answer · 2020-05-30 14:55:18Z

0

Simple like this:

import math
def my_cdf(x):
    return 0.5*(1+math.erf(x/math.sqrt(2)))

I found the formula in this page https://www.danielsoper.com/statcalc/formulas.aspx?id=55

answered May 30, 2020 at 14:55

Samuel Corradi

493 bronze badges

Collectives™ on Stack Overflow

How to calculate cumulative normal distribution?

8 Answers 8

7 Comments

4 Comments

2 Comments

3 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

7 Comments

4 Comments

2 Comments

3 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related