Statistical Distributions

This page contains a list of common probability distributions that you may want to use to disperse your inputs in a Monte Carlo analysis. For an overview of common distributions and how they are related, start with this excellent blog post: Common Probability Distributions: The Data Scientist’s Crib Sheet.

Usage

# After having initialized your Sim object as 'sim',
# create a continuous uniform random variable between 1 and 5
from scipy.stats import uniform
sim.addInVar(name='var1', dist=uniform, distkwargs={'loc':1, 'scale':4})

Continuous Distributions

Uniform Distribution [SciPy Ref, Wikipedia]:
uniform(loc, scale), where loc is the lower bound and scale is size of the range, such that the distribution returns x from the inclusive range [loc, loc + scale].

Normal Distribution [SciPy Ref, Wikipedia]:
norm(loc, scale), where loc is the mean and scale is the standard deviation. Also known as a Gaussian distribution. The returned range of x is unbounded. The normal distribution pops up often thanks to the central limit theorem, which states that any distributions when added together will tend towards a normal distribution.

Log-Normal Distribution [SciPy Ref, Wikipedia]:
lognorm(s = sigma, scale = numpy.exp(mu)), where mu is the mean and sigma is the standard deviation of the underlying normal distribution. Also known as a Gaussian distribution. Retuns x ≥ 0. Similarly to a normal distribution, this pops up due to the central limit theorem in the log domain, stating that any distributions when multiplied together will tend towards a log-normal distribution. Think of stock prices being log-normally distributed when their rate of return is normally distributed.

Exponential Distribution [SciPy Ref, Wikipedia]:
expon(scale = 1/lambda), where lambda is the expected rate parameter for the associated poisson process. The returned range of x is unbounded. Think of a call center which receives an average of lambda calls per minute, and this is the odds of x minutes passing between subsequent calls.

Discrete Distributions

Random Integers in Range [SciPy Ref, Wikipedia]:
randint(low, high), where low and high are the lower and upper bounds of the integer range. Also known as a discrete uniform distribution. Returns k in {low, ..., high - 1}.

Random Integers with Custom Weights [SciPy Ref]:
rv_discrete(values=(xk, pk)), where xk is a list of integers and pk is a list of the probabilities associated with returning each integer. The sum of pk must equal 1, and each probability in it must be 0 < p < 1. Returns x in xk.

Bernoulli Distribution [SciPy Ref, Wikipedia]:
bernoulli(p), where p is the probability of success. Equivalent to a “weighted coin flip”. Returns k in {0, 1}.

Binomial Distribution [SciPy Ref, Wikipedia]:
binom(n, p), where p is the probability of success for a single bernoulli trial, and n is the number of trials to conduct. Returns k in {0, 1, ..., n}. Think of the odds of k heads in n weighted coin flips.

Geometric Distribution [SciPy Ref, Wikipedia]:
geom(p), where p is the probability of success for a single bernoulli trial. Returns k ≥ 1. Think of the odds of the number k of weighted coins you need to flip before you hit heads.

Poisson Distribution [SciPy Ref, Wikipedia]:
poisson(mu), where mu is the expected rate of occurances (notated as lambda on wikipedia). Returns k ≥ 0. Think of a call center that receives an average of lambda calls per minute, and this gives the odds of receiving k calls in any given minute.