Continuous probabilityContinuous random variablesExamplesProbability density functionWhy can't we use the PMF anymore?DefinitionPropertiesCumulative distribution functionPropertiesExampleExpectationMomentsPropertiesVarianceDistributionsUniform distributionExponential distributionGaussian distributionStandard normal distributionApproximations of the binomial distributionPoisson approximationGaussian/normal approximationDe Moivre-Laplace theoremContinuity correctionExampleRelating probability density functions is an increasing function is a decreasing functionHazard rate functionExample 1PDF in terms of HRFExample 2Joint distributionsJoint probability density functionsJoint cumulative distribution functionsConditional distributionsDiscrete conditional distributionsDiscrete conditional expectationContinuous conditional distributionsContinuous conditional expectationExpectationCovarianceMomentsMoment-generating functionsCalculating momentsExampleMoment-generating functions for summations of independent random variablesGeneral caseExampleInequalitiesMarkov's inequalityReal-world example interpretationChebyshev's inequalityExampleChernoff boundsProofLimit theoremsWeak Law of Large NumbersProofStrong Law of Large NumbersCentral Limit TheoremExampleMarkov chains and stochastic processesDiscrete-time Markov chainsTransition probabilitiesTransition matrixGeneral caseRow vectorsProbability vectorsExampleImportant consequencesEvolution of a Markov chainExampleProperties of Markov chainsIrreducibilityErgodicityTheoremExampleContinuous-time Markov chainsEntropySurpriseExampleDesired properties for TheoremJoint and conditional entropiesImportant propositionsResources

Continuous probability

Continuous random variables

Random variables were previously defined in the discrete probability notes as:

A random variable is a function that maps each outcome of the sample space to some numerical value.

Given a sample space , a random variable with values in some set is a function:

Where was typically or for discrete RVs.

However in continuous probability, the codomain is always .

Therefore, a continuous random variable is a random variable which can take on infinitely many values (has an uncountably infinite range).


Given a sample space , a continuous random variable is a function:

Examples

Note: Random variables can be partly continuous and partly discrete!

Probability density function

Why can't we use the PMF anymore?

A continuous random variable has what could be thought of as infinite precision.

More specifically, a continuous random variable can realise an infinite amount of real number values within its range, as there are an infinite amount of points in a line segment.

So we have an infinite amount of values whose sum of probabilities must equal one. This means that these probabilities must each be infinitesimal. and therefore:

It is clear from this result that the probability mass function which we previously used in discrete probability will no longer provide any useful information.

Definition

A probability density function is a function whose integral over an interval gives the probability that the value of a random variable falls within the interval.

is a continuous random variable if there is a function such that:

The function is called the probability density function (PDF).


For better reasoning as to why , we can now use the definition above.

Properties

The following properties follow from the axioms:

Cumulative distribution function

Sometimes also called cumulative density function (to differentiate with between cumulative distribution of a discrete random variable), the cumulative distribution function of a continuous random variable evaluated at is the probability that will take a value less than or equal to .

The cumulative distribution function is denoted , and defined as:

Additionally, if is continuous at :


The definition of the probability density function given earlier can be expressed in terms of the cumulative distribution function, by the fundamental theorem of calculus:

Properties

Example

Suppose the lifetime of a car battery has a probability of lasting more than days. Find the probability density function of .

We are given the complementary cumulative distribution function:

And we can determine the cumulative distribution function:

Expectation

If a continuous random variable is given, and its distribution is given by a probability density function , then the expected value of (if the expected value exists) can be calculated as:

Moments

The -th moment of a continuous random variable is given by:

Properties

In general, the properties of expectation for continuous random variables are the same as that of discrete random variables, but switching sums with integrals:

Variance

If the random variable represents samples generated by a continuous distribution with probability density function , then the population variance is given by:

All properties from the variance of discrete random variables still hold for continuous random variables.

Distributions

Uniform distribution

The uniform distribution with parameters is a distribution where all intervals of the same length on the distribution's support , for a random variable are equally probable.

The support is defined by the two parameters and .

The probability density function for a uniformly distributed random variable would be:

Additionally, the cumulative distribution function is given by:

ParameterMeaning
Minimum value
Maximum value
Quantity (or function)Formula
Mean (expected value)
Variance
Moment-generating function

Exponential distribution

The exponential distribution is the probability distribution that describes the time between events in a process in which events occur continuously and independently at a constant average rate.

An exponentially distributed random variable with rate parameter has the probability density function:

Additionally, the cumulative distribution function is given by:

ParameterMeaning
Constant average rate
Quantity (or function)Formula
Mean (expected value)
Variance
Moment-generating function

Gaussian distribution

To denote a random variable which is distributed according to the Gaussian distribution, we write , with standard deviation , variance and mean/expectation .

The probability density function for a Gaussian distributed random variable would be:

Additionally, the cumulative distribution function is given by the integral:

Note: We must use an evaluation table to determine the CDF evaluated at , since is not an elementary function.

ParameterMeaning
Mean/expectation of the distribution (also its median and mode)
Variance
Quantity (or function)Formula
Mean (expected value)
Variance
Moment-generating function

Standard normal distribution

The standard normal distribution (sometimes normal distribution, though this is ambiguous naming) is a special case of the Gaussian distribution, when and .

To denote a random variable which is (standard) normally distributed, we write .

Additionally, the cumulative distribution function is given by the integral:

Note: This integral doesn't evaluate to any simple expression as it cannot be expressed in terms of elementary functions, and instead relies on the special function. Instead, we must use an evaluation table - specifically Table 5.1 in Section 5.4.

Approximations of the binomial distribution

Recall that the binomial distribution is a discrete probability distribution representing the number of successes in a sequence of independent experiments, with each experiment being a Bernoulli trial (success/failure experiment) with probability of success .

For a binomially distributed random variable , the probability mass function is given by:

Where is the number of successes in trials.

Poisson approximation

Recall that for a Poisson distributed random variable , the probability mass function is given by:

Where is the number of successes if they occur at rate .

We can approximate the binomially distribution with the Poisson distribution reasonably well when and is small (with ). This is true because when — that is:

Gaussian/normal approximation

Note that a binomially distributed random variable such as can be expressed as a sum of Bernoulli random variables — that is:

Additionally, note that:

We then have .


This section may not be examinable, but is useful for deriving the Gaussian approximation

A standard score (denoted ) is the number of standard deviations by which a data point is above or below the mean value of what is being observed or measured.

To standardise a data point , we can use the normal standardisation formula:


If we use the normal standardisation formula for , we get:

By using the fact that can be expressed as a sum of Bernoulli random variables (as discussed earlier), and the central limit theorem (which will be discussed a bit later), we can see that:

Note: The normal approximation of the binomial is reasonable when is large, or more specifically when and are not too small relative to — that is:

De Moivre-Laplace theorem

For the sequence of Bernoulli random variables, we have (for ):

Or alternatively, with and :


This theorem essentially states that the probability mass function of the centred and normalised binomial random variable converges (for and ) to the probability density function of the normal random variable.

Continuity correction

Sometimes when using the De Moivre-Laplace theorem, or approximating a discrete probability distribution with a continuous probability distribution, we must use continuity correction. For a discrete random variable , we can write:

Example

Consider a fair coin being tossed times.

Let the random variable represent the number of heads.

Then .

Approximate using the Gaussian random variable.


First, we can start by correcting the discrete random variable for continuity:

We can compare this to the result of letting be a binomially distributed random variable.

Recall that . Therefore:

As you can see, approximating with a Gaussian random variable led to a reasonably accurate probability, but remember that we get a better estimate when is large.

Relating probability density functions

Suppose we have a continuous random variable and some continuous function . Note that is also a random variable.

We will look at relating the two probability density functions and by considering two different cases for — when is an increasing function and when it is a decreasing function.

is an increasing function

By the definition of increasing functions, we must have:

If we look at the cumulative distribution function for , we can determine a relationship between and :

is a decreasing function

By the definition of decreasing functions, we must have:

Once again, if we consider the cumulative distribution function for , we can determine a relationship between and :

Hazard rate function

The hazard rate function is the frequency with which a component fails, expressed in failures per unit of time.

Although the hazard rate function is often thought of as the probability that a failure occurs in a specified interval given no failure before time , it is not actually a probability because it can exceed one.

The hazard rate function for a continuous random variable is given by:

Where:

Example 1

Consider an exponentially distributed random variable .

Recall that for :