Continuous probabilityContinuous random variablesExamplesProbability density functionWhy can't we use the PMF anymore?DefinitionPropertiesCumulative distribution functionPropertiesExampleExpectationMomentsPropertiesVarianceDistributionsUniform distributionExponential distributionGaussian distributionStandard normal distributionApproximations of the binomial distributionPoisson approximationGaussian/normal approximationDe Moivre-Laplace theoremContinuity correctionExampleRelating probability density functions is an increasing function is a decreasing functionHazard rate functionExample 1PDF in terms of HRFExample 2Joint distributionsJoint probability density functionsJoint cumulative distribution functionsConditional distributionsDiscrete conditional distributionsDiscrete conditional expectationContinuous conditional distributionsContinuous conditional expectationExpectationCovarianceMomentsMoment-generating functionsCalculating momentsExampleMoment-generating functions for summations of independent random variablesGeneral caseExampleInequalitiesMarkov's inequalityReal-world example interpretationChebyshev's inequalityExampleChernoff boundsProofLimit theoremsWeak Law of Large NumbersProofStrong Law of Large NumbersCentral Limit TheoremExampleMarkov chains and stochastic processesDiscrete-time Markov chainsTransition probabilitiesTransition matrixGeneral caseRow vectorsProbability vectorsExampleImportant consequencesEvolution of a Markov chainExampleProperties of Markov chainsIrreducibilityErgodicityTheoremExampleContinuous-time Markov chainsEntropySurpriseExampleDesired properties for TheoremJoint and conditional entropiesImportant propositionsResources

**Random variables** were previously defined in the discrete probability notes as:

A

random variableis a function that maps each outcome of the sample space to some numerical value.Given a sample space , a random variable with values in some set is a function:

Where was typically or for discrete RVs.

However in continuous probability, the codomain is always .

Therefore, a **continuous random variable** is a random variable which can take on infinitely many values (has an uncountably infinite range).

Given a sample space , a **continuous random variable** is a function:

- The continuous random variable could be the length of a randomly selected telephone call in seconds.
- The continuous random variable could be the volume of water in a bucket.

**Note**: Random variables can be partly continuous and partly discrete!

A continuous random variable has what could be thought of as *infinite precision*.

More specifically, a continuous random variable can realise an infinite amount of real number values within its range, as there are an infinite amount of points in a line segment.

So we have an infinite amount of values whose sum of probabilities must equal one. This means that these probabilities must each be **infinitesimal**. and therefore:

It is clear from this result that the **probability mass function** which we previously used in discrete probability will no longer provide any useful information.

A **probability density function** is a function whose integral over an interval gives the probability that the value of a random variable falls within the interval.

is a continuous random variable if there is a function such that:

The function is called the **probability density function** (**PDF**).

For better reasoning as to why , we can now use the definition above.

The following properties follow from the axioms:

Sometimes also called **cumulative density function** (to differentiate with between cumulative distribution of a discrete random variable), the **cumulative distribution function** of a continuous random variable evaluated at is the probability that will take a value less than or equal to .

The cumulative distribution function is denoted , and defined as:

Additionally, if is continuous at :

The definition of the probability density function given earlier can be expressed in terms of the cumulative distribution function, by the **fundamental theorem of calculus**:

- The cumulative distribution function is an increasing function.

Suppose the lifetime of a car battery has a probability of lasting more than days. Find the probability density function of .

We are given the

complementary cumulative distribution function:And we can determine the cumulative distribution function:

If a continuous random variable is given, and its distribution is given by a probability density function , then the expected value of (if the expected value exists) can be calculated as:

The -th moment of a continuous random variable is given by:

In general, the properties of expectation for continuous random variables are the same as that of discrete random variables, but switching sums with integrals:

**Linearity**— for a set of tuples , each consisting of a continuous random variable and a corresponding constant :In general, if is a function of (e.g. , ), then is also a random variable.

If , its expectation is given by:

*Plus the rest of the properties from discrete random variable expectations*

If the random variable represents samples generated by a continuous distribution with probability density function , then the population variance is given by:

All properties from the variance of discrete random variables still hold for continuous random variables.

The **uniform distribution** with parameters is a distribution where all intervals of the same length on the distribution's support , for a random variable are equally probable.

The support is defined by the two parameters and .

The probability density function for a uniformly distributed random variable would be:

Additionally, the cumulative distribution function is given by:

Parameter | Meaning |
---|---|

Minimum value | |

Maximum value |

Quantity (or function) | Formula |
---|---|

Mean (expected value) | |

Variance | |

Moment-generating function |

The **exponential distribution** is the probability distribution that describes the time between events in a process in which events occur continuously and independently at a **constant average rate**.

An exponentially distributed random variable with rate parameter has the probability density function:

Additionally, the cumulative distribution function is given by:

Parameter | Meaning |
---|---|

Constant average rate |

Quantity (or function) | Formula |
---|---|

Mean (expected value) | |

Variance | |

Moment-generating function |

To denote a random variable which is distributed according to the **Gaussian distribution**, we write , with standard deviation , variance and mean/expectation .

The probability density function for a Gaussian distributed random variable would be:

Additionally, the cumulative distribution function is given by the integral:

**Note**: We must use an evaluation table to determine the CDF evaluated at , since is not an elementary function.

Parameter | Meaning |
---|---|

Mean/expectation of the distribution (also its median and mode) | |

Variance |

Quantity (or function) | Formula |
---|---|

Mean (expected value) | |

Variance | |

Moment-generating function |

The **standard normal distribution** (sometimes **normal distribution**, though this is ambiguous naming) is a special case of the **Gaussian distribution**, when and .

To denote a random variable which is (standard) normally distributed, we write .

Additionally, the cumulative distribution function is given by the integral:

**Note**: This integral doesn't evaluate to any simple expression as it cannot be expressed in terms of elementary functions, and instead relies on the special function. Instead, we must use an evaluation table - specifically Table 5.1 in Section 5.4.

Recall that the binomial distribution is a discrete probability distribution representing the number of successes in a sequence of independent experiments, with each experiment being a Bernoulli trial (success/failure experiment) with probability of success .

For a binomially distributed random variable , the probability mass function is given by:

Where is the number of successes in trials.

Recall that for a Poisson distributed random variable , the probability mass function is given by:

Where is the number of successes if they occur at rate .

We can approximate the binomially distribution with the Poisson distribution reasonably well when and is small (with ). This is true because when — that is:

Note that a binomially distributed random variable such as can be expressed as a sum of **Bernoulli random variables** — that is:

Additionally, note that:

- and

- and

We then have .

This section may not be examinable, but is useful for deriving the Gaussian approximation

A **standard score** (denoted ) is the number of standard deviations by which a data point is above or below the mean value of what is being observed or measured.

To standardise a data point , we can use the normal standardisation formula:

If we use the normal standardisation formula for , we get:

By using the fact that can be expressed as a sum of Bernoulli random variables (as discussed earlier), and the **central limit theorem** (which will be discussed a bit later), we can see that:

**Note**: The normal approximation of the binomial is reasonable when is large, or more specifically when and are not too small relative to — that is:

For the sequence of Bernoulli random variables, we have (for ):

Or alternatively, with and :

This theorem essentially states that the probability mass function of the centred and normalised binomial random variable converges (for and ) to the probability density function of the normal random variable.

Sometimes when using the De Moivre-Laplace theorem, or approximating a discrete probability distribution with a continuous probability distribution, we must use **continuity correction**. For a discrete random variable , we can write:

Consider a

fair coinbeing tossed times.Let the random variable represent the number of heads.

Then .

Approximate using the Gaussian random variable.

First, we can start by correcting the discrete random variable for continuity:

We can compare this to the result of letting be a binomially distributed random variable.

Recall that . Therefore:

As you can see, approximating with a Gaussian random variable led to a reasonably accurate probability, but remember that we get a better estimate when is large.

Suppose we have a continuous random variable and some continuous function . Note that is also a random variable.

We will look at relating the two probability density functions and by considering two different cases for — when is an increasing function and when it is a decreasing function.

By the definition of increasing functions, we must have:

If we look at the cumulative distribution function for , we can determine a relationship between and :

By the definition of decreasing functions, we must have:

Once again, if we consider the cumulative distribution function for , we can determine a relationship between and :

The **hazard rate function** is the frequency with which a component fails, expressed in failures per unit of time.

Although the hazard rate function is often thought of as the probability that a failure occurs in a specified interval given no failure before time , it is **not** actually a probability because it can exceed one.

The hazard rate function for a continuous random variable is given by:

Where:

- is called the
**failure density function**, and is the probability that the failure will fall in a specified interval. - is called the
**failure distribution function**, and is the probability of the failure of a component, up to and including a certain time . - is called the
**survival function**, and is the complementary cumulative distribution function — the probability of survival of a component past a certain time .

Consider an exponentially distributed random variable .

Recall that for :