Discrete probabilitySample spacesProbability distributionsEventsBasic facts about probabilities of eventsTheoremConditional probabilityIndependence of two eventsPairwise and mutual independenceBiased coins and Bernoulli trialsThe binomial distributionTheoremProofDefinitionPairwise and mutual independenceBayes' theoremRandom variablesBiased coins and the geometric distributionExpectation of a random variableBetter expression for the expectationTheoremProofExpected number of successes in Bernoulli trialsTheoremLinearity of expectationTheoremVarianceDefinition

For any probabilistic experiment or process, the set of all its possible outcomes is called its **sample space**.

E.g. Consider the following probabilistic (random) experiment:

Flip a fair coin 7 times in a row, and see what happens

The possible outcomes of the example above, are all the sequences of Heads () and Tails (), of length 7. In other words, they are the set of strings .

The cardinality of this sample space is , because there are two choices for each of the seven characters of the string.

Sample spaces may be infinite.

E.g.

Flip a fair coin repeatedly until a Heads is flipped.

However, in **discrete probability**, we focus on finite and countable sample spaces.

A **probability distribution** over a finite or countable set , is a function:

such that .

In other words, each outcome is assigned a probability such that and the sum of the probabilities of all outcomes sum to 1.

E.g. Suppose a fair coin is tossed times consecutively. This random experiment defines a probability distribution on , where for all , .

Since , we have

E.g. suppose a fair coin is tossed repeatedly until it lands heads. This random experiment defines a probability distribution , on , such that, ,

Note that

For a **countable** sample space , and **event** E, is simply a subset of the set of possible outcomes.

Given a probability distribution , we define **the probability of the event ** to be .

E.g. For , the following are events:

- The third coin toss came up heads

- The fourth and fifth coin tosses did not both come up tails

E.g. For , the following are events:

- The first time the coin comes up heads is after an even number of coin tosses

For event , define the **complement event** to be .

Suppose are a (finite or countable) sequence of pairwise disjoint events from the sample space . In other words, and . Then

Furthermore, for each event , .

**Proof:** Follows easily from definitions:

, , thus, since the sets are disjoint, .

Likewise, since ,

**Definition**: Let be a probability distribution, and let be two events, such that .

The **conditional probability** of given , denoted , is defined by:

E.g. A fair coin is flipped three times. Suppose we know that the event occurs.

What is the probability that the event occurs?

For the example above, there are 8 flip sequences, , all with probability . The event that is . The event . So,

Intuitively, two events are independent if knowing whether one occurred does not alter the probability of the other. Formally:

**Definition:** Events and are called **independent** if .

Note that if then and are independent if and only if

Thus, the probability of is not altered by knowing occurs.

E.g. A fair coin is flipped three times. Are the events and , independent?

Yes, because , , and , so .

**Definition:** Events are called **pairwise independent** if for every pair and are independent. (i.e. )

Events are called **mutually independent**, if for every subset .

Clearly, mutual independence implies pairwise independence. However, **pairwise independence does not imply mutual independence**.

A **Bernoulli trial** is a probabilistic experiment that has two outcomes: **success** or **failure** (e.g. heads or tails).

We suppose that is the probability of success, and is the probability of failure.

We can of course have repeated Bernoulli trials. We typically assume the different trials are **mutually independent**.

E.g. A

biased coin, which comes up heads with probability , is flipped times consecutively. What is the probability that it comes up heads exactly times?

The probability of exactly successes in mutually independent Bernoulli trials, with probability of success and of failure in each trial, is:

We can associate Bernoulli trials with outcomes . Each sequence ) with exactly heads and tails occurs with probability . There are such sequences with exactly heads.

The **binomial distribution** with parameters and , denoted , defines a probability distribution on , given by:

A finite set of events is **pairwise independent** if every pair of events is independent - that is, if and only if for all distinct pairs of indices :

A finite set of events is **mutually independent** if every event is independent of any intersection of the other events - that is, if and only if for every -element subset of :

Let and be events such that and .

**Definition:** A **random variable**, is a function , that assigns a real value to each outcome in a sample space .

E.g. Suppose a biased coin is flipped times. The sample space is . The function that assigns to each outcome the number of coin tosses that came up heads, is one random variable.

For a random variable we write as shorthand for the probability ). The **distribution** of a random variable is given by the set of pairs .

**Note:** These definitions of a random variable and its distribution are only adequate in the context of **discrete** probability distributions.

Suppose a biased coin comes up heads with probability each time it is tossed. Suppose we repeatedly flip this coin until it comes up heads.

What is the probability that we flip the coin times, for ?

**Answer:** The sample space is . Assuming mutual independence of coin flips, the probability of is . Note that this does define a probability distribution on because:

A random variable , is said to have a **geometric distribution** with parameter , if for all positive integers .

**Recall:** A **random variable** is a function , that assigns a real value to each outcome in a sample space .

The **expected value** or **expectation** or **mean**, of a random variable , denoted by , is defined by:

Here is the underlying probability distribution on .

**Question:** Let be the random variable outputting the number that comes up when a **fair die** is rolled. What is the expected value, , of ?

However, sometimes this way can be inefficient for calculating . Take the sample space of 11 coin flips, for example: . Our summation will have terms!

Recall denotes the probability .

Recall that for a function ,

For a random variable ,

, but for each , if we sum all terms such that , we get as their sum. So, summing over all we get .

So, if is small, and if we can compute , then we need to sum a lot fewer terms to calculate .

The expected number of successes in (independent) Bernoulli trials, with probability of success in each, is .

For any random variables on ,

Furthermore, for any ,

In other words, the expectation function is a **linear function**.

The **variance** and **standard deviation** of a random variable give us ways to measure (roughly) on average, how far off the value of the random variable is from its expectation.

For a random variable on a sample space , the **variance** of , denoted by , is defined by:

The **standard deviation** of , denoted by , is defined by

E.g. Consider the random variable , s.t. , and the random variable , s.t. .

Then , but , whereas and .

## Theorem

For any random variable ,

## Proof