# ProbabilityProbability Distributions

Given a probability space and a random variable , the **distribution** of tells us how *distributes* probability mass on the real number line. Loosely speaking, the distribution tells us where we can expect to find and with what probabilities.

**Definition** (Distribution of a random variable)

The distribution (or *law*) of a random variable is the probability measure on which maps a set to .

**Exercise**

Suppose that represents the amount of money you're going to win with the lottery ticket you just bought. Suppose that is the law of . Then

We can think of as pushing forward the probability mass from to by sending the probability mass at to for each . The probability masses at multiple 's can stack up at the same point on the real line if maps the 's to the same value.

**Exercise**

A problem on a test requires students to match molecule diagrams to their appropriate labels. Suppose there are three labels and three diagrams and that a student guesses a matching uniformly at random. Let denote the number of diagrams the student correctly labels. What is the probability mass function of the distribution of ?

*Solution.* The number of correctly labeled diagrams is an integer between 0 and 3 inclusive. Suppose the labels are , and suppose the correct labeling sequence is (the final result would be the same regardless of the correct labeling sequence). The sample space consists of all six possible labeling sequences, and each of them is equally likely since the student applies the labels uniformly at random. So we have

The probability mass function of the distribution of is therefore

All together, we have

## Cumulative distribution function

The distribution of a random variable **cumulative distribution function**

**Definition** (Cumulative distribution function)

If

**Exercise**

Consider a random variable

*Solution.* The first one is true, since the CDF goes from about 0.1 at

The second one is also true, since there is no probability mass past 2.

The third one is false: there is no probability mass in the interval from

**Exercise**

Suppose that

*Solution.* By definition of

where the last step follows since

**Exercise**

Random variables with the same cumulative distribution function are not necessarily equal as random variables, because the probability mass sitting at each point on the real line can come from different

For example, consider the two-fair-coin-flip experiment and let

*Solution.* If we define *tails*, then it's clear from symmetry that it has the same distribution as

(In fact, we can express