Notes 4

  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Notes 4 as PDF for free.

More details

  • Words: 2,622
  • Pages: 7
1

Random Variables

In applications we are interested in quantitative properties of experimental results. Example: Toss a coin three times and count the number of heads. The sample space is S = {(t, t, t), (t, t, h), (t, h, t), (h, t, t), (t, h, h), (h, t, h), (h, h, t), (h, h, h)}. The random variable X counts the number of heads. Thus if w = (t, t, h) occurs then X(t, t, h) = 1. Assigning probabilities to value of random variables is done by connecting probability associates with outcomes of an experiment with the values of the random variables. Example: (continued) Assume all elements in S are equally likely. Then X = 1 corresponds to the set {(t, t, h), (t, h, t), (h, t, t)} thus P (X = 1) = P ({(t, t, h), (t, h, t), (h, t, t)}) = 3/8. Let B = {2, 3} by X ∈ B we mean the subset A = {(t, h, h), (h, t, h), (h, h, t), (h, h, h)} of S such that X(w) ∈ B for all w ∈ A so 4 P (X ∈ B) = P (A) = = 0.5. 8 By X = j we mean the set A = {w ∈ S : X(w) = j}. So X = 0 is the set A = {(t, t, t)} and P (X = 0) = P (A) = 1/8, etc. Definition: A function X : S → < mapping elements of the sample space into the real numbers will be called a random variable. Remark: A random variable is a deterministic function. What is random is the selection of w ∈ S. Example: Suppose you select a point at p random in the unit circle {(x, y) : x2 + y 2 ≤ 1}. Let D be the distance of the selected point to the origin x2 + y 2 . Then D is a random variable.

1.1

Discrete Random Variables

Definition: Discrete Random Variables: If the possible values of X are finite or countable we say that X is a discrete random variable. Definition: Let x1 , x2 , . . . denote the possible values of X then p(xi ) = P (X = xi ) = P (w ∈ S : X(w) = xi ) is called the probability mass function. Notice that p(xi ) ≥ 0 and that ∞ X i=1

p(xi ) =

X

P (w ∈ S : X(w) = xi ) = P (S) = 1.

i

Convenient notation (not used in book): pX (a) = P (X = a). This notation is useful when you want to emphasize that you are refereing to random variable X. When there is no danger of confusion we drop the subscript. Example: (continued) p(0) = 1/8, p(1) = 3/8, p(2) = 3/8, p(3) = 1/8. The probability mass function summarizes all probability information associated with the random variable. 1

Example: (continued) P (X ≤ 1) = p(0) + p(1) = 0.5. P (1 ≤ X ≤ 2) = p(1) + p(2) = 3/4, P (X > 1) = p(2) + p(3) = 0.5. Although the probability mass function contains all information w.r.t. a discrete random variable, the cumulative distribution function is frequently used. Definition: The cumulative distribution function (cdf) F of a random variable X is given by FX (a) = P (X ≤ a). When it is clear that we are taking about the random variable X we may simply write F (a). The notation X ∼ F signifies that F is the distribution of the random variable X. If X is a discrete random variable we can write X X p(X = t) = p(t). F (a) = t≤a

t≤a

Example: (continued) F (0) = 1/8, F (1) = 4/8, F (2) = 7/8, F (3) = 1. Properties of cdf: 1. F (−∞) = P (X ≤ −∞) = 0, 2. F (∞) = P (X ≤ ∞) = 1, 3. If a < b then F (a) ≤ F (b). Any function satisfying the above three properties is the cdf of a random variable. Facts: 1. P (a < X ≤ b) = F (b) − F (a). 2. P (a ≤ X ≤ b) = F (b) − F (a) + p(a). 3. P (a < X < b) = F (b) − F (a) − p(b). 4. P (a ≤ X < b) = F (b) − F (a) − p(b) + p(a). Example: (continued) P (1 ≤ X ≤ 3) = F (3) − F (1) + p(1) = 1 − 4/8 + 3/8 = 7/8.

1.2

Continuous Random Variables

Although almost all actual measurements made in engineering and scientific work are really discrete, it is often conceptually convenient to thing in terms of a continuum of possible values. Examples are measurements of height and weight. It is also a mathematical convenient to deal with such continuous random variables. We cannot define continuous random variables in terms of their probability mass function because continuous random variables have an uncountable number of possible values. We can however work with the concept of the cumulative distribution function and then derive from it a new concept, the probability density function that has similar properties to the probability mass function. Recall FX (x) = P (X ≤ x) = P ({w ∈ S : X(w) ≤ x}). 2

Example: Before exploring the properties of the cdf of continuous random variables, let us work out the cdf of the distance to the origin of a point selected at random within the unit circle. Clearly FD (d) = 1 for all d > 1 and FD (d) = 0 for all d < 0. What about values of d ∈ [0, 1]? p (1) FD (d) = P r( x2 + y 2 ≤ d) = = =

P r(x2 + y 2 ≤ d2 ) πd2 /π d2

(2) (3) (4)

Notice that FD (d) is an increasing and continuous function. Example: What is the probability that D ≤ 1/2? FD (1/2) = 1/4. Example: What is the probability that that 1/2 < D ≤ 2/3? Clearly FD (2/3) − FD (1/2) = 4/9 − 1/4. What is the probability that the distance is exactly equal to 1/2? 0. Now let us go back to a generic random variable X P (a < X ≤ b) = P (X ≤ b) − P (X ≤ a) = F (b) − F (a). What is P (X = b)? P (X = b) ≡ lim P (a < X ≤ b)

(5)

a↑b

= lim[F (b) − F (a)]

(6)

= P (X = b) = [F (b) − F (b− )]

(7)

a↑b

Here F (b− ) is the limit of F (x) as x approaches x from the left. So P (X = b) is the jump in F at b. For example if X is number of heads in three tosses of a fair coin, we have F (x) = 0.5 for 1 ≤ x < 2 and F (x) = 0.125 for 0 < x ≤ 1. Thus, P (X = 1) = F (1) − F (1− ) = 0.5 − 0.125 = 0.375. What if F is continuous (has no jump) at b? Then P (X = b) = 0. If F is continuous everywhere then P (X = x) = 0 for all x. What if F is continuously differentiable? Then there exists a function f (x) = F 0 (x) called the probability density function such that Z b F (b) = f (x)dx. −∞

Consequently

Z P (a < X ≤ b) = F (b) − F (a) =

b

f (x)dx. a

The function f (·) is called the the probability density function. Example: Find the probability density function of D. Clearly f = 0 on d < 0 and d > 1. For d ∈ [0, 1] we have f (d) = F 0 (d) = 2d. Notice that

3

• f (x) ≥ 0, and R∞ • −∞ f (x)dx = 1. Since P (x − δ/2 < X ≤ x + δ/2) = f (x)δ, the value of f (x) is related to the probability that X takes values close to x. For example, if f (x) = 2f (y) it is almost twice as likely for X to fall in a small neighborhood of x than of y. Example: Notice that for random variable D, f (1/4) = 1/2 while at f (3/4) = 3/2, so it is 3 times more likely that you end at a distance of 3/4 than that you are of ending up at a distance of 1/4. To summarize, for continuous random variables there is a probability density function f (x) ≥ 0 such that Z P (X ∈ A) = f (u)du. A

In particular if A = {X ≤ x}, then Z

x

F (x) = P (X ≤ x) =

f (u)du. −∞

Given the cdf F (x) we can obtain the density function f (x) by differentiation. Conversely, given the density function f (x) we can obtain the cdf F (x) by integration. Properties of cdf: As with the case of discrete random variables: 1. F (−∞) = P (X ≤ −∞) = 0, 2. F (∞) = P (X ≤ ∞) = 1, 3. If a < b then F (a) ≤ F (b). However, the calculation of probabilities over intervals is simpler: 1. P (a < X ≤ b) = F (b) − F (a). 2. P (a ≤ X ≤ b) = F (b) − F (a). 3. P (a < X < b) = F (b) − F (a). 4. P (a ≤ X < b) = F (b) − F (a). Example: f (t) = 0 on t < 0, f (t) = t/2 on 0 < t ≤ 1, f (t) = 0.75 on 1 < t ≤ 2 and f (t) = 0 on t > 2. Then F (a) = a2 /4 on 0 < a ≤ 1, F (a) = .25 + .75(a − 1) on 1 < a ≤ 2, and F (a) = 1 on a > 2. The cdf is convenient for probability calculations: P (.1 < X ≤ 1.2) = F (1.2) − F (.1) = .25 + .15 − .01/4 = .40 − .0025 = .3975.

2

Joint Random Variables:

Often we need to deal with two or more random variables at the same time. Here we discuss the case of discrete and continuous joint random variables. 4

2.1

Joint Distributions of Discrete Random Variables

Example: Toss three fair coins and let X number of heads in first two tosses and Y number of heads in first three tosses. Give an explicit mapping to obtain p(0, 0) = p(0, 1) = p(2, 2) = p(2, 3) = 1/8 and p(1, 1) = p(1, 2) = 1/4. Let A = {(0, 0), (1, 1), (2, 2)} then p(A) = 0.5. What is P (X = 1)? Well X = 1 is equivalent to the set {(x, y) : (1, 1), (1, 2)} so P (X = 1) = 0.5. 2.1.1

Marginal Probability Mass Functions P P In general, pX (x) = y p(x, y) and pY (y) = y p(x, y) are known as the marginal probability mass functions. If X and Y are discrete random variables, there is a joint probability mass function p(·, ·) with the following three properties. p(x, y) ≥ 0 XX p(x, y) = 1. x

y

X

P ((X, Y ) ∈ A) =

p(x, y).

(x,y)∈A

2.1.2

Conditional Distribution: Discrete Case

We define the conditional pmf of X given Y = y as pX|Y (x|y) = P (X = x|Y = y) =

p(x, y) pY (y)

provided pY (y) > 0 and is defined to be zero otherwise. Example: Referring to our earlier example we see that P (X = 1|Y = 2) =

.25 = 0.66 .25 + .125

Notice also that p(x, y) = PX|Y (x|y)pY (y) and therefore by adding over y we can obtain pX (x). ¡ ¢ n Example: Suppose P (X = k|N = n) = nx px (1 − p)n−x for x = 0, 1, . . . , n, and P (N = n) = exp(−λ) λn! . Then, P (X = k, N = n) = P (X = k|N = n)P (N = n) and P (X = k) =

X

P (X = k, N = n)

n≥k

Work it out and it turns out that P (X = k) = exp(−λp)

5

(λp)k . k!

2.2

Joint Distribution of Continuous Random Variables

If X and Y are continuous random variables, there exists a joint density function f (x, y) with the following three properties: f (x, y) ≥ 0, Z Z f (x, y)dxdy = 1 For well defined subsets A of <2

Z P ((X, Y ) ∈ A) =

f (x, y)dxdy. A

Example, f (x, y) = 1 on the unit square. What is the probability that (X, Y ) is in the set [0, .5] × [.5, .8]? 2.2.1

Marginal Density Functions

Let F (x, y) = P (X ≤ x, Y ≤ y). This is called the joint cdf of X and Y. When X and Y are continuous, we have Z x Z y F (x, y) = f (u, v)dudv. −∞

Now,

−∞

Z

x

·Z



P (X ≤ x) = F (x, ∞) = −∞

¸ f (u, v)dv du.

−∞

On the other hand, Z

x

P (X ≤ x) =

fX (u)du. −∞

So it follows that

Z



fX (x) =

f (x, v)dv. −∞

Consequently, the density of X is obtained by integrating out over the second variable. Similarly, the density of Y is obtained by integrating the first variable. Z ∞ fY (y) = f (u, y)du. −∞

Example: f (x, y) = 12/7 × (x2 + xy) on the unit square. Compute fX and fY Integrating we obtain: fX (x) = 12/7 × (x2 + x/2) and fY (y) = (4 + 6y)/7. Q. How can we compute P (x1 < X < x2 , y1 < Y < y2 ) from F (·, ·)? A. F (x2 , y2 ) − F (x1 , y2 ) − F (x2 , y1 ) + F (x1 , y1 ).

6

2.2.2

Conditional Distribution: Continuous Case

In the continuous case the conditional density of X given Y = y is defined as fX|Y (x|y) =

f (x, y) fY (y)

provided fY (y) > 0 and is defined to be zero otherwise. Example: f (x, y) = λ2 e−λy on 0 ≤ x ≤ y. Then fX (x) = λe−λx and fY (y) = λ2 ye−λy and fX|Y (x|y) =

f (x, y) 1 = fY (y) y

on 0 ≤ x ≤ y. Notice that f (x, y) = fX|Y (x|y)fY (y) and that integrating over y we can obtain fX (x).

3

Independent Random Variables

Let X and Y be random variables with joint cdf F (x, y). Recall that FX (x) = F (x, ∞) and FY (y) = F (∞, y). X and Y are said to be independent if F (x, y) = FX (x)FY (y). Notice that this definition is valid for both discrete and continuous random variables. If X and Y are continuous and independent then f (x, y) = fX (x)fY (y). If X and Y are discrete and independent then p(x, y) = pX (x)pY (y).

7

Related Documents

Notes 4
May 2020 5
Notes 4
June 2020 7
Geometry Notes Dec. 4
July 2020 2
Lecture 4 Notes
November 2019 24
Ccl Notes P2-4
April 2020 5