Stochastic Signal Processing

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Stochastic Signal Processing as PDF for free.

More details

  • Words: 21,919
  • Pages: 155
Stochastic Signal Processing TS Ñoã Hoàng Tuaán Boä Moân Vieãn Thoâng, Khoa Ñieän-Ñieän Töû Ñaïi Hoïc Baùch Khoa TP. HCM E-mail: [email protected]

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1

SSP2008 BG, CH, ÑHBK

Outline (1) Chapter 1: Discrete-Time Signal Processing z-Transform. Linear Time-Invariant Filters. Discrete Fourier Transform (DFT). Chapter 2: Stochastic Processes and Models Review of Probability and Random Variables. Stochastic Models. Stochastic Processes. Chapter 3: Spectrum Analysis Spectral Density. Spectral Representation of Stochastic Process. Spectral Estimation. Other Statistical Characteristics of a Stochastic Process.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

2

SSP2008 BG, CH, ÑHBK

Outline (2) Chapter 4: Eigenanalysis Properties of Eigenvalues and Eigenvectors. Chapter 5: Wiener Filters Minimum Mean-Squared Error. Wiener-Hopf Equations. Channel Equalization. Linearly Constrained Minimum Variance (LCMV) Filter. Chapter 6: Linear Prediction Forward Linear Prediction. Backward Linear Prediction. Levinson-Durbin Algorithm. Chapter 7: Kalman Filters Kalman Algorithm. Applications of Kalman Filter: Tracking Trajectory of Object and System Identification. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

3

SSP2008 BG, CH, ÑHBK

Outline (3) Chapter 8: Linear Adaptive Filtering Adaptive Filters and Applications. Method of Steepest Descent. Least-Mean-Square Algorithm. Recursive Least-Squares Algorithm. Examples of Adaptive Filtering: Adaptive Equalization and Adaptive Beamforming. Chapter 9: Estimation Theory Fundamentals. Minimum Variance Unbiased Estimators. Maximum Likelihood Estimation. Eigenanalysis Algorithms for Spectral Estimation. Example: Direction-of-Arrival (DoA) Estimation.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

4

SSP2008 BG, CH, ÑHBK

References [1] Simon Haykin, Adaptive Filter Theory, Prentice Hall, 1996 (3rd Ed.), 2001 (4th Ed.). [2] Steven M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory, Prentice Hall, 1993. [3] Alan V. Oppenheim, Ronald W. Schafer, Discrete-Time Signal Processing, Prentice Hall, 1989. [4] Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill, 1991 (3rd Ed.), 2001 (4th Ed.). [5] Dimitris G. Manolakis, Vinay K. Ingle, Stephen M. Kogon, Statistical and Adaptive Signal Processing, Artech House, 2005.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

5

SSP2008 BG, CH, ÑHBK

Goal of the course Introduction to the theory and algorithms used for the analysis and processing of random signals (stochastic signals) and their applications to communications problems. ƒ Understanding the random signals: via statistical description (theory of probability, random variables and stochastic processes), modeling and the dependence between the samples of one or more discrete-time random signals. ƒ Developing the theoretically methods/practical techniques for processing random signals to achieve a predefined applicationdependent objective. • Major applications in communications: signal modeling, spectral estimation, frequency-selective filtering, adaptive filtering, array processing...

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

6

SSP2008 BG, CH, ÑHBK

Random Signals vs. Deterministic Signals Example: Discrete-time random signals (a) and the dependence between the samples (b), [5].

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

7

SSP2008 BG, CH, ÑHBK

Techniques for Processing Random Signals ‰ Signal analysis (signal modeling, spectral estimation): Primary goal is to extract useful information for understanding and classifying the signals. Typical applications: Detection of useful information from receiving signals, system modeling/identification, detection and classification of radar and sonar targets, speech recognition, signal representation for data compression… ‰ Signal filtering (frequency-selective filtering, adaptive filtering, array processing): Main objective is to improve the quality of a signal according to a criterion of performance. Typical applications: Noise and interference cancellation, echo cancellation, channel equalization, active noise control…

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

8

SSP2008 BG, CH, ÑHBK

Applications of Adaptive Filters (1) System identification [5]:

System inversion [5]:

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

9

SSP2008 BG, CH, ÑHBK

Applications of Adaptive Filters (2) Signal prediction [5]:

Interference cacellation [5]:

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

10

SSP2008 BG, CH, ÑHBK

Applications of Adaptive Filters (3) Simple model of a digital communications system with channel equalization [5]:

Pulse trains: (a) without intersymbol interference (ISI) and (b) with ISI. ISI distortion: The tails of adjacent pulses interfere the current pulse and can lead to an incorrect decision. The equalizer can compensate for the ISI distortion. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

11

SSP2008 BG, CH, ÑHBK

Applications of Adaptive Filters (4) Block diagram of the basic components of an active noise control system [5]:

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

12

SSP2008 BG, CH, ÑHBK

Applications of Array Processing (1) Beamforming [5]:

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

13

SSP2008 BG, CH, ÑHBK

Applications of Array Processing (2) Example of (adaptive) beamforming with an airborne for interference mitigation [5]:

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

14

SSP2008 BG, CH, ÑHBK

Applications of Array Processing (3) (Adaptive) Sidelobe canceler [5]:

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

15

SSP2008 BG, CH, ÑHBK

Chapter 1: Discrete-Time Signal Processing ‰ z-Transform. ‰ Linear Time-Invariant Filters. ‰ Discrete Fourier Transform (DFT).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

SSP2008 BG, CH, ÑHBK

1. Z-transform (1) ‰ Discrete-time signals → signals described as a time series, consisting of sequence of uniformly spaced samples whose varying amplitudes carry the useful information content of the signal. ‰ Consider time series (sequence) {u(n)} or u(n) denoted by samples: u(n), u(n-1), u(n-2),…, n: discrete time. ƒ Z-transform of u(n):

U ( z ) = z[u (n)] =



−n u ( n ) z ∑

(1.1)

n = −∞

z: complex variable. z-transform pair: u(n) ↔ U(z). ƒ Region of convergence (ROC): set of values of z for which U(z) is uniformly convergent.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

17

SSP2008 BG, CH, ÑHBK

1. Z-transform (2) ‰ Properties: ƒ Linear transform (superposition):

au1 (n) + bu2 (n) ↔ aU1 ( z ) + bU 2 ( z )

(1.2)

ROC of (1.2): intersection of ROC of U1(z) and ROC of U2(z). ƒ Time-shifting:

u ( n) ↔ U ( z ) ⇒ u (n − n0 ) ↔ z n0U ( z )

n0: integer. ROC of (1.3): same as ROC of U(z). Special case: n 0 = 1 ⇒ u (n − 1) ↔ z −1U ( z ) z-1: unit-delay element. ƒ Convolution theorem: ∞

∑ u (n)u

i = −∞

1

2

(n − i ) ↔ U1 ( z )U 2 ( z )

(1.3)

(1.4)

ROC of (1.4): intersection of ROC of U1(z) and ROC of U2(z). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

18

SSP2008 BG, CH, ÑHBK

1. Linear Time-Invariant (LTI) Filters (1) ‰ Definition: LTI Filter:

v(n)

av1 (n) + bv2 (n) v(n − k )

u(n)

Linearity, au1 (n) + bu2 (n) Time-invariant

u (n − k )

‰ Impulse response h(n): LTI Filter h(n)

t=0

u(n)=h(n)

For arbitrary input v(n): convolution sum

u ( n) =



∑ h(i)v(n − i)

(1.5)

i = −∞ Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

19

SSP2008 BG, CH, ÑHBK

1. LTI Filters (2) ‰ Transfer function: ƒ Applying z-transform to both side of (1.5) U ( z ) = H ( z )V ( z ) U(z) ↔ u(n), V(z) ↔ v(n), H(z) ↔ h(n)

(1.6)

H(z): transfer function of the filter H ( z) =

U ( z) V ( z)

(1.7)

ƒ When input sequence v(n) and output sequence u(n) are related by difference equation of order N: N

N

∑ a u (n − j ) =∑ b v(n − j ) j =0

j

j =0

(1.8)

j

aj, bj: constant coefficients. Applying z-transform, obtaining: N

U ( z) H ( z) = = V ( z)

∑a z

−j

∑b z

−j

j =0 N

j =0

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

j

j

20

N

a = 0 b0

∏ (1 − c z

−1

∏ (1 − d

−1

k

k =1 N

k

)

z )

(1.9)

k =1

SSP2008 BG, CH, ÑHBK

1. LTI Filters (3) ‰ From (1.9), two distinct types of LTI filters: ƒ Finite-duration impulse response (FIR) filters: dk=0 for all k. → all-zero filter, h(n) has finite duration. ƒ Infinite-duration impulse response (IIR) filters: H(z) has at least one non-zero pole, h(n) has infinite duration. When ck=0 for all k → all-pole filter. ƒ See examples of FIR and IIR filters in next two slides.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

21

SSP2008 BG, CH, ÑHBK

1. LTI Filters (4) v(n)

z-1

v(n-1)

a1

+

z-1

v(n-2)

v(n-M+1)





a2



+

aM-1

+

z-1

v(n-M)

aM

+

u(n)

FIR filter

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

22

SSP2008 BG, CH, ÑHBK

1. LTI Filters (5) v(n)

u(n)

+ z-1 +

u(n-1)

a1 z-1

+

u(n-2)

a2 . . .

+ IIR filter

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

u(n-M+1)

aM-1 z-1 aM 23

u(n-M) SSP2008 BG, CH, ÑHBK

1. LTI Filters (6) ‰ Causality and stability: ƒ LTI filter is causal if: h( n) = 0

for n < 0

(1.10)

ƒ LTI filter is stable if output sequence is bounded for all bounded input sequences. From (1.5), necessary and sufficient condition: ∞

∑ h( k ) < ∞

k = −∞

Im

A causal LTI filter is stable if and only if all of the poles of the filter’s transfer function lie inside the unit circle in the z-plane

Unit circle Re

z-plane Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

(1.11)

Region of stability

(See more in [3]) 24

SSP2008 BG, CH, ÑHBK

1. Discrete Fourier Transform (DFT) (1) ‰ Fourier transform of a sequence u(n) is obtained from z-transform by setting z=exp(j2πf), f: real frequency variable. When u(n) has a finite duration, its Fourier representation → discrete Fourier transform (DFT). For numerical computation of DFT → efficient fast Fourier transform (FFT). ‰ u(n): finite-duration sequence of length N, DFT of u(n): N −1 ⎛ j 2πkn ⎞ U (k ) = ∑ u (n) exp⎜ − ⎝ n =0

N

⎟ , k = 0,..., N − 1 ⎠

(1.12)

Inverse DFT (IDFT) of U(k): 1 u ( n) = N

⎛ j 2πkn ⎞ U ( k ) exp ⎜ ⎟ , n = 0,..., N − 1 ∑ N ⎝ ⎠ k =0 N −1

(1.13)

u(n), U(k): same length N → “N-point DFT” Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

25

SSP2008 BG, CH, ÑHBK

Chapter 2: Stochastic Processes and Models ‰ Review of Probability and Random Variables. ‰ Review of Stochastic Processes and Stochastic Models.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

SSP2008 BG, CH, ÑHBK

2. Review of Probability and Random Variables ‰ Axioms of Probability. ‰ Repeated Trials. ‰ Concepts of Random Variables. ‰ Functions of Random Variables. ‰ Moments and Conditional Statistics. ‰ Sequences of Random Variables.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

27

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (1) ‰ Probability theory deals with the study of random phenomena, which under repeated experiments yield different outcomes that have certain underlying patterns about them. The notion of an experiment assumes a set of repeatable conditions that allow any number of identical repetitions. When an experiment is performed under these conditions, certain elementary events ξi occur in different but completely uncertain ways. We can assign nonnegative number P(ξi ), as the probability of the event ξi in various ways: Laplace’s Classical Definition: The Probability of an event A is defined a-priori without actual experimentation as

P( A) =

Number of outcomes favorable to A , Total number of possible outcomes

provided all these outcomes are equally likely. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

28

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (2) Relative Frequency Definition: The probability of an event A is defined as nA n→∞ n

P ( A) = lim

where nA is the number of occurrences of A and n is the total number of trials. The axiomatic approach to probability, due to Kolmogorov, developed through a set of axioms (below) is generally recognized as superior to the above definitions, as it provides a solid foundation for complicated applications.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

29

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (3) The totality of all ξ i , known a priori, constitutes a set Ω, the set of all experimental outcomes.

Ω =

{ξ 1 , ξ 2 ,

,ξ k ,

}

Ω has subsetsA, B, C , . Recall that if A is a subset of Ω, then ξ ∈ A implies ξ ∈ Ω. From A and B, we can generate other related subsets A ∪ B, A ∩ B, A, B, A ∪ B = { ξ | ξ ∈ A or ξ ∈ B} A ∩ B = { ξ | ξ ∈ A and ξ ∈ B}

and A

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

= { ξ | ξ ∉ A}

30

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (4) A A

A

B

A

B

A∩ B

A∪ B

A

If A ∩ B = φ , the empty set, then A and B are said to be mutually exclusive (M.E). A partition of Ω is a collection of mutually exclusive subsets of Ω such that their union is Ω.

Ai ∩ A j = φ , and

∪A

i

= Ω.

i =1

A1

A

B Aj

A2

Ai An

A∩ B =φ Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

31

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (5) De-Morgan’s Laws:

A∪ B = A∩ B ;

A

B

A∪ B

A

A∩ B = A∪ B

B

A

A∪ B

B

B

A

A∩ B

Often it is meaningful to talk about at least some of the subsets of Ω as events, for which we must have mechanism to compute their probabilities. Example: Consider the experiment where two coins are simultaneously tossed. The various elementary events are Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

32

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (6) ξ1 = (H, H ), ξ2 = (H, T ), ξ3 = (T , H ), ξ4 = (T , T ) and

Ω =

{ ξ 1 , ξ 2 , ξ 3 , ξ 4 }.

The subset A = { ξ 1 , ξ 2 , ξ 3 } is the same as “Head has occurred at least once” and qualifies as an event. Suppose two subsets A and B are both events, then consider “Does an outcome belong to A or B = A∪ B ” “Does an outcome belong to A and B = A∩ B ” “Does an outcome fall outside A”?

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

33

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (7) Thus the sets A ∪ B , A ∩ B , A , B , etc., also qualify as events. We shall formalize this using the notion of a Field. •Field: A collection of subsets of a nonempty set Ω forms a field F if (i) Ω ∈ F

(ii) If A ∈ F , then A ∈ F (iii) If A ∈ F and B ∈ F , then A ∪ B ∈ F. Using (i) - (iii), it is easy to show that A ∩ B , A ∩ B , etc., also belong to F. For example, from (ii) we have A ∈ F , B ∈ F , and using (iii) this gives A ∪ B ∈ F ; applying (ii) again we get A ∪ B = A ∩ B ∈ F , where we

have used De Morgan’s theorem in slide 22.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

34

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (8) ‰ Axioms of Probability For any event A, we assign a number P(A), called the probability of the event A. This number satisfies the following three conditions that act the axioms of probability. (i) P( A) ≥ 0 (Probability is a nonnegative number) (ii) P(Ω) = 1 (Probability of the whole set is unity) (iii) If A ∩ B = φ , then P( A ∪ B ) = P( A) + P( B ).

(Note that (iii) states that if A and B are mutually exclusive (M.E.) events)

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

35

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (9) The following conclusions follow from these axioms: a. P( A ∪ A) = P( A) + P( A) = 1 or P( A) = 1 − P( A). b.

P{ φ } = 0.

c. Suppose A and B are not mutually exclusive (M.E.)? P ( A ∪ B ) = P ( A) + P ( B ) − P ( AB ). ‰ Conditional Probability and Independence

In N independent trials, suppose NA, NB, NAB denote the number of times events A, B and AB occur respectively, for large N P ( A) ≈

N N NA , P ( B ) ≈ B , P ( AB ) ≈ AB . N N N

Among the NA occurrences of A, only NAB of them are also found among the NB occurrences of B. Thus the ratio N AB N AB / N P ( AB ) = = NB NB / N P( B) Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

36

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (10) is a measure of “the event A given that B has already occurred”. We denote this conditional probability by P(A|B) = Probability of “the event A given that B has occurred”. We define

P( A | B ) =

P( AB ) , P( B )

provided P ( B ) ≠ 0. As we show below, the above definition satisfies all probability axioms discussed earlier. Independence: A and B are said to be independent events, if P ( AB ) = P ( A ) ⋅ P ( B ). Then P(A | B) =

P ( AB ) P ( A)P ( B ) = = P ( A ). P(B) P(B)

Thus if A and B are independent, the event that B has occurred does not shed any more light into the event A. It makes no difference to A whether B has occurred or not. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

37

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (11) Let

A = A1 ∪ A2 ∪ A3 ∪

∪ An ,

(*)

a union of n independent events. Then by De-Morgan’s law

A = A1 A 2

An

and using their independence

P ( A ) = P ( A1 A 2

An ) =

n

n

∏ P ( A ) = ∏ (1 − P ( A )). i

i =1

Thus for any A as in (*)

i =1

i

n

P ( A ) = 1 − P ( A ) = 1 − ∏ (1 − P ( Ai )) , i =1

a useful result.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

38

SSP2008 BG, CH, ÑHBK

PILLAI

2. Review of Probability Theory: Basics (12) Bayes’ theorem: P( A | B ) =

P( B | A) ⋅ P( A) P( B )

Although simple enough, Bayes’ theorem has an interesting interpretation: P(A) represents the a-priori probability of the event A. Suppose B has occurred, and assume that A and B are not independent. How can this new information be used to update our knowledge about A? Bayes’ rule above take into account the new information (“B has occurred”) and gives out the a-posteriori probability of A given B. We can also view the event B as new knowledge obtained from a fresh experiment. We know something about A as P(A). The new information is available in terms of B. The new information should be used to improve our knowledge/understanding of A. Bayes’ theorem gives the exact mechanism for incorporating such new information. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

39

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (13) Let A1 , A2 ,..., An are pair wise disjoint Ai Aj = φ , we have P(B) =

n



i =1

P ( BA i ) =

n



i =1

P ( B | A i ) P ( A i ).

A more general version of Bayes’ theorem P ( Ai | B ) =

P ( B | Ai ) P ( Ai ) = P(B)

P ( B | Ai ) P ( Ai ) n



i =1

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

40

,

P ( B | Ai ) P ( Ai )

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (14) Example: Three switches connected in parallel operate independently. Each switch remains closed with probability p. (a) Find the probability of receiving an input signal at the output. (b) Find the probability that switch S1 is open given that an input signal is received at the output.

Input

s1 s2

Output

s3

Solution: a. Let Ai = “Switch Si is closed”. Then P ( Ai ) = p , i = 1 → 3 . Since switches operate independently, we have

P ( Ai A j ) = P ( Ai ) P ( A j ); P ( A1 A2 A3 ) = P ( A1 ) P ( A2 ) P ( A3 ). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

41

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Basics (15) Let R = “input signal is received at the output”. For the event R to occur either switch 1 or switch 2 or switch 3 must remain closed, i.e.,

R = A1 ∪ A2 ∪ A3 . Using results in slide 38,

P( R) = P( A1 ∪ A2 ∪ A3 ) = 1 − (1 − p)3 = 3 p − 3 p 2 + p3. b. We need P( A1 | R ). From Bayes’ theorem P ( R | A1 ) P( A1 ) (2 p − p 2 )(1 − p ) 2 − 2 p + p2 P( A1 | R ) = = = . P( R ) 3 p − 3 p2 + p3 3 p − 3 p2 + p3

Because of the symmetry of the switches, we also have P( A1 | R ) = P( A2 | R ) = P( A3 | R ). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

42

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (1) ‰ Consider two independent experiments with associated probability models (Ω1, F1, P1) and (Ω2, F2, P2). Let ξ∈Ω1, η∈Ω2 represent elementary events. A joint performance of the two experiments produces an elementary events ω = (ξ, η). How to characterize an appropriate probability to this “combined event” ? Consider the Cartesian product space Ω = Ω1× Ω2 generated from Ω1 and Ω2 such that if ξ ∈ Ω1 and η ∈ Ω2 , then every ω in Ω is an ordered pair of the form ω = (ξ, η). To arrive at a probability model we need to define the combined trio (Ω, F, P). Suppose A∈F1 and B ∈ F2. Then A × B is the set of all pairs (ξ, η), where ξ ∈ A and η ∈ B. Any such subset of Ω appears to be a legitimate event for the combined experiment. Let F denote the field composed of all such subsets A × B together with their unions and compliments. In this combined experiment, the probabilities of the events A × Ω2 and Ω1 × B are such that P( A × Ω 2 ) = P1 ( A), P (Ω1 × B ) = P2 ( B ). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

43

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (2) Moreover, the events A × Ω2 and Ω1 × B are independent for any A ∈ F1 and B ∈ F2 . Since ( A × Ω 2 ) ∩ (Ω1 × B ) = A × B, then P( A × B ) = P( A × Ω 2 ) ⋅ P(Ω1 × B ) = P1 ( A) P2 ( B )

for all A ∈ F1 and B ∈ F2 . This equation extends to a unique probability measure P( ≡ P1 × P2 ) on the sets in F and defines the combined trio (Ω, F, P). ‰ Generalization: Given n experiments Ω1 , Ω 2 , , Ωn , and their associated Fi and Pi , i = 1 → n , let Ω = Ω1 × Ω2 ×

× Ωn

represent their Cartesian product whose elementary events are the ordered n-tuples ξ 1 , ξ 2 , , ξ n , where ξ i ∈ Ω i . Events in this combined space are of the form A1 × A2 × × An where Ai ∈ Fi , Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

44

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (3) If all these n experiments are independent, and P(Ai) is the probability of the event Ai in Fi then as before

P ( A1 × A2 ×

× An ) = P1 ( A1 ) P2 ( A2 )

Pn ( An ).

Example: An event A has probability p of occurring in a single trial. Find the probability that A occurs exactly k times, k ≤ n in n trials. Solution: Let (Ω, F, P) be the probability model for a single trial. The outcome of n experiments is an n-tuple

ω = { ξ1 , ξ 2 , , ξ n }∈ Ω0 , × Ω. The event A occurs where every ξi ∈ Ω and Ω 0 = Ω × Ω × at trial # i , if ξ i ∈ A . Suppose A occurs exactly k times in ω. Then k of the ξi belong to A, say ξ i1 , ξ i 2 , , ξ i k , and the remaining n-k are contained in its compliment in A .

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

45

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (4) Using independence, the probability of occurrence of such an ω is given by P0 (ω ) = P ({ ξi1 , ξi2 , = P ( A) P ( A)

, ξik ,

, ξin }) = P({ ξ i1 }) P({ ξi2 })

P({ ξik })

P({ ξin })

P( A) = p k q n −k .

P( A) P ( A) P ( A) n −k

k

However the k occurrences of A can occur in any particular location inside ω. Let ω1 , ω 2 , , ω N represent all such events in which A occurs exactly k times. Then

" A occurs exactly k times in n trials" = ω1 ∪ ω 2 ∪

∪ ωN .

But, all these ωi s are mutually exclusive, and equiprobable. Thus P (" A occurs exactly k times in n trials" ) N

= ∑ P0 (ωi ) = NP0 (ω ) = Np k q n −k , i =1

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

46

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (5) Recall that, starting with n possible choices, the first object can be chosen n different ways, and for every such choice the second one in (n-1) ways, … and the kth one (n-k+1) ways, and this gives the total choices for k objects out of n to be n(n-k)…(n-k+1). But, this includes the k! choices among the k objects that are indistinguishable for identical objects. As a result ⎛n⎞ n( n − 1) (n − k + 1) n! = =⎜ ⎟ N= k! ( n − k )! k! ⎜⎝ k ⎟⎠ represents the number of combinations, or choices of n identical objects taken k at a time. Thus, we obtain Bernoulli formula

Pn ( k ) = P(" A occurs exactly k times in n trials" ) ⎛ n ⎞ k n −k = ⎜⎜ ⎟⎟ p q , k = 0,1,2, ⎝k ⎠

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

47

, n,

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (6) Independent repeated experiments of this nature, where the outcome is either a “success” ( = A) or a “failure” ( = A) are characterized as Bernoulli trials, and the probability of k successes in n trials is given by Bernoulli formula, where p represents the probability of “success” in any one trial. ‰ Bernoulli trial: consists of repeated independent and identical experiments each of which has only two outcomes A or A with P( A) = p, and P ( A) = q. The probability of exactly k occurrences of A in n such trials is given by Bernoulli formula. Let

X k = " exactly k occurrence s in n trials" . Since the number of occurrences of A in n trials must be an integer k = 0, 1, 2,…, n, either X0 or X1 or X2 or … or Xn must occur in such an experiment. Thus

P(X 0 ∪ X1 ∪ Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

48

∪ X n ) = 1. SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (7) But Xi, Xj are mutually exclusive. Thus

P( X 0 ∪ X 1 ∪

⎛n⎞ ∪ X n ) = ∑ P( X k ) = ∑ ⎜⎜ ⎟⎟ p k q n −k . k =0 k =0 ⎝ k ⎠ n

n

For a given n and p what is the most likely value of k ? Pn (k ) n = 12, p = 1 / 2.

k

From Figure, the most probable value of k is that number which maximizes Pn(k) in Bernoulli formula. To obtain this value, consider the ratio (n − k )! k! Pn (k − 1) n! p k −1q n −k +1 k q . = = k n −k (n − k + 1)! ( k − 1)! n! p q Pn ( k ) n − k +1 p Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

49

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Repeated Trials (8) (n − k )! k! Pn (k − 1) n! p k −1q n −k +1 k q . = = k n −k (n − k + 1)! ( k − 1)! n! p q Pn ( k ) n − k +1 p

Thus, Pn(k) ≥ Pn(k-1), if k(1-p) ≤ (n-k+1)p or k ≤ (n+1)p . Thus Pn(k) as a function of k increases until k = ( n + 1) p if it is an integer, or the largest integer kmax less than (n+1)p, it represents the most likely number of successes (or heads) in n trials.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

50

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Random Variables (1) ‰ Let (Ω, F, P) be a probability model for an experiment, and X a function that maps every ζ ∈ Ω to a unique point x ∈ R, the set of real numbers. Since the outcome ζ is not certain, so is the value X(ζ ) = x. Thus if B is some subset of R, we may want to determine the probability of “X(ζ ) ∈ B ”. To determine this probability, we can look at the set A = X-1(B) ∈ Ω that contains all ζ ∈ Ω that maps into B under the function X.

ξ X (ξ )

Ω A

B

x

R

Obviously, if the set A = X-1(B) also belongs to the associated field F, then it is an event and the probability of A is well defined; in that case we can say

Probability of the event " X (ξ ) ∈ B " = P( X −1 ( B )). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

51

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (2) However, X-1(B) may not always belong to F for all B, thus creating difficulties. The notion of random variable (r.v) makes sure that the inverse mapping always results in an event so that we are able to determine the probability for any B ∈ R. ‰ Random Variable (r.v): A finite single valued function X(.) that maps the set of all experimental outcomes Ω into the set of real numbers R is said to be a r.v, if the set { ζ | X(ζ) ≤ x } is an event ∈ F for every x in R. Alternatively, X is said to be a r.v, if X-1(B) ∈ F where B represents semidefinite intervals of the form {-∞ < x ≤ a} and all other sets that can be constructed from these sets by performing the set operations of union, intersection and negation any number of times. Thus if X is a r.v, then

{ξ | X (ξ ) ≤ x } = { X ≤ x } is an event for every x . Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

52

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (3) What about { a < X ≤ b }, { X = a } ? Are they also events ? In fact with b > a , since { X ≤ a } and { X ≤ b } are events, { X ≤ a }c = { X > a } is an event and hence { X > a } ∩ { X ≤ b } = { a < X ≤ b }, is also an event. Thus, { a – 1/n < X ≤ a }, is an event for every n. Consequently,



1 ⎧ ⎫ a − < X ≤ a ⎬ = { X = a} ∩ ⎨⎩ n ⎭ n =1

is also an event. All events have well defined probability. Thus the probability of the event { ζ | X(ζ) ≤ x } must depend on x. Denote

P{ ξ | X (ξ ) ≤ x } = FX ( x) ≥ 0 which is referred to as the Probability Distribution Function (PDF) associated with the r.v X. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

53

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (4) ‰ Distribution Function: If g(x) is a distribution function, then (i) g(+∞) = 1; g(-∞) = 0 (ii) if x1 < x2, then g(x1) ≤ g(x2) (iii) g(x+) = g(x) for all x. It is shown that the PDF FX(x) satisfies these properties for any r.v X. Additional Properties of a PDF (iv) if FX(x0) = 0 for some x0, then FX(x) = 0 for x ≤ x0 (v) P{ X (ξ ) > x } = 1 − FX ( x ). (vi) P{ x1 < X (ξ ) ≤ x2 } = FX ( x2 ) − FX ( x1 ), x2 > x1. (vii) P ( X (ξ ) = x ) = FX ( x ) − FX ( x − ).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

54

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (5) X is said to be a continuous-type r.v if its distribution function FX(x) is continuous. In that case FX(x -) = FX(x) for all x, and from property (vii) we get P{X = x} = 0. If FX(x) is constant except for a finite number of jump discontinuities (piece-wise constant; step-type), then X is said to be a discrete-type r.v. If xi is such a discontinuity point, then from (vii)

pi = P{X = xi } = FX ( xi ) − FX ( xi− ).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

55

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (6) Example: X is a r.v such that X(ζ) = c, ζ ∈ Ω. Find FX(x). Solution: For x < c, {X(ζ) ≤ x } = {φ}, so that FX(x) = 0, and for x > c, {X(ζ) ≤ x } = Ω, so that FX(x) = 1. FX (x)

FX (x) 1

1 q c

x

x

1

Example: Toss a coin. Ω = {H, T}. Suppose the r.v X is such that X(T) = 0, X(H) = 1. Find FX(x). Solution: For x < 0, {X(ζ) ≤ x } = {φ}, so that FX(x) = 0.

0 ≤ x < 1, x ≥ 1, Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

{ X (ξ ) ≤ x } = { T }, so that FX ( x) = P{ T } = 1 − p, { X (ξ ) ≤ x } = { H , T } = Ω, so that FX ( x) = 1. 56

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (7) Example: A fair coin is tossed twice, and let the r.v X represent the number of heads. Find FX(x). Solution: In this case Ω = {HH, HT, TH, TT}, and X(HH) = 2, X(HT) = X(TH) = 1, X(TT) = 0. x < 0,

{X (ξ ) ≤ x} = φ ⇒ FX ( x) = 0,

1 0 ≤ x < 1, {X (ξ ) ≤ x} = { TT } ⇒ FX ( x) = P{ TT } = P (T ) P(T ) = , 4 3 1 ≤ x < 2, {X (ξ ) ≤ x} = { TT , HT , TH } ⇒ FX ( x) = P{ TT , HT , TH } = , 4 x ≥ 2, {X (ξ ) ≤ x} = Ω ⇒ FX ( x) = 1. P{X = 1} = FX (1) − FX (1− ) = 3 / 4 − 1 / 4 = 1 / 2.

FX (x) 1 3/ 4 1/ 4

1 Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

57

x

2 SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (8) ‰ Probability Density Function (p.d.f): The derivative of the distribution function FX(x) is called the probability density function fX(x) of the r.v X. Thus dF ( x ) f X ( x) = X . dx Since dFX ( x ) F ( x + Δx ) − FX ( x ) = lim X ≥ 0, Δx → 0 dx Δx from the monotone-nondecreasing nature of FX(x), it follows that fX(x) ≥ 0 for all x. fX(x) will be a continuous function, if X is a continuous type r.v. However, if X is a discrete type r.v, then its p.d.f has the general form f X ( x)

f X ( x) = ∑ piδ ( x − xi ), i

pi xi

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

58

x

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (9) where xi represent the jump-discontinuity points in FX(x). In the figure, fX(x) represents a collection of positive discrete masses, and it is known as the probability mass function (p.m.f ) in the discrete case. f X (x)

We also obtain

pi xi

x

x

FX ( x) = ∫ f x (u )du. −∞

Since FX(+ ∞) = 1, it yields



+∞

−∞

f x ( x )dx = 1,

which justifies its name as the density function. Further, we also get P{ x1 < X (ξ ) ≤ x2 } = FX ( x2 ) − FX ( x1 ) = ∫ f X ( x )dx. x2

x1

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

59

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (10) Thus the area under fX(x) in the interval (x1, x2) represents the probability in the equation P{ x1 < X (ξ ) ≤ x2 } = FX ( x2 ) − FX ( x1 ) = ∫ f X ( x )dx. x2

x1

FX (x)

f X (x)

1

(a)

x1 x2

x (b)

x1 x2

x

Often, r.vs are referred by their specific density functions - both in the continuous and discrete cases - and in what follows we shall list a number of them in each category.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

60

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (11) ‰ Continuous-type Random Variables 1. Normal (Gaussian): X is said to be normal or Gaussian r.v, if

f X ( x) =

1 2πσ

2

e

− ( x − μ ) 2 / 2σ 2

.

This is a bell shaped curve, symmetric around the parameter μ and its distribution function is given by x 2 2 1 ⎛x−μ⎞ FX ( x ) = ∫ e −( y − μ ) / 2σ dy = G⎜ ⎟, −∞ 2 ⎝ σ ⎠ 2πσ x 1 − y2 / 2 where G ( x ) = ∫ e dy is often tabulated. Since fX(x) depends on −∞ 2π two parameters μ and σ2, the notation X ∼ N(μ, σ2) will be used to represent fX(x). f X (x) μ Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

61

x SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (12) 2. Uniform: X ∼ U(a,b), a < b, if ⎧⎪ 1 , a ≤ x ≤ b, f X ( x) = ⎨ b − a ⎪⎩ 0, otherwise.

1 b−a

f X (x)

a

x

b

3. Exponential: X ∼ ε (λ), if ⎧⎪ 1 − x / λ e , x ≥ 0, f X ( x) = ⎨ λ ⎪⎩ 0, otherwise.

f X (x)

x

4. Gamma: X ∼ G(α, β), if α > 0, β > 0 f X ( x)

⎧ xα −1 −x / β ⎪ e , x ≥ 0, f X ( x ) = ⎨ Γ(α ) β α ⎪⎩ 0, otherwise.

If α = n an integer, Γ(n) = (n-1)! Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

62

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (13) 5. Beta: X ∼ β(a,b) if (a > 0, b > 0)

f X ( x)

⎧ 1 x a −1 (1 − x )b−1 , 0 < x < 1, ⎪ f X ( x ) = ⎨ β ( a , b) ⎪⎩ 0, otherwise.

1

0

x

where the Beta function β(a,b) is defined as 1

β (a, b) = ∫ u a −1 (1 − u )b−1 du. 0

6. Chi-Square: X ∼ χ 2(n) if

f X ( x)

1 ⎧ x n / 2−1e − x / 2 , x ≥ 0, ⎪ n/2 f X ( x ) = ⎨ 2 Γ( n / 2) ⎪⎩ 0, otherwise.

x

Note that χ 2(n) is the same as Gamma (n/2, 2)

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

63

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (14) 7. Rayleigh: X ∼ R(σ2), if

f X ( x)

⎧⎪ x − x 2 / 2σ 2 , x ≥ 0, e f X ( x ) = ⎨σ 2 ⎪⎩ 0, otherwise.

x

8. Nakagami – m distribution: ⎧ 2 ⎛ m ⎞ m 2 m −1 − mx 2 / Ω x e , x≥0 ⎪ f X ( x ) = ⎨ Γ( m ) ⎜⎝ Ω ⎟⎠ ⎪ 0 otherwise ⎩

9. Cauchy: X ∼ C(α, μ), if f X ( x) =

α /π α + (x − μ) 2

2

f X ( x)

, − ∞ < x < +∞.

μ Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

64

x SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (15) 10. Laplace: f X ( x) =

fX (x)

1 −|x|/ λ e , − ∞ < x < +∞. 2λ x

11. Student’s t-distribution with n degrees of freedom: fT (t ) =

Γ(( n + 1) / 2 ) ⎛ t ⎞ ⎜⎜1 + ⎟⎟ πn Γ ( n / 2 ) ⎝ n ⎠ 2

fT ( t )

− ( n +1) / 2

, − ∞ < t < +∞.

t 12. Fisher’s F-distribution: ⎧Γ{( m + n ) / 2} m m / 2 n n / 2 z m / 2 −1 , z≥0 ⎪ (m+n) / 2 f z ( z) = ⎨ ( m / 2) ( n / 2) Γ Γ ( n + mz ) ⎪ 0 otherwise ⎩ Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

65

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: R. V (16) ‰

Discrete-type Random Variables 1. Bernoulli: X takes the values (0,1), and P ( X = 0) = q, P ( X = 1) = p. 2. Binomial: X ∼ B(n,p) if

P(X = k)

⎛n⎞ P( X = k ) = ⎜⎜ ⎟⎟ p k q n −k , k = 0,1,2, ⎝k ⎠

X ∼ P(λ) if k −λ λ P( X = k ) = e , k = 0,1,2, k!

3. Poisson:

, ∞.

, n.

k

12

n

P(X = k)

4. Discrete-Uniform P( X = k ) = Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1 , k = 1,2, N

, N. 66

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (1) ‰ Let X be a r.v defined on the model (Ω, F, P) and suppose g(x) is a function of the variable x. Define Y = g(X) Is Y necessarily a r.v ? If so what is its PDF FY(y) and pdf fY(y) ? Consider some of the following functions to illustrate the technical details. Example 1: Y = aX + b ƒ Suppose a > 0,

y −b⎞ ⎛ y −b⎞ ⎛ = FY ( y ) = P (Y (ξ ) ≤ y ) = P (aX (ξ ) + b ≤ y ) = P⎜ X (ξ ) ≤ F ⎟. ⎟ X⎜ a ⎠ ⎝ a ⎠ ⎝

and fY ( y ) =

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1 ⎛ y −b⎞ fX ⎜ ⎟. a ⎝ a ⎠

67

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (2) ƒ If a < 0, then y −b⎞ ⎛ FY ( y ) = P (Y (ξ ) ≤ y ) = P (aX (ξ ) + b ≤ y ) = P ⎜ X (ξ ) > ⎟ a ⎠ ⎝ ⎛ y −b⎞ = 1 − FX ⎜ ⎟ , ⎝ a ⎠

and hence fY ( y ) = −

ƒ For all a fY ( y ) =

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1 ⎛ y −b⎞ fX ⎜ ⎟. a ⎝ a ⎠

1 ⎛ y −b⎞ fX ⎜ ⎟. |a | ⎝ a ⎠

68

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (3) Example 2: Y = X2

FY ( y ) = P (Y (ξ ) ≤ y ) = P (X 2 (ξ ) ≤ y ).

If y < 0 then the event {X2(ζ) ≤ y} = φ, and hence FY ( y ) = 0, y < 0.

For y > 0, from figure, the event {Y(ζ) ≤ y} = {X2(ζ) ≤ y} is equivalent to {x1 ≤ X(ζ) ≤ x2}. Hence FY ( y ) = P ( x1 < X (ξ ) ≤ x2 ) = FX ( x2 ) − FX ( x1 ) = FX ( y ) − FX ( − y ),

y > 0.

By direct differentiation, we get

(

)

⎧ 1 f ( y ) + f X (− y ) , y > 0, ⎪ fY ( y ) = ⎨ 2 y X ⎪⎩ 0, otherwise. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

Y = X2 y

69

x1

x2

X

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (4) If fX(x) represents an even function, then fY(y) reduces to 1 fY ( y ) = f X y U ( y ). y

( )

In particular, if X ∼ N(0,1), so that f X ( x) =

1 − x2 / 2 e , 2π

then, we obtain the p.d.f of Y = X2 to be 1 fY ( y ) = e − y / 2U ( y ). 2πy we notice that this equation represents a Chi-square r.v with n = 1, since Γ(1/2) = √π . Thus, if X is a Gaussian r.v with μ = 0, then Y = X2 represents a Chi-square r.v with one degree of freedom (n = 1). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

70

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (5) Example 3: Let

In this case

g( X )

X > c, ⎧ X − c, ⎪ Y = g ( X ) = ⎨ 0, − c < X ≤ c, ⎪ X + c, X ≤ −c. ⎩

−c

X

c

P (Y = 0) = P ( −c < X (ξ ) ≤ c ) = FX (c ) − FX ( −c ).

For y > 0, we have x > c, and Y(ζ) = X(ζ) – c, so that

FX (x )

FY ( y ) = P (Y (ξ ) ≤ y ) = P ( X (ξ ) − c ≤ y ) = P ( X (ξ ) ≤ y + c ) = FX ( y + c ),

x

y > 0.

Similarly y < 0, if x < - c, and Y(ζ) = X(ζ) + c, so that

FY ( y )

FY ( y ) = P (Y (ξ ) ≤ y ) = P ( X (ξ ) + c ≤ y )

Thus

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

= P ( X (ξ ) ≤ y − c ) = FX ( y − c ),

y < 0.

y

⎧ f X ( y + c ), y > 0, ⎪ f Y ( y ) = ⎨[ FX ( c ) − FX ( − c )]δ ( y ), ⎪ f ( y − c ), y < 0. ⎩ X 71

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (6) Example 4: Half-wave rectifier

Y

⎧ x, x > 0, Y = g ( X ); g ( x ) = ⎨ ⎩0, x ≤ 0.

In this case

X

P(Y = 0) = P( X (ξ ) ≤ 0) = FX (0).

and for y > 0, since Y = X FY ( y ) = P (Y (ξ ) ≤ y ) = P ( X (ξ ) ≤ y ) = FX ( y ).

Thus y > 0, ⎧ f X ( y ), ⎪ fY ( y ) = ⎨ FX (0)δ ( y ) y = 0, ⎪ y < 0, 0, ⎩

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

= f X ( y )U ( y ) + FX (0)δ ( y ).

72

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (7) ‰ Note: As a general approach, given Y = g(X), first sketch the graph y = g(x), and determine the range space of y. Suppose a < y < b is the range space of y = g(x). Then clearly for y < a, FY(y) = 0, and for y > b, FY(y) = 1, so that FY(y) can be nonzero only in a < y < b. Next, determine whether there are discontinuities in the range space of y. If so, evaluate P(Y(ζ) = yi) at these discontinuities. In the continuous region of y, use the basic approach FY ( y ) = P (g ( X (ξ )) ≤ y )

and determine appropriate events in terms of the r.v X for every y. Finally, we must have FY(y) for -∞ < y < + ∞ and obtain fY ( y ) =

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

dFY ( y ) dy

in a < y < b.

73

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (8) ‰ However, if Y = g(X) is a continuous function, it is easy to obtain fY(y) directly as 1 1 fY ( y ) = ∑ f X ( xi ) = ∑ f X ( xi ). (♦) ′ i dy / dx x i g ( xi ) i

The summation index i in this equation depends on y, and for every y the equation y = g(xi) must be solved to obtain the total number of solutions at every y, and the actual solutions x1, x2, … all in terms of y. For example, if Y = X2, then for all y > 0, x1 = -√y and x2 = +√y represent the two solutions for each y. Notice that the solutions xi are all in terms of y so that the right side of (♦) is only a function of y. Moreover

dy dy = 2 x so that dx dx

Y = X2 y

=2 y x = xi

x1 Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

74

x2

X

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (9) Using (♦), we obtain

(

)

⎧ 1 f ( y ) + f X (− y ) , y > 0, ⎪ fY ( y ) = ⎨ 2 y X ⎪⎩ 0, otherwise ,

which agrees with the result in Example 2. Example 5: Y = 1/X. Find fY(y) Here for every y, x1 = 1/y is the only solution, and

dy 1 dy = − 2 so that dx x dx Then, from (♦), we obtain

fY ( y ) = Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

x = x1

1 2 = = y , 2 1/ y

⎛1⎞ 1 f ⎜ ⎟⎟. 2 X⎜ y ⎝ y⎠ 75

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (10) In particular, suppose X is a Cauchy r.v with parameter α so that

α /π f X ( x) = 2 , − ∞ < x < +∞. α + x2 In this case, Y = 1/X has the p.d.f 1 (1 / α ) / π α /π fY ( y ) = 2 2 = , − ∞ < y < +∞. 2 2 2 y α + (1 / y ) (1 / α ) + y But this represents the p.d.f of a Cauchy r.v with parameter (1/α). Thus if X ∼ C(α), then 1/X ∼ C(1/α). Example 6: Suppose fX(x) = 2x / π 2, 0 < x < π and Y = sinX . Determine fY(y). Since X has zero probability of falling outside the interval (0, π), y = sinx has zero probability of falling outside the interval (0, 1). Clearly fY(y) = 0 outside this interval. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

76

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (11) For any 0 < y < 1, the equation y = sinx has an infinite number of solutions …x1, x2, x3,… (see the Figure below), where x1 = sin-1y is the principal solution. Moreover, using the symmetry we also get x2 = π - x1 etc. Further, dy = cos x = 1 − sin 2 x = 1 − y 2 dx

so that dy dx

fX (x)

= 1− y . 2

x = xi

(a)

x3

π

x

y = sin x

y

x1

x −1

x2

π

x3

x

(b) Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

77

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (12) From (♦), we obtain for 0 < y < 1 fY ( y ) =

+∞



i =−∞ i ≠0

1 1− y

2

f X ( xi ).

But from the figure, in this case fX(-x1) = fX(x3) = fX(x4) = … = 0 (Except for fX(x1) and fX(x2) the rest are all zeros). Thus fY ( y ) =

1 1− y

( f X ( x1 ) + 2

f X ( x2 ) ) =

⎛ 2 x1 2 x2 ⎞ ⎜ 2 + 2 ⎟ 2 π ⎠ 1− y ⎝ π 1

2 ⎧ , 0 < y < 1, 2( x1 + π − x1 ) ⎪ 2 = = ⎨π 1 − y 2 2 π 1− y ⎪⎩ 0, otherwise.

fY ( y )

2

π

1 Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

78

y

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Function of R. V (13) ‰ Functions of a discrete-type r.v Suppose X is a discrete-type r.v with P(X = xi) = pi , x = x1, x2,…, xi, … and Y = g(X). Clearly Y is also of discrete-type, and when x = xi, yi = g(xi) and for those yi , P(Y = yi) = P(X = xi) = pi , y = y1, y2,…, yi, … Example 7: Suppose X ∼ P(λ), so that

P( X = k ) = e

λk

−λ

k!

, k = 0,1,2,

Define Y = X2 + 1. Find the p.m.f of Y. X takes the values 0, 1, 2,…, k, … so that Y only takes the value 1, 2, 5,…, k2+1,… and P(Y = k2+1) = P(X = k) so that for j = k2+1

(

P(Y = j ) = P X = Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

)

j −1 = e

−λ

λ

j −1

( j − 1)! 79

, j = 1, 2, 5,

, k 2 + 1,

.

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (1) ‰ Mean or the Expected Value of a r.v X is defined as +∞

η X = X = E ( X ) = ∫ x f X ( x )dx. −∞

If X is a discrete-type r.v, then we get

η X = X = E ( X ) = ∫ x ∑ piδ ( x − xi )dx = ∑ xi pi ∫ δ ( x − xi )dx i

i

= ∑ xi pi = ∑ xi P( X = xi ) . i

1

i

Mean represents the average (mean) value of the r.v in a very large number of trials. For example if X ∼ U(a,b), then b

E( X ) = ∫

b

a

1 x2 x b2 − a 2 a + b = = dx = 2 b−a b − a 2 a 2( b − a )

is the midpoint of the interval (a,b).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

80

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (2) On the other hand if X is exponential with parameter λ, then

E( X ) = ∫



x

0

λ

e

−x / λ



dx = λ ∫ ye − y dy = λ , 0

implying that the parameter λ represents the mean value of the exponential r.v. Similarly if X is Poisson with parameter λ , we get ∞



E ( X ) = ∑ kP ( X = k ) = ∑ ke =e



λk

λk k!

k =0

k =0

−λ

−λ



λi

=e

−λ



λk

∑ k k! k =1

∑ (k − 1)! = λe ∑ i! = λe λ eλ = λ. −λ

k =1



i =0

Thus the parameter λ also represents the mean of the Poisson r.v. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

81

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (3) In a similar manner, if X is binomial, then its mean is given by n ⎛ n ⎞ k n −k n! E ( X ) = ∑ kP ( X = k ) = ∑ k ⎜⎜ ⎟⎟ p q = ∑ k p k q n −k (n − k )! k! k =0 k =0 ⎝ k ⎠ k =1 n n −1 ( n − 1)! n! k n −k =∑ p q = np ∑ p i q n −i −1 = np( p + q)n −1 = np. k =1 ( n − k )! ( k − 1)! i =0 ( n − i − 1)! i! n

n

Thus np represents the mean of the binomial r.v. For the normal r.v, E( X ) = =

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1 2πσ 1

2

2πσ

2

∫ ∫

+∞ −∞ +∞ −∞

xe −( x − μ ) ye − y

2

2

/ 2σ 2

/ 2σ 2

dx =

dy + μ ⋅

0

1 2πσ 1 2πσ

2

2

∫ ∫

+∞ −∞

+∞ −∞

( y + μ )e − y

e− y

2

/ 2σ 2

2

/ 2σ 2

dy

dy = μ .

1

82

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (4) Given X ∼ fX(x), suppose Y = g(X) defines a new r.v with p.d.f fY(y), the new r.v Y has a mean μY given by +∞

+∞

−∞

−∞

μY = E (Y ) = E ( g ( X ) ) = ∫ y fY ( y )dy = ∫ g ( x) f X ( x)dx. In the discrete case,

E (Y ) = ∑ g ( xi )P( X = xi ). i

From the equations, fY(y) is not required to evaluate E(Y) for Y=g(X).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

83

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (5) Example: Determine the mean of Y = X2 where X is a Poisson r.v. E (X

2



)= ∑ k

2

P( X = k ) =

k =0

=e

−λ



∑k

e

−λ

k =0



λk

λk k!



=e

∑ k ( k − 1)! = e ∑ ( i + 1) −λ

k =1

= λe

2

−λ

⎛ ∞ λi ⎜⎜ ∑ i + i ! ⎝ i=0

i=0



∑ i=0

λi ⎞

⎟ = λe i! ⎟⎠

−λ

−λ



∑k k =1

2

λk k!

λi +1 i!

⎛ ∞ λi λ ⎞ ⎜⎜ ∑ i + e ⎟⎟ i ! ⎝ i =1 ⎠

∞ ⎞ ⎛ ∞ λi λ m +1 −λ ⎛ λ ⎞ + e λ ⎟⎟ = λ e ⎜⎜ ∑ + e ⎟⎟ = λ e ⎜⎜ ∑ ⎠ ⎠ ⎝ m =0 m! ⎝ i =1 ( i − 1)! = λ e − λ (λ e λ + e λ ) = λ 2 + λ . −λ

In general, E(Xk) is known as the kth moment of r.v X. Thus if X ∼ P(λ), its second moment is given by the above equation. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

84

SSP2008 BG, CH, ÑHBK

PILLAI

2. Review of Probability Theory: Moments… (6) ‰ For a r.v X with mean μ , X - μ represents the deviation of the r.v from its mean. Since this deviation can be either positive or negative, consider the quantity (X - μ )2 and its average value E[(X - μ )2 ] represents the average mean square deviation of X around its mean. Define σ 2 = E[( X − μ ) 2 ] > 0. X

With g(X) = (X - μ )2 , we get +∞

σ = ∫ ( x − μ )2 f X ( x )dx > 0. 2 X

−∞

where σX2 is known as the variance of the r.v X, and its square root σ X = E ( X − μ ) 2 is known as the standard deviation of X. Note that the standard deviation represents the root mean square spread of the r.v X around its mean μ .

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

85

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (7) Alternatively, the variance can be calculated by Var ( X ) = σ X2 = ∫ =∫

+∞ −∞

(x

+∞ −∞

2

− 2 xμ + μ 2 ) f X ( x )dx +∞

x f X ( x )dx − 2 μ ∫ x f X ( x )dx + μ 2

= E (X

2

−∞

2

)− μ

2

= E (X

2

) − [E ( X )]

2

___ 2

2

=X −X .

Example: Determine the variance of Poisson r.v

σ = X − X = (λ2 + λ ) − λ2 = λ. 2

___ 2

2

X

Thus for a Poisson r.v, mean and variance are both equal to its parameter λ

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

86

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (8) Example: Determine the variance of the normal r.v N(μ , σ2 ) We have

Var ( X ) = E [( X − μ ) 2 ] = ∫

Use of the identity



+∞ −∞

f X ( x )dx = ∫



+∞ −∞

2 ( x − μ) −∞

+∞

1

−∞

2πσ

for a normal p.d.f. This gives

e −( x − μ )

2

1

+∞

/ 2σ 2

2

2πσ

e −( x − μ )

2

/ 2σ 2

2

e −( x − μ )

2

/ 2σ 2

dx.

dx = 1

dx = 2π σ .

Differentiating both sides with respect to σ , we get 2 +∞ ( x − μ ) −( x − μ ) 2 / 2σ 2 e dx = 2π 3 ∫ −∞

or

+∞

σ

2 ( ) x − μ ∫ −∞ Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1 2 πσ

2

87

e −( x− μ )

2

/ 2σ

2

dx = σ 2 , SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (9) ‰ Moments: In general

___ n

mn = X = E ( X n ), n ≥ 1

are known as the moments of the r.v X, and

μn = E[( X − μ )n ] are known as the central moments of X. Clearly, the mean μ = m1 , and the variance σ2 = μ2. It is easy to relate mn and μn. Infact ⎛ n ⎛n⎞ k n −k ⎞ ⎜ E X E X μn = [( − μ ) ] = ⎜ ∑ ⎜⎜ ⎟⎟ ( − μ ) ⎟⎟ ⎝ k =0 ⎝ k ⎠ ⎠ n n ⎛n⎞ ⎛n⎞ k n −k = ∑ ⎜⎜ ⎟⎟E (X ) ( − μ ) = ∑ ⎜⎜ ⎟⎟ mk ( − μ )n −k . k =0 ⎝ k ⎠ k =0 ⎝ k ⎠ n

In general, the quantities E[(X - a)n] are known as the generalized moments of X about a, and E[⏐X⏐n] are known as the absolute moments of X. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

88

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (10) ‰ Characteristic Function of a r.v X is defined as Φ X (ω ) = E (e

jXω

)= ∫

+∞

−∞

e jxω f X ( x )dx.

Thus ΦX(0) = 1, and ⏐ΦX(ω)⏐ ≤ 1 for all ω. For discrete r.vs the characteristic function reduces to Φ X (ω ) = ∑ e jkω P( X = k ). k

Example: If X ∼ P(λ), then its characteristic function is given by k ∞ ∞ jω jω ( λe jω ) k jkω − λ λ −λ Φ X (ω ) = ∑ e e =e ∑ = e −λ e λe = e λ ( e −1) . k! k! k =0 k =0 Example: If X is a binomial r.v, its characteristic function is given by n

Φ X (ω ) = ∑ e k =0

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

jkω

⎛ n ⎞ k n −k n ⎛ n ⎞ ⎜⎜ ⎟⎟ p q = ∑ ⎜⎜ ⎟⎟( pe jω )k q n −k =( pe jω + q) n . k =0 ⎝ k ⎠ ⎝k ⎠ 89

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (11) The characteristic function can be used to compute the mean, variance and other higher order moments of any r.v X. ⎡ ∞ ( jωX ) k ⎤ ∞ k E ( X k ) k Φ X (ω ) = E (e ) = E ⎢∑ ω ⎥=∑j ! ! k k ⎣ k =0 ⎦ k =0 k 2 k E( X ) 2 E( X ) 2 = 1 + jE ( X )ω + j ω + +j ωk + 2! k! jXω

.

Taking the first derivative with respect to ω, and letting it to be equal to zero, we get 1 ∂Φ X (ω ) ∂Φ X (ω ) ∂ω

= jE ( X ) or E ( X ) =

ω =0

j

∂ω

.

ω =0

Similarly, the second derivative gives

1 ∂ 2Φ X (ω ) E( X ) = 2 , 2 j ∂ω ω =0 2

and repeating this procedure k times, we obtain the kth moment of X to be 1 ∂ k Φ X (ω ) k E( X ) = k , k ≥ 1. k j ∂ω ω =0 Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

90

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Moments… (12) Example: If X ∼ P(λ), then jω ∂Φ X (ω ) = e −λ e λe λje jω , ∂ω so that E(X) = λ. The second derivative gives ∂ 2Φ X (ω ) − λ λe jω jω 2 λe j ω 2 jω = e e ( λ je ) + e λ j e , 2 ∂ω

(

so that

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

)

E ( X 2 ) = λ2 + λ ,

91

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Two r.vs (1) ‰ Let X and Y denote two random variables (r.v) based on a probability model (Ω, F, P). Then P ( x1 < X (ξ ) ≤ x2 ) = FX ( x2 ) − FX ( x1 ) = ∫ f X ( x )dx, x2

and

x1

P ( y1 < Y (ξ ) ≤ y2 ) = FY ( y2 ) − FY ( y1 ) = ∫ fY ( y )dy. y2

y1

What about the probability that the pair of r.vs (X,Y) belongs to an arbitrary region D? In other words, how does one estimate, for example, P[( x1 < X (ξ ) ≤ x2 ) ∩ ( y1 < Y (ξ ) ≤ y2 )] = ?

Towards this, we define the joint probability distribution function of X and Y to be FXY ( x, y ) = P[( X (ξ ) ≤ x ) ∩ (Y (ξ ) ≤ y )] = P( X ≤ x, Y ≤ y ) ≥ 0, where x and y are arbitrary real numbers. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

92

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Two r.vs (2) ‰ Properties (i)

FXY ( −∞, y ) = FXY ( x,−∞) = 0, FXY ( +∞,+∞) = 1.

(ii)

P ( x1 < X (ξ ) ≤ x2 , Y (ξ ) ≤ y ) = FXY ( x2 , y ) − FXY ( x1 , y ).

P ( X (ξ ) ≤ x, y1 < Y (ξ ) ≤ y2 ) = FXY ( x, y2 ) − FXY ( x, y1 ). (iii)

P ( x1 < X (ξ ) ≤ x2 , y1 < Y (ξ ) ≤ y2 ) = FXY ( x2 , y2 ) − FXY ( x2 , y1 ) − FXY ( x1 , y2 ) + FXY ( x1 , y1 ).

This is the probability that (X,Y) belongs to the rectangle R0. Y

y2

R0 y1 X

x2

x1 Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

93

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Two r.vs (3) ‰ Joint probability density function (Joint p.d.f): By definition, the joint p.d.f of X and Y is given by ∂ 2 FXY ( x, y ) . f XY ( x, y ) = ∂x ∂y and hence we obtain the useful formula

FXY ( x, y ) = ∫

x −∞



y −∞

f XY (u, v ) dudv.

Using property (i), we also get +∞

+∞

−∞

−∞

∫ ∫

f XY ( x, y ) dxdy = 1.

The probability that (X,Y) belongs to an arbitrary region D is given by P (( X , Y ) ∈ D ) = ∫

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû



( x , y )∈D

94

f XY ( x, y )dxdy.

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Two r.vs (4) ‰ Marginal Statistics: In the context of several r.vs, the statistics of each individual ones are called marginal statistics. Thus FX(x) is the marginal probability distribution function of X, and fX(x) is the marginal p.d.f of X. It is interesting to note that all marginals can be obtained from the joint p.d.f. In fact FX ( x ) = FXY ( x,+∞), FY ( y ) = FXY ( +∞, y ). Also +∞ +∞ f X ( x ) = ∫ f XY ( x, y )dy , fY ( y ) = ∫ f XY ( x, y )dx. −∞

−∞

If X and Y are discrete r.vs, then pij = P(X = xi, Y = yi) represents their joint p.d.f, and their respective marginal p.d.fs are given by P( X = xi ) = ∑ P( X = xi , Y = y j ) = ∑ pij j

j

P(Y = y j ) = ∑ P( X = xi , Y = y j ) = ∑ pij i

i

The joint P.D.F and/or the joint p.d.f represent complete information about the r.vs, and their marginal p.d.fs can be evaluated from the joint p.d.f. However, given marginals, (most often) it will not be possible to compute the joint p.d.f. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

95

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Two r.vs (5) ‰ Independence of r.vs Definition: The random variables X and Y are said to be statistically independent if the events {X(ζ) ∈ A} and {Y(ζ) ∈ B} are independent events for any two sets A and B in x and y axes respectively. Applying the above definition to the events {X(ζ) ≤ x} and {Y(ζ) ≤ y}, we conclude that, if the r.vs X and Y are independent, then P (( X (ξ ) ≤ x ) ∩ (Y (ξ ) ≤ y ) ) = P ( X (ξ ) ≤ x ) P (Y (ξ ) ≤ y )

i.e.,

FXY ( x, y ) = FX ( x ) FY ( y )

or equivalently, if X and Y are independent, then we must have f XY ( x, y ) = f X ( x ) fY ( y ).

If X and Y are discrete-type r.vs then their independence implies

P( X = xi , Y = y j ) = P ( X = xi ) P(Y = y j ) for all i, j. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

96

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Two r.vs (6) The equations in previous slide give us the procedure to test for independence. Given fXY(x,y), obtain the marginal p.d.fs fX(x) and fY(x) and examine whether these equations (last two equations) are valid. If so, the r.vs are independent, otherwise they are dependent. Example: Given

⎧ xy 2e − y , 0 < y < ∞, 0 < x < 1, f XY ( x, y ) = ⎨ otherwise. ⎩ 0, Determine whether X and Y are independent.

We have

+∞



0

0

f X ( x ) = ∫ f XY ( x, y )dy = x ∫ y 2e − y dy ∞ ∞ = x ⎛⎜ − 2 ye − y + 2 ∫ ye − y dy ⎞⎟ = 2 x, 0 < x < 1. 0 0 ⎠ ⎝

Similarly,

fY ( y ) = ∫

1 0

y2 −y f XY ( x, y )dx = e , 0 < y < ∞. 2

In this case, f XY ( x, y ) = f X ( x ) fY ( y ), and hence X and Y are independent r.vs. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

97

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (1) ‰ Given two random variables X and Y and a function g(x,y), we form a new random variable Z = g(X, Y). Given the joint p.d.f fXY(x, y), how does one obtain fZ(z) the p.d.f of Z ? Problems of this type are of interest from a practical standpoint. For example, a receiver output signal usually consists of the desired signal buried in noise, and the above formulation in that case reduces to Z = X + Y. It is important to know the statistics of the incoming signal for proper receiver design. In this context, we shall analyze problems of the following type: X +Y max( X , Y ) min( X , Y )

X −Y Z = g ( X ,Y )

XY X /Y

X 2 +Y 2

tan −1 ( X / Y ) Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

98

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (2) ‰ Start with: FZ ( z ) = P (Z (ξ ) ≤ z ) = P (g ( X , Y ) ≤ z ) = P[( X , Y ) ∈ Dz ]

=∫



x , y∈Dz

f XY ( x, y )dxdy ,

where Dz in the XY plane represents the region such that g(x, y) ≤ z is satisfied. To determine FZ(z), it is enough to find the region Dz for every z, and then evaluate the integral there. Example 1: Z = X + Y. Find fZ(z).

FZ ( z ) = P ( X + Y ≤ z ) = ∫

+∞ y = −∞



z− y x = −∞

since the region Dz of the xy plane where x+y ≤ z is the shaded area in the figure to the left of the line x+y = z. Integrating over the horizontal strip along the x-axis first (inner integral) followed by sliding that strip along the y-axis from -∞ to +∞ (outer integral) we cover the entire shaded area. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

99

f XY ( x, y )dxdy , y

x= z− y

x

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (3) We can find fZ(z) by differentiating FZ(z) directly. In this context, it is useful to recall the differentiation rule due to Leibnitz. Suppose

H ( z) = ∫ Then

b( z ) a( z)

h( x, z )dx.

b ( z ) ∂h ( x , z ) dH ( z ) db( z ) da ( z ) h (b( z ), z ) − h (a ( z ), z ) + ∫ = dx. a ( z ) dz dz dz ∂z

Using above equations, we get fZ ( z) = ∫

+∞ −∞

=∫

+∞ ⎛ z − y ∂f ⎛ ∂ z− y ⎞ XY ( x , y ) ⎞ ( , ) ( , ) 0 f x y dx dy f z y y dy = − − + XY ⎜ ∫ −∞ XY ⎟ ⎜ ⎟ ∫ ∫ −∞ −∞ ∂z ⎝ ∂z ⎠ ⎝ ⎠

+∞ −∞

f XY ( z − y , y )dy.

If X and Y are independent, fXY(x,y) = fX(x)fY(y), then we get fZ ( z) = ∫

+∞ y = −∞

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

f X ( z − y ) fY ( y )dy = ∫

+∞ x = −∞

100

f X ( x ) fY ( z − x )dx.

(Convolution!) SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (4) Example 2: X and Y are independent normal r.vs with zero mean and common variance σ2 . Determine fZ(z) for Z = X2 + Y2 . We have

FZ ( z ) = P (X 2 + Y 2 ≤ z ) = ∫



X 2 +Y 2 ≤ z

f XY ( x, y )dxdy.

But, X2 + Y2 ≤ z represents the area of a circle with radius √z and hence

FZ ( z ) = ∫

z y =− z



z− y2 x =− z − y 2

f XY ( x, y )dxdy.

y

This gives after repeated differentiation

X 2 +Y 2 = z

f Z ( z) =



1

z y =− z

z

2 z−y

2

(f

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

XY

)

z

x

( z − y 2 , y ) + f XY (− z − y 2 , y ) dy.

(*)

101



z

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (5) Moreover, X and Y are said to be jointly normal (Gaussian) distributed, if their joint p.d.f has the following form: −1 ⎛⎜ ( x − μ X ) 2 2 ρ ( x − μ X )( y − μY ) ( y − μY ) 2 ⎞⎟ − + ⎟ 1 σ XσY 2 (1− ρ 2 ) ⎜⎝ σ X2 σ Y2 ⎠ f XY ( x, y ) = e , 2 2πσ X σ Y 1 − r − ∞ < x < +∞, − ∞ < y < +∞, | r |< 1. with zero mean f XY ( x, y ) =



1 2πσ 1σ 2 1 − r

2

e

1 ⎛ x 2 2 rxy y 2 ⎜ − + 2 (1− r 2 ) ⎜⎝ σ 12 σ 1σ 2 σ 22

⎞ ⎟ ⎟ ⎠

.

with r = 0 and σ1 = σ2 = σ, then direct substitution into (*), we get fZ ( z) = ∫ =

z y =− z

e

⎛ −( z− y e ⋅ 2 ⎜ 2 2 z − y 2 ⎝ 2πσ

− z / 2σ 2

πσ 2

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1



π /2 0

1

2

2

+ y ) / 2σ

2

⎞ ⎟ dy = πσ 2 ⎠

z cos θ 1 − z / 2σ 2 dθ = e U ( z ), 2 2 σ z cos θ

102

e − z / 2σ

2



z 0

with y =

1 z− y

2

dy

z sin θ . SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (6) Thus, if X and Y are independent zero mean Gaussian r.vs with common variance σ2 then X2 + Y2 is an exponential r.vs with parameter 2σ2. Example 3: Let Z = X 2 + Y 2 . Find fZ(z)

y

From the figure, the present case corresponds to a circle with radius z2. Thus FZ ( z ) = ∫ f Z ( z) = ∫

z

z

−z

z −y 2

z

2

z −y

z X 2 +Y 2 = z

2

y =− z



2

2 2 2 2 ( z − y , y ) + f ( − z − y , y ) dy. XY XY

(f

2

x =− z − y

2

z

f XY ( x, y )dxdy.

)

− z

Now suppose X and Y are independent Gaussian as in Example 2, we obtain z 1 2 z − z / 2σ z 1 z ( z − y + y ) / 2σ = e dy e dy f Z ( z) = 2∫ 2 ∫ 2 2 2πσ 2 2 2 0 0 πσ z −y z −y 2

=

2z

πσ

−z e 2

2

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

/ 2σ 2



π/2 0

2

2

2

2

2 2 z z cosθ dθ = 2 e − z / 2σ U ( z ), z cosθ σ

103

2

Rayleigh distribution! SSP2008 BG, CH, ÑHBK

x

2. Review of Probability Theory: Func. of 2 r.vs (7) Thus, if W = X + iY, where X and Y are real, independent normal r.vs with zero mean and equal variance, then the r.v W = X 2 + Y 2 has a Rayleigh density. W is said to be a complex Gaussian r.v with zero mean, whose real and imaginary parts are independent r.vs and its magnitude has Rayleigh distribution. What about its phase

⎛X ⎝Y

θ = tan −1 ⎜

⎞ ⎟? ⎠

Clearly, the principal value of θ lies in the interval (-π/2, +π/2). If we let U = tan θ = X/Y , then it is shown that U has a Cauchy distribution with fU ( u ) =

As a result

1/ π , − ∞ < u < ∞. u2 + 1

⎧1 / π , − π / 2 < θ < π / 2, 1 1 1/π =⎨ fθ (θ ) = fU (tan θ ) = otherwise . | dθ / du | (1 / sec 2 θ ) tan 2 θ + 1 ⎩ 0, Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

104

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (8) To summarize, the magnitude and phase of a zero mean complex Gaussian r.v has Rayleigh and uniform distributions respectively. Interestingly, as we will see later, these two derived r.vs are also independent of each other! Example 4: Redo example 3, where X and Y have nonzero means μX and μY respectively. Since

f XY ( x, y ) =

1

e

−[( x − μ X ) 2 + ( y − μY ) 2 ] / 2σ 2

,

2πσ Similar to example 3, we obtain the Rician probability density function to be f Z ( z) = = = Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

ze

− ( z 2 + μ 2 ) / 2σ 2

2πσ 2 ze

−( z 2 + μ 2 ) / 2σ 2

2πσ 2 ze

−( z 2 + μ 2 ) / 2σ 2

2πσ 2

π/2

∫π

− /2

2

(e

zμ cos(θ −φ ) / σ 2

+e

− zμ cos(θ +φ ) / σ 2

) dθ

⎛⎜ π/2 e zμ cos(θ −φ ) / σ 2 dθ + 3π/2 e zμ cos(θ −φ ) / σ 2 dθ ⎞⎟ ∫π/2 ⎝ ∫ −π/2 ⎠ ⎛ zμ ⎞ I 0 ⎜ 2 ⎟, ⎝σ ⎠ 105

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Func. of 2 r.vs (9) where

x = z cos θ , y = z sin θ , μ = μ X2 + μY2 , μ X = μ cos φ , μY = μ sin φ ,

and

1 2π





1

π

π ∫0 is the modified Bessel function of the first kind and zeroth order. I 0 (η ) =

0

eη cos(θ −φ ) dθ =

eη cosθ dθ

Thus, if X and Y have nonzero means μX and μY, respectively. Then Z = X 2 + Y 2 is said to be a Rician r.v. Such a scene arises in fading multipath situation where there is a dominant constant component (mean) in addition to a zero mean Gaussian r.v. The constant component may be the line of sight Multipath/Gaussian noise Line of sight signal and the zero mean Gaussian r.v signal (constant) part could be due to random multipath components adding up incoherently Rician ∑ a (see diagram). The envelope of such Output a signal is said to have a Rician p.d.f. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

106

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (1) ‰ Given two r.vs X and Y and a function g(x,y), define the r.v Z = g(X,Y), the mean of Z can be defined as +∞

μ Z = E ( Z ) = ∫ z f Z ( z )dz. −∞

or more useful formula E(Z ) = ∫

+∞ −∞

z f Z ( z )dz = ∫

+∞ −∞



+∞ −∞

g ( x, y ) f XY ( x, y )dxdy.

If X and Y are discrete-type r.vs, then

E[ g ( X , Y )] = ∑∑ g ( xi , y j ) P( X = xi , Y = y j ). i

j

Since expectation is a linear operator, we also get

⎛ ⎞ E ⎜ ∑ ak g k ( X , Y ) ⎟ = ∑ ak E[ g k ( X , Y )]. ⎝ k ⎠ k Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

107

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (2) If X and Y are independent r.vs, it is easy to see that V=g(X) and W=h(Y) are always independent of each other. We get the interesting result E[ g ( X )h(Y )] = ∫

+∞ −∞



+∞ −∞

g ( x )h( y ) f X ( x ) fY ( y )dxdy

+∞

+∞

−∞

−∞

= ∫ g ( x ) f X ( x )dx ∫ h( y ) fY ( y )dy = E[ g ( X )]E[h(Y )].

In the case of one random variable, we defined the parameters mean and variance to represent its average behavior. How does one parametrically represent similar cross-behavior between two random variables? Towards this, we can generalize the variance definition given as ‰ Covariance: Given any two r.vs X and Y, define Cov ( X , Y ) = E [( X − μ X )(Y − μY )]. or Cov( X , Y ) = E ( XY ) − μ X μY = E ( XY ) − E ( X ) E (Y ) ____

__ __

= XY − X Y . Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

108

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (3) It is easy to see that

Cov ( X , Y ) ≤ Var ( X )Var (Y ) .

We define the normalized parameter Cov( X , Y ) Cov( X , Y ) ρ XY = = , − 1 ≤ ρ XY ≤ 1, σ Xσ Y Var ( X )Var (Y ) and it represents the correlation coefficient between X and Y. Uncorrelated r.vs: If ρXY = 0, then X and Y are said to be uncorrelated r.vs. If X and Y are uncorrelated, then E(XY) = E(X)E(Y) Orthogonality: X and Y are said to be orthogonal if E(XY) = 0 If either X or Y has zero mean, then orthogonality implies uncorrelatedness also and vice-versa. Suppose X and Y are independent r.vs, it also implies they are uncorrelated. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

109

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (4) Naturally, if two random variables are statistically independent, then there cannot be any correlation between them ρXY = 0. However, the converse is in general not true. As the next example shows, random variables can be uncorrelated without being independent. Example 5: Let Z = aX + bY. Determine the variance of Z in terms of σX, σY and ρXY . We have

μ Z = E ( Z ) = E ( aX + bY ) = a μ X + bμY 2 σ Z2 = Var ( Z ) = E ⎡⎣( Z − μ Z )2 ⎤⎦ = E ⎡( a ( X − μ X ) + b(Y − μY ) ) ⎤

⎣ ⎦ = a 2 E ( X − μ X ) 2 + 2abE ( ( X − μ X )(Y − μY ) ) + b2 E (Y − μY ) 2 = a 2σ X2 + 2ab ρ XY σ X σ Y + b2σ Y2 .

In particular, if X and Y are independent, then ρXY = 0 and

σ Z2 = a 2σ X2 + b2σ Y2 . Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

110

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (5) ‰ Moments: represents the joint moment of order (k,m) for X and Y E[ X Y ] = ∫ k

m

+∞ −∞



+∞ −∞

x k y m f XY ( x, y )dx dy ,

Following the one random variable case, we can define the joint characteristic function between two random variables which will turn out to be useful for moment calculations. ‰ Joint characteristic functions: between X and Y is defined as Φ XY (u, v ) = E ( e

j ( Xu +Yv )

)=∫ ∫ +∞

+∞

−∞

−∞

e j ( Xu +Yv ) f XY ( x, y )dxdy.

Note that Φ XY (u, v ) ≤ Φ XY (0,0) = 1. It is easy to show that

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

1 ∂ 2Φ XY (u, v ) E ( XY ) = 2 j ∂u∂v 111

. u = 0 ,v = 0

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (6) If X and Y are independent r.vs, then we obtain Φ XY (u, v ) = E ( e juX ) E ( e jvY ) = Φ X (u )ΦY ( v ).

Also

Φ X (u ) = Φ XY (u, 0),

ΦY ( v ) = Φ XY (0, v ).

‰ More on Gaussian r.vs: the joint characteristic function of two jointly Gaussian r.vs to be Φ XY (u, v ) = E (e

j ( Xu +Yv )

)=e

1 j ( μ X u + μY v ) − (σ X2 u 2 + 2 ρσ X σ Y uv +σ Y2 v 2 ) 2

.

Example 6: Let X and Y be jointly Gaussian r.vs with parameters N(μX, μY, σX2, σY2, ρ). Define Z = aX + bY. Determine fZ(z). In this case we can use characteristic function to solve.

Φ Z (u ) = E (e jZu ) = E ( e j ( aX +bY ) u ) = E ( e jauX + jbuY ) = Φ XY ( au, bu ). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

112

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (7) or where

Φ Z (u ) = e

1 j ( aμ X + bμY ) u − ( a 2σ X2 + 2 ρabσ X σ Y + b2σ Y2 ) u 2 2

=e

1 jμ Z u − σ Z2 u 2 2

,

μ Z =Δ a μ X + b μ Y , σ

2 Z

=Δ a 2 σ

2 X

+ 2 ρ ab σ X σ

Y

+ b 2 σ Y2 .

Thus, Z = aX + bY is also Gaussian with mean and variance as above. We conclude that any linear combination of jointly Gaussian r.vs generate a Gaussian r.v. Example 7: Suppose X and Y are jointly Gaussian r.vs as in the example 6. Define two linear combinations: Z = aX + bY and W = cX + dY. Determine their joint distribution? The characteristic function of Z and W is given by

Φ ZW (u, v ) = E ( e j ( Zu +Wv ) ) = E ( e j ( aX +bY ) u + j ( cX + dY ) v ) = E ( e jX ( au + cv )+ jY ( bu + dv ) ) = Φ XY ( au + cv, bu + dv ). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

113

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (8) Similar to example 6, we get

Φ ZW (u, v ) = e where

1 j ( μ Z u + μW v ) − (σ Z2 u 2 + 2 ρ ZW σ X σ Y uv +σ W2 v 2 ) 2

,

μ Z = aμ X + bμY , μW = cμ X + dμY , σ Z2 = a 2σ X2 + 2abρσ X σ Y + b2σ Y2 , σ W2 = c 2σ X2 + 2cdρσ X σ Y + d 2σ Y2 ,

and

ρ ZW =

acσ X2 + ( ad + bc ) ρσ X σ Y + bdσ Y2

σ Zσ W

.

Thus, Z and W are also jointly distributed Gaussian r.vs with means, variances and correlation coefficient as above. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

114

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (9) To summarize, any two linear combinations of jointly Gaussian random variables (independent or dependent) are also jointly Gaussian r.vs.

Gaussian input

Gaussian output

Linear operator

Gaussian r. vs are also interesting because of the following result: ‰ Central Limit Theorem: Suppose X1, X2,…, Xn are a set of zero mean independent, identically distributed (i.i.d) random variables with some common distribution. Consider their scaled sum

Y=

X1 + X 2 + n

+ Xn

.

Then asymptotically (as n →∞)

Y → N (0, σ 2 ). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

115

SSP2008 BG, CH, ÑHBK

2. Review of Probability Theory: Joint Moments… (10) The central limit theorem states that a large sum of independent random variables each with finite variance tends to behave like a normal random variable. Thus the individual p.d.fs become unimportant to analyze the collective sum behavior. If we model the noise phenomenon as the sum of a large number of independent random variables (eg: electron motion in resistor components), then this theorem allows us to conclude that noise behaves like a Gaussian r.v.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

116

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes and Models ‰ Mean, Autocorrelation. ‰ Stationarity. ‰ Deterministic Systems. ‰ Discrete Time Stochastic Process. ‰ Stochastic Models.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

117

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Introduction (1) ‰ Let ζ denote the random outcome of an experiment. To every such outcome suppose a waveform X(t, ζ) is assigned. The collection of such waveforms form a stochastic process. The set of {ζk} and the time index t can be continuous or discrete (countably infinite or finite) as well. For fixed ζi ∈ S (the set of all experimental outcomes), X(t, ζ) is a specific time function. For fixed t, X1 = X(t1, ζi) is a random variable. The ensemble of all such realizations X(t, ζ) over X (t, ξ ) time represents the stochastic process (or random process) X(t) (see the figure). X (t, ξ ) n

For example: X(t) = acos(ω0t +φ) where φ is a uniformly distributed random variable in (0, 2π), represents a stochastic process. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

X (t, ξ k ) X (t, ξ 2 ) X (t, ξ1 ) 0

118

t1

t

t2 SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Introduction (2) If X(t) is a stochastic process, then for fixed t, X(t) represents a random variable. Its distribution function is given by FX ( x, t ) = P{ X (t ) ≤ x} Notice that FX(x, t) depends on t, since for a different t, we obtain a different random variable. Further Δ dF X ( x , t ) f X ( x, t ) = dx represents the first-order probability density function of the process X(t). For t = t1 and t = t2, X(t) represents two different random variables X1 = X(t1) and X2 = X(t2), respectively. Their joint distribution is given by

FX ( x1 , x2 , t1 , t2 ) = P{ X (t1 ) ≤ x1 , X (t2 ) ≤ x2 }

and

∂ 2 FX ( x1 , x2 , t1 , t2 ) f X ( x1 , x2 , t1 , t2 ) = ∂x1 ∂x2 Δ

represents the second-order density function of the process X(t). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

119

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Introduction (3) Similarly, fX(x1, x2,… xn, t1, t2,… tn) represents the nth order density function of the process X(t). Complete specification of the stochastic process X(t) requires the knowledge of fX(x1, x2,… xn, t1, t2,… tn) for all ti , i = 1, 2…, n and for all n. (an almost impossible task in reality!). ‰ Mean of a stochastic process:

μ ( t ) =Δ E { X ( t )} =

+∞

∫ −∞

x f X ( x , t ) dx

represents the mean value of a process X(t). In general, the mean of a process can depend on the time index t. ‰ Autocorrelation function of a process X(t) is defined as Δ

RXX (t1 , t2 ) = E{X (t1 ) X * (t2 )} = ∫ ∫ x1 x2* f X ( x1 , x2 , t1 , t2 )dx1dx2 and it represents the interrelationship between the random variables X1 = X(t1) and X2 = X(t2) generated from the process X(t). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

120

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Introduction (4) Properties:

R XX (t1 , t2 ) = R XX* (t2 , t1 ) = [ E{ X (t2 ) X * (t1 )}]*

(i) (ii)

R XX (t , t ) = E{| X (t ) |2 } > 0.

(Average instantaneous power)

(iii) RXX(t1, t2) represents a nonnegative definite function, i.e., for any set of constants {ai}ni=1 n n

∑ ∑ ai a*j R i =1 j =1

XX

(ti , t j ) ≥ 0.

n

this follows by noticing that E{| Y | } ≥ 0 for Y = ∑ ai X (ti ). 2

i =1

The function

C XX (t1 , t 2 ) = RXX (t1 , t 2 ) − μ X (t1 ) μ *X (t 2 )

represents the autocovariance function of the process X(t). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

121

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Introduction (5) Example:

X (t ) = a cos(ω 0 t + ϕ ),

This gives

ϕ ~ U (0,2π ).

μ (t ) = E{ X (t )} = aE{cos(ω 0 t + ϕ )} = a cos ω 0 t E{cosϕ } − a sin ω 0 t E{sin ϕ } = 0, X

since E{cosϕ } = Similarly,

1 2π



∫0

cosϕ dϕ = 0 = E{sin ϕ }.

R XX (t1 , t2 ) = a 2 E{cos(ω 0 t1 + ϕ ) cos(ω 0 t2 + ϕ )} a2 = E{cosω 0 (t1 − t2 ) + cos(ω 0 (t1 + t2 ) + 2ϕ )} 2 a2 = cos ω 0 (t1 − t2 ). 2

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

122

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Stationary (1) ‰ Stationary processes exhibit statistical properties that are invariant to shift in the time index. Thus, for example, second-order stationarity implies that the statistical properties of the pairs {X(t1) , X(t2)} and {X(t1+c),X(t2+c)} are the same for any c. Similarly, first-order stationarity implies that the statistical properties of X(ti) and X(ti+c) are the same for any c. In strict terms, the statistical properties are governed by the joint probability density function. Hence a process is nth-order Strict-Sense Stationary (S.S.S) if

f X ( x1 , x2 ,

xn , t1 , t2

, tn ) ≡ f X ( x1 , x2 ,

xn , t1 + c, t2 + c

, tn + c ) (*)

for any c, where the left side represents the joint density function of the random variables X1 = X(t1), X2 = X(t2), …, Xn = X(tn), and the right side corresponds to the joint density function of the random variables X’1=X(t1+c), X’2 = X(t2+c), …, X’n = X(tn+c). A process X(t) is said to be strict-sense stationary if (*) is true for all ti, i = 1, 2, …, n; n = 1, 2, … and any c. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

123

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Stationary (2) For a first-order strict sense stationary process, from (*) we have

f X ( x, t ) ≡ f X ( x, t + c )

for any c. In particular c = – t gives

f X ( x, t ) = f X ( x )

i.e., the first-order density of X(t) is independent of t. In that case +∞

E [ X (t )] = ∫ −∞ x f ( x )dx = μ , a constant. Similarly, for a second-order strict-sense stationary process we have from (*)

f X ( x1 , x2 , t1 , t 2 ) ≡ f X ( x1 , x2 , t1 + c, t 2 + c)

for any c. For c = – t2 we get

f X ( x1 , x2 , t1 , t 2 ) ≡ f X ( x1 , x2 , t1 − t 2 )

i.e., the second order density function of a strict sense stationary process depends only on the difference of the time indices t1 – t2 = τ . Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

124

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Stationary (3) In that case the autocorrelation function is given by Δ RXX (t1 , t2 ) = E{ X (t1 ) X * (t2 )}

= ∫ ∫ x1 x2* f X ( x1 , x2 ,τ = t1 − t2 )dx1dx2 = RXX (t1 − t2 ) Δ= RXX (τ ) = RXX* (−τ ), i.e., the autocorrelation function of a second order strict-sense stationary process depends only on the difference of the time indices τ . However, the basic conditions for the first and second order stationarity are usually difficult to verify. In that case, we often resort to a looser definition of stationarity, known as Wide-Sense Stationarity (W.S.S). Thus, a process X(t) is said to be Wide-Sense Stationary if (i) E{ X (t )} = μ and * (ii) E{ X (t1 ) X (t2 )} = R XX (t1 − t2 ), Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

125

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Stationary (4) i.e., for wide-sense stationary processes, the mean is a constant and the autocorrelation function depends only on the difference between the time indices. Strict-sense stationarity always implies wide-sense stationarity. However, the converse is not true in general, the only exception being the Gaussian process. If X(t) is a Gaussian process, then wide-sense stationarity (w.s.s) ⇒ strict-sense stationarity (s.s.s).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

126

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (1) ‰ A deterministic system transforms each input waveform X(t, ζi ) into an output waveform Y(t, ζi) = T[X(t, ζi)] by operating only on the time variable t. A stochastic system operates on both the variables t and ζ . Thus, in deterministic system, a set of realizations at the input corresponding to a process X(t) generates a new set of realizations Y(t, ζ) at the output associated with a new process Y(t). Y (t, ξ i )

X (t , ξ i ) X (t ) ⎯⎯ ⎯→

T [⋅]

(t ) ⎯Y⎯→ ⎯ t

t

Our goal is to study the output process statistics in terms of the input process statistics and the system function. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

127

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (2) Deterministic Systems

Memoryless Systems

Systems with Memory

Y ( t ) = g [ X ( t )]

Time-varying systems

Time-invariant systems

Linear systems Y ( t ) = L[ X ( t )]

Linear-Time Invariant (LTI) systems X (t )

+∞

Y ( t ) = ∫ − ∞ h ( t − τ ) X (τ ) dτ

h (t )

+∞

= ∫ − ∞ h (τ ) X ( t − τ ) dτ .

LTI system Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

128

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (3) ‰ Memoryless Systems:

The output Y(t) in this case depends only on the present value of the input X(t). i.e., Y ( t ) = g{ X ( t )} Strict-sense stationary input

Memoryless system

Strict-sense stationary output.

Wide-sense stationary input

Memoryless system

Need not be stationary in any sense.

X(t) stationary Gaussian with R XX (τ )

Memoryless system

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

129

Y(t) stationary,but not Gaussian with R XY (τ ) = η R XX (τ ). SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (4) Theorem: If X(t) is a zero mean stationary Gaussian process, and Y(t) = g[X(t)], where g(.) represents a nonlinear memoryless device, then

R XY (τ ) = ηR XX (τ ),

η = E{g ′( X )}.

where g’(x) is the derivative with respect to x. ‰ Linear Systems: L[.] represents a linear system if

L{a1 X (t1 ) + a 2 X (t2 )} = a1 L{ X (t1 )} + a 2 L{ X (t2 )}. Let Y(t) = L{X(t)} represent the output of a linear system. ‰ Time-Invariant System: L[.] represents a time-invariant system if

Y (t ) = L{ X (t )} ⇒ L{ X (t − t0 )} = Y (t − t0 ) i.e., shift in the input results in the same shift in the output also. ‰ If L[.] satisfies both conditions for linear and time-invariant, then it corresponds to a linear time-invariant (LTI) system. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

130

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (5) LTI systems can be uniquely represented in terms of their output to a delta function Impulse h(t )

δ (t )

response of the system

h (t )

LTI

t

Impulse response

Impulse

then

Y (t ) X (t ) X (t ) t

LTI

Y (t )

t +∞

Y (t ) = ∫ − ∞ h (t − τ ) X (τ )dτ

arbitrary input

where

+∞

= ∫ − ∞ h (τ ) X (t − τ )dτ +∞

X (t ) = ∫ − ∞ X (τ )δ (t − τ )dτ

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

131

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (6) Thus

+∞

Y (t ) = L{ X (t )} = L{∫ − ∞ X (τ )δ (t − τ )dτ } +∞

= ∫ − ∞ L{ X (τ )δ (t − τ )dτ }

By Linearity

+∞

= ∫ − ∞ X (τ ) L{δ (t − τ )}dτ +∞

By Time-invariance +∞

= ∫ − ∞ X (τ )h(t − τ )dτ = ∫ − ∞ h(τ ) X (t − τ )dτ . Then, the mean of the output process is given by +∞

μ (t ) = E{Y (t )} = ∫ − ∞ E{ X (τ )h(t − τ )dτ } Y

+∞

= ∫ − ∞ μ X (τ )h(t − τ )dτ = μ X (t ) ∗ h(t ).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

132

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (7) Similarly, the cross-correlation function between the input and output processes is given by

R XY (t1 , t2 ) = E{ X (t1 )Y * (t2 )} +∞

= E{ X (t1 ) ∫ − ∞ X * (t2 − α )h * (α )dα } +∞

= ∫ − ∞ E{ X (t1 ) X * (t2 − α )}h * (α )dα +∞

= ∫ − ∞ R XX (t1 , t2 − α )h * (α )dα = R XX (t1 , t2 ) ∗ h * (t2 ).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

133

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (8) Finally the output autocorrelation function is given by

RYY (t1 , t2 ) = E{Y (t1 )Y * (t2 )} +∞

= E{∫ − ∞ X (t1 − β )h ( β )dβ Y * (t2 )} +∞

= ∫ − ∞ E{ X (t1 − β )Y * (t2 )}h ( β )dβ +∞

= ∫ − ∞ R XY (t1 − β , t2 )h ( β )dβ = R XY (t1 , t2 ) ∗ h(t1 ), or

RYY (t1 , t2 ) = R XX (t1 , t2 ) ∗ h * (t2 ) ∗ h (t1 ).

In particular, if X(t) is wide-sense stationary, then we have μX(t) = μX Also +∞ *

R XY ( t1 , t 2 ) = ∫ − ∞ R XX ( t1 − t 2 + α ) h (α ) dα Δ

= R XX (τ ) ∗ h * ( −τ ) = R XY (τ ), τ = t1 − t 2 . Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

134

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (9) Thus X(t) and Y(t) are jointly w.s.s. Further, the output autocorrelation simplifies to +∞ RYY (t1 , t 2 ) = ∫ −∞ RXY (t1 − β − t 2 )h( β )dβ , τ = t1 − t 2

= RXY (τ ) ∗ h(τ ) = RYY (τ ). or

RYY (τ ) = RXX (τ ) ∗ h* ( −τ ) ∗ h(τ ).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

135

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (10) X (t ) wide-sense stationary process

Y (t ) LTI system h(t)

wide-sense stationary process.

(a)

X (t ) strict-sense stationary process

LTI system h(t)

Y (t )

strict-sense stationary process

(b)

X (t ) Gaussian process (also stationary)

Y (t ) Linear system

Gaussian process (also stationary)

(c) Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

136

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (11) ‰ White Noise Process: W(t) is said to be a white noise process if

RWW (t1 , t 2 ) = q (t1 )δ (t1 − t 2 ), i.e., E[W(t1) W*(t2)] = 0 unless t1 = t2. W(t) is said to be wide-sense stationary (w.s.s) white noise if E[W(t)] = constant, and

RWW (t1 , t 2 ) = qδ (t1 − t 2 ) = qδ (τ ).

If W(t) is also a Gaussian process (white Gaussian process), then all of its samples are independent random variables (why?). For w.s.s. white noise input W(t), we have +∞

E [ N (t )] = μW ∫ −∞ h(τ )dτ , a constant and Rnn (τ ) = qδ (τ ) ∗ h* ( −τ ) ∗ h(τ )

= qh* (−τ ) ∗ h(τ ) = qρ (τ ) where

+∞

ρ (τ ) = h(τ ) ∗ h* (−τ ) = ∫ −∞ h(α )h* (α + τ )dα .

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

137

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Systems (12) Thus the output of a white noise process through an LTI system represents a (colored) noise process. Note: White noise need not be Gaussian. “White” and “Gaussian” are two different concepts!

White noise W(t)

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

Colored noise

LTI h(t)

N (t ) = h (t ) ∗ W (t )

138

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (1) ‰ A discrete time stochastic process (DTStP) Xn = X(nT) is a sequence of random variables. The mean, autocorrelation and auto-covariance functions of a discrete-time process are gives by

μ n = E{ X ( nT )} R ( n1 , n2 ) = E{ X ( n1T ) X * ( n2T )}

C ( n1 , n2 ) = R ( n1 , n2 ) − μ n1 μ n*2 respectively. As before strict sense stationarity and wide-sense stationarity definitions apply here also. For example, X(nT) is wide sense stationary if

E{ X ( nT )} = μ , a constant

and

Δ

E[X{(k +n)T}X*{(k)T}] = R(n) = rn = r−*n

i.e., R(n1, n2) = R(n1 – n2) = R*(n2 – n1).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

139

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (2) ‰ If X(nT) represents a wide-sense stationary input to a discrete-time system {h(nT)}, and Y(nT) the system output, then as before the cross correlation function satisfies

R XY ( n ) = R XX ( n ) ∗ h * ( − n ) and the output autocorrelation function is given by

RYY ( n ) = R XY ( n ) ∗ h( n ) or

RYY (n ) = R XX ( n ) ∗ h * ( −n ) ∗ h( n ).

Thus wide-sense stationarity from input to output is preserved for discretetime systems also.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

140

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (3) ‰ Mean (or ensemble average μ) of a stochastic process is obtained by averaging across the process, while time average is obtained by averaging along the process as M −1

μˆ =

1 M

∑X n =0

n

where M is total number of time samples used in estimation. Consider wide-sense DTStP Xn, time average converge to ensemble average if: 2

[

]

lim (μ − μˆ ) = 0

M →∞

the process Xn is said mean ergodic (in the mean-square error sense).

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

141

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (4) ‰ Let define an (M×1)-observation vector xn represents elements of time series Xn, Xn-1, ….Xn-M+1 xn = [Xn, Xn-1, ….Xn-M+1]T An (M×M)-correlation matrix R (using condition of wide-sense stationary) can be defined as

[

R = E xnxn

H

]

R(1) ⎡ R(0) ⎢ R(−1) R(0) =⎢ ⎢ ⎢ ⎣ R (− M + 1) R (− M + 2)

R( M − 1) ⎤ R ( M − 2)⎥⎥ ⎥ ⎥ R (0) ⎦

Superscript H denotes Hermitian transposition.

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

142

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (5) Properties: (i) Correlation matrix R of stationary DTStP is Hermitian: RH = R or R(k)=R*(k).Therefore:

R(1) ⎡ R(0) ⎢ R ∗ (1) R ( 0) ⎢ R= ⎢ ⎢ ∗ ∗ ⎣ R ( M − 1) R ( M − 2)

R( M − 1) ⎤ R ( M − 2)⎥⎥ ⎥ ⎥ R ( 0) ⎦

(ii) Matrix R of stationary DTStP is Toeplitz: all elements on main diagonal are equal, elements on any subdiagonal are also equal. (iii) Let x be arbitrary (nonzero) (M×1)-complex-valued vector, then xHRx ≥ 0 (nonnegative definition). B T (iv) If xBn is backward arrangement of xn: x n = [ xn − M +1 , xn − M + 2 ,..., xn ]

Then, E[xBnxBHn] = RT Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

143

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (6) (v) Consider correlation matrices RM and RM+1, corresponding to M and M+1 observations of process, these matrices are related by

⎡ R (0) r H ⎤ R M +1 = ⎢ ⎥ r R M⎦ ⎣ or

⎡R M R M +1 = ⎢ BT ⎣r

r B∗ ⎤ ⎥ R ( 0) ⎦

where rH = [R(1), R(2),…, R(M)] and rBT = [r(-M), r(-M+1),…, r(-1)]

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

144

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (7) ‰ Consider a time series consisting of complex sine wave plus noise:

un = u (n) = α exp( jωn) + v(n),

n = 0,..., N − 1

Sources of sine wave and noise are independent. Assumed that v(n) has zero mean and autocorrelation function given by 2 ⎧ σ E v ( n )v ∗ ( n − k ) = ⎨ v ⎩0

[

]

k =0 k ≠0

For a lag k, autocorrelation function of process u(n): 2 ⎧ + σ v2 , α ⎪ ∗ r (k ) = E u (n)u (n − k ) = ⎨ 2 jωk ⎪⎩ α e ,

[

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

]

145

k =0 k ≠0

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Processes: Discrete Time (8) Therefore, correlation matrix of u(n):

1 ⎡ 1 exp( jω ) + ⎢ ρ ⎢ 1 2⎢ exp(− jω ) 1+ R=α ⎢ ρ ⎢ ⎢ ⎢exp(− jω ( M − 1)) exp(− jω ( M − 2)) ⎣

⎤ exp( jω ( M − 1)) ⎥ ⎥ exp( jω ( M − 2))⎥ ⎥ ⎥ ⎥ 1 1+ ⎥ ρ ⎦

2

α where ρ = 2 σv

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

: signal to noise ratio (SNR).

146

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (1) ‰ Consider an input – output representation p

q

k =1

k =0

X ( n ) = − ∑ a k X ( n − k ) + ∑ bkW ( n − k ), where X(n) may be considered as the output of a system {h(n)} driven by the input W(n). Using z – transform, it gives p

X ( z ) ∑ ak z k =0

or

H (z) =



∑ h(k ) z −k

k =0

−k

q

= W ( z ) ∑ bk z − k , k =0

a0 ≡ 1

−1 −2 X ( z ) b0 + b1 z + b2 z + = = W ( z ) 1 + a1 z − 1 + a 2 z −2 +

+ bq z − q Δ B ( z ) = −p A( z ) + apz

represents the transfer function of the associated system response {h(n)} so ∞ that X ( n ) = ∑ h( n − k )W ( k ). k =0

W(n) Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

h(n) 147

X(n) SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (2) Notice that the transfer function H(z) is rational with p poles and q zeros that determine the model order of the underlying system. The output X(n) undergoes regression over p of its previous values and at the same time a moving average based on W(n), W(n-1), …, W(n-q) of the input over (q + 1) values is added to it, thus generating an Auto Regressive Moving Average (ARMA (p, q)) process X(n). Generally the input {W(n)} represents a sequence of uncorrelated random variables of zero mean and constant variance so that

RWW (n) = σ W2δ (n). If in addition, {W(n)} is normally distributed then the output {X(n)} also represents a strict-sense stationary normal process. If q = 0, then X(n) represents an Auto Regressive AR(p) process (all-pole process), and if p = 0, then X(n) represents an Moving Average MA(q) process (all-zero process). Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

148

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (3) AR(1) process: An AR(1) process has the form X ( n ) = aX ( n − 1) + W ( n ) and the corresponding system transfer function ∞ 1 n −n H ( z) = = a z ∑ −1 1 − az n =0

provided | a | < 1. Thus

h(n ) = a n ,

| a |< 1

represents the impulse response of an AR(1) stable system. We get the output autocorrelation sequence of an AR(1) process to be |n | a R XX ( n ) = σ W2δ ( n ) ∗ {a − n } ∗ {a n } = σ W2 ∑ a |n|+ k a k = σ W2 1− a2 k =0 ∞

The normalized (in terms of RXX (0)) output autocorrelation sequence is given by R (n) ρ X ( n ) = XX = a |n| , | n | ≥ 0. R XX (0) Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

149

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (4) It is instructive to compare an AR(1) model discussed above by superimposing a random component to it, which may be an error term associated with observing a first order AR process X(n). Thus Y ( n) = X ( n) + V ( n) where X(n) ~ AR(1), and V(n) is an uncorrelated random sequence with zero mean and variance σ2V that is also uncorrelated with {W(n)}. Then, we obtain the output autocorrelation of the observed process Y(n) to be RYY (n) = RXX ( n) + RVV ( n) = RXX (n) + σ V2δ (n)

a |n| 2 = σW + σ δ ( n) V 2 1− a so that its normalized version is given by n=0 ⎧1 Δ RYY (n) = ⎨ |n| ρY (n) = RYY (0) ⎩c a n = ±1, ± 2, where σ W2 c= 2 < 1. 2 2 σ W + σ V (1 − a ) 2

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

150

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (5) The results demonstrate the effect of superimposing an error sequence on an AR(1) model. For non-zero lags, the autocorrelation of the observed sequence {Y(n)} is reduced by a constant factor compared to the original process {X(n)}. The superimposed error sequence V(n) only affects the corresponding term in Y(n) (term by term). However, a particular term in the “input sequence” W(n) affects X(n) and Y(n) as well as all subsequent observations. ρ X (0) = ρ Y (0) = 1 ρ X (k ) > ρY (k )

0

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

n k

151

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (6) AR(2) Process: An AR(2) process has the form

X ( n) = a1 X ( n − 1) + a2 X (n − 2) + W (n) and the corresponding transfer function is given by ∞

H ( z ) = ∑ h( n) z − n =

so that

n =0

1 b1 b2 = + 1 − a1 z −1 − a2 z − 2 1 − λ1 z −1 1 − λ2 z −1

h(0) = 1, h(1) = a1 , h(n) = a1h(n − 1) + a2 h(n − 2), n ≥ 2

and in term of the poles of the transfer function, we have

h(n) = b1λ1n + b2 λn2 ,

n≥0

that represents the impulse response of the system. We also have

λ1 + λ2 = a1 ,

λ1λ2 = −a2 ,

b1 + b2 = 1, b1λ1 + b2 λ2 = a1. and H(z) stable implies | λ1 |< 1, | λ2 |< 1. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

152

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (7) Further, the output autocorrelations satisfy the recursion R XX ( n ) = E{ X ( n + m ) X * ( m )}

= E{[ a1 X ( n + m − 1) + a2 X ( n + m − 2)] X * ( m )} 0

+ E{W ( n + m ) X * ( m )} = a1 R XX ( n − 1) + a2 R XX ( n − 2) and hence their normalized version is given by R (n ) ρ X ( n ) =Δ XX = a1 ρ X ( n − 1) + a 2 ρ X ( n − 2). R XX (0) By direct calculation using, the output autocorrelations are given by RXX (n) = RWW (n) ∗ h* (− n) ∗ h(n) = σ W2 h* (− n) ∗ h(n) ∞

= σ W ∑ h * ( n + k ) ∗ h( k ) 2

k =0

⎛ | b1 |2 (λ1* ) n b1*b2 (λ1* ) n b1b2* (λ*2 ) n | b2 |2 (λ*2 ) n ⎞ = σW ⎜ + + + ⎟ * 2 2 * 1 − λ1λ2 1 − λ1λ2 1− | λ2 | ⎠ ⎝ 1− | λ1 | 2

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

153

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (8) Then, the normalized output autocorrelations may be expressed as

RXX (n) *n *n ρ X ( n) = = c1λ1 + c2 λ2 RXX (0)

Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

154

SSP2008 BG, CH, ÑHBK

2. Review of Stochastic Models (9) ‰ An ARMA (p, q) system has only p + q + 1 independent coefficients, (ak, k = 1… p; bi, i = 0… q) and hence its impulse response sequence {hk} also must exhibit a similar dependence among them. In fact, according to P. Dienes (1931), and Kronecker (1881) states that the necessary and ∞ −k sufficient condition for H ( z ) = ∑ k = 0 hk z to represent a rational system (ARMA) is that

det H n = 0, where

n≥N

(for all sufficiently large n ),

⎛ h0 h1 h2 ⎜h h h3 Δ 1 2 ⎜ Hn = ⎜ ⎜h h ⎝ n n +1 hn + 2

hn ⎞ hn +1 ⎟ ⎟. ⎟ h2 n ⎟⎠

i.e., in the case of rational systems for all sufficiently large n, the Hankel matrices Hn all have the same rank. Boä moân Vieãn thoâng Khoa Ñieän-Ñieän töû

155

SSP2008 BG, CH, ÑHBK

Related Documents