Fundamentals of Communications A1: Simple Statistics EE3158 Professor Ian Groves
[email protected] www.ctr.kcl.ac.uk/members
Simple Statistics
Averages Spreads Z scores Normal Distribution Finding probabilities Application to Telecommunications
A1 - Simple Statistics
2
Averages Mean, median and mode Weekly rent paid by 15 students sharing accommodation, 1998 45 35 51 45 51 40 42 46 37 42 47 49
(£) 49 36 42
Mean (or average) add observations and divide by number of observations 657/15 = 43.8
x ∑ x= n A1 - Simple Statistics
3
Averages… 2 Median – the middle observation. rank the observations and find the middle one (n+1)/2th observation 35 36 37 40 42 42 42 45 45 46 47 49 49 51 51 the 8th observation (15+1)/2 is 45
Mode – the most frequent observation in this case 42
A1 - Simple Statistics
4
Spreads Standard Deviation (SD) calculated as below calculate residuals – individual observation minus mean square and sum these divide by number 2 of observations minus 1 [gives Yi − Y Variance] SD = take square − 1 root for Standard Deviation
∑( n
)
example peoples heights (cm) 190 185 182 208 186 187 189 179 183 191 179 mean 187.18 SD 8.02 A1 - Simple Statistics
5
Z Scores used to ‘normalise’ data
X −X i Zi = SD observation − mean Z= standard deviation A1 - Simple Statistics
6
Normal Distribution A general statistical theorem, the Central Limit Theorem, states that the probability distribution of any quantity which arises as the sum of the effects of a large number of separate contributions is the Gaussian (or normal) distribution, which for unit variance and zero mean is given by:
Z(x ) =
1 −x 2 2 e 2π
A1 - Simple Statistics
7
Normal Distribution…2 the probability that x is greater than some value xo is found by integrating this equation over the range xo to infinity:
P(x > xo ) =
∞
∫ Z (t)dt
xo
also known as the Gaussian Q–function. we cannot solve this in closed form, rather numerically integrate and tabulate. total area under curve (integral over +/infinity) =1 A1 - Simple Statistics
8
Areas Under the Normal Curve
‘bell’ shaped curve – symmetrical most common observations fall within +/- one SD A1 - Simple Statistics
9
Tabulated Results Values of Q(x) for various x x 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Q(x) 0.50000 0.30854 0.15866 0.06681 0.02275 0.00621 0.00135 0.00023 3.17 x 10-5
Some ‘rules of thumb’ 68% of a population fall within =/- 1 SD 95% fall within +/- 2 SD 99% fall within +/- 3 SD SD is Standard Deviation
For large values of x we can approximate Q(x) as Z(x)/x with less than 10% error for x>3
A1 - Simple Statistics
10
Gaussian Probabilities level Xo 1.28 2.33 3.09 3.73 4.28 4.76 5.21 5.62
20log Xo (dB) 2.1 7.3 9.8 11.4 12.6 13.6 14.3 15.0
+ 6dB 8.1 13.3 15.8 17.4 18.6 19.6 20.3 21.0
Error Rate 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8
values for Xo taken from published tables for Xo < 3 else computed from approximation (Excel solver function) expressed as signal–to–noise ratio (power which is voltage squared so use 20log(Xo) +6dB column for pulse detection in presence of noise… A1 - Simple Statistics
11
Error Rates For digital telecommunication systems we are interested in low error rates when detecting pulses in the presence of noise. An error occurs if the instantaneous noise voltage exceeds half the pulse amplitude E volts. If error rate is 10-3, say, Xo is 3.09 from previous table. 0.5E/SD = 3.09, so E/SD = 6.18 20log(6.18) = 15.8 dB i.e. 6dB greater than first column we can now plot an error rate curve for a rectangular pulse detected in the presence of noise as a function of signal to noise A1ratio. - Simple Statistics 12
Error Rate Curve Error Probability 0 -1
Error Rate (10^y)
-2 -3 -4 -5 -6 -7 -8 -9 8
10
12
14
16
18
20
22
24
Signal to Noise Ratio (dB)
Calculated curve for a rectangular pulse from Slide 11 A1 - Simple Statistics
13