SCHOOL OF MATHEMATICS MATHEMATICS FOR PART I ENGINEERING Self Study Course
MODULE 25
STATISTICS II
Module Topics 1. Mean and standard error of sample data 2. Normal distribution 3. Sampling 4. Confidence intervals for means 5. Hypothesis testing
A:
Work Scheme based on JAMES (THIRD EDITION)
1. Turn to p.887 and study section 13.4.8. Work through Example 13.19. 2. Read through section 13.4.9 on scaling and adding random variables. Note the shaded results on p.889 for the mean E(X + Y ) and, provided X and Y are independent, the corresponding result for Var(X + Y ). 3. Move on to section 13.4.10, which concerns sample data. Study first the introductory comments and the subsection on sample average and variance. It is important for you to note that the first definition 2 , defined in the middle of of sample variance in J. is not correct, and you should use the quantity SX,n−1 p.891. The reason why the divisor is n − 1 and not n is that the differences Xi − X sum to zero, and so are 2 not independent. Given any n − 1 differences the nth is then completely determined, and SX is the average of only n − 1 independent terms. 2 2 2 and SX,n−1 come close together when n is large. The expression for SX,n−1 Clearly the values of SX,n can be written in a different form, which is often useful in calculations, and you will find this alternative definition on the Formula Sheet.
Work through Example 13.22, noting that the answer for standard deviation will be slightly different due to the change in definition. ***Do Exercise 42(a, sample average and correct sample standard deviation only) on p.895*** 4. Read the introduction to section 13.5. The Poisson distribution is not considered in this module. Study section 13.5.1 on the binomial distribution. For the Bernoulli trial and binomial distribution it is common to introduce q = 1 − p, so that q represents the probability of a failure. You can then write the formula at the top of p.898 in the form Var(Y ) = npq. Study Example 13.25. 5. Omit section 13.5.2 and turn to p.901. Study section 13.5.3 on the normal distribution. The standard normal distribution, with mean 0 and variance 1, is extremely useful in applications. J. points out –1–
2 2 that if a random variable X is normally distributed with mean µX and variance σX (i.e. X ∼ N (µX , σX )), X − µX then Z = is normally distributed with mean 0 and variance 1 (i.e. Z ∼ N (0, 1)). Figure 13.22 σX on p.903 shows a large collection of results. A very much smaller subset is given on the Formula Sheet.
Work through Examples 13.28 and 13.29. In part (a) of the solution to 13.29 note that, because of symmetry of the standard normal distribution about Z = 0, we know P (Z < −2) = P (Z > 2), and then P (Z > 2) = 1 − P (Z ≤ 2) = 1 − Φ(2). ***Do Exercises 50, 51, 59 on p.911*** 6. The remainder of the topics to be covered in this module are not in J., so the material is discussed in some detail below. One of the major problems in statistics is estimating the properties of a large population from the properties of a sample of individuals chosen from that population. The theory of sampling deals with this estimation process and with determining the accuracy of the estimates. Select at random a sample of n observations X1 , X2 , . . . , Xn taken from a population. From these n observations you can calculate the values of a number of statistical quantities, for example the sample mean X. If you choose another random sample of size n from the same population, a different value of the statistic will, in general, result. In fact, if repeated random samples are taken, you can regard the statistic itself as a random variable, and its distribution is called the sampling distribution of the statistic. As an illustration, consider the distribution of heights of all adult men in England. It is an empirical fact that this distribution conforms very closely to the normal curve. However, take a large number of samples of size four, drawn at random from the population, and calculate the mean height of each sample. How will these mean heights be distributed? We find that they too would be normally distributed, and about the same mean as the original distribution. However, the dispersion of the second distribution would be less. This is expected, since a random sample of four is likely to include men both above and below average height; the mean of the sample will therefore tend to deviate from the true mean less than a single observation will. The very important general result can be stated as follows: If random samples of size n are taken from a distribution whose mean is µX and whose standard deviation is σX , then the sample means√form a distribution with the same mean µX , but with a smaller standard deviation given by σX = σX / n. Note that the theorem is independent of the distribution of the parent population, and holds whether it is normal, binomial or any other. However, if the parent distribution is normal, then it can be shown that √ the sampling distribution of the sample mean is also normal. The standard deviation σX = σX / n of the sample mean is usually called the standard error of the sample mean to distinguish it from the standard deviation of the parent population. Let us now present three worked examples. Example A: A random sample is drawn from a population with a known standard deviation of 2.0. Find the standard error of the sample mean if the sample is of size (i) 9, (ii) 25, (iii) 100. What sample size would give a standard error equal to 0.5? √ The standard error of the sample mean is σX / n, if σX is the standard deviation for the whole population. Hence, the required standard errors of the sample mean are 2 2 2 √ = 0.667, to 3 decimal places, (i) (ii) √ = 0.4, (iii) √ = 0.2. 9 25 100 √ If the standard error is equal to 0.5, then 2/ n = 0.5 = 1/2. Squaring, it then follows that 4/n = 1/4 or n = 16, thus the sample size is n = 16. –2–
Example B: The diameters of shafts made by a certain manufacturing process are known to be normally distributed with mean 2.500 cm and standard deviation 0.009 cm. What is the distribution of the sample mean diameter of nine such shafts selected at random? Calculate the proportion of such sample means which can be expected to exceed 2.506 cm. The earlier result states that the sampling distribution of the sample mean will also√be normal, with the same mean, 2.500 cm, but with a standard deviation (or standard error) σX = 0.009/ 9 = 0.003 cm. In order to calculate the probability that X > 2.506, we standardize in the usual way by putting Z = (X − 2.500)/0.003. Then 2.506 − 2.500 = P (Z > 2.0) = 1−P (Z ≤ 2.0) = 1−Φ(2.0) = 1−.9772 = 0.0228 . P (X > 2.506) = P Z > 0.003 Hence, the proportion of sample means which can be expected to exceed 2.506 cm. is 2.28%. Example C: What is the probability that an observed value of a normally distributed random variable lies within one standard deviation from the mean? Suppose the given normally distributed random variable, X say, has mean µX and standard deviation σX , X − µX 2 i.e. X ∼ N (µX , σX ). We need to calculate P (µX − σX ≤ X ≤ µX + σX ). Define Z = , then σX Z ∼ N (0, 1). It follows that P (µX − σX ≤ X ≤ µX + σX ) = P (−1 ≤ Z ≤ 1) = 2 P (0 ≤ Z ≤ 1),
by symmetry
= 2(Φ(1) − Φ(0)) = 2(0.8413 − 0.5000) = 2(0.3413) = 0.6826. 7. When the parent distribution is normal, then so is the sampling distribution of the sample mean. What happens when the parent distribution is not normal? The rather surprising result is that, provided reasonably large samples are taken (i.e. n ≥ 30), the sampling distribution of X is approximately normal whatever the distribution of the parent population. This remarkable result, known as the central limit theorem, can be stated as follows: If a random sample of size n is taken from ANY distribution with mean µX and standard deviation √ σX , the sampling distribution of X is approximately normal, with mean µX and standard deviation σX / n, the approximation improving as n increases. Study the following example. Example D: It is known that a particular make of light bulb has an average life of 800 hrs with a standard deviation of 48 hrs. Find the probability that a random sample of 144 bulbs will have an average life of less than 790 hrs. Since the number √ of bulbs in the sample is large, the sample mean will be normally distributed with mean 800 and σX = 48/ 144 = 4. Put Z = (X − 800)/4. Then 790 − 800 = P (Z < −2.5) = P (Z > 2.5) = 1 − P (Z ≤ 2.5) = 0.0062 . P (X < 790) = P Z < 4
The main results concerning the distribution of X can be summarised as follows: –3–
Z=
X − µX √ σX / n
Size of sample
Form of distribution of Z
n ≥ 30
Good approximation to the standard normal distribution N (0, 1), regardless of the distribution of the parent population. The distribution is approximately N (0, 1) if the distribution of the parent population is approximately normal. The distribution is N (0, 1) if the distribution of the parent population is normal.
n < 30 all n
***Do Exercise A: The length of a steel rod in a given batch is normally distributed with mean 3.10 m. and standard deviation 0.16 m. Calculate the proportion of rods that are longer than 3.50 m. ***Do Exercise B: The lifetimes of a batch of video components are known to be normally distributed with mean 500 hours and standard deviation 50 hours. A purchaser requires at least 95% of them to have a lifetime greater than 400 hours. Will the batch meet the purchaser’s specification? 8. Let us now consider confidence intervals. Suppose you take a sample chosen at random from a population, then the mean, X, of the sample is said to provide a point estimator of the population mean µX . This point estimator is a random variable distributed in some way around the population mean; but it contains within itself no measure of its precision. A method of estimation that does also contain such a measure is the confidence interval, within which you can be reasonably sure that the value of µX lies. The procedure is to calculate a number k such that the interval (X − k, X + k) has a specified probability of containing the population mean. The values usually chosen for this probability are 0.95 and 0.99. For example, if k is calculated so that P (X − k ≤ µ ≤ X + k) = .95 , then the interval is called the 95% confidence interval. That is, if the interval is calculated for very many samples, then 95 out of 100 intervals would contain µX . 2 is known and that To proceed further, assume that the population variance σX
Z=
X − µX √ σX / n
(1)
is distributed with the standard normal distribution N (0, 1)—the conditions for this are displayed in the boxed results above. Now the results on the Formula Sheet show that 95% of the standard normal distribution lies between −1.96 and 1.96. Hence X − µX √ ≤ 1.96 = 0.95 . P −1.96 ≤ σX / n √ X − µX √ ≤ 1.96 then X − µX ≤ 1.96σX / n and σX / n √ √ X − µX √ then µX ≤ X + 1.96σX / n . It follows that hence X − 1.96σX / n ≤ µX . Similarly, if −1.96 ≤ σX / n the expression can be re-written as σX σX = 0.95 . P X − 1.96 √ ≤ µX ≤ X + 1.96 √ n n Next re-arrange the inequalities inside the bracket. If
–4–
Hence, the interval
σX σX X − 1.96 √ ≤ µX ≤ X + 1.96 √ n n σX is the 95% confidence interval for µX . Similarly X ± 2.58 √ is the 99% confidence interval. n
(2)
Work through the example below. Example E: The percentage of copper in a certain chemical is to be estimated by taking a series of measurements on randomly chosen small quantities of the chemical and using the sample mean to estimate the true percentage. From previous experience, individual measurements of this type are known to have a standard deviation of 2.0%. How many measurements must be made so that the standard error of the estimate is less than 0.3? If the sample mean w of 45 measurements is found to be 12.91%, give a 95% confidence interval for the true percentage, ω. √ Assume that n measurements are made. The√ standard error of the sample mean is (2/ n)%. If the required precision is to be achieved, we must have 2/ n < 0.3, i.e. n > 4/0.09 = 44.4. Since n must be an integer, at least 45 measurements must be made to achieve the required precision. With a sample of 45 measurements we can use the central limit theorem √ and take the sample mean percentage W to be distributed normally with mean ω and standard error 2/ 45. Hence, if ω is the true percentage, it W −ω follows that = √ is distributed as N (0, 1). Since 95% of the area under the standard normal curve lies 2/ 45 between Z = −1.96 and Z = 1.96, W −ω P −1.96 ≤ √ ≤ 1.96 = 0.95 . 2/ 45 Re-arranging, we obtain P
W − 1.96
2 √ 45
≤ ω ≤ W + 1.96
2 √ 45
= 0.95 .
Hence, the 95% confidence interval for the true percentage is (12.91 − 1.96(0.298), 12.91 + 1.96(0.298)) = (12.33, 13.49) . ***Do Exercise C: A machine fills cartons of liquid; the mean fill is adjustable but the dial on the gauge is not very accurate. The standard deviation of the quantity of fill is 6 ml. A sample of 30 cartons gave a measured average content of 570 ml. Find 90% and 95% confidence intervals for the mean. ***Do Exercise D: While performing a certain task under simulated weightlessness, the pulse rate of 32 astronaut trainees increased on average by 26.4 beats per minute, with a standard deviation of 4.28 beats per minute. Construct a 95% confidence interval for the true average increase in the pulse rate of astronaut trainees performing the given task. 9. Given a sample of n observations X1 , X2 , . . . , Xn , the sample mean X can be calculated. The quantity s2 =
n
2 1 X Xi − X n − 1 i=1
(3)
is called the sample variance. The quantity s2 varies from sample to sample, and hence it has a sampling distribution. It can be shown that the mean of the sample variance, averaged over many samples, is equal –5–
2 to the population variance σX . It is for this reason that it is preferable to use the sample variance s2 as an 2 estimator of σX . 2 was In our discussion of confidence intervals for the mean, it was assumed that the population variance σX 2 known. What happens when this is not the case? For samples of size n ≥ 30, a good estimate of σX 2 is obtained by calculating the sample variance s , and then replacing σX by s in equation (1). However, for small samples (n < 30), the values of s2 fluctuate considerably from sample to sample; the sampling X − µX √ is no longer a standard normal distribution. If the samples are taken distribution of the values of s/ n X − µX √ is called the t-distribution. The latter from a normal distribution, the sampling distribution of s/ n distribution depends on n − 1 and is extensively tabulated, but is not considered further in this module.
10. As mentioned earlier, one of the most important aspects of statistics is using data from a sample to make inferences about the population from which the sample was drawn. An assumption about the population is called a statistical hypothesis. A random sample is chosen and, from the information contained in the sample, we try to decide whether or not the hypothesis is true. If the evidence from the sample is inconsistent with the hypothesis, then it is rejected—i.e. we conclude that it is false. If the evidence is consistent with the hypothesis, then it is accepted— or, at least, not rejected. The procedure adopted in investigating the evidence is called a statistical test. The hypothesis which we are trying to test in any instance is called the null hypothesis. It will always be stated so as either to specify a particular value of the population parameter or to specify that two or more parameters are equal. The null hypothesis is usually denoted by H0 . A common example is the statement that the population mean has a particular value—i.e. H0 : µX = µ0 . A contrary assumption is called an alternative hypothesis and is usually denoted by H1 . In contrast to H0 , it usually specifies a range of values for the parameter. Examples of three alternative hypotheses to the null hypothesis H0 : µX = µ0 are (i) H1 : µX > µ0 ,
(ii) H1 : µX < µ0 ,
(iii) H1 : µX 6= µ0 .
Alternative hypotheses of types (i) and (ii) are said to be one-sided or one-tailed, those of type (iii) are two-sided or two-tailed. The result of a test is a decision to choose H0 or H1 . Such a decision is subject to uncertainty, and hence to error. Two types of error are possible. The first (an error of type I), occurs when we reject H0 on the basis of the test although it happens to be true. The probability of making such an error is called the level of significance of the test and is decided on before testing. In practice, the values most commonly chosen for the significance level are 5% or 1%. The other type of error (type II) occurs when you accept the null hypothesis on the basis of the test although it happens to be false. The above ideas are now applied to the determination of whether or not the mean, X, of a sample is consistent with a specified population mean µ0 . The null hypothesis is H0 : µX = µ0 and a suitable statistic to use is Z=
X − µ0 √ , σX / n
where σX is the standard deviation of the population (assumed known) and n is the size of the sample. You then need to determine the range of values of Z for which the null hypothesis would be accepted. This is known as the acceptance region for the test and it will depend, as you shall see below, on the pre-determined significance level and also on our choice of H1 . The corresponding range of values of Z for which H0 is rejected is called the rejection region. A worked example is now presented. –6–
Example F: A standard process produces yarn with mean breaking strength 15.8 kg and standard deviation 1.9 kg. A modification is introduced and a sample of 30 lengths of yarn produced by the new process is tested to see if the breaking strength has changed. The sample mean breaking strength is 16.5 kg. Assuming the standard deviation is unchanged, is it correct to say that there is no change in the mean breaking strength? In this case, the null hypothesis is that there has been no change in the mean breaking strength and the alternative hypothesis is that there has been a change. Thus H0 : µX = µ0 ,
H1 : µX 6= µ0 ,
where µ0 = 15.8 and µX is the mean breaking strength for the new process. X − µ0 √ has approximately the N (0, 1) distribution, where σX / n X is the mean breaking strength of the 30 sample values, n = 30 and σX is the standard deviation.
If H0 is true, then we know that the statistic Z =
Let us test at the 5% significance level. This means that there is a rejection region of 2.5% in each tail, as shown in the diagram (since, under H0 , P (Z > 1.96) = P (Z < −1.96) = 1 − P (Z ≤ 1.96) = 1 − Φ(1.96) = 0.025,
i.e. 2.5%).
This is an example of a two-sided test leading to a two-tailed rejection region.
rejection region 2 12 % -1.96
acceptance region
rejection region
0
1.96 2 1 % 2
Figure 6a
The test is therefore: accept H0 if −1.96 ≤ Z ≤ 1.96,
reject H0 otherwise.
16.5 − 15.8 √ = 2.018. Hence, H0 is rejected at the 5% significance level: i.e. the evidence 1.9/ 30 suggests that there IS a change in the mean breaking strength.
From the data, Z =
Suppose now that the modification in the process was specifically designed so as to increase the strength of the yarn. The null hypothesis in this case is, as before, that the mean breaking strength is unchanged. The alternative hypothesis in this case is that the mean breaking strength is increased. Thus H0 : µX = µ0 ,
H1 : µX > µ0 .
Now H0 is rejected if the value of Z is unreasonably large. You are now dealing with a one-sided test and the acceptance and (one-tailed) rejection regions at the 5% significance level are shown below.
rejection region
acceptance region 0
Figure 6b –7–
1.64
5%
At the 5% significance level, the test is accept H0 if Z ≤ 1.64,
reject H0 otherwise.
For the data of this particular problem, Z = 2.018 and, once again, the null hypothesis is rejected. However, you should compare the two diagrams above, which illustrate the statement that the rejection region for a test depends on the form of the alternative hypothesis and on the significance level. ***Do Exercise E: The strength of steel wire made by an existing process is normally distributed with a mean value of 1250 and a standard deviation of 150 (in appropriate units). A batch of wire is made by a new process and a sample of 25 measurements gives an average strength of 1312. The standard deviation of the measurements is assumed to be unchanged. Do the measurements provide evidence that the new process is an improvement? (Use a 5% significance level.) ***Do Exercise F: For a certain chemical product it is thought that the true percentage of phosphorus is 3 per cent. 64 analyses give a mean percentage of 3.3 per cent with a standard deviation of 0.8 per cent. Is the sample mean significantly different from 3 per cent?
B:
Work Scheme based on STROUD (FIFTH EDITION)
Only a small proportion of the material in this module can be found in S., in parts of Programmes 27 and 28, so it seems sensible just to work through A: Work scheme based on JAMES (THIRD EDITION), presented above.
–8–
Specimen Test 25 1.
Find the sample average and standard deviation for the following sample of component lifetimes (in thousands of hours): 5.6, 4.1, 6.0, 5.8, 5.2, 4.3, 6.4, 5.5, 6.0, 5.1
2.
The weights of components produced in a particular industrial process are normally distributed with a mean weight of 18 kg and a standard deviation of 2 kg. Determine the proportion of components with weights (i) over 21 kg, (ii) at most 19 kg,
3.
Random samples of size n are taken from a normal distribution whose mean is µ and whose standard deviation is σ. (i) What is the mean of the sampling distribution of sample means? (ii) What is the standard deviation of the sampling distribution of sample means?
(iii) If a large number of samples of size 9 is taken, for what proportion of the samples will the mean X exceed µ by at least σ/3? 4.
A machine produces metal bars. A sample of 100 bars is taken. The mean length X of the bars in the sample and the sample variance s2 have the values X = 120 mm , s2 = 400 mm2 . Estimate (i) the standard deviation σ of the population from which the sample is drawn, (ii) the standard error of the mean,
(iii) the 95% confidence interval for the population mean.
5.
The average life of a random sample of 100 light bulbs is 3580 hours with standard deviation 400 hours. We wish to determine whether this indicates that the average life of this brand of light bulbs is greater than 3500 hours. (i) State the appropriate null hypothesis. (ii) State the alternative hypothesis.
(iii) Are we dealing with a one-sided or with a two-sided test? (iv) Assuming that Z is normally distributed as N (0, 1), show on a diagram the acceptance and rejection regions for this problem at the 5% level of significance. (v) Should the null hypothesis be rejected at the 5% level of significance? (vi) What is the probability of observing a sample mean ≥ 3580 when the true mean is 3500 hours?
–9–