Point Estimation : Definition : Point estimation is a choice of statistics, i.e. a single number calculated from sample data for which we have some expectation, or assurance that it is reasonably close to the parameter it is supposed to estimate.
Point estimation of a mean : Parameter: Population mean µ Data: A random sample X 1 ,......, X n Estimator: X
Estimate of standard error: S n
θ θ θ
Unbiased estimator : Let be the parameter of interest and ˆ be a statistic. Then a statistics ˆ is said to be an unbiased estimator, or its value an unbiased estimate, if and only if the mean of the sampling distribution of the estimator equals , whatever the value of .
θ θ
Remark : It is a mathematical fact that X is an unbiased estimator of the population mean µ provided the observations are a random sample.
More efficient unbiased estimator : A statistics θˆ1 is said to be a more efficient unbiased estimator of the parameter θ than the statistics θˆ2 if
1. θˆ1 and θˆ2 are both unbiased estimators of θ ;
2. the variance of the sampling distribution of the first estimator is no larger than that of the second and is smaller for at least one value of θ .
Maximum error of estimate Error of estimate = | X − µ | To examine this error, if we assert with probability 1 − α that the inequality
or
X −µ − zα 2 ≤ ≤ zα 2 σ n |X − µ | ≤ zα 2 is satisfied, σ n
then maximum estimate is
error
of
σ E = zα 2 . n
i.e. the error will be less than σ with probability 1 − α zα 2 . n
(or we say with (1 − α )100% confidence that the error is at most E ).
Sample size determination : Solving the above equation for n we get, sample size zα 2 .σ n= E
2
Remark : Note that the method discussed so far require that σ be known or that it can be approximated with sample standard deviation s .
Maximum error of estimate (σ unknown) : If sampling is done from a normal population, we can find the maximum error of estimate by using the fact that X −µ t= S n
is a random variable having the t distribution with n − 1 degrees of freedom.
Thus we have,
s E = tα 2 . n where, tα 2 has probability α 2
of being exceeded by a t random variable having n − 1 degree of freedom.
Interval Estimation : Definition : Interval estimate is an interval for which we can assert with a reasonable degree of certainty that they will contain the parameter under consideration.
Large sample confidence interval for µ (σ known) : Suppose that we have a large ( n ≥ 30) random sample from a population with the unknown mean µ and the 2 known variance σ .
Then, referring inequality
to
the
X −µ − zα 2 ≤ ≤ zα 2 σ n and calculating x , we obtain σ σ x − zα 2 . < µ < x + zα 2 . n n
Interval of this kind is known as a confidence interval for µ having the degree of confidence 1 − α or (1 − α )100% and its endpoints are called confidence limits.
Large sample confidence interval for µ (σ unknown) : Since, σ is unknown in most applications, we may have to make the further approximation of substituting for σ the sample standard deviation s . Thus, s s x − zα 2 . < µ < x + zα 2 . n n
Small sample confidence interval for µ : For small sample (n < 30) , with tα 2 , we have the following (1 − α )100% confidence interval formula s s x − tα 2 . < µ < x + tα 2 . n n
Ex. 7.6(Pg 225): To estimate the average time it takes to assemble a certain computer component, the industrial engineer at an electronic firm time 49 technicians in the performance of this task, getting a mean of 12.00 min. and a standard deviation of 1.9 min.
(a) What can we say with 95% confidence about the maximum error if x = 12.00 is used as a point estimate of the actual average time required to do the job?
Sol. : (a) n=49, x = 12.00, s=1.9 1 − α =0.95, z0.025 =1.96 s 1.9 ∴ E = zα 2 = 1.96 × n 49 = 0.532
Thus we can say with 95% confidence that maximum error of estimate is 0.532.
(b) Use the given data to construct a 99% confidence interval for the true average time it takes to assemble the computer component.
(b) n=49, x = 12.00, s=1.9 1 − α = 0.99 , z.005 = 2.575 Using confidence interval formula, 1.9 1. 9 12 − 2.575 × < µ < 12 + 2.757 × 49 49
11.3 < µ < 12.7
Thus, we are 99% confident that the interval from 11.3 min. to 12.7 min. contains the true average time.
Ex. 7.7(Pg 225): With reference to Ex. 7.6, with what confidence that we assert that the sample mean does not differ from the true mean by more than 15 seconds?
Sol. : n=49, s=1.9, E=0.25 s Using E = zα 2 × , n
0.25 × 7 zα 2 = = 0.9211 1 .9 P ( Z < 0.9211) = 1 − α 2 = 0.8212 1 − α = 0.6424
Therefore, we can assert with 64.24% confidence that maximum error is 15 sec.
Ex. 7.11(Pg 225): The principal of college wants to use the mean of a random sample to estimate the average amount of time students take to get from one class to next and he wants to
be able to assert with 98% confidence that the error is at most .25 minute. If it can be presumed from experience that σ = 1.25 minutes, how large a sample will she have to take?
Sol. : σ = 1.25, E − 0.25,
1 − α = 0.98 ⇒ α = 0.2 2 zα 2 .σ z0.01 × 1.25 ∴n= = E 0 . 25 = 136
Ex. 7.15(Pg226): A random sample of 100 teachers in a large metropolitan area revealed a mean weekly salary of Rs. 2,000 with a standard deviation of Rs. 43. With what degree of confidence can we assert that the average weekly salary of all teachers in the metropolitan area is between Rs. 1,985 and Rs. 2,015?
Sol. : n=100, x =2000, s=43 1985 < µ < 2015
Using, s s x − zα 2 . < µ < x + zα 2 . n n
we get
43 2000 + zα 2 × = 2015 100 zα 2 = 3.49
∴ P( Z < 3.49) = 1 − α 2 = 0.9998 1 − α = 0.9996
Therefore, we can assert with 99.96% confidence that the average weekly salary of all teachers in the metropolitan area is between Rs. 1,985 and Rs. 2,015
Maximum likelihood estimation : Consider a random sample of size n from discrete population f ( x;θ ) that depends on a parameter θ . The joint distribution is f ( x1 ;θ ) f ( x2 ;θ )...... f ( xn ;θ ) .
Once the observations become available, we could substitute their actual values into the joint x1 , x2 ,..., xn density. After the substitution, the resulting function of θ
L(θ ) = f ( x1 ;θ ) f ( x2 ;θ )...... f ( xn ;θ )
is called the likelihood function. The maximum likelyhood estimator of θ is the random variable which equals the value for θ that maximizes the probability of the observed sample.
Because this is an after-thefact calculation, we say that it maximizes the likelihood. The same procedure for obtaining the maximum likelihood estimator applies in the continuous case.
Ex. 7.23(a)(pg 227): Find the maximum likelihood estimator for λ when f ( x; λ ) is the Poisson distribution.
Sol.:
e λ f ( x; λ ) = , x! −λ
x
x = 0,1,...., λ > 0
Likelihood function is
e λ e λ e λ L (λ ) = ⋅ ⋅ ......... ⋅ x1! x2 ! xn ! −λ
x1
−λ
x2
n
∑ xi
e λ = x1!⋅ x2 !⋅..... ⋅ xn ! − nλ
i =1
−λ
xn
Take g (λ ) = ln L(λ ) n
= −nλ + (∑ xi )(ln L) i =1
− ln( x1! x2 !....xn !) 1 n g ′(λ ) = −n + ∑ xi = 0 λ i =1 1n ⇒ λ = ∑ xi n i =1
1 n g ′′(λ ) = − 2 ∑ xi < 0 λ i =1
It shows that g (λ ) is 1n maximum at λ = ∑ xi , consn i =1
equently, L(λ ) is maximum at 1n λ = ∑ xi . n i =1
Hence, maximum likelihood 1n estimator = ∑ xi n i =1
Ex. 7.24(a)(pg 227): Find the maximum likelihood estimator for β when f ( x; β ) is the exponential distribution.
Sol.:
1 xβ f ( x; β ) = e , β
x > 0, β > 0
Likelihood function is L( β ) =
e
− x1 β
⋅
e
− x2 β
⋅ ......... ⋅
e
− xn β
β β β 1 − 1 β∑x = ne β Take g ( β ) = ln L( β ) 1 n = − (∑ xi ) − n(ln β ) β i =1 n
i =1
i
1 n n g ′( β ) = 2 ∑ xi − = 0 β i =1 β 1n ⇒ β = ∑ xi n i =1 1 2 n g ′′( β ) = 2 (n − ∑ xi ) β β i =1 3 n 1n =− n < 0 for β = ∑ xi 2 i =1 n (∑ x i ) i =1
It shows maximum
that at
is g (β ) 1n β = ∑ xi , n i =1 is L( β )
consequently, 1n maximum at β = ∑ xi . n i =1
Hence, maximum likelihood 1n estimator = ∑ xi n i =1
Estimation of Proportions Suppose X denotes the number of times that an appropriate event occurs in n trials. Then the point estimator of the population proportion is X the sample proportion . n
If the n trials satisfy the assumptions underlying the binomial distribution, one can show that the sample proportion is an unbiased estimator of the binomial parameter p , namely, of the true proportion we are trying to estimate on the basis of a sample.
When n is large, we can construct approximate confidence intervals for the binomial parameter p by using the normal approximation. Accordingly we can assert with probability 1 − α that the inequality − zα 2
X − np < < zα 2 np (1 − p )
will be satisfied.
Making the further approximation of substituting x for p in np(1 − p) , we get a n
large sample interval for p x − zα 2 n
confidence
x x x x 1 − 1 − x n n n n < p < + zα 2 n n n
Where the degree confidence is (1 − α )100% .
of
Maximum error of estimate: Again using the normal approximation, we can assert with probability 1 − α that the error will be at most E = zα 2
p (1 − p ) n
with the observed value substituted for p .
x n
Sample size determination : Solving above formula for n , we get zα 2 n = p (1 − p ) E
2
Remark: This formula cannot be used unless we know the possible value of p . If no such value available, we can make use of the fact that p (1 − p ) is at most 1 4 , corresponding to p = 1 2.
Sample size ( p unknown ) : 1 zα 2 n= 4 E
2
Ex. 9.3(Pg. 286) : In a random sample of 200 industrial accidents, it was found that 116 were due at least partially to unsafe working conditions. Construct a 99% confidence interval for the corresponding true proportion? (a) Table 9; (b) The large-sample confidence interval formula.
Sol. n = 200, x = 116, α = 1 − .99 = 0.01 x (a) = 0.58 n
Using Table 9, 0.49 < p < 0.67
x (b) = 0.58, z0.005 = 2.575 n
Using large-sample confideence interval formula, we get 0.58(1 − 0.58) 0.58(1 − 0.58) 0.58 + 2.575 < p < 0.58 − 2.575 200 200
∴
0.49 < p < 0.67
Ex. 9.4(Pg. 286) : In the reference to previous exercise , what can we say with 95% confidence about the maximum error if we use the sample proportion to estimate the corresponding true proportion?
x Sol. = 0.58, zα 2 = z0.005 = 2.575 n
Then the error is at most
0.58(1 − 0.58) E = 2.575 200 = 0.0899
Ex. 9.8(Pg. 286) : Among 200 fish caught in a large lake, 36 were inedible due to the pollution of the environment. If we use 36 = 0.18 as an estimate of 200
the corresponding true proportion, with what confidence can we assert that the error of this estimate is at most 0.070?
x Sol. = 0.18, n
E = 0.07
Then,
0.18(1 − 0.18) E = zα 2 = 0.07 200 ⇒ zα 2 = 2.5926 P ( Z < 2.5926) = 1 − α 2 = 0.9952
1 − α = 0.9904
Therefore, we can assert with 99.04% confidence that the error of this estimate is at most 0.07.