Review Chaps 3-4
Chapter 3 A Closer Look at Assumptions Robustness: a statistical procedure is robust to departures from a particular assumption if it is valid even when the assumption is not met. Valid means that the uncertainty measures, such as confidence intervals and p-values, are very nearly equal to those under the assumption. The t procedures assume that the data are normal, that the two populations have equal variances, and that all the observations are independent of one another. Departures from the Normality Assumption: The t procedures are robust to departures from normality. Data depart from normality when their distribution is not symmetrical, bell- or mound-shaped, or when the tails are either too long or short. While this is subjective to some extent, assessing how severe is the departure from normality is an important part of your training.
Departures from the Equal Variances Assumption: These departures can be more serious. This condition is best checked by looking at histograms of both samples as well as the sample standard deviations. Often, so long as the sample sizes are similar the uncertainty measures will still be reasonably close to the true ones.
Departures from Independence: This can be caused by serial or spatial correlation or by cluster effects. These assumptions can usually be easily checked by considering the experimental design and data collecting procedures. Data that fail to meet the independence assumption require methods other than those presented here. What are some examples of data sets that violate the independence assumption?
Resistance: A statistical procedure is resistant if it does not change very much when a small part of the data changes. The t procedures are not resistant to outliers. Outliers should be identified and the analysis performed with and without the outlier and the results reported in the published results of the experiment.
Transformations of the Data: sometimes departures from normality can be corrected by transformations. The most common transformation is the log transform. There are two common log transformations. It is common for writers to use log to mean both the natural log and the log base 10. In my notes, I will often use ln to mean natural log and log to mean log 10, or log to mean either natural or log 10. Your text will use log to mean natural log and will use log10 to mean log 10, though the use of log10 will be rare. Please do not let this confuse or upset you. Multiplicative Treatment Effect under the log Transformation: If an additive effect is found for log transformed data, i.e.,
y = log( x ) + δ
,
Then, to get back to the original scale,
e y = e( log( x )+δ ) = xeδ δ
Thus, e is the multiplicative effect on the original scale of measurement. To test for a treatment effect, we use the usual t-test on the log-transformed data.
Population Model: Estimating the Ratio of Population Medians When drawing inferences about population means using a log or natural log transform, there is a problem with transforming back to the original scale of the data because the mean of the log transformed sample is not the log of the mean of the original. So taking the antilog does not give an estimate of the mean on the original scale. However, if the log transformed data have a symmetric distribution then mean[log(Y)] = median[log(Y)] and median[log(Y)] = log[median(Y)]. In words, The median, or 50th percentile, of the log transformed values is the log of the 50th percentile of the original values. So, when we transform back to the original scale, we are now drawing inferences about the median. If we denote the averages of the log transformed means as Z1 , Z 2 then the difference of these two quantities estimates the log of the ratio of their medians. That is,
⎡ median ( Y2 ) ⎤ Z 2 − Z1 = log ⎢ ⎥ where Y1 and Y2 represent the two samples and hence the ⎢⎣ median ( Y1 ) ⎥⎦ right-hand side is an estimate of the log of the ratio of the two population medians. Other transformations: There are many transformations one can try. There are rules of thumb, but it usually boils down to trial and error. Some common transformations are square root, reciprocal, and the arcsine.
Chapter 4 Alternatives to t-tools
Welch’s t-Test for Comparing Two Normal Populations with Unequal Variance. As mentioned earlier, the standard error for the estimate of the difference between two population means when the variances are not assumed equal is given by: SEW ( y1 − y2 ) =
s12 s22 + . n1 n2
The degrees of freedom are difficult in this case and the exact d.f., and hence the exact distribution is not known. The best approximation, Satterthwaite’s approximation, is given by ⎡⎣ SEW ( y2 − y1 ) ⎤⎦ dfW = 4 4 ⎡⎣ SE ( y2 ) ⎤⎦ ⎡⎣ SE ( y1 ) ⎤⎦ + n2 − 1 n1 − 1 4
The t statistic and p-value are then calculated in exactly the same way as for the pooled variance test.
Wilcoxon Sum Rank Test Say you believe that students who go home to their families for Thanksgiving Weekend actually do better on their exams because they need to decompress more than they need to study. Say you took a random sample of 8 students who went home for Thanksgiving and 8 who stayed in Missoula and studied, and then obtained their final exam scores out of a total of 200 possible. Here are the resulting data Went Home Studied 137.9134 113.25 142.4956 95.94 129.6706 90.04 115.4934 104.44 183.4077 119.21 123.5596 106.88 94.7618 94.99 102.0240 131.09 The hypothesis test is then: H0: µh-µs=0
versus
Ha: µh-µs>0
To perform the sum-rank test, we need to rank the two samples together and find the ranks. Once we have found the ranks, the test statistic can be calculated. Under the null hypothesis of no difference, we have n ( N + 1) n n ( N + 1) µW = 1 , var (W ) = 1 2 2 12 Our two samples are just large enough (each >7) so we can perform this test by calculating a z-score. z=
W − µw . SD(W )
Home Ranked 90.0400
Studied Ranked 94.7618
94.9900 95.9400 102.0240 104.4400 106.8800 113.2500 115.4934 119.2100 123.5596 131.0900 129.6706 137.9134 142.4956 183.4077
Ranks 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
We now sum the ranks for the ‘home’ data. This yields W=1+3+4+6+7+8+10+12=51. For this example, z=
51 − 68 = −0.1992 85.33
So, we fail to reject the null hypothesis of no difference between the two groups.
Wilcoxon Signed-Rank Test for Paired Data:
To compute the signed-rank test statistic we perform the following calculations: 1) 2) 3) 4)
Compute the difference in each of the n pairs. Drop the zeros from the list. Order the absolute differences from smallest to largest and assign ranks. The signed rank statistic, S, is the sum of the ranks from the pairs for which the difference is positive.
This procedure can be performed in SPSS under non-parametric tests Æ 2 related samples Æ choose ‘Wilcoxon’.
Exact p-value for the Signed-Rank Test: The exact p-value is the proportion of all assignments of outcomes within each pair that lead to a test statistic as least as extreme as the observed. In the schizophrenia example, we assign “affected” or “not affected” to each pair, regardless of true status, in 215 different ways. Then calculate the statistic under these different assignments. The distribution of the statistics calculated in this way is then the true distribution under the null hypothesis of no treatment effect. Normal approximation for the Signed-Rank Test: A normal approximation give us an expected value and standard deviation:
µs=n(n+1)/4 SD(S)=[n(n+1)(2n+1)/24]1/2
We then compare the usual z-statistic to the quantiles of the normal distribution to obtain a p-value in the usual way.