Psych 494 Fall 2001 Solution 4 1. Variable=X1 W:Normal
0.827419
Pr<W
0.0007
Normal Probability Plot 27.5+ | | | | 2.5+
* * ++ ++++++++++ ++++++*+* ************ ** *+*+*+**+*+ +++*+++++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2
Variable=X2 W:Normal
0.924988
Pr<W
0.0697
Normal Probability Plot 17+ | | | 9+ | | | 1+
+++*+ * * *+*+ * *++++ **+++ +++* +++**** *+*** * * *+++* * ++++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2
Variable=X3 W:Normal
0.967717
Pr<W
0.6204
Normal Probability Plot 17+ | | | | | | 3+
* *++++* *+*+++ +**+*+ ****** ***** *+*+**+ +++++ +++*++ * +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2
The Q-Q plots show that only X1 has gross departure from normality. The Shapiro-Wilk statistics confirm this observation.
2)
D i s t a n c e
Chi-square Plot 16 + | | * | | | 14 + | | | | | 12 + | | | | | 10 + | | | | | * 8 + | | | | * | 6 + | | | * | | 4 + * * | * * * | | * | | * * 2 + | | * * | *** * | ** ** | * 0 + * * -+------+------+------+------+------+------+------+------+------+------+0 1 2 3 4 5 6 7 8 9 10 ChiSq
Most of the observations (except for observation 9) lie along the line. This suggests that the assumption of multivariate normality is tenable.
TOPLOT 1.4236522 4.0135936 3.0763677 3.6648708 1.0878828 1.9635806 0.7558302 4.4149844 9.8374093 0.1848318 2.3659739 1.7767839 0.5843744 1.2543524 3.3554449 2.5857116 0.922479 2.8213339 0.4011734 6.2513886 7.40688 1.5973096 4.8904101 2.1593473 5.4773439
0.8807247 3.8113869 2.8907255 3.7549258 0.6533776 1.143333 0.5598385 3.8723804 15.262769 0.0355722 1.3338642 1.019959 0.4707491 0.6645781 3.7399316 2.3258664 0.6167535 2.3763783 0.1499929 6.8316784 8.3442448 0.9699873 3.9779303 1.2612803 5.0517725
CHI80 4.6416277 Around 20 observations are expected to lower than the quantile 4.64. The actual count is 21, which is very close to what is expected under normality. Based on the chi-square plot and the probability contour plot, we can say that it is reasonable to assume trivariate normality. 3) A=1, B=2, etc.
Plot of X1*X2.
30 + | A | A | | | 20 + | | AA X1 | A A | AA A A | A A A A 10 + A A A B A | A A A | A | A | | 0 + -+---------------+---------------+ 0 10 20 X2
A=1, B=2, etc.
Plot of X1*X3.
30 + | A | A | | | 20 + | | A A X1 | A A | A AAA | A A A A 10 + A AB A A | A AA | A | A | | 0 + -+-------+-------+-------+-------+ 0 5 10 15 20 X3
A=1, B=2, etc.
Plot of X2*X3.
20 + | | A | X2 | A | A A A | A | A | A A 10 + | | A A A | A | AA BA A A | | A | A A A | A 0 + -+-------+-------+-------+-------+ 0 5 10 15 20 X3 It can be argued that at least one of the observations (9) is an outlier. This is evident from the scatter plots and the large statistical distance. Observation 21 is also a likely candidate.
4.
a)
H0: µ ' = [10, 10, 10] H1: µ ' ≠ [10, 10, 10]. T_SQ T_SQCRIT PVALUE 30.373968 10.224691 0.0004974
The T2 = 30.37 with a p-value of .0005. At an alpha level of .05, we reject the null hypothesis that the means are all equal to 10. b) VAL 24.898543 6.757645 4.6820876
VEC 0.458362 0.0873568 0.884462 0.7521083 -0.568348 -0.333637 0.473537 0.8181376 -0.326211
The length of the longest axis is 2 24.90 10.22 / 23 = 6.65 The next longest axis is 2 6.76 10.22 / 23 = 3.47 while the shortest axis has a length of 2 4.68 10.22 / 23 = 2.89 . The direction of the ellipsoid is given by the eigenvectors.
c)
Simultaneous Confidence Intervals for
µ1 , µ 2 , µ 3 and µ1 + µ 2 + µ 3 :
T-Square Simulateneous Confidence Intervals SCLM 9.2549748 4.7246222 8.4131009 23.647662
13.243286 10.188421 12.755595 34.932338
Bonferroni Simulateneous Confidence Intervals SCLM 9.552743 5.1325507 8.7373125 24.490179
12.945518 9.7804928 12.431383 34.089821
The 4 intervals based on the Bonferroni method are shorter compared to those using T2. In most situations given the same confidence level, Bonferroni intervals are more precise. Although one may be inclined to always use the Bonferroni, some situations dictate that simultaneous confidence intervals be based on T2. In situations where no particular contrast is of interest prior to the implementation of the study, the T2 method is ideal since this method maintains same the confidence level even if data snooping is involved. In some situations where the number of contrasts is relatively large, one may be better off with the T2 method, regardless of when the contrasts are formulated.
SAS Program for Homeowrk 4: options ls=78; data milk; infile 'milk.dat'; input x1-x3; * Univariate Test For Normality; proc univariate normal plot; *Multivariate Normality and Chi-square Plot; proc iml; START distance(X); n=nrow(x); p=ncol(x); one=J(N,1); xbar=t(X[:,]); I=I(n); SSCP=X`*(I-one*t(one)*1/n)*X; s = SSCP/(N-1); d=j(nrow(x),1); means=one*t(xbar); do i=1 to nrow(x); d[i]= (x[i,]- means[i,])*inv(s)*t(x[i,]-means[i,]); end;
y=ranktie(d); chisq=cinv((y-.5)/n,p); toplot=chisq||d; call pgraf(toplot,'*','ChiSq','Distance','Chi-square Plot'); print toplot; finish; use milk; read all var _num_ into X; run distance(x); chi80=cinv(.8,3); print chi80; proc plot plot plot
plot hpercent=50 vpercent=50; x1*x2; x1*x3; x2*x3;
data milk1; set milk; if _n_ ne 9 and _n_ ne 21; proc iml; use milk1; read all var{x1 x2 x3} into x; START stat(X,Xbar,S); N=nrow(X); one=J(N,1); Xbar=t(X[:,]); I=I(n); SSCP=X`*(I-one*t(one)*1/n)*X; S = SSCP/(N-1); FINISH stat; START test(x,xbar,s,mu,alpha); n=nrow(x); p=ncol(x); t_sq=n*(xbar-mu)`*inv(s)*(xbar-mu); t_sqcrit=(n-1)*p/(n-p)*finv(1-alpha,p,n-p); pvalue=1-probf((n-p)*t_sq/((n-1)*p),p,n-p); print t_sq t_sqcrit pvalue; FINISH test; mu={10,10,10}; alpha=.05; run stat(x,xbar,s); run test(x,xbar,s,mu,alpha); call eigen(val,vec,s); print val, vec; start sclm_t2(xbar,s,n,a,alpha); p=nrow(s); i=i(p); nclm=nrow(a); sclm=j(nclm,2);
crit=p*(n-1)*finv(1-alpha,p,n-p)/(n*(n-p)); do i=1 to nclm; me=sqrt(crit*a[i,]*s*t(a[i,])); sclm[i,1]=a[i,]*xbar-me; sclm[i,2]=a[i,]*xbar+me; end; print ,'T-Square Simulateneous Confidence Intervals', sclm; finish sclm_t2; a={1 0 0, 0 1 0, 0 0 1, 1 1 1}; n=23; alpha=.05; run stat(x,xbar,s); run sclm_t2(xbar,s,n,a,alpha); start sclm_b(xbar,s,n,a,alpha); p=nrow(s); i=i(p); nclm=nrow(a); sclm=j(nclm,2); crit=tinv(1-alpha/(2*nclm),n-1); do i=1 to nclm; me=crit*sqrt(a[i,]*s*t(a[i,])/n); sclm[i,1]=a[i,]*xbar-me; sclm[i,2]=a[i,]*xbar+me; end; print , 'Bonferroni Simulateneous Confidence Intervals', finish sclm_b; run sclm_b(xbar,s,n,a,alpha);
sclm;