Tarea 3 Fundamentos de An´alisis Epidemiol´ogico II 3009373 Diego Alejandro Mu˜ noz Gaviria Catalina Ot´alvaro Ram´ırez 23 de marzo de 2019
10.5 Swan (1986) gives the following data from a study of infant respiratory disease. Each cell of the table shows the number out of so many observed children who developed bronchitis or pneumonia in their first year of life, classified by sex and type of feeding (with the risk in parentheses). Sex Boys Girls
Bottle only 77/458 (0.17) 48/384 (0.13)
Breast + supplement 19/147 (0.13) 16/127 (0.13)
Breast only 47/494 (0.10) 31/464 (0.07)
The major question of interest is whether the risk of illness is affected by the type of feeding. Also, is the risk the same for both sexes and, if there are differences between the feeding groups, are they the same for boys and girls? (i) Fit all possible linear logistic regression models to the data. Use your results to answer all the preceding questions through significance testing. Summarize your findings using odds ratios with 95 % confidence intervals. Existe la posibilidad de ajustar 3 modelos diferentes para encontrar tales diferencias, uno en el cual solo se tenga en cuenta tipo de alimentaci´on, otro para g´enero y otro en el cual se eval´ uen las dos covariables. The LOGISTIC Procedure Model Information Data Set
ADE.SIRS
Response Variable (Events)
illness
Response Variable (Trials)
total
Model
binary logit
Optimization Technique
Fisher's scoring
Number of Observations Read
6
Number of Observations Used
6
Sum of Frequencies Read
2074
Sum of Frequencies Used
2074
Response Profile Ordered Value
Binary Outcome
1
Event
2
Nonevent
Total Frequency 238 1836
Class Level Information Class
Value
feeding
1
1
2
0
1
3
-1
-1
1
Design Variables 0
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied.
Number of Observations Read
6
Number of Observations Used
6
Sum of Frequencies Read
2074
Sum of Frequencies Used
2074
Response Profile Ordered Value
Total Frequency
Binary Outcome
Modelo ajustado al tipo de alimentaci´ 1 = Bottle only, feeding 1 Eventon donde: feeding 238 1836 2 = Breast + supplement, feeding2 3Nonevent = Breast only. Class Level Information Class
Value
feeding
1
Design Variables 1
2
0
1
3
-1
-1
0
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics Intercept and Covariates Criterion
Intercept Only
Log Likelihood
Full Log Likelihood
AIC
1480.102
1463.426
43.217
SC
1485.739
1480.338
60.129
-2 Log L
1478.102
1457.426
37.217
Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
20.6763
2
<.0001
Score
20.3480
2
<.0001
Wald
19.8447
2
<.0001
Type 3 Analysis of Effects Effect feeding
DF
Wald Chi-Square
Pr > ChiSq
2
19.8447
<.0001
Analysis of Maximum Likelihood Estimates
Parameter Intercept
DF
Estimate
Standard Error
Wald Chi-Square
Pr > ChiSq
1
-2.0304
0.0790
661.2600
<.0001
feeding
1
1
0.2836
0.0968
8.5877
0.0034
feeding
2
1
0.1092
0.1310
0.6958
0.4042
Odds Ratio Estimates Effect
Point Estimate
95% Wald Confidence Limits
feeding 1 vs 3
1.967
1.458
2.654
feeding 2 vs 3
1.652
1.082
2.524
Association of Predicted Probabilities and Observed Responses Percent Concordant
39.1
Somers' D
0.163
Percent Discordant
22.8
Gamma
0.263
Percent Tied
38.1
Tau-a
0.033
c
0.581
Pairs
436968
ˆ f eeding,13 = 1.967 como estimaci´on para la En este modelo ajustado se obtiene Ψ raz´ on de odds, fijando como nivel de referencia el nivel 3, Breast only; lo que quiere decir es que los ni˜ nos que se alimentan Bottle only tienen aproximadamente 1.97 veces m´ as riesgo de sufrir una enfermedad respiratoria comparado con los bebes alimentados Breast only; este riesgo se considera significativo ya que su intervalo de confianza no contiene el 1, (1.458, 2.654).
2
Model Information Data Set
ADE.SIRS
Response Variable (Events)
illness
Response Variable (Trials)
total
Model
binary logit
Optimization Technique
Fisher's scoring
Number of Observations Read
6
ˆ f eeding,23 = 1.652, seNumber Siendo Ψ tienen para Used decir que los ni˜ nos que reciben una of Observations 6 Sum of Frequencies Read 2074 alimentaci´ on Breast + supplement tienen un riesgo 1.65 veces mayor de sufrir Sum of Frequencies Used 2074 de una enfermedad respiratoria que los bebes alimentados Breast only; este riesgo se considera significativo ya queResponse su intervalo de confianza no contiene el 1 Profile Ordered Total (1.082, 2.524). Value
Binary Outcome
1
Frequency
Event
238
Nonevent Modelo ajustado al g´enero donde:2 sex 1 = Boys, 1836 sex 2 = Girls Class Level Information Class
Value
sex
1
Design Variables 1
2
-1
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics Intercept and Covariates Criterion
Intercept Only
Log Likelihood
Full Log Likelihood
AIC
1480.102
1476.626
56.417
SC
1485.739
1487.900
67.692
-2 Log L
1478.102
1472.626
52.417
Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
5.4761
1
0.0193
Score
5.4324
1
0.0198
Wald
5.3975
1
0.0202
Type 3 Analysis of Effects Effect sex
DF
Wald Chi-Square
Pr > ChiSq
1
5.3975
0.0202
Analysis of Maximum Likelihood Estimates Parameter
DF
Estimate
Standard Error
Wald Chi-Square
Pr > ChiSq
1
-2.0629
0.0702
864.0594
<.0001
1
0.1630
0.0702
5.3975
0.0202
Intercept sex
1
Odds Ratio Estimates Effect
Point Estimate
sex 1 vs 2
1.386
95% Wald Confidence Limits 1.052
1.824
Association of Predicted Probabilities and Observed Responses Percent Concordant
28.8
Somers' D
0.080
Percent Discordant
20.8
Gamma
0.162
Percent Tied
50.4
Tau-a
0.016
c
0.540
Pairs
436968
ˆ Boys,Girls = 1.386 como estimaci´on para la Al ajustar este modelo se obtiene Ψ raz´ on de odds, fijando como nivel de referencia el nivel asociado a Girls; Boys tienen un riesgo 1.386 veces mayor de sufrir de una enfermedad respiratoria, comparado con Girls. Este riesgo se considera significativo ya que su intervalo de confianza no contiene el 1 (1.386, 1.824). 3
Number of Observations Read
6
Number of Observations Used
6
Sum of Frequencies Read
2074
Sum of Frequencies Used
2074
Response Profile Ordered Value
Total Frequency
Binary Outcome
1
Event
238
2 Nonevent 1836 Modelo ajustado al tipo de alimentaci´ on y g´enero. Class Level Information Class
Value
sex
1
1
2
-1
1
1
2
0
1
3
-1
-1
feeding
Design Variables
0
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics Intercept and Covariates Criterion
Intercept Only
Log Likelihood
Full Log Likelihood
AIC
1480.102
1460.449
40.240
SC
1485.739
1482.998
62.789
-2 Log L
1478.102
1452.449
32.240
Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
25.6534
3
<.0001
Score
25.2344
3
<.0001
Wald
24.6148
3
<.0001
Type 3 Analysis of Effects DF
Wald Chi-Square
sex
1
4.9109
0.0267
feeding
2
19.3786
<.0001
Effect
Pr > ChiSq
Analysis of Maximum Likelihood Estimates Parameter Intercept
DF
Estimate
Standard Error
Wald Chi-Square
Pr > ChiSq
1
-2.0496
0.0801
654.9727
<.0001
sex
1
1
0.1563
0.0705
4.9109
0.0267
feeding
1
1
0.2806
0.0969
8.3844
0.0038
feeding
2
1
0.1081
0.1311
0.6791
0.4099
Odds Ratio Estimates Effect
Point Estimate
95% Wald Confidence Limits
sex 1 vs 2
1.367
1.037
1.802
feeding 1 vs 3
1.953
1.447
2.636
feeding 2 vs 3
1.643
1.075
2.512
Association of Predicted Probabilities and Observed Responses Percent Concordant
50.2
Somers' D
0.196
Percent Discordant
30.6
Gamma
0.243
Percent Tied
19.2
Tau-a
0.040
c
0.598
Pairs
436968
Con este modelo sin interacci´on entre Sex y Feeding se obtienen los verdaderos valores de los par´ ametros estimados cuando intervienen como variebles sin tener interacci´ on, sus razones de odds e intervalos de confianza, as´ı: Comparaci´ on Boys : Girls Bottle : Breast Mixed : Breast
4
Odds ratio 1.37 1.95 1.64
IC 95 % (1.04,1.80) (1.45,2.51) (1.08,2.05)
(ii) Fit the model with explanatory variables sex and type of feeding (but no interaction). Calculate the residuals, deviance residuals and standardised deviance residuals and comment on the results. Luego de realizar el ajuste del modelo con el PROC LOGISTIC se buscaron los residuales en el PROC GENMOD con la opci´on output que arroja la siguiente tabla: Sex Boy Boy Boy Girl Girl Girl
Feed Bottle Mixed Breast Bottle Mixed Breast
Resid 0.8742 -2.1175 1.2433 -0.8742 2.1175 -1.2433
Dev Resid 0.1096 -0.5052 0.1922 -0.1342 0.5896 -0.2284
St Dev Resid 0.2462 -0.8579 0.3670 -0.2473 0.8279 -0.3707
Programa SAS /*
Study of Infant Respiratory Disease
*/
data ADE.SIRS; input illness total sex$ feeding$; cards; 77 458 1 1 19 147 1 2 47 494 1 3 48 384 2 1 16 127 2 2 31 464 2 3 ; run; proc logistic data=ADE.SIRS; class feeding; model illness/total = feeding; run; proc logistic data=ADE.SIRS; class sex; model illness/total = sex; run; proc logistic data=ADE.SIRS; class sex feeding; model illness/total = sex feeding; run; proc genmod data=ADE.SIRS; class sex feeding; model illness/total = sex feeding/ dist=binomial link=logit; output out=res resraw resdev stdresdev=st_Dev_res; run;
5
10.10 Repeat the analysis of Exercise 6.1, the unmatched case–control study of oral contraceptive use and breast cancer, using logistic regression modelling. Compare results. 6.1 In a case–control study of the use of oral contraceptives (OCs) and breast cancer in New Zealand, Paul et al. (1986) identified cases over a 2-year period from the National Cancer Registry and controls by random selection from electoral rolls. The following data were compiled. Used OCs? Yes No Total
Cases 310 123 433
Controls 708 189 897
(i) Estimate the odds ratio for breast cancer, OC users versus nonusers. Specify a 95 % confidence interval for the true odds ratio. Luego de ajustar el modelo de regresi´on log´ıstica: [model cases/total = UsedOC] en SAS, se obtuvo como resultado el estimador del efecto UsedOCs = −0.3963 para un Odss Ratio = 0.673. El signo negativo en el estimador y el valor del OR < ˆ = e−0.3963 = 0.673, con un IC al 95 % para el OR = Ψ ˆ es (0.517, 0.876) 1 en Ψ sin incluir el 1 (adquiriendo significancia), indica que Yes en UsedOC disminuye la probabilidad del riesgo de c´ancer de seno en 67.3 %. Entonces UsedOC es un factor protector contra el c´ancer de seno. ˆ = 1/0.673 = 1.48 veces mayor Las mujeres con No en UsedOC tienen un riesgo Ψ de sufrir de c´ ancer de seno que las mujeres con Yes en UsedOC. (ii) Test whether OC use appears to be associated with breast cancer. El test de asociaci´ on arroja un resultado de 8.6978 con un valor − p = 0.0032 que al compararlo con un nivel de significancia del 5 %, rechaza la hip´otesis nula y da raz´ on para concluir que si hay asociaci´on estad´ıstica entre UsedOC y la aparici´ on de c´ ancer de seno. Estos resultados son similares a los obtenidos de forma manual para el ejercicio 6.1 pero queda demostrado que bajo la regresi´on log´ıstica se obtienen mayores beneficios como rapidez, exactitud, evidencias y todas las estimaciones y tests bajo un mismo procedimiento. A continuaci´ on el programa y la salida en SAS para lo anteriormente explicado: /* UNMATCHED CASE:CONTROL uso anticonceptivos orales(OC) */ data ADE.OC; input UsaOC casos controles total; cards; 1 310 708 1018 0 123 189 312 ; run; proc logistic data=ADE.OC; model casos/total=UsaOC; run;
6
The LOGISTIC Procedure Model Information Data Set
ADE.OC
Response Variable (Events)
cases
Response Variable (Trials)
total
Model
binary logit
Optimization Technique
Fisher's scoring
Number of Observations Read
2
Number of Observations Used
2
Sum of Frequencies Read
1330
Sum of Frequencies Used
1330
Response Profile Ordered Value
Total Frequency
Binary Outcome
1
Event
433
2
Nonevent
897
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics Intercept and Covariates Criterion
Intercept Only
Log Likelihood
Full Log Likelihood
AIC
1680.440
1673.872
17.362
SC
1685.633
1684.258
27.748
-2 Log L
1678.440
1669.872
13.362
Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
8.5675
1
0.0034
Score
8.7534
1
0.0031
Wald
8.6978
1
0.0032
Analysis of Maximum Likelihood Estimates DF
Estimate
Standard Error
Wald Chi-Square
Pr > ChiSq
Intercept
1
-0.4295
0.1158
13.7476
0.0002
UsedOC
1
-0.3963
0.1344
8.6978
0.0032
Parameter
Odds Ratio Estimates Effect
Point Estimate
UsedOC
95% Wald Confidence Limits
0.673
0.517
0.876
Association of Predicted Probabilities and Observed Responses Percent Concordant
22.4
Somers' D
0.073
Percent Discordant
15.1
Gamma
0.196
62.5
Tau-a
0.032
c
0.537
Percent Tied Pairs
388401
7
10.13 Refer to the venous thromboembolism matched case–control study of Exercise 6.12. In a matched case–control study of venous thromboembolism (VTE) and use of hormone replacement therapy (HRT), Daly et al. (1996) screened women aged 45–64 years admitted to hospitals in the Oxford Regional Health Authority (UK) with a suspected diagnosis of VTE. From these, 103 cases of idiopathic VTE were recruited. Each case was individually matched with up to two hospital controls with diagnoses judged to be unrelated to HRT use, such as diseases of the eyes, ears or skin. Matching criteria were 5-year age group, district of admission and date of admission (between 2 weeks before and 4 months after the admission date of the corresponding case). Altogether there were 178 controls. The data are available from the web site for this book (Appendix A). Confirm the following summary table:
Matching ratio
Case uses HRT?
1:1
yes no yes no
1:2
Number of controls using HRT 0
1
2
7 17 15 27
3 1 15 11
4 3
(i) Use logistic regression to repeat the analysis of Exercise 6.12. Compare results. Using this summary table, a. Test for no association between hormone replacement therapy (HRT) use and venous thromboembolism. b. Estimate the odds ratio, and find the associated 95 % confidence interval, for HRT users versus nonusers.
Al realizar el an´ alisis de forma similar al ejercicio 6.12 pero usando regresi´on log´ıstica y en la cual solo se tiene en cuenta la variable HRT (similar al ejercicio 6.12) se obtienen las siguientes resultados: El estimador del efecto HRT es: β1 = 1.0957 Su signo positivo de β1 y el valor estimado de odds ratio Ψ = 2.991 > 1 indica que usar HRT aumenta el chance de protecci´on para tromboembolismo venoso hasta 3 veces m´ as que cuando no se usa HRT. Entonces el no uso de HRT es un factor de riesgo para VTE. Un IC al 95 %para Ψ de trombosis venosa para las mujeres entre 45 y 64 a˜ nos que no usan HRT comparado con las que usan HRT (1.607,5.568) como el intervalo no incluye el 1, puede concluirse con base en los datos, que el no uso de HRT incrementa el riesgo de sufrir un VTE hasta 5.5 veces m´as que si se usara. El test para verificar si existe asociaci´on entre HRT y VTE (significancia del factor HRT ) arroj´ o una estad´ıstica de 11.9516 con un valor − p = 0.0005, rechazando la hip´ otesis nula y dando raz´on con una significancia del 5 % que si hay asociaci´ on estad´ıstica entre HRT y la aparici´on de VTE.
8
The LOGISTIC Procedure Conditional Analysis Model Information Data Set
ADE.VTE
Response Variable
CC
Number of Response Levels
2
Number of Strata
103
Model
binary logit
Optimization Technique
Newton-Raphson ridge
Number of Observations Read
281
Number of Observations Used
281
Number of Observations Informative
281
Response Profile Ordered Value
CC
Total Frequency
1
0
178
2
1
103
Probability modeled is CC=1. Strata Summary CC
Response Pattern
0
1
Number of Strata
1
1
1
28
56
2
2
1
75
225
Frequency
Newton-Raphson Ridge Optimization Without Parameter Scaling Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics Criterion
Without Covariates
With Covariates
AIC
203.608
192.467
SC
203.608
196.105
-2 Log L
203.608
190.467
Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
13.1414
1
0.0003
Score
12.9151
1
0.0003
Wald
11.9516
1
0.0005
Analysis of of Conditional Conditional Maximum Maximum Likelihood Likelihood Estimates Estimates Analysis Parameter Parameter HRT HRT
DF DF
Estimate Estimate
Standard Standard Error Error
Wald Wald Chi-Square Chi-Square
Pr > > ChiSq ChiSq Pr
1 1
1.0957 1.0957
0.3170 0.3170
11.9516 11.9516
0.0005 0.0005
Odds Ratio Estimates Effect
Point Estimate
HRT
2.991
9
95% Wald Confidence Limits 1.607
5.568
Model Information Data Set
ADE.VTE
Response Variable
CC
Number of Response Levels
2
Number of Strata
103
Number of Uninformative Strata
3
Frequency Uninformative
4
Model
binary logit
Esta fue la salida SAS de donde se extrajo laNewton-Raphson informaci´ on para dar los anteriores Optimization Technique ridge resultados, que adem´ as son muy similares a los que se obtuvieron en la realizaci´ on del ejercicio 6.12 (Ψ = 3.00 IC95 %(1.61, 5.59), donde igualmente se concluy´ o que hab´ıa Number of Observations Read 281 asociaci´ on entre HRT y VTE es of decir ambas metodolog´ Number Observations Used 278 ıas conllevan a las mismas Number of ıstica Observations conclusiones, siendo la regresi´ on log´ masInformative acertada274y con mayor detalle. Response Profile
(ii) The associated dataset also includes data on body mass index (BMI), a potential Ordered Total Value between CC Frequency confounding factor in the relationship HRT and venous thromboembolism. 0 176 Test for a significant effect of HRT on1 venous thromboembolism, adjusting for BMI. 2 1 102 Estimate the odds ratio for HRT users versus nonusers, adjusting for BMI. Does BMI Probability modeled is CC=1. appear to have a strong confounding effect? Note: 3 observations were deleted due to missing values for the response, explanatory, or strata variables. Strata Summary CC
Response Pattern
0
1
Number of Strata
1
0
1
2
2
2
1
1
26
52
3
2
0
1
2
4
2
1
74
222
Frequency
Newton-Raphson Ridge Optimization Without Parameter Scaling Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics Criterion
Without Covariates
With Covariates
AIC
198.638
185.741
SC
198.638
192.997
-2 Log L
198.638
181.741
Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
16.8968
2
0.0002
Score
16.3724
2
0.0003
Wald
14.8877
2
0.0006
Analysis of Conditional Maximum Likelihood Estimates DF
Estimate
Standard Error
Wald Chi-Square
Pr > ChiSq
HRT
1
1.1115
0.3252
11.6862
0.0006
BMI
1
0.0543
0.0257
4.4637
0.0346
Parameter
Odds Ratio Estimates Effect
Point Estimate
95% Wald Confidence Limits
HRT
3.039
1.607
5.748
BMI
1.056
1.004
1.110
10
Seg´ un los resultados obtenidos para BMI Ψ = 3.039 IC95 %(1.607, 5.748) como covariable de HRT , este no incide como un factor confusor en la relaci´on HRT y VTE. La variaci´ on con respecto a los resultados antes expuestos es m´ınima, casi imperceptible. Adem´ as los resultados de BMI no inciden puesto que su Ψ = 1.056 lo que lleva a concluir que el OR de BMI no afecta el riesgo por HRT para VTE. Programa SAS /* matched case{control study of venous thromboembolism (VTE) data ADE.VTE; input ID CC HRT BMI; cards; 1 1 0 32.74 1 0 0 22.00 1 0 0 29.83 . . . 135 1 0 29.16 135 0 0 24.46 ; run; proc logistic data=ADE.VTE; strata ID; model CC(event=’1’)=HRT ; output; run; proc logistic data=ADE.VTE; strata ID; model CC(event=’1’)=HRT BMI ; run;
11
*/
10.6 Saetta et al. (1991) carried out a prospective, single-blind experiment to determine whether gastric content is forced into the small bowel when gastric-emptying procedures are employed with people who have poisoned themselves. Each of 60 subjects was asked to swallow 20 barium-impregnated polythene pellets. Of the 60, 20 received a gastric lavage, 20 received induced emesis and 20 (controls) received no gastric decontamination. The number of residual pellets, counted by x-ray, in the intestine after ingestion for each subject was, for the induced emesis group: 0, 15, 2, 0, 0, 15, 1, 16, 0, 1, 1, 0, 6, 0, 0, 1, 0, 16, 7, 11 for the gastric lavage group: 9, 3, 4, 15, 3, 5, 0, 0, 2, 11, 0, 0, 0, 0, 7, 5, 9, 0, 0, 0 and for the control group: 0, 9, 0, 0, 4, 5, 0, 0, 13, 0, 0, 12, 0, 0, 1, 0, 4, 4, 6, 7 Considerando estos datos: (i) Implemente ANOVA Como primer paso en esta implementaci´on se decide realizar un an´alisis de normalidad el cual arroja como resultado, la anormalidad de los datos a tratar. Test de ShapiroWilks con estad´ıstica 0.7697 y valor−p < 0.0001 rechazan la hip´otesis nula y confirma que no existe normalidad en los datos; por tal motivo ser´ıa infructuoso realizar un anova ya que no estar´ıa cumpliendo los supuestos nacesarios para sus an´alisis. Sin embargo se decide explorar un anova para observar que est´a sucediendo “por dentro”de la estructura de esos datos “aperezados”. En el anova (con diagnostico de normalidad) se observa una aparente igualdad de medias dentro de las 3 distribuciones de la variable gastric con una estad´ıstica 0.37 y valor − p = 0.6909 se acepta la hip´oteis nula de igualdad de medias y se rescata que pareciera existir homogenidad de varianza dada la estad´ıstica F = 2.37 y valor − p = 0.1024. (ii) Implemente Kruskal-Wallis Como se dijo previamente, la no posible implementaci´on de anova, lleva a tener que tratar los datos como si no tuvieran una estructura definida (non-parametric) y se decide correr un text de Kruskal-Wallis con resultado estad´ıstico de 0.3758 y valor − p = 0.8287 para las tres categor´ıas (control, emesis, lavage) de gastric, poniendo en evidencia la igualdad de sus medias. El valor para cada una de estas medias (grupos ranqueados) es para el grupo control : 28.9, emasis: 32.125 y lavage: 30.475 con una desviaci´on est´andar de 60.75 Salida SAS para el an´ alisis de la implementaci´on Kruskal-Wallis:
12
The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable respel Classified by Variable gastric gastric
N
Sum of Scores
Expected Under H0
Std Dev Under H0
Mean Score
control
20
578.00
610.0
60.751415
28.9000
emesis
20
642.50
610.0
60.751415
32.1250
lavage
20
609.50
610.0
60.751415
30.4750
Average scores were used for ties.
Kruskal-Wallis Test Chi-Square
DF
Pr > ChiSq
0.3758
2
0.8287
Monte Carlo Estimates for the Exact Test Probability Pr >= ChiSq
Estimate 0.8313
99% Confidence Limits 0.8217
0.8409
13
Samples
Seed
10000
364567942
A continuaci´ on se presentan el programa SAS y sus salidas respectivas, que soportan los resultados anteriormente comentados en la implementaci´on anova: Programa SAS %web_drop_table(ADE.GE); FILENAME REFFILE ’/folders/myfolders/ADE/gastric-emptying.csv’; PROC IMPORT DATAFILE=REFFILE DBMS=CSV OUT=ADE.GE; GETNAMES=YES; RUN; PROC CONTENTS DATA=ADE.GE; RUN; %web_open_table(ADE.GE); PROC SORT DATA = ADE.GE; by gastric; run; PROC UNIVARIATE PLOT NORMAL data=ADE.GE ; BY gastric; VAR respel; run; /* ANOVA */ proc glm data=ADE.GE order=data plots=diagnostics; class gastric; model respel=gastric; lsmeans gastric / pdiff cl; mean gastric / hovtest; run; *comparaciones usando t Student; *emesis vs lavage; proc ttest data=ADE.GE; where gastric in (’emesis’,’lavage’); class gastric; var respel; run; *emesis vs control; proc ttest data=ADE.GE; where gastric in (’emesis’,’control’); class gastric; var respel; run; *lavage vs control; proc ttest data=ADE.GE; where gastric in (’lavage’,’control’); class gastric; var respel; run;
14
/* kruskal wallis - Exact Wilcoxon Two-Sample Test */ proc npar1way data=ADE.GE wilcoxon; class gastric; exact wilcoxon / mc; var respel; run; PROC RANK data=ADE.GE OUT=ADE.geranks; VAR respel; run; /* Printing the ranks for the data: */ PROC PRINT DATA=ADE.geranks; run; /* Performing the Bonferroni Multiple Comparisons: */ PROC GLM DATA=ADE.geranks; CLASS gastric; MODEL respel = gastric; LSMEANS gastric / CL PDIFF ADJUST=BON; run;
15
The UNIVARIATE Procedure Variable: respel Moments N
60
Sum Weights
60
Mean
3.83333333
Sum Observations
Std Deviation
5.02929272
Variance
25.2937853
1.2014062
Kurtosis
0.24103148
Corrected SS
1492.33333
Skewness Uncorrected SS
2374
Coeff Variation
131.198941
230
Std Error Mean
0.6492789
Basic Statistical Measures Location
Variability
Mean
3.833333
Std Deviation
Median
1.000000
Variance
25.29379
5.02929
Mode
0.000000
Range
16.00000
Interquartile Range
6.50000
Tests for Location: Mu0=0 Test
Statistic
Student's t
t
Sign
M
Signed Rank
S
p Value
5.903986
Pr > |t|
<.0001
16.5
Pr >= |M|
<.0001
280.5
Pr >= |S|
<.0001
Tests for Normality Test
Statistic
p Value
Shapiro-Wilk
W
0.769273
Pr < W
<0.0001
Kolmogorov-Smirnov
D
0.246741
Pr > D
<0.0100
Cramer-von Mises
W-Sq
0.91413
Pr > W-Sq
<0.0050
Anderson-Darling
A-Sq
5.318862
Pr > A-Sq
<0.0050
Quantiles (Definition 5) Level
Quantile
100% Max
16.0
99%
16.0
95%
15.0
90%
12.5
75% Q3
6.5
50% Median
1.0
25% Q1
0.0
10%
0.0
5%
0.0
1%
0.0
0% Min
0.0
Extreme Observations Lowest Value
Obs
Highest Value
16
Obs
0
49
15
37
0
48
15
38
0
47
15
60
0
46
16
39
0
45
16
40
17
The GLM Procedure Class Level Information Class
Levels
gastric
3
Values control emesis lavage
Number of Observations Read
60
Number of Observations Used
60
The GLM Procedure Dependent Variable: respel Source
DF
Sum of Squares
Mean Square
F Value
Pr > F
Model
2
19.233333
9.616667
0.37
0.6909
Error
57
1473.100000
25.843860
Corrected Total
59
1492.333333
R-Square
Coeff Var
Root MSE
respel Mean
0.012888
132.6179
5.083686
3.833333
Source
DF
Type I SS
Mean Square
F Value
Pr > F
gastric
2
19.23333333
9.61666667
0.37
0.6909
Source
DF
Type III SS
Mean Square
F Value
Pr > F
gastric
2
19.23333333
9.61666667
0.37
0.6909
18
19
The GLM Procedure Least Squares Means gastric
respel LSMEAN
LSMEAN Number
control
3.25000000
1
emesis
4.60000000
2
lavage
3.65000000
3
Least Squares Means for effect gastric Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: respel i/j
1
1 2
0.4046
3
0.8044
2
3
0.4046
0.8044 0.5569
0.5569
gastric
respel LSMEAN
control
3.250000
95% Confidence Limits 0.973704
5.526296
emesis
4.600000
2.323704
6.876296
lavage
3.650000
1.373704
5.926296
20
Least Squares Means for Effect gastric i
j
Difference Between Means
1
2
-1.350000
95% Confidence Limits for LSMean(i)-LSMean(j) -4.569169
1.869169
1
3
-0.400000
-3.619169
2.819169
2
3
0.950000
-2.269169
4.169169
21
Note: To ensure overall protection level, only probabilities associated with pre-planned comparisons should be used.
The GLM Procedure Levene's Test for Homogeneity of respel Variance ANOVA of Squared Deviations from Group Means Source
DF
Sum of Squares
Mean Square
F Value
Pr > F
gastric
2
5173.0
2586.5
2.37
0.1024
57
62138.9
1090.2
Error
The GLM Procedure
22
respel
Level of gastric
N
Mean
Std Dev
control
20
3.25000000
4.24108973
emesis
20
4.60000000
6.29452561
lavage
20
3.65000000
4.46359544
23