Islm-handout4-oct18 (1).pdf

  • Uploaded by: Liad Elmalem
  • 0
  • 0
  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Islm-handout4-oct18 (1).pdf as PDF for free.

More details

  • Words: 2,180
  • Pages: 52
DEMENTIA RESEARCH CENTRE INSTITUTE OF NEUROLOGY

Non-parametric tests – 16th October 2018 Saiful Islam

Chris Hardy

Feedback from you about past 2 lectures

•Didn’t understand the concept of Standard Error (SE) and Confidence Interval (CI) •Useful examples •Bad timing as in the afternoon and everyone was tired / sleepy. •Class quizzes are useful •Overall standard very good / Excellent

Standard Deviation (SD) vs Standard Error (SE) •

The SE quantifies the typical error or difference between the mean measured in a sample and the theoretical mean in the population from which the sample was drawn 



The SE indicates how accurately the sample mean estimates the population mean

Standard deviation (SD) of a sample of observations measures how a typical observation in the sample differs (deviates) from the sample mean

SD

Measures variability in the population or sample

SE = SD/√n

Measures variability in the sample means

Confidence interval – an example There is 95% probability that this interval contains the unknown but true value of the population mean • Assume we have a large sample size > 30 say low blood pressure of the patients, also we found these data are normally distributed then, • We need to obtain an upper and lower limit of the interval and say •

95% CI for mean is ( x  2  SE ) to ( x  2  SE )



Approximated 1.96~2

Next meeting date 23rd October at 4:15pm 

Where to meet : QS7 Cluster room (same as before)



Feedback will be collected for today’s and 23rd lectures



Free coffee, snacks available!!



Please contact Max & Nikita to join in this survey.



Max : [email protected]



Nikita : [email protected]

Learning Outcomes for today’s lecture

•What are non-parametric tests? •When do you need these tests? •Difference between paired and unpaired data •How to compare data using: •Mann-Whitney test (two independent samples) •Kruskal-Wallis test for > 2 independent samples. •Wilcoxon-Signed Rank test (paired data)

•How to interpret results from STATA output

Current research at UCL

Why do patients with logopenic variant primary progressive aphasia (lvPPA) struggle to understand speech?

Logopenic Variant PPA

•A ‘language-led’ dementia syndrome •Rare variant of Alzheimer’s disease •Young onset (<65) •Impaired repetition of phrases and sentences •Word finding difficulties •People also report having difficulties understanding speech in noisy places

[video]

Speech perception in lvPPA

Current research at UCL

How do we investigate this experimentally?

Current research at UCL

17 healthy controls 7 patients with lvPPA (9 patients with nfvPPA) Patients in National Hospital for Neurology and Neursurgery Tested by researchers in Dementia Research Centre

Paradigm

“Nine hundred and sixty five” freq

0

time (msec)

1800

Research question

Do people with lvPPA struggle to understand degraded speech?

Can they improve over time?

List of variables

Diagnosis (Categorical) SinewaveScore (SWS) (Continuous) Bin1 (Trials 1-10 SWS; Continuous) Bin 2 (Trials 11-20 SWS; Continuous) Bin 3 (Trials 21-30 SWS; Continuous) Bin 4 (Trials 31-40 SWS; Continuous)

Study objectives

Main outcome: SinewaveScore = ability to understand degraded speech 1. 2. 3. 4.

Compare controls with lvPPA Control controls with nfvPPA Compare controls, lvPPA, and nfvPPA Compare SWS score between two time points (Bin 1 and Bin 4) in the lvPPA group

What are non-parametric tests?

•Parametric’ tests involve estimating parameters such as the mean, and assume that distribution of sample means are ‘normally’ distributed – (planned to cover lecture-4 on 23 Oct 2018). •Often data does not follow a Normal distribution eg number of cigarettes smoked, cost to NHS etc. •Positively skewed distributions

20

Frequency

15

10

5

Mean = 8.03 Std. Dev. = 12.952 N = 30 0 0

10

20

30

Units of alcohol per week

40

50

What are non-parametric tests? • ‘Non-parametric’ tests were developed for these situations where fewer assumptions have to be made • Sometimes called Distribution-free tests • NP tests STILL have assumptions but are less stringent • NP tests can be applied to Normal data but parametric tests have greater power IF assumptions met

Ranks •Practical differences between parametric and NP are that NP methods use the ranks of values rather than the actual values •E.g. 1,2,3,4,5,7,13,22,38,45 - actual 1,2,3,4,5,6, 7, 8, 9,10 - rank

Median • The median is the value above and below which 50% of the data lie. • If the data is ranked in order, it is the middle value

• In symmetric distributions the mean and median are the same • In skewed distributions, median more appropriate

Class exercise : Find median 1 min

• Blood Pressure measures of 7 patients: 135, 138, 140, 140, 141, 142, 143

Median= ? • No. of cigarettes smoked:

0, 1, 2, 2, 2, 3, 5, 5, 8, 10 Median=

Paired And Not Paired Comparisons • If you have the same sample measured on two separate occasions then this is a paired comparison • Two independent samples is not a paired comparison • Different samples which are ‘matched’ by age and gender are paired

Non parametric tests: Wilcoxon tests • Frank Wilcoxon was Chemist in USA

who developed • Non parametric Wilcoxon Signed Rank Ξ parametric paired ttest

• Non parametric Wilcoxon Rank Sum Ξ independent parametric ttest • Compare > 2 independent samples use kruskal Walli • Mann-Whitney test Ξ Wilcoxon Rank Sum Please note that parametric will discuss in next lecture (lecture-4) on 23rd October 2018.

10 0

5

Frequency

15

20

•Histogram

0

50

100

150

numbersSWS

•Which clearly not a normal distribution

•Histogram by group

0

Density

0

.02 .04 .06

1

0

50

0

.02 .04 .06

2

0

50

100

150

numbersSWS

•0 vs 1 not similar •0 vs 2 not similar •1 vs 2 similar

Density normal numbersSWS1 Graphs by Group

100

150

•Null hypothesis : there are no difference in distribution of SWS score between control and lvPPA group •Alternative hypothesis : there are some differences in distribution of SWS score between control and lvPPA group.

•Now we check quantile-quantile (q-q) plot to check normality for group = 1 (control group). •Data point are Away from the straight line suggests not normal

•Now we check quantile-quantile (q-q) plot to check normality for group=2 (lvPPa). •Data point are Away from the straight line suggests not normal. Very few data points as well.

•Control group and lvPPA group are independent , very small sample and none of them are normally distributed so met the assumptions of non-parametric test. •We should choose non-parametric version of two independent sample test called MannWhitney test to compare SWS score

•Stata output STATA code

. ranksum numbersSWS1 if Group== 0 | Group==1 , by(Group)

STATA output

Two-sample Wilcoxon rank-sum (Mann-Whitney) test Group

obs

rank sum

expected

0 1

17 7

264 36

212.5 87.5

combined

24

300

300

unadjusted variance adjustment for ties

247.92 -1.19

adjusted variance

246.73

Ho: numbe~S1(Group==0) = numbe~S1(Group==1) z = 3.279 Prob > |z| = 0.0010



The output gives us a handy table displaying the two groups, their Obs (number of observations), the observed ranked sums and the rank sum that would be expected if the null hypothesis were retained (if there were no difference).

• Tied ranks can be an issue, so below the table there is a variance adjustment to account for these ties. • Then you are reminded of the null hypothesis, and given the zstatistic (3.29) and p-value (0.001); which suggests that there are significant difference in the distribution (medians) between control and experimental group in SSW.

Class Quiz

1 min in pairs

•Can we perform NP test if SWS for control and experimental lvPPA are not independent?

•Your answer: Yes / No

•Null hypothesis : there are no difference in distribution of SWS score between control and nfvPPA group •Alternative hypothesis : there are some differences in distribution of SWS score between control and nfvPPA group.

•Now we check quantile-quantile (q-q) plot to check normality for nfvPPa. •Data point are Away from the straight line suggests not normal. *Very few data points as well.

•Control group and lvPPA group are independent , very small sample and none of them are normally distributed so met the assumptions of non-parametric test. •We should choose non-parametric version of two independent sample test called MannWhitney test to compare SWS scores.

•Stata output STATA code

. ranksum numbersSWS1 if Group== 0 | Group==2 , by(Group)

STATA output

Two-sample Wilcoxon rank-sum (Mann-Whitney) test Group

obs

rank sum

expected

0 2

17 8

286 39

221 104

combined

25

325

325

unadjusted variance adjustment for ties

294.67 -1.25

adjusted variance

293.42

Ho: numbe~S1(Group==0) = numbe~S1(Group==2) z = 3.795 Prob > |z| = 0.0001



The output gives us a handy table displaying the two groups, their Obs (number of observations), the observed ranked sums and the rank sum that would be expected if the null hypothesis were retained (if there were no difference). • Tied ranks can be an issue, so below the table there is a variance adjustment to account for these ties. • Then you are reminded of the null hypothesis, and given the z-statistic (3.79) and p-value (0.001); which suggests that there are significant difference in the distribution (medians) between control and experimental group in SSW.

•Null hypothesis : there are no difference in the distribution of at least one pair of SWS score (among control , lvppa and nfvPPA) •Alternative hypothesis : there are some differences in distribution of SWS scores at least between one pair.

•Quantitative measure for all outcome •Overall outcome not normally distributed •The shapes of at least one pair in groups not similar •Each groups are independent from each other

1. We have Quantitative measure for each outcome 2. Overall outcome not normally distributed 3. The shapes of at least one pair in groups not similar (e.g.; control vs lvfppa measure) 4. Each groups are independent from each other

•As we have more than two groups and overall outcome not normally distributed so a non parametric test is preferred •We have more than two independent groups. •We will consider a non-parametric test called Kruskal-Wallis equality of populations rank test

•STATA output • kwallis numbersSWS1, by( diagnosis1)

STATA command

Kruskal-Wallis equality-of-populations rank test

diagno~1

Obs

Rank Sum

Control lvPPA nfvPPA

17 7 8

397.00 63.50 67.50

chi-squared = probability =

19.371 with 2 d.f. 0.0001

chi-squared with ties = probability = 0.0001

19.414 with 2 d.f.

STATA output



We had ties in our data, so we want to consult the Kruskal-Wallis H test results highlighted in the red rectangle above.

• The top line (i.e., "chi-squared with ties = 19.37 with 2 d.f.") reports the chi-squared value and the degrees of freedom of the test. • The line below this one (i.e., "probability = 0.0001") indicates the statistical significance of the Kruskal-Wallis H test (i.e., the p-value). • We can see that the significance level is 0.0001 (i.e., p = .0001), which is below 0.05, and, therefore, there is a statistically significant difference in the median score between the three different groups of the independent variable, SWS (i.e., control

vs lvfppa vs nfvPPA )

• There are only 7 patients in this group • Very small sample size (<10) and a nonparametric test is appropriate • We want to compare SWS score between time1 and time4

• First take the differences of SWS between two time points: Table : SWS score between two time points for the patients with lvfppa diff = time4Time1 Time4 time1 2 8 6 12 17 5 2 6 4 20 27 7 30 28 -2 1 0 -1 23 30 7

• Almost all the data points in q-q plots are away from the straight line so we apply a non-parametric Wilcoxon signed-rank test (an alternative to paired t-test) for testing the hypothesis that there is no difference between in SWS score between two time points. • Use STATA command: gen diff=Time4-Time1 if Diagnosis==2

• Stata output STATA command

• signrank diff=0 Wilcoxon signed-rank test sign

obs

sum ranks

expected

positive negative zero

5 2 0

25 3 0

14 14 0

all

7

28

28

unadjusted variance adjustment for ties adjustment for zeros

35.00 -0.13 0.00

adjusted variance

34.88

Ho: diff1 = 0 z = Prob > |z| =

1.863 0.0625

STATA output

• Stata output

. centile diff, centile(50)

Variable

Obs

diff

7

Percentile 50

Centile 5

Binom. Interp. [95% Conf. Interval] -1.685714

7

•The test gives a p-value of 0.0625 suggesting that there is not enough evidence of the difference in median of SWS scores between two time points for the patients with lvfppa. • The 95% confidence around the median falls between -1.68 and 7. This confidence interval includes 0, which indicates there is no much difference in regards to the shape of sinewave score for lvFPPA patients between two time points.

Take home : What statistical methods should I use to analyze my data? •

Choose appropriate statistical methods/tests Will cover in next lecture 23rd October 2018

Suggested Reading •An introduction to medical statistics by Martin Bland (4th edition): page 117-191

•Medical Statistics by B. Kirkwood & J. Sterne : page 344-350

•Any questions?

Related Documents

Chile 1pdf
December 2019 139
Theevravadham 1pdf
April 2020 103
Majalla Karman 1pdf
April 2020 93
Rincon De Agus 1pdf
May 2020 84
Exemple Tema 1pdf
June 2020 78

More Documents from "Gerardo Garay Robles"