Discus (denniss Et Al, Sept07.09)

  • Uploaded by: Paul H Artes
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Discus (denniss Et Al, Sept07.09) as PDF for free.

More details

  • Words: 5,813
  • Pages: 18
Discus – A software program to assess judgment of glaucomatous damage in optic disc photographs.

Short title

Discus

Words, Figures, Tables

3600, 3, 2

Codes & Presentations

GL, Poster at ARVO meeting in May 2008 (program # 3625)

Keywords

glaucoma, optic disc, sensitivity, specificity, diagnostic performance

www.wordle.net courtesy Jon Feinberg

Authors

Jonathan Denniss, MCOptom1,2 Damian Echendu, OD, MSc1 David B Henson, PhD, FCOptom1,2 Paul H Artes, PhD1,2,3 (corresponding author)

Affiliations & Correspondence

1

Research Group for Eye and Vision Sciences, University of Manchester, England 2

Manchester Royal Eye Hospital, Manchester, England

3

Ophthalmology and Visual Sciences, Dalhousie University Rm 2035, West Victoria 1276 South Park St, Halifax, Nova Scotia B3H 2Y9, Canada [email protected] Commercial Relationships

None

Support

College of Optometrists PhD studentship (JD) Nova Scotia Health Research Foundation Grant Med-727 (PHA)

1

1

Abstract

2

Aim

3

To describe a software package (Discus) for evaluating clinicians’ assessment of optic disc damage,

4

and to provide reference data from a group of expert observers.

5

Methods

6

Optic disc images were selected from patients with manifest or suspected glaucoma or ocular

7

hypertension who attended the Manchester Royal Eye Hospital. Eighty images came from eyes

8

without evidence of visual field (VF) loss in at least 4 consecutive tests (VF-negatives), and 20

9

images from eyes with repeatable VF loss (VF-positives). Software was written to display these

10

images in randomized order, for up to 60 seconds. Expert observers (n=12) rated optic disc damage

11

on a 5-point scale (definitely healthy, probably healthy, not sure, probably damaged, definitely

12

damaged).

13

Results

14

Optic disc damage as determined by the expert observers predicted VF loss with less than perfect

15

accuracy (mean area under receiver-operating curve [AUROC], 0.78; range 0.72 to 0.85). When the

16

responses were combined across the panel of experts, the AUROC reached 0.87, corresponding to a

17

sensitivity of ~60% at 90% specificity. While the observers’ performances were similar, there were

18

large differences between the criteria they adopted (p<0.001), even though all observers had been

19

given identical instructions.

20

Conclusion

21

Discus provides a simple and rapid means for assessing important aspects of optic disc interpretation.

22

The data from the panel of expert observers provide a reference against which students, trainees, and

23

clinicians may compare themselves. The program and the analyses described in this paper are freely

24

accessible from http://discusproject.blogspot.com/.

2

25

Introduction

26

The detection of early damage of the optic disc is an important yet difficult task.1, 2

27

In many patients with glaucoma, optic disc damage is the first clinically detectable sign of disease. In

28

the Ocular Hypertension Treatment Study, for example, almost 60% of patients who converted to

29

glaucoma developed optic disc changes before exhibiting reproducible visual field damage.3, 4

30

Broadly similar findings were obtained in the European Glaucoma Prevention Study; in

31

approximately 40% of those participants who developed glaucoma, optic disc changes were

32

recognised before visual field changes.5 However, the diverse range of optic disc appearances in a

33

healthy population, combined with the many ways in which glaucomatous damage may affect the

34

appearance of disc, make it difficult to detect features of early damage.6, 7

35

While several imaging technologies have been developed in the last decades (confocal scanning laser

36

tomography, nerve fibre layer polarimetry, and optical coherence tomography) which provide

37

reproducible assessment of the optic disc and retinal nerve fibre layer, the diagnostic performances

38

of these technologies have not been consistently better than that achieved by clinicians.8-11 Subjective

39

assessment of the optic disc, either by slitlamp biomicroscopy or by inspection of photographs,

40

therefore still plays a pivotal role in the clinical care of patients at risk from glaucoma.8

41

Many papers describe the optic disc changes in glaucoma6, 7, 12-14 and several authors have looked at

42

either at the agreement between clinicians in diagnosing glaucoma, differentiating between different

43

types of optic disc damage, or in estimating specific parameters such as cup/disc ratios.15-25

44

However, because there is no objective reference standard for optic disc damage, it is difficult for

45

students, trainees, or clinicians to assess their judgments against an external reference.

46

In this paper, we describe a software package (“Discus”) which observers can use to view and

47

interpret a set of selected optic disc images under controlled conditions. We further present reference

48

data from 12 expert observers against which future observers can be evaluated, or evaluate

49

themselves.

50

3

50

Methods

51

Selection of Images

52

To obtain a set of optic disc images with a wide spectrum of early glaucomatous damage, data were

53

selected from patients who had attended the Optometrist-lead Glaucoma Assessment (OLGA) clinics

54

at the Royal Eye Hospital (Manchester, UK) between June 2003 and May 2007. This clinic sees

55

patients who are deemed at risk of developing glaucoma, for example due to ocular hypertension, or

56

who have glaucoma but are thought of as being at low risk of progression and are well controlled on

57

medical therapy. Patients undergo regular examinations (normally in intervals of 6 months) by

58

specifically trained optometrists. During each visit, visual field examinations (Humphrey Field

59

Analyzer program 24-2, SITA-Standard) and non-stereoscopic fundus photography are performed

60

(Topcon TRC-50EX, field-of-view 20 degrees, resolution 2000×1312 pixels, 24 bit colour).

61

For this study, images were considered for inclusion if the patient had undergone at least 4 visual

62

field tests on each eye (n=665). The 4 most recent visual fields were then analysed to establish two

63

distinct groups, visual field (VF-) positive and VF-negative (Table 1). Images from patients who did

64

not meet the criteria of either group were excluded. Table 1: Inclusion criteria for VF-positive and VF-negative groups. For inclusion in the VF-negative group, the criteria had to be met with both eyes. In addition, the between-eye differences in MD and PSD had to be less than 1.0 dB

MD

PSD

VF-positive

between -2.5 and -10.0 dB

between 3.0 and 15.0 dB

VF-negative

better than [>] -1.5 dB 1

better than [<] 2.0 dB 1

65

If both eyes of a patient met these criteria, a single eye was randomly selected. A small number of

66

eyes (n=17) were excluded owing to clearly non-glaucomatous visual field loss (for example,

67

hemianopia) or non-glaucomatous lesions visible on the fundus photographs (eg chorioretinal scars).

68

There were 155 eyes in the VF-positive and 144 eyes in the VF-negative group.

69

To eliminate any potential clues other than glaucomatous optic disc damage, we matched the image

70

quality in VF-negative and VF-positive groups. One of the authors (DE) viewed the images on a

71

computer monitor in random order and graded each one on a five-point scale for focus and

72

uniformity of illumination. During grading, the observer was unaware of the status of the image (VF-

73

positive or -negative), and the area of the disc had been masked from view. A final set of 20 VF-

74

positive images and 80 VF-negative images was then created such that the distribution of image

75

quality was similar in both groups (Table 2). The total size of the image set (100), and the ratio of 4

76

VF-positive to VF-negative images (20:80), had been decided on beforehand to limit the duration of

77

the experiments and to keep the emphasis on discs with early damage. Table 2: Characteristics of VF-positive and VF-negative groups Image quality was scored subjectively on a scale from 1 to 5. Differences between groups were tested for statistical significance by Mann-Whitney U (MWU) tests.

Image Quality

Age, y

MD, dB

PSD

VF-positive (n=20)

1.82 (1.20)

66.0 (13.1)

-6.20 (1.76)

5.58 (2.15)

VF-negative (n=80)

1.68 (1.33)

61.3 (9.3)

+0.60 (0.4)

1.50 (0.16)

p-value (MWU)

0.67

0.35

<0.001

<0.001

78

Expert Observers

79

For the present study, 12 expert observers (either glaucoma fellowship-trained ophthalmologists

80

working in glaucoma sub-speciality clinics (n=10) or scientists involved in research in the optic disc

81

in glaucoma (n=2) were selected as observers. Observers were approached ad-hoc during scientific

82

meetings or contacted by e-mail or letter with a request for participation.

83

Prior to the experiments, the observers were given written instructions detailing the selection of the

84

image set. The instructions also stipulated that responses should be given on the basis of apparent

85

optic disc damage rather than the perceived likelihood of visual field damage.

86

5

86

Experiments

87

In order to present images under controlled conditions, and to collect the observers’ responses, a

88

software package Discus (3.0E, figure 1) was developed in Delphi (CodeGear, San Francisco, CA).

89

Details on availability and configuration of the software are provided in the Appendix.

90

The software displayed the images, in random order, on a computer monitor. After the observer had

91

triggered a new presentation by hitting the “Next” button, an image was displayed until the observer

92

responded by clicking one of 5 buttons (definitely healthy, probably healthy, not sure, probably

93

damaged, definitely damaged). After a time-out period of 60 seconds the image would disappear, but

94

observers were allowed unlimited time to give a response. To guard against occasional finger-errors,

95

observers were also allowed to change their response, as long as this occurred before the “Next”

96

button was hit.

97

To assess the consistency of the observers, 26 images were presented twice (2 in the VF-positive

98

group, 24 in the VF-negative group). No feedback was provided during the sessions.

Fig 1: Screenshot of Discus software. Images remained on display for up to 60 seconds, or until the observer clicked on one of the 5 response categories. A new presentation was triggered by hitting the “Next” button.

99

100

Analysis

101

The responses were transformed to a numerical scale ranging from -2 (“definitely healthy”) to +2

102

(“definitely damaged”. The proportion of repeated images in which the responses differed by one or

103

more categories was calculated, for each observer. For all subsequent analyses, however, only the

104

last of the two responses was used. All analyses were carried out in the freely available open-source

105

environment R, and the ROCR library was used to plot the ROC curves.26, 27 6

106

Individual observers’ ROC curves

107

To obtain an objective measure of individual observers’ performance at discriminating between eyes

108

with and without visual field damage, ROC curves were derived from each set of responses. For this

109

analysis, the visual field status was the reference standard, and responses in the “not sure” category

110

were interpreted as between “probably healthy” and “probably damaged”. If an observer had used all

111

five response categories, the ROC curve would contain 4 points (A – D). Point A, the most

112

conservative criterion (most specific but least sensitive) gave the sensitivity and specificity to visual

113

field damage when only the “definitely damaged” responses were treated as test positives while all

114

other responses (“probably damaged”, “not sure”, “probably healthy”, “definitely healthy”) were

115

interpreted as test negatives. For point D, the least conservative criterion (most sensitive but least

116

specific), only “definitely healthy” responses were interpreted as test negatives, and all other

117

responses as test positives.

118

Individual observers’ criteria

119

When using a subjective scale, as in the current study, the responses are dependent on the observer’s

120

interpretation of the categories and their individual inclination to respond with “probably damaged”

121

or “definitely damaged” (response criterion). A cautious observer, for example, might regard a

122

particular ONH as “probably damaged” whilst an equally skilled but less cautious observer might

123

respond with “not sure” or “probably healthy”. To investigate the variation in criteria within our

124

group, we compared the observers’ mean responses across the entire image set.

125

Combining responses of expert observers

126

To estimate the performance of a panel of experts, and to obtain a reference other than visual field

127

damage for judging current as well as future observer’s responses, the mean response of the 12

128

expert observers was calculated for each of the 100 images.

129

To estimate if the expert group (n=12) was sufficiently large, we investigated how the performance

130

of the combined panel changed depending on the number of included observers. Areas under the

131

ROC curve were calculated for all possible combinations of 2, 3, 4…11 observers to derive the mean

132

performance, as well as the minimum and maximum.

133

Relationship between responses of individual observers and expert panel

134

As a measure of overall agreement between the expert observers, independent of their individual

135

response criteria, the Spearman rank correlation coefficient between the 12 sets of responses was

136

computed. The underlying rationale of this analysis is that, by assigning each image to one of five

137

ordinal categories, each observer had in fact ranked the 100 images. If two observers had performed

138

identical ranking, the Spearman coefficient would be 1, regardless of the actual responses assigned. 7

139

Results

140

The experiments took between 13 and 46 minutes (mean, 29 min) to complete. On average, the

141

observers responded 7 seconds after the images were first presented on the screen, and the median

142

response latencies of individual observers ranged from 4 to 16 seconds. The reproducibility of

143

individual observer’s responses was moderate - on average, discrepancies of one category were seen

144

in 44% (12) of 26 repeated images (range, 23 – 62%).

145

Individual observers’ results are shown in Fig. 2A-L. The points labelled A, B, C, and D represent

146

the trade-off between the positive rates in the VF-positive (vertical axis) and VF-negative groups

147

(horizontal axis) achieved with the four possible classification criteria. Point A, for example, shows

148

the trade-off when only discs in the “definitely damaged” category are regarded as test-positives.

149

Point B gives the trade-off when discs in both “definitely damaged” and “probably damaged”

150

categories are regarded as test-positives. For D, the least conservative criterion, only responses of

151

“definitely healthy” were interpreted as negatives. To indicate the precision of these estimates, the

152

95% confidence intervals were added to point B.

153

Areas under the curve (AUROC) ranged from 0.71 (95% CI, 0.58, 0.85) to 0.88 (95% CI, 0.82,

154

0.96), with a mean of 0.79. There was no relationship between observers’ overall performance and

155

their median response latency (Spearman’s rho = 0.34, p = 0.29).

156

In contrast to their similar overall performance, the observers’ response criteria differed substantially

157

(p<0.001, Friedman test). For example, the proportion of discs in the VF-positive category which

158

were classified as “definitely damaged” ranged from 15% to 90%, while the proportion of discs in

159

the VF-negative category classified as “definitely healthy” ranged from 8% to 68%. In Fig 2A-L, the

160

response criterion is represented by the inclination of the red line with its origin in the bottom right

161

corner. If the responses had been exactly balanced between the “damaged” and “healthy” categories,

162

the inclination of the line would be 45 degrees. A more horizontal line represents a more

163

conservative criterion (less likely to respond with “probably damaged” or “definitely damaged”,

164

while a more vertical line represents a less conservative criterion. There was no relationship between

165

the observers’ performance (AUROC) and their response criterion (Spearman’s rho 0.41, p = 0.18).

166

To derive the “best possible” performance as a reference for future observers, the responses of the

167

expert panel were combined by calculating the mean response obtained for each image. The ROC

168

curve for the combined responses (grey curve in Fig. 2A-L) enclosed an area of 0.87.

169

8

169

Fig. 2. Receiver-operating characteristic (ROC) curves for the classification of optic disc photographs by the 12 expert observers (A-L), with a reference standard of visual field damage. The x-axis (positive rate in VF-negative group) measures specificity to visual field damage, while the y-axis (positive rate in the VF-positive group) gives the sensitivity. Point A (most conservative criterion) shows the trade-off between sensitivity and specificity when only“definitely damaged” responses are interpreted as test positives. For point D (the least conservative criterion) shows the trade-off when all but “definitely healthy” responses are interpreted as test positives. Boxplots (right) give the distributions of response latencies, and the number of times each response was selected. 170

9

170

10

Fig. 2 (cont). To facilitate comparison, the grey ROC curve, and the dotted grey line, represent the performance and the criterion of the group as a whole, respectively. Results provided in numerical format are the area under the ROC curve (AUC), the percentage of the AUC as compared to that of the entire group (individual ROC area – 0.5) / (expert panel ROC area – 0.5), the Spearman rank correlation of the individual’s responses with those of the entire group, the mean difference between repeated responses, and the average response as a measure of criterion (-2=”definitely healthy”, -1=”probably healthy”, 0=”not sure”, 1=”probably damaged”, and 2=”definitely damaged”.

171

11

171

To investigate how the performance of an expert panel varies with the number of contributing

172

observers, the area under the ROC curve was derived for all possible combinations of 2, 3, 4, etc, up

173

to 11, observers (Fig. 5). The limit of the ROC area was approached with 6 or more observers, and it

174

appeared that a further increase in the number of observers would not have had a substantial effect

175

on the performance of the panel. Fig. 3 Performance (area under ROC curve) of combined expert panel as a function of included observers. All possible combinations of 2 to 11 observers were evaluated. The mean area under the ROC curve approaches its limit with approximately 6 observers.

176

177

Individual observers’ Spearman rank correlation coefficient with the combined expert panel ranged

178

from 0.62 to 0.86, with a median of 0.79. There was no relationship between the Spearman

179

coefficient and the area under the ROC curve (r = 0.09, p = 0.78).

180

181

12

181

Discussion

182

The objective of this work was to establish an easy-to-use tool for clinicians, trainees, and students to

183

assess their skill at interpreting optic discs for signs of glaucoma-related damage, and to provide data

184

from a panel of experts as a reference for future observers. The study also showed that meaningful

185

experiments with Discus can be performed within a relatively short time.

186

All observers in this study had ROC areas significantly smaller than 1, and even when the judgments

187

of the observers were averaged, the combined responses of the panel failed to discriminate perfectly

188

between optic discs in the VF-positive and VF-negative groups. These findings are not surprising,

189

given the lack of a strong association between structure and function in early glaucoma that has been

190

reported by many previous studies.28-33 However, the experiments provide a powerful illustration of

191

how difficult it is to make diagnostic decisions in glaucoma based solely on the optic disc.

192

Estimated at a specificity fixed at 90%, the combined panel’s sensitivity to visual field loss was 60%.

193

This is within the range of performances previously reported for clinical observers and objective

194

imaging tools.9, 34-37 Unfortunately, objective imaging data are not available for the patients in the

195

current dataset and we are therefore unable to perform a direct comparison. However, the

196

methodology developed in this paper may prove useful for future studies that compare diagnostic

197

performance between clinicians and imaging tools in different clinical settings. A potential weakness

198

of our study was the relatively small size of the expert group (n = 12). However, by averaging every

199

possible combination of 2 to 11 observers within the group, we demonstrated that our panel was

200

likely to have attained near-maximum performance, and that a larger group of observers was unlikely

201

to have changed our findings substantially.

202

One challenging issue is how to derive complete and easily interpretable summary measures of

203

performance, in the absence of a reference standard of optic disc damage. Such summary measures

204

would be useful for giving feedback and for establishing targets for students and trainees. We used

205

visual field data as the criterion to separate optic disc images into VF-positive and VF-negative

206

groups, and there was no selection based on the presence or type of optic disc damage which would

207

have biased our sample.38-40 The ROC area therefore measures the statistical separation between an

208

observer’s responses to optic discs in eyes with and without visual field damage.41, 42 However,

209

owing to the lack of a strong correlation between structure and function, visual field loss is not an

210

ideal metric for optic disc damage in early glaucoma. For example, it is likely that a substantial

211

proportion of the VF-negative images show early structural damage, whereas some optic discs in the

212

VF-positive group may still appear healthy.

213

We have attempted to address the problem of a lacking reference standard in two complementary

214

ways. First, a new observer’s ROC area can be compared to that of the expert panel, such that the 13

215

statistic is re-scaled to cover a potential range from near zero (corresponding to chance performance,

216

AUROC = 0.5) to around 100% (AUROC = 0.87, performance of expert panel).

217

Second, we suggest that the Spearman rank correlation coefficient may be useful as a measure of

218

agreement between a future observer’s responses and those of the expert panel.43 Because this

219

coefficient takes into account the relative ranking of the responses, and not their overall magnitude, it

220

is independent of the observer’s response criterion. Consider, for example, three images graded as

221

“probably damaged”, “probably healthy”, and “definitely healthy” by the expert group. An observer

222

responding with “definitely damaged”, “not sure”, and “probably healthy” would differ in criterion

223

but agree on the relative ranking of damage, and their rank correlation with the expert panel would

224

be 1.0 (perfect). Our data suggest that observers may achieve similar ROC areas with rather different

225

responses (consider observers D and F as an example), and the lack of association between the ROC

226

area and the rank correlation means that these statistics measure somewhat independent aspects of

227

decision-making.

228

A surprising finding was that individual observers in our study adopted very different response

229

criteria, even though they had been provided with identical written instructions and identical

230

information on the source of the images and the distribution of visual field damage in the sample

231

(compare observers A and E, for example). It is possible that we might have been able to control the

232

criteria more closely, for example by instructing observers to use the “probably damaged” category if

233

they believed that the chances for the eye to be healthy were less than, say, 10%. More importantly,

234

however, our findings underscore the need to distinguish between differences in diagnostic

235

performance, and differences in diagnostic criterion, whenever subjective ratings of optic disc

236

damage are involved. This is the principal reason for why we avoided the use of kappa statistics

237

which measure overall agreement but do not isolate differences in criterion.44, 45

238

The outpatient clinic from which our images were obtained sees a relatively high proportion of

239

patients suspected of having glaucoma who do not have visual field loss. Because our image sample

240

is not representative of an unselected population, the ROC curves are likely to underestimate

241

clinicians’ true performance at detecting glaucoma by ophthalmoscopy. However, the use of a

242

“difficult” data set may also be seen as an advantage as it allows observers’ performance to be

243

assessed on the type of optic disc more likely to cause diagnostic problems in clinical practice.

244

In addition to the source of our images, here are several other reasons for why the performance on

245

Discus should not be regarded as providing a truly representative measure of an observer’s real-

246

world diagnostic capability. First, we used non-stereoscopic images. Stereoscopic images would

247

have been more representative of slitlamp biomicroscopy, the current standard of care, and there is

248

evidence that many features of glaucomatous damage may be more clearly apparent in stereoscopic

249

images.46 However, the gain over monoscopic images is probably not large.47-50 Second, Discus does 14

250

not permit a comparison of fellow eyes which often provides important clues in patients with early

251

damage.51 Third, through the display of photographic images on a computer monitor we can not

252

assess an observer’s aptitude at obtaining an adequate view of the optic disc in real patients.

253

Notwithstanding these limitations, we believe that Discus provides a useful assessment of some

254

important aspects of recognising glaucomatous optic disc damage. Further studies with Discus are

255

now being undertaken to examine the performance of ophthalmology residents and other trainees as

256

compared to our expert group. These studies will also provide insight into which features of

257

glaucomatous optic disc damage are least well recognised, and how clinicians use information on

258

prior probability in their clinical decision-making.

259

Conclusions

260

The Discus software may be useful in the assessment and training of clinicians involved in the

261

detection of glaucoma. It is freely available from http://discusproject.blogspot.com, and interested

262

users may analyse their results using an automated web server on this site.

263

Acknowledgements

264

Robert Harper, Amanda Harding and Jo Marcks of the OLGA clinic at the Manchester Royal Eye

265

Hospital supported this project and contributed ideas. Jonathan Layes (Medicine) and Bijan Farhoudi

266

(Computer Science) of Dalhousie University helped to improve the software and to implement an

267

automated analysis on our server. We are most grateful to all 12 anonymous observers for their

268

participation.

269

Appendix

270

At present, Discus is available only for the Windows operating systems. The software can be called

271

with different start-up parameters. These parameters (and their defaults) are:

272

1) Duration of image presentations, in ms (10000)

273

2) Rate of Repetitions in the visual field positive group (0.1)

274

3) Rate of Repetitions in the visual field negative group (0.3)

275

4) Save-To-Desktop status (1)

276

If the Save-To-Desktop status is set to 1, a tab delimited file will be saved to the desktop. The user

277

can then upload this file to our server and retrieve their results after a few seconds.

278

279

15

279

References

280 281

1.

Weinreb RN, Tee Khaw P. Primary open-angle glaucoma. Lancet 2004;363:1711-1720.

282

2.

Garway-Heath DF. Early diagnosis in glaucoma. In: Nucci C, Cerulli L, Osborne NN, Bagetta G (eds), Progress in Brain Research; 2008:47-57.

3.

Gordon MO, Beiser JA, Brandt JD, et al. The Ocular Hypertension Treatment Study: Baseline Factors That Predict the Onset of Primary Open-Angle Glaucoma. Archives of Ophthalmology 2002;120:714.

4.

Keltner JL, Johnson CA, Anderson DR, et al. The association between glaucomatous visual fields and optic nerve head features in the Ocular Hypertension Treatment Study. Ophthalmology 2006;113:1603-1612.

5.

Predictive Factors for Open-Angle Glaucoma among Patients with Ocular Hypertension in the European Glaucoma Prevention Study. Ophthalmology 2007;114:3-9.

6.

Broadway DC, Nicolela MT, Drance SM. Optic Disk Appearances in Primary Open-Angle Glaucoma. Survey of Ophthalmology 1999;43:223-243.

7.

Jonas JB, Budde WM, Panda-Jonas S. Ophthalmoscopic evaluation of the optic nerve head. Survey of Ophthalmology 1999;43:293-320.

8.

Lin SC, Singh K, Jampel HD, et al. Optic Nerve Head and Retinal Nerve Fiber Layer Analysis: A Report by the American Academy of Ophthalmology. Ophthalmology 2007;114:1937-1949.

9.

Sharma P, Sample PA, Zangwill LM, Schuman JS. Diagnostic Tools for Glaucoma Detection and Management. Survey of Ophthalmology 2008;53.

10.

Zangwill LM, Bowd C, Weinreb RN. Evaluating the Optic Disc and Retinal Nerve Fiber Layer in Glaucoma II: Optical Image Analysis. Seminars in Ophthalmology 2000;15:206 220.

11.

Mowatt G, Burr JM, Cook JA, et al. Screening Tests for Detecting Open-Angle Glaucoma: Systematic Review and Meta-analysis. Invest Ophthalmol Vis Sci 2008;49:5373-5385.

12.

Fingeret M, Medeiros FA, Susanna Jr R, Weinreb RN. Five rules to evaluate the optic disc and retinal nerve fiber layer for glaucoma. Optometry 2005;76:661-668.

13.

Susanna Jr R, Vessani RM. New findings in the evaluation of the optic disc in glaucoma diagnosis. Current Opinion in Ophthalmology 2007;18:122-128.

14.

Caprioli J. Clinical evaluation of the optic nerve in glaucoma. Transactions of the American Ophthalmological Society 1994;92:589.

15.

Lichter PR. Variability of expert observers in evaluating the optic disc. Transactions of the American Ophthalmological Society 1976;74:532.

16.

Tielsch JM, Katz J, Quigley HA, Miller NR, Sommer A. Intraobserver and interobserver agreement in measurement of optic disc characteristics. Ophthalmology 1988;95:350-356.

17.

Nicolela MT, Drance SM, Broadway DC, Chauhan BC, McCormick TA, LeBlanc RP. Agreement among clinicians in the recognition of patterns of optic disk damage in glaucoma. American journal of ophthalmology 2001;132:836-844.

18.

Spalding JM, Litwak AB, Shufelt CL. Optic nerve evaluation among optometrists. Optom Vis Sci 2000;77:446-452.

19.

Harper R, Reeves B, Smith G. Observer variability in optic disc assessment: implications for glaucoma shared care. Ophthalmic Physiol Opt 2000;20:265-273.

283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322

16

323

20.

Harper R, Radi N, Reeves BC, Fenerty C, Spencer AF, Batterbury M. Agreement between ophthalmologists and optometrists in optic disc assessment: training implications for glaucoma co-management. Graefes Archive Clin Exp Ophthalmol 2001;239:342-350.

21.

Spry PG, Spencer IC, Sparrow JM, et al. The Bristol Shared Care Glaucoma Study: reliability of community optometric and hospital eye service test measures. The British journal of ophthalmology 1999;83:707-712.

22.

Abrams LS, Scott IU, Spaeth GL, Quigley HA, Varma R. Agreement among optometrists, ophthalmologists, and residents in evaluating the optic disc for glaucoma. Ophthalmology 1994;101:1662-1667.

23.

Varma R, Steinmann WC, Scott IU. Expert agreement in evaluating the optic disc for glaucoma. Ophthalmology 1992;99:215-221.

24.

Azuara-Blanco A, Katz LJ, Spaeth GL, Vernon SA, Spencer F, Lanzl IM. Clinical agreement among glaucoma experts in the detection of glaucomatous changes of the optic disk using simultaneous stereoscopic photographs. American journal of ophthalmology 2003;136:949950.

25.

Sung VCT, Bhan A, Vernon SA. Agreement in assessing optic discs with a digital stereoscopic optic disc camera (Discam) and Heidelberg retina tomograph. BMJ; 2002:196202.

26.

Ihaka R, Gentleman R. R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics 1996;5:299-314.

27.

Sing T, Sander O, Beerenwinkel N, Lengauer T. ROCR: visualizing classifier performance in R. Bioinformatics 2005;21:3940-3941.

28.

Anderson RS. The psychophysics of glaucoma: Improving the structure/function relationship. Progress in Retinal and Eye Research 2006;25:79-97.

29.

Garway-Heath DF, Holder GE, Fitzke FW, Hitchings RA. Relationship between electrophysiological, psychophysical, and anatomical measurements in glaucoma. Investigative Ophthalmology and Visual Science 2002;43:2213-2220.

30.

Johnson CA, Cioffi GA, Liebmann JR, Sample PA, Zangwill LM, Weinreb RN. The relationship between structural and functional alterations in glaucoma: A review. Seminars in Ophthalmology 2000;15:221-233.

31.

Harwerth RS, Quigley HA. Visual field defects and retinal ganglion cell losses in patients with glaucoma. Archives of Ophthalmology 2006;124:853-859.

32.

Caprioli J. Correlation of visual function with optic nerve and nerve fiber layer structure in glaucoma. Survey of Ophthalmology 1989;33:319-330.

33.

Caprioli J, Miller JM. Correlation of structure and function in glaucoma. Quantitative measurements of disc and field. Ophthalmology 1988;95:723-727.

34.

Deleon-Ortega JE, Arthur SN, McGwin Jr G, Xie A, Monheit BE, Girkin CA. Discrimination between glaucomatous and nonglaucomatous eyes using quantitative imaging devices and subjective optic nerve head assessment. Invest Ophthalmol Vis Sci 2006;47:3374-3380.

35.

Mardin CY, Jünemann AGM. The diagnostic value of optic nerve imaging in early glaucoma. Current Opinion in Ophthalmology 2001;12:100-104.

36.

Greaney MJ, Hoffman DC, Garway-Heath DF, Nakla M, Coleman AL, Caprioli J. Comparison of optic nerve imaging methods to distinguish normal eyes from those with glaucoma. Investigative Ophthalmology and Visual Science 2002;43:140-145.

324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366

17

367

37.

Harper R, Reeves B. The sensitivity and specificity of direct ophthalmoscopic optic disc assessment in screening for glaucoma: a multivariate analysis. Graefe's Archive for Clinical and Experimental Ophthalmology 2000;238:949-955.

38.

Whiting P, Rutjes AWS, Reitsma JB, Glas AS, Bossuyt PMM, Kleijnen J. Sources of Variation and Bias in Studies of Diagnostic Accuracy: A Systematic Review. Annals of Internal Medicine 2004;140:189-202.

39.

Medeiros FA, Ng D, Zangwill LM, Sample PA, Bowd C, Weinreb RN. The effects of study design and spectrum bias on the evaluation of diagnostic accuracy of confocal scanning laser ophthalmoscopy in glaucoma. Investigative Ophthalmology and Visual Science 2007;48:214222.

40.

Harper R, Henson D, Reeves BC. Appraising evaluations of screening/diagnostic tests: the importance of the study populations. British Journal of Ophthalmology 2000;84:1198.

41.

Hanley JA. Receiver operating characteristic (ROC) methodology: The state of the art. Critical Reviews in Diagnostic Imaging 1989;29:307-335.

42.

Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29-36.

43.

Svensson E. A coefficient of agreement adjusted for bias in paired ordered categorical data. Biometrical journal 1997;39:643-657.

44.

Fleiss JL. Measuring nominal scale agreement among many raters. Psychological Bulletin 1971;76:378-382.

45.

Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol 1990;43:543-549.

46.

Morgan JE, Sheen NJL, North RV, Choong Y, Ansari E. Digital imaging of the optic nerve head: Monoscopic and stereoscopic analysis. British Journal of Ophthalmology 2005;89:879884.

47.

Hrynchak P, Hutchings N, Jones D, Simpson T. A comparison of cup-to-disc ratio measurement in normal subjects using optical coherence tomography image analysis of the optic nerve head and stereo fundus biomicroscopy. Ophthalmic and Physiological Optics 2004;24:543-550.

48.

Parkin B, Shuttleworth G, Costen M, Davison C. A comparison of stereoscopic and monoscopic evaluation of optic disc topography using a digital optic disc stereo camera. BMJ; 2001:1347-1351.

49.

Vingrys AJ, Helfrich KA, Smith G. The role that binocular vision and stereopsis have in evaluating fundus features. Optom Vis Sci 1994;71:508-515.

50.

Rumsey KE, Rumsey JM, Leach NE. Monocular vs. stereospecific measurement of cup-todisc ratios. Optometry and Vision Science 1990;67:546-550.

51.

Harasymowycz P, Davis B, Xu G, Myers J, Bayer A, Spaeth GL. The use of RADAAR (ratio of rim area to disc area asymmetry) in detecting glaucoma and its severity. Canadian Journal of Ophthalmology 2004;39:240-244.

368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407

18

Related Documents


More Documents from ""