Dr N. M. Jansonius Department of Ophthalmology University Hospital Groningen PO Box 30 001 9700 RB Groningen The Netherlands Tel: + 31 50 361 2510 Fax: + 31 50 361 1709 Email: firstname.lastname@example.org
Purpose: To describe the baseline data of a large cohort of patients included for follow-up with perimetry using the frequency doubling technique (FDT) and with quantification of the retinal nerve fibre layer as assessed by GDx, and to calculate the sensitivity and specificity of both devices from these baseline data.
Methods: Regular visitors to our glaucoma service were included. All subjects were followed for at least 4 years with FDT in full-threshold mode, GDx and conventional perimetry. Patients were classified as having either glaucoma or suspect glaucoma, according to baseline perimetry results. In addition, a group of healthy subjects was recruited outside the hospital.
Results: A total of 452 glaucoma patients, 423 glaucoma suspects and 237 healthy subjects were incorporated into the analyses. Sensitivities for both FDT and GDx were fixed at 0.90. For the group as a whole, the specificity was 0.81 for FDT, using number of depressed test-points p < 0.01 in the total deviation probability plot with a cut-off point > 1, and 0.78 for GDx, using the Number, with a cut-off point > 29. The area under the receiver operating characteristic (ROC) curve was 0.92 for FDT and 0.94 for GDx. Of the subjects with suspect glaucoma, 75% showed normal FDT test results and 52% showed normal GDx results. Unlike FDT, GDx failed to detect some moderate/severe glaucoma cases.
Conclusions: The performances of FDT and GDx are approximately equivalent in terms of sensitivity, specificity and area under the ROC curve. In glaucoma suspects, GDx in particular yielded a rather high percentage of positive test results. The majority of these positive test results are presumably false-positive results rather than results indicating preperimetric glaucoma.
It is not known whether the new devices are suitable for follow-up. We therefore decided to follow a large group of patients, using both FDT and GDx.
We included almost 1000 patients attending the glaucoma service of our hospital and followed this cohort prospectively for at least 4 years. The city of Groningen and its surrounding rural area are ideal for carrying out a longterm follow-up study because of the remote location of Groningen within the Netherlands and its low migration rate.
The aim of the present study is to describe the baseline data of the cohort and to evaluate some of the screening properties of FDT and GDx from these data. We recruited healthy subjects from outside the hospital to determine the specificity. These healthy subjects were compared with the patients from the glaucoma service with a normal baseline visual field, termed ‘glaucoma suspects’. This comparison is interesting because it might provide insight into the ability of FDT/GDx to detect so-called preperimetric glaucoma.
Material and Methods
Patient data and study protocol
The inclusion period for our study ran from 1st July 2000 to 30th June 2001. During this period, 1051 patients (509 men and 542 women) attended the glaucoma service at our outpatients department. Their mean age was 65 years (SD 15 years, range 13 − 91 years). Patients with a sufficient number of reliable visual fields to allow classification (see below), who agreed to participate (informed consent), were included in the study. No further exclusion criteria were applied in order to obtain a sample as representative as possible of visitors to a glaucoma service. At the baseline visit, at least one FDT and GDx test result was obtained. In 187 patients, additional FDT measurements were performed to explore a possible learning effect in FDT. No significant learning effect was found in this group (Heeg et al. 2003).
In addition to the patients, 237 healthy subjects were recruited outside the hospital by advertising in old people's homes, the local blood bank and other public places. People who visited an ophthalmologist regularly for glaucoma-related reasons, such as glaucoma or ocular hypertension, or who had a positive family history of glaucoma, were excluded. No further exclusion criteria were applied in order to get a sample as representative as possible of the population studied. The mean age of these healthy subjects was 63 years (SD 11 years, range 33 − 94 years).
We used the Humphrey Field Analyser (HFA) 30–2 Sita Fast for perimetry (Carl Zeiss Meditec Inc., Dublin, California, USA). In some cases,Goldmann perimetry (Haag Streit AG, Bern, Switzerland) was applied in addition to HFA. When using the HFA, an abnormal visual field was defined as one of the following:
1glaucoma hemifield test (GHT) outside normal limits;
2pattern standard deviation (PSD) p < 0.05, or
3three adjacent non-edge points p < 0.05 in the pattern deviation probability plot, of which at least one point was p < 0.01, with all points on the same side of the horizontal meridian (LTG-P criterion; Katz et al. 1991).
Defects had to be compatible with glaucoma and without any other explanation. Abnormal first visual fields were discarded because of a learning effect in static computerized perimetry (Heijl et al. 1989; Heijl & Bengtsson 1996). Patients with a reproducible visual field defect in at least one eye were classified as glaucoma patients. Therefore, at least three visual fields were required for the diagnosis of glaucoma.
Patients attending our glaucoma service who had a normal latest visual field in both eyes were considered as glaucoma suspects. Patients with an insufficient number of visual fields to allow classification (such as a patient with an abnormal visual field that was not confirmed or falsified within the inclusion period) were excluded from the analysis.
An HFA visual field was considered reliable if fixation losses were ≤ 20%, false-positives ≤ 10% and false-negatives ≤ 10%. In the case of an isolated increased number of fixation losses, fixation was considered acceptable if a well defined blind spot was present and the technician had observed a stable fixation. False-positives > 15% were considered unacceptable. In the case of 10–15% false-positives, the field was rejected if the GHT indicated an abnormally high sensitivity, if more depressed test-points were found in the pattern deviation probability plot than in the total deviation probability plot, if individual points > 37 dB were observed, or if the blind spot was absent (pseudo-fixation loss). An isolated increased number of false-negatives was considered compatible with glaucoma (Bengtsson & Heijl 2000). In the case of either repeatedly unreliable HFA test results (two to four attempts were usually performed) or loss of central vision, Goldmann perimetry was performed in addition to the HFA. For a Goldmann field, a defect had to be a paracentral/arcuate Bjerrum scotoma, a nasal step, a temporal rest or a central island.
Frequency doubling technique
Testing was performed with the frequency doubling perimeter (FDT software Version 2.60; Carl Zeiss Meditec Inc., Dublin, California, USA) using the C-20 full-threshold mode. Four parameters were studied: Mean deviation (MD) in dB, PSD in dB, the number of depressed test-points p < 0.01 in the total deviation probability plot (TD), and the number of depressed test-points p < 0.01 in the pattern deviation probability plot (PD). For details of frequency doubling perimetry, see Maddess & Henry (1992), Johnson & Samuels (1997) and Maddess et al. (1999).
Nerve fibre analyser (GDx)
Testing was performed using the nerve fibre analyser (GDx; Version 2.0.10; Laser Diagnostic Technologies, San Diego, California, USA). Six parameters were studied: superior maximum in µm (supM), inferior maximum in µm (infM), ellipse modulation (eMod), ellipse average in µm (eAvg), symmetry (symm) and the Number. The Number is an integer between 0 and 100 computed by the device. The higher the Number, the more likely it is that a patient has glaucoma. An explanation of these parameters has been published by Weinreb et al. (1998). In addition, subjective assessments have been published by Choplin & Lundy (2001) and Nicolela et al. (2001).
In our study, the subjective assessment (subj) was scored on an ordinal scale of 0 − 2, where 0 indicated no signs of glaucoma, 1 indicated a borderline case and 2 indicated clear signs of glaucomatous damage.
Six images were recorded for each eye. Another six images were recorded when the first series did not contain an image of sufficiently high quality.
High image quality required a well centred optic nerve head, an in-focus image, equal illumination in all quadrants and an absence of motion artefacts. A mean image was created when at least two images with high image quality were available. For details of scanning laser polarimetry, see Weinreb et al. (1990) and Dreher & Reiter 1992).
In this study, we counted patients, not eyes. An FDT or GDx test was considered negative if the test could be finished and was found to be normal in both eyes. In all other cases the test was considered positive (see Discussion). In the case of subjects with only one functional eye, only that eye was used for classifying the patient as having glaucoma or as a glaucoma suspect and for determining the test result.
Sensitivities and specificities can be calculated for any cut-off point. New devices should not fail to detect most of the manifest glaucoma cases that can be diagnosed using older techniques. We, therefore, determined cut-off points of various parameters so that sensitivity = 0.90, and studied the resulting specificity. The 95% confidence intervals (95% CIs) of sensitivity and specificity were calculated from:
p ± 1.96*sqrt[p(1 – p)/n]
where p is the proportion (sensitivity or specificity) and n the number of glaucoma patients (in the case of sensitivity) or the number of healthy subjects (in the case of specificity). This approximation is justified for n > 100 (Abramson & Gahlinger 2001). The chi-squared test with Yates' correction was applied for the comparison of independent proportions. Paired proportions were compared using McNemar's test with Yates' correction.
Progression of manifest glaucoma occurs at a rate of about 5–10% per year (Heijl et al. 2002), and conversion of ocular hypertension to glaucoma at a rate of approximately 1–2% per year (Kass et al. 2002). From these rates it can be calculated that our sample size should have enabled us to identify approximately 140 patients with progression (at least 120 with a 95% CI) and about 25 patients with conversion (at least 18 with a 95% CI) during a 4-year follow-up. Consequently, we may need to continue the follow-up of our suspects for longer than 4 years.
For the screening part of the study as presented in this article, our sample size should have enabled us to determine sensitivities of around 0.90 with 95% CIs of 0.87–0.93 and specificities of approximately 0.80, with 95% CIs of 0.75 − 0.85 (Abramson & Gahlinger 2001).
We included and classified 875 of 1051 consecutive patients attending our glaucoma service. Reliable HFA test results were obtained for both eyes in 767 of these 875 patients (HFA-eligible patients). In the remaining 108 patients, Goldmann perimetry was performed in addition to HFA. These were mainly elderly subjects with advanced glaucoma. Of the 875 patients, 452 were classified as glaucoma patients. The remaining 423 patients attending the glaucoma service were considered to be glaucoma suspects. Table 1 summarizes the classification of the patients. All 237 healthy subjects completed at least one FDT test. A GDx was performed in a subgroup of 108 healthy subjects. Analyses were carried out for the groups as a whole ( Tables 2 and 3; Fig. 1) in order to get results as representative as possible of the populations studied, and also for several subgroups (Tables 4 − 7).
Table 1. Classification of the patients (n = 875) from the glaucoma service included in the study. Of the patients participating, 767 were able to provide reliable HFA test results; the remaining 108 patients were tested using Goldmann perimetry.
Table 2. Reliability of FDT and GDx test results in glaucoma patients (n = 452), suspects (n = 423) and healthy subjects (n = 237 for FDT; n = 108 for GDx). Percentages are given in brackets.
Fixation losses > 1/6
False-positives > 1/6
False-negatives > 1/3
> 1 abnormal category
High image quality
Table 3. Specificity based on healthy subjects (n = 237 for FDT; n = 108 for GDx) and the proportion of normal test results in suspects from our glaucoma service (n = 423) for various FDT and GDx parameters. Sensitivity (based on 452 glaucoma patients) was fixed at 0.90. Parameter asymm was defined as [1-symm] = [1-supM/infM].
Normal test in suspects
MD = mean deviation in dB; PSD = pattern standard deviation in dB; TD = number of depressed test-points p < 0.01 in total deviation probability plot; PD = number of depressed test-points p < 0.01 in pattern deviation probability plot; supM = superior maximum in µm; infM = inferior maximum in µm; eMod = ellipse modulation; eAvg = ellipse average in µm; Number = the Number; subj = subjective assessment.
< − 1.8
(0.87 − 0.92)
(0.64 − 0.73)
(0.66 − 0.78)
(0.88 − 0.94)
(0.53 − 0.63)
(0.62 − 0.74)
(0.87 − 0.93)
(0.70 − 0.79)
(0.76 − 0.86)
(0.87 − 0.93)
(0.48 − 0.58)
(0.53 − 0.65)
(0.87 − 0.92)
(0.33 − 0.42)
(0.63 − 0.80)
(0.87 − 0.93)
(0.34 − 0.44)
(0.60 − 0.77)
(0.87 − 0.92)
(0.42 − 0.52)
(0.39 − 0.58)
(0.87 − 0.93)
(0.23 − 0.32)
(0.41 − 0.59)
(0.85 − 0.91)
(0.11 − 0.18)
(0.12 − 0.27)
(0.87 − 0.93)
(0.47 − 0.57)
(0.70 − 0.86)
(0.88 − 0.93)
(0.54 − 0.64)
(0.77 − 0.91)
Table 4. FDT and GDx results from the subgroup of healthy subjects (n = 108) in which both FDT and GDx measurements were performed.
TD ≤ 1
TD > 1
TD = FDT parameter number of depressed test-points p < 0.01 in total deviation probability plot; Number = GDx parameter the Number.
Number ≤ 29
Number > 29
Table 7. Sensitivity of FDT and GDx for patients with early glaucoma (defined as MD(HFA) ≥ − 6 dB in the worse eye; n = 123) and for the remaining moderate and severe glaucoma cases (n = 329).
(0.55 − 0.72)
(0.68 − 0.83)
(0.99 − 1.00)
(0.93 − 0.97)
The worse eye of the HFA-eligible glaucoma patients had an average mean deviation of − 11.5 dB (SD 8.6 dB, range − 31.5 to − 1.5 dB). Glaucoma was bilateral in approximately half the cases. Most of the glaucoma suspects, 271 of the 423, were patients who regularly visited the glaucoma service because of ocular hypertension, defined as > 20 mmHg on at least two separate visits. A total of 104 of the 423 patients had a positive family history of glaucoma, defined as at least one first- or second-degree relative with manifest glaucoma, and 61 of the 423 patients had both ocular hypertension and a positive family history.
Reliable FDT and GDx test results were obtained from the majority of the subjects. This is shown in Table 2. For both FDT and GDx, tests that could not be finished occurred almost exclusively in glaucoma patients. The major reason was loss of central vision in either eye. Fixation losses and false-positive catch trials during FDT were found almost equally in all groups. False-negative catch trials appeared to be limited to glaucoma patients. Image quality for GDx was better in healthy subjects than in glaucoma suspects. Likewise, suspects appeared to give a better image quality than glaucoma patients.
Table 3 shows the sensitivity and proportion of glaucoma suspects with normal baseline FDT/GDx based on the glaucoma service patients (n = 875), and the specificity based on the healthy subjects from outside the hospital (n = 237 for FDT; n = 108 for GDx). The specificity was 0.81 for FDT (TD > 1) and 0.78 for GDx (Number > 29). This difference is not significant (p = 0.62; chi-squared test). Table 4 presents a 2 × 2 table with FDT and GDx results from the subgroup of healthy subjects (n = 108) in which both FDT and GDx measurements were recorded. A paired comparison in this subgroup did not reveal a significant difference between FDT and GDx either (p = 0.12; McNemar's test). Interestingly, healthy subjects with an abnormal FDT test result were usually not the same individuals as healthy subjects with an abnormal GDx. Parameter subj in GDx seemed to have a higher specificity than the Number. This difference, however, was not significant (p = 0.19; McNemar's test). As subj, unlike the Number, requires a well trained interpreter, we decided to confine our analyses to the Number. Interestingly, the specificity of FDT (0.81) appeared to be equal to the proportion of normal test results in suspects (0.75; p = 0.10; chi-squared test). On the contrary, the specificity of GDx (0.78) was higher than the proportion of normal test results in suspects (0.52; p < 0.001). Fig. 1 presents the receiver operating characteristic (ROC) curves of TD and the Number. The areas under the curve were 0.92 and 0.94, respectively.
In 546 of the 875 patients, ophthalmic history was limited to (suspect) glaucoma and possible glaucoma surgery and/or cataract surgery, which had been carried out in at least one eye in 197 and 226 of the 875 subjects, respectively. Table 5 shows the sensitivity and proportion of normal test results in suspects in this subgroup of 546 patients (number of depressed test-points p < 0.01 in total deviation probability plot with cut-off point > 1 for FDT, the Number with cut-off point > 29 for GDx). Table 6 presents an overview of the various ophthalmic disorders found in our sample population.
Table 5. Sensitivity and proportion of normal test results in suspects for a subgroup of patients with ophthalmic history limited to (suspect) glaucoma (n = 546; glaucoma surgery and/or cataract surgery allowed).
Normal test in suspects
(0.83 − 0.91)
(0.77 − 0.85)
(0.83 − 0.91)
(0.52 − 0.63)
Table 6. Overview of the various ophthalmic disorders as found in our sample population (n = 875). Cataract was only mentioned if it contributed to a visual acuity < 6/12. Data apply to the right eye only. Disorders occurring in less than 1% of the eyes were omitted.
Age-related macular degeneration
Branch retinal vein occulsion
Central retinal vein occulsion
Laser for diabetic retinopathy
Other retinal pathology
Optic nerve/chiasmal disorders
Homonymous field defects
Glaucoma patients with an MD(HFA) ≥ − 6 dB in the worse eye were classified as early glaucoma cases (n = 123; average MD − 3.1 dB; SD 1.6 dB; range from − 6.0 to + 1.5 dB). Table 7 presents both the sensitivity of this subgroup and that of the remaining moderate/severe glaucoma cases (n = 329). All false-negative FDT test results (i.e. a normal FDT test result in a glaucoma patient) were found within the subgroup of early glaucoma cases. Contrary to this, GDx failed to detect 5% of the moderate/severe glaucoma cases.
This article describes the baseline data of 875 patients from our glaucoma service who were prospectively followed using FDT, GDx and conventional perimetry. From these baseline data and data from 237 healthy subjects from outside the hospital, some screening-related properties of FDT and GDx were evaluated. In our glaucoma service, three-quarters of the patients with a normal visual field had a normal FDT test result. This proportion (0.75) is close to the specificity found in healthy subjects from outside the hospital (0.81). Half the patients with a normal visual field had a normal GDx. This proportion (0.52) is lower than the specificity (0.78). With the selected parameters and cut-off points (number of depressed test-points p < 0.01 in the total deviation probability plot > 1 for FDT; the Number > 29 for GDx) nine out of 10 glaucoma cases were detected during a single examination. Unlike FDT, GDx failed to detect some moderate/severe glaucoma cases.
Most of the subjects were able to provide reliable FDT/GDx test results. There are several ways to handle the remaining unfinished and apparently unreliable tests. Although these test results are often excluded from analysis, such an approach may reveal an optimistic diagnostic performance. In a real-life screening setting, a test should be conclusive in all cases: patients are either sent home after reassurance or are referred for further work-up. We considered unfinished tests as positive because these patients required further assessment, as did patients with abnormal test results. It is also the case that apparently unreliable tests (any catch trial category outside normal limits with FDT, or borderline or poor image quality with GDx) could either be interpreted as reliable or considered positive, irrespective of the test result. We decided to interpret these tests rather than considering them positive because that approach yielded a better diagnostic performance.
How does our approach influence specificity when compared to an approach in which all unfinished and apparently unreliable tests are excluded from analysis? Unfinished tests appeared to be limited to glaucoma patients and glaucoma suspects (Table 2). Therefore, considering these tests as positive did not compromise specificity. To explore the effect of interpreting apparently unreliable tests, we recalculated the specificity after the exclusion of all subjects with an apparently unreliable test. The specificity was now 0.80 for FDT (0.81 before exclusion) and 0.82 for GDx (0.78 before exclusion). These data may suggest that GDx specificity is more dependent on reliability than FDT specificity.
It is an intrinsic problem of longterm studies that devices (or preferred treatment modalities) may change during the study. A 24–2 version of FDT (Matrix) has been developed with 54 test locations instead of 17, and a version of GDx with a variable corneal compensator (GDx-VCC) has been launched. The GDx-VCC seems to display a slightly better diagnostic performance (Zhou & Weinreb 2002; Bagga et al. 2003; Reus et al. 2003). For follow-up, however, the variable corneal compensator might be less important because the corneal birefringence is supposed to be very stable. Neither the Matrix nor the GDx-VCC were available at the time we started our study.
There is going discussion as to whether new devices such as the FDT and GDx are able to detect glaucoma earlier than standard automated perimetry. A similar question refers to whether our suspects with abnormal FDT/GDx test results were false-positive cases or so-called preperimetric glaucoma patients. The FDT did not show a significant difference between the specificity (0.81) and the proportion of suspects with a normal FDT test result (0.75). Our sample size should have enabled us to detect a 0.11 difference between these proportions at a level of significance of 0.05 and a power of 0.90 (Abramson & Gahlinger 2001). From our data it can therefore be concluded that the difference is less than 0.11. This indicates that at least 60–70% of the suspects with abnormal FDT test results were false-positive cases rather than preperimetric glaucoma cases. The proportion of suspects with normal test results (0.52) was lower than the specificity (0.78) in GDx. This finding may suggest that GDx is able to pick up glaucoma earlier than standard automated perimetry and FDT. From the GDx data it could be concluded that up to approximately 30% of the suspects, that is, approximately 60% of the suspects with abnormal GDx results, might actually have been preperimetric glaucoma cases. These percentages, however, seem to be unrealistic. The majority of the suspects, 271 of the 423, suffered from ocular hypertension. Approximately 1–2% of ocular hypertensive patients will convert to manifest glaucoma each year (Kass et al. 2002). If we assume that preperimetric glaucoma precedes manifest glaucoma by 5 years, and that all suspects convert to manifest glaucoma at the same rate as ocular hypertensive patients, then our group of 423 suspects would typically have contained 30 (i.e. 5–10%) preperimetric glaucoma cases. Clearly, this percentage is much lower than 30%. The majority of the suspects with an abnormal GDx, just like those with a positive FDT, are likely to represent false-positive cases rather than preperimetric glaucoma cases. The question remaining, therefore, concerns why there are more false-positive GDx test results in suspects than in healthy subjects. Possible explanations for this are a higher prevalence of other ophthalmic pathology and a lower image quality of the GDx scans in suspects compared to healthy subjects. The prevalence of other ophthalmic disorders is presumably higher in suspects because they are similar to glaucoma patients in that they are often detected accidentally during an ophthalmic consultation for another reason.
In conclusion, the FDT parameter with the best diagnostic performance in terms of sensitivity and specificity appeared to be the number of depressed test-points p < 0.01 in the FDT total deviation probability plot, with a cut-off point of > 1 abnormal test-point. The GDx parameter the Number, with a cut-off point > 29, performed equally well. The GDx test, within a glaucoma service, results in a rather high percentage of positive test results in patients with a normal visual field. Presumably, the majority of these positive test results are false-positive results rather than results indicating preperimetric glaucoma.
It is not possible to conclude from the results of this cross-sectional analysis whether FDT/GDx are able to detect preperimetric glaucoma. Our prospective study must answer this question: following the suspects longitudinally will allow us to investigate the predictive value of both a positive and a negative FDT/GDx baseline test result for conversion to glaucoma.
This research was supported by the Dutch Health Care Insurance Council (CVZ) through the Department of Medical Technology Assessment (MTA) of the University Hospital Groningen, the Netherlands.