- Top of page
- Concluding remarks and advice
Clinical studies in optometry often collect data from either one or both eyes of a subject. A recent survey of ophthalmology journals, however, suggested a variety of different approaches both to eye selection in ‘one-eye’ studies and methods of analysis in ‘two-eye’ studies. Many studies did not describe clearly the procedures used or violated the statistical assumptions of independence of the data.
There are a number of issues raised by the decision to collect data from one or both eyes. First, if one eye per subject is studied, then how is that eye to be selected? Second, if data from both eyes are collected, how should the data be analysed? Measurements obtained from right and left eyes are usually correlated[1, 2] whereas many statistical procedures, such as t- tests, analysis of variance (anova), confidence intervals (CI), or linear regression assume that observations are an independent sample of the population. An important problem in testing hypotheses is the possibility of making a Type 1 error, i.e., rejecting the null hypothesis (Ho) when it is true. Since, the variance between eyes is usually less than that between subjects, the overall variance of a sample of measurements combined from both eyes is likely to be an underestimate of the true variance resulting in an increased risk of a Type 1 error. Hence, data collected from both eyes from a sample of subjects cannot be combined without taking the correlation into account. If measurements are included from both eyes without consideration of their mutual correlation, there may be a significant effect on the results of the experiment.[4, 5] Third, there may be an advantage in using both eyes in a study, especially in an experimental context, as one eye may be used as a control for the other, rather than recruiting a separate control population.[6, 7] An experimental treatment can be applied in healthy subjects to one eye, selected at random (the treated eye), the other acting as a control. In addition, in diseases which are essentially monocular and the fellow eye essentially healthy, one eye can be regarded as the ‘diseased’ eye and the other the control. Moreover, information concerning the diseased eye may be obtained from the fellow eye either by application of the conditional model of Rosner or marginal models in which one directly models the marginal probabilities of disease for each eye. These methods have been little used in optometry but enable diagnostic information from both eyes to be used explicitly in diagnosis.
The purpose of this article is to provide statistical advice for authors carrying out clinical studies in optometry which involves the question of whether to collect data from one or both eyes. First, current practice is reviewed with reference to articles published in three optometric journals, viz., Ophthalmic and Physiological Optics (OPO), Optometry and Vision Science (OVS), Clinical and Experimental Optometry (CEO) during the period 2009–2012. Second, statistical advice relevant to the analysis of data from both eyes is described in a variety of experimental contexts.
- Top of page
- Concluding remarks and advice
Of the 230 articles reviewed for this study, published in the period 2009–2012, 148/230 (64%) obtained data from one eye and 82/230 (36%) obtained data from both eyes.
Of the 148 one-eye studies (Table 1), a variety of methods of selecting that eye were used: 52/148 (35%) selected the right eye, 3/148 (2%) the left eye, 19/148 (13%) a randomly selected eye, 34/148 (23%) the better or the worse/diseased eye, 5/148 (3%) the dominant eye, and in 35/148 (24%) no selection criteria were given. There were no significant differences in the distribution of these frequencies among journals (χ2 = 14.48, 12DF, P = 0.31) or years (χ2 = 12.93, 15DF, P = 0.60).
Table 1. Frequency of different methods of selecting the eye in studies employing one eye only in articles published in three optometry journals 2009 – 2012 (OPO, Ophthalmic and Physiological Optics; OVS, Optometry and Vision Science; CEO, Clinical and Experimental Optometry; N, number of articles)
|Selection of eye|
|Journal||N||Right eye||Left eye||Random eye||Better eye||Worse eye||Dominant eye||No criteria given|
Of the 82 two-eye studies (Table 2): (1) 18/82 (22%) made measurements on both eyes but analysed data from one eye only, most commonly the right eye, (2) 10/82 (12%) analysed data from both eyes separately, (3) 10/82 (12%) analysed data from both eyes, taking into account the correlation between eyes, (4) 15/82 (18%) analysed both eyes using one eye as a treated or diseased eye, and the other as a control, and (5) 29/82 (35%) analysed both eyes either without correction or it was unclear exactly how the data had been analysed. Where data from both eyes were analysed, a variety of methods of analysis were employed including clustered anova, nested anova, anova with right and left eye included as a ‘within subject’ factor, and the Bland and Altman test of agreement. There were no significant differences in the distribution of the frequencies of analysis in two-eye studies among journals (χ2 = 7.44, 8DF, P = 0.51) or years (χ2 = 16.91, 15DF, P = 0.32).
Table 2. Frequency of different methods of analysis of data employing both eyes in articles published in three optometry journals 2009 – 2012 (OPO, Ophthalmic and Physiological Optics; OVS, Optometry and Vision Science; CEO, Clinical and Experimental Optometry; N, Number of articles)
|Method of analysis|
|Journal||N||Data from one eye only||Each eye taken separately||Both eyes corrected for correlation||Both eyes (adjacent as control)||Both eyes uncorrected for correlation|
- Top of page
- Concluding remarks and advice
As in previous reports reviewing clinical studies in ophthalmology[1, 10], the optometric data suggest a wide range of current practice with reference to the design and analysis of data involving one or both eyes. Two main problems were identified in the optometric journals, viz., too many studies failed to use all the available data or did not analyse the data appropriately and these problems are the same as identified by Murdoch et al.  and Karakosta et al.. There was no evidence that these problems varied significantly among the three optometry journals or that the methods employed had changed markedly over the years reviewed.
A significant proportion of studies chose not to exploit or to avoid the between eye correlation by measuring one eye only[11, 12], a procedure which can result in the loss of statistical power. In addition, there was no consistency in the procedures employed for the selection of the measured eye. Where either eye could have been chosen, the majority of studies selected the right eye, fewer choosing a randomly selected eye[13-18], and even fewer the left eye. In some studies, the better or dominant eye was selected[19-23] while in other studies the eye was self-selected on clinical grounds, i.e., the eye in which the signs and symptoms of disease were most evident.[24-26] In a significant proportion of articles, the selection criteria were either not described at all or were unclear, similar findings to ophthalmology. Where either eye could be selected, the only statistically valid procedure is to select that eye at random unless an alternative can be justified as selection consistently of the right eye can result in bias. There may be systematic differences between right and left eyes. Hence, some conditions are more prevalent in either the left or right eye, e.g., early glaucomatous defects may favour the right eye as in certain types of migraine. As a consequence, selecting the right eye may provide a random sample of right eyes but is a biased sample of all eyes.
A smaller proportion of studies utilized data obtained from both eyes, and a significant proportion of these did so without correction for correlation, a result similar to ophthalmology. Such a procedure is likely to underestimate standard errors (S.E.), result in probability (P) values that are too low, and the calculation of imprecise CI, these problems becoming more profound as the degree of correlation between eyes increases.[4, 5] Some investigators attempt to avoid this problem by analysing data from one of the eyes only rejecting data from the fellow eye[11, 28-30]. This approach rejects valid data, reduces the potential power of the study, and raises ethical questions of subjecting patients to measurements that were not used in a subsequent analysis. In addition, some investigators average data from both eyes and the problems of using this procedure are discussed by Newcombe and Duff and Murdoch et al., or analyze data from each eye separately which avoids the problem of rejecting useful data. Averaging data from both eyes can be a useful procedure if the correlation between the two eyes is high (close to unity) and if a treatment is applied which affects both eyes equally. Obviously, averaging would not be recommended if the treatment is locally administered to one eye. However, as a result of averaging, the data analysis is likely to be less efficient and have less power as it does not utilize the fact that right and left eyes can be regarded as a ‘within subjects’ factor.[33-35]
A variety of statistical procedures are available to analyse data collected from both eyes in a variety of experimental circumstances (Table 3). Hence, a number of statistical tests specifically designed for correlated quantitative data have been described including those for non-parametric procedures such as the Wilcoxon test which compare medians from paired data[36, 37] and for linear regression.[38, 39] Between eye correlation can be measured using the intraclass correlation coefficient (ICC). The ICC measures the relationship between paired measurements from the same subject, i.e., right and left eyes and not pairs of measurements made on the same experimental unit, e.g., intraocular pressure (IOP) and corneal thickness made on a sample of right eyes for which Pearson's correlation coefficient would be appropriate. Various methods of calculating the ICC have been proposed[42-44], the usual method involving the calculation of the within-subject and between-subject components of variance from an anova. If, however, the two observations per subject vary in a predictable way, i.e., the dominant eye may always give higher values, then the method described by Rosner et al. which takes this bias into account can be used. Investigators may also wish to study in more detail the extent of agreement between a measurement made on the right and left eyes of a sample of subjects[45, 46] and this should be carried out using Bland and Altman's method.[47-49] The essential feature of a Bland and Altman plot is that for each pair of values the difference between them is plotted against the mean of the two values. The mean of all pairs of differences is known as the degree of bias. Either side of the bias line are the 95% confidence intervals in which it would be expected that 95% of the differences between the two methods would fall.
Table 3. Recommended procedures for the analysis of data from both eyes
|Mean, S.D. of a sample of right and left eyes||anova nested design with calculation of variance components ||Armstrong et al.|
|Comparing two groups (correlated data)||Modified Wilcoxon test||Rosner et al.[36, 37]|
|Comparing proportion of eyes with a feature (two samples)||Adjust variances of the different proportions by calculating asymptotic normal distribution||Fleiss et al.|
|Measure correlation between eyes (no systematic differences between eyes)||ICC||Bland and Altman|
|Measure correlation between eyes (systematic difference between eyes)||ICC||Rosner et al.|
|Linear regression||Various regression models||Glynn & Rosner,[38, 39]|
|Level of agreement between eyes||Bland and Altman test of agreement||Bland and Altman, McAlinden et al.|
|Treated eye, other as control (two-way)||paired t- test||Armstrong et al. |
|Treated eye, other as control (factorial design)||anova split-plot||Armstrong et al.|
To estimate the magnitude of a variable together with its variance from a sample of right and left eyes, a ‘random effects model’ anova could be used. In a random effects model, the objective is not to measure the fixed effect of a treatment but to estimate the degree of variation of a particular measurement and to compare different sources of variation. These designs are also called nested or hierarchical designs. The most important statistics from a random effects model are the ‘components of variance’ which estimate the variance associated with each of the sources of variation. The components of variance can be used to calculate appropriate S.D. and CI if required but S.D. can also be obtained from the anova when calculating the ICC.
If a hypothesis test that the proportions of eyes with a particular characteristic is similar in two groups, involving data collected from right and left eyes, is required than the procedure of Fleiss et al. , which accounts for the correlation between eyes, can be used and is described in detail by Karakosta et al.. Essentially, an asymptotic approach is adopted with variance inflation factors applied to adjust the variance of the difference in proportions and to calculate an appropriately adjusted Z statistic.
A useful method of dealing with the two-eye problem is to exploit the correlation between eyes in clinical experiments[50-52]. The simplest experimental design of this type is a two-way design in which each treatment is allocated at random to the eyes of each subject separately.[53, 54] Originally the terminology randomised blocks was applied to this type of design by Fisher because it was first used in agricultural experiments in which treatments were applied to units within blocks of land. Hence, plots within a block analogous to eyes within a subject, tend to respond more similarly compared with plots in different blocks or eyes from different subjects. If no other factors are involved, then the appropriate analysis would be a paired sample t- test or a two-way anova in randomised blocks.
In a more complex experimental design, different treatments could be given, at random, to the right and left eyes of human subjects employing two or more different subject groups. In such an experiment, the subject group would be regarded as a major factor while right/left eye would be regarded as a minor factor. This type of factorial design is best described as a split-plot factorial. The difference between this and an ordinary factorial design is that in a completely randomised experiment, all subjects are allocated to treatment combinations at random whereas in a split-plot design, subjects can only be allocated at random to the main treatment groups, the sub-plot treatments then being randomised to right and left eyes within each subject. Hence, in a two-factor, split-plot anova, there are two error terms, the main-plot error is used to test the main effect of subject group while the sub-plot error is used to test the main effect of eyes and the possible interaction between the factors. With reference to the design of experiments employing these analyses, it should be noted that statistical power of the analysis will vary with the degree of correlation between the eyes. In general, as the correlation decreases, a larger sample size will be needed to provide a specified power because of the increased variability. Hence, some knowledge of the ICC between eyes in a specific circumstance is useful in designing the experiment efficiently.
To illustrate the analyses, anova is applied to the analysis of axon counts from the right and left optic nerves of twelve control subjects and twelve subjects with Alzheimer's disease (AD) (Table 4). The density of axons was quantified using an image analysis system. Each section of the optic nerve was divided into four approximately equal quadrants. A sample field, approximately 2000 μm2 in area was located within each quadrant, as close as possible to the center of the section, and the number of axons present in the field counted and averaged for the four fields. Three different types of anova are illustrated. First, using the data from control subjects only, total variation was partitioned into that associated with subjects (between subjects) () and between eyes nested within subjects (). The components of variance indicate that the between subjects variance is approximately eight times that between eyes within a subject. Second, using control subjects only, the data were analysed as a two-way anova in which the total variance was partitioned into that associated with subjects, which was highly significant (F = 15.82, P = 0.0003), and between eyes which was not significant (F = 0.17, P = 0.67). In the third example, the data were analysed as a two-factor, split-plot anova with patient group as the main plot factor and eyes as a sub-plot factor. The data suggested a significant reduction in axon counts in AD compared with the control group (F = 17.34, P = 0.004) but with no significant differences between eyes (F = 1.11, P = 0.30), the interaction suggesting that the difference between control and AD was similar for right and left eye (F = 0.33, P = 0.57).
Table 4. Mean axon densities per sample field (2000 μm2) in the right (R) and left (L) optic nerves of twelve normal subjects and twelve subjects with Alzheimer's disease (AD)
|1. Nested analysis of variance (anova) with calculation of components of variance for the control data only|
|Source||DF||SS||MS||Components of variance|
|Subjects||11||912155||82923|| = 39023|
|Eyes within subjects||12||58525||4877|| = 4877|
|2. A two-way anova for the control data only|
|Subjects||11||912155||82923||15.82 (P = 0.0003)|
|Eyes||1||876||876||0.17 (P = 0.67)|
|3. A two-factor split-plot ANOVA comparing control with AD patients|
|Group||1||1097168||1097168||17.34 (P = 0.0004)|
|Main-plot error||22||1392056||63275|| |
|Eyes||1||8454||8454||1.11 (P = 0.30)|
|Group × Eyes||1||2509||2509||0.33 (P = 0.57)|
|Sub-plot error||22||167889||7631.3|| |