Measurements obtained from the right and left eye of a subject are often correlated whereas many statistical tests assume observations in a sample are independent. Hence, data collected from both eyes cannot be combined without taking this correlation into account. Current practice is reviewed with reference to articles published in three optometry journals, viz., Ophthalmic and Physiological Optics (OPO), Optometry and Vision Science (OVS), Clinical and Experimental Optometry (CEO) during the period 2009–2012.

Recent findings

Of the 230 articles reviewed, 148/230 (64%) obtained data from one eye and 82/230 (36%) from both eyes. Of the 148 one-eye articles, the right eye, left eye, a randomly selected eye, the better eye, the worse or diseased eye, or the dominant eye were all used as selection criteria. Of the 82 two-eye articles, the analysis utilized data from: (1) one eye only rejecting data from the adjacent eye, (2) both eyes separately, (3) both eyes taking into account the correlation between eyes, or (4) both eyes using one eye as a treated or diseased eye, the other acting as a control. In a proportion of studies, data were combined from both eyes without correction.

Summary

It is suggested that: (1) investigators should consider whether it is advantageous to collect data from both eyes, (2) if one eye is studied and both are eligible, then it should be chosen at random, and (3) two-eye data can be analysed incorporating eyes as a ‘within subjects’ factor.

Clinical studies in optometry often collect data from either one or both eyes of a subject. A recent survey of ophthalmology journals, however, suggested a variety of different approaches both to eye selection in ‘one-eye’ studies and methods of analysis in ‘two-eye’ studies. Many studies did not describe clearly the procedures used or violated the statistical assumptions of independence of the data.[1]

There are a number of issues raised by the decision to collect data from one or both eyes. First, if one eye per subject is studied, then how is that eye to be selected? Second, if data from both eyes are collected, how should the data be analysed? Measurements obtained from right and left eyes are usually correlated[1, 2] whereas many statistical procedures, such as t- tests, analysis of variance (anova), confidence intervals (CI), or linear regression assume that observations are an independent sample of the population.[3] An important problem in testing hypotheses is the possibility of making a Type 1 error, i.e., rejecting the null hypothesis (H_{o}) when it is true. Since, the variance between eyes is usually less than that between subjects, the overall variance of a sample of measurements combined from both eyes is likely to be an underestimate of the true variance resulting in an increased risk of a Type 1 error. Hence, data collected from both eyes from a sample of subjects cannot be combined without taking the correlation into account. If measurements are included from both eyes without consideration of their mutual correlation, there may be a significant effect on the results of the experiment.[4, 5] Third, there may be an advantage in using both eyes in a study, especially in an experimental context, as one eye may be used as a control for the other, rather than recruiting a separate control population.[6, 7] An experimental treatment can be applied in healthy subjects to one eye, selected at random (the treated eye), the other acting as a control. In addition, in diseases which are essentially monocular and the fellow eye essentially healthy, one eye can be regarded as the ‘diseased’ eye and the other the control. Moreover, information concerning the diseased eye may be obtained from the fellow eye[8] either by application of the conditional model of Rosner[9] or marginal models in which one directly models the marginal probabilities of disease for each eye.[8] These methods have been little used in optometry but enable diagnostic information from both eyes to be used explicitly in diagnosis.

The purpose of this article is to provide statistical advice for authors carrying out clinical studies in optometry which involves the question of whether to collect data from one or both eyes. First, current practice is reviewed with reference to articles published in three optometric journals, viz., Ophthalmic and Physiological Optics (OPO), Optometry and Vision Science (OVS), Clinical and Experimental Optometry (CEO) during the period 2009–2012. Second, statistical advice relevant to the analysis of data from both eyes is described in a variety of experimental contexts.

Methods

Journals

All of the articles published in three optometric journals, viz., OPO, OVS, and CEO in the period 2009-2012 were initially reviewed. Articles involving animal or laboratory studies were then eliminated. The remaining 230 articles were divided into two groups: (1) those in which data were collected from one eye only and (2) those in which data were collected from both eyes. In the one-eye studies, articles were classified according to how the eye was selected: viz., right eye, left eye, a randomly selected eye, dominant eye, better eye, i.e., eye with better visual acuity (VA), and worse or diseased eye. In the two-eye studies, articles were classified according to how the data were analysed: (1) using one eye only rejecting data from the fellow eye, (2) using both eyes but analysed separately, (3) using both eyes, the analysis taking into account the correlation between eyes, (4) using both eyes in which one eye is the ‘treated’ or ‘diseased’ eye, the other acting as a control, or (5) using data combined from both eyes but without correction for correlation.

Data analysis

Differences in the distribution of frequencies were compared among the three journals (totalled over years) and the four years of the study (totalled over journals) using chi-square (χ^{2}) contingency table tests.

Results

Of the 230 articles reviewed for this study, published in the period 2009–2012, 148/230 (64%) obtained data from one eye and 82/230 (36%) obtained data from both eyes.

Of the 148 one-eye studies (Table 1), a variety of methods of selecting that eye were used: 52/148 (35%) selected the right eye, 3/148 (2%) the left eye, 19/148 (13%) a randomly selected eye, 34/148 (23%) the better or the worse/diseased eye, 5/148 (3%) the dominant eye, and in 35/148 (24%) no selection criteria were given. There were no significant differences in the distribution of these frequencies among journals (χ^{2} = 14.48, 12DF, P = 0.31) or years (χ^{2} = 12.93, 15DF, P = 0.60).

Table 1. Frequency of different methods of selecting the eye in studies employing one eye only in articles published in three optometry journals 2009 – 2012 (OPO, Ophthalmic and Physiological Optics; OVS, Optometry and Vision Science; CEO, Clinical and Experimental Optometry; N, number of articles)

Of the 82 two-eye studies (Table 2): (1) 18/82 (22%) made measurements on both eyes but analysed data from one eye only, most commonly the right eye, (2) 10/82 (12%) analysed data from both eyes separately, (3) 10/82 (12%) analysed data from both eyes, taking into account the correlation between eyes, (4) 15/82 (18%) analysed both eyes using one eye as a treated or diseased eye, and the other as a control, and (5) 29/82 (35%) analysed both eyes either without correction or it was unclear exactly how the data had been analysed. Where data from both eyes were analysed, a variety of methods of analysis were employed including clustered anova, nested anova, anova with right and left eye included as a ‘within subject’ factor, and the Bland and Altman test of agreement. There were no significant differences in the distribution of the frequencies of analysis in two-eye studies among journals (χ^{2 }= 7.44, 8DF, P = 0.51) or years (χ^{2} = 16.91, 15DF, P = 0.32).

Table 2. Frequency of different methods of analysis of data employing both eyes in articles published in three optometry journals 2009 – 2012 (OPO, Ophthalmic and Physiological Optics; OVS, Optometry and Vision Science; CEO, Clinical and Experimental Optometry; N, Number of articles)

As in previous reports reviewing clinical studies in ophthalmology[1, 10], the optometric data suggest a wide range of current practice with reference to the design and analysis of data involving one or both eyes. Two main problems were identified in the optometric journals, viz., too many studies failed to use all the available data or did not analyse the data appropriately and these problems are the same as identified by Murdoch et al. [10] and Karakosta et al.[1]. There was no evidence that these problems varied significantly among the three optometry journals or that the methods employed had changed markedly over the years reviewed.

A significant proportion of studies chose not to exploit or to avoid the between eye correlation by measuring one eye only[11, 12], a procedure which can result in the loss of statistical power.[1] In addition, there was no consistency in the procedures employed for the selection of the measured eye. Where either eye could have been chosen, the majority of studies selected the right eye, fewer choosing a randomly selected eye[13-18], and even fewer the left eye. In some studies, the better or dominant eye was selected[19-23] while in other studies the eye was self-selected on clinical grounds, i.e., the eye in which the signs and symptoms of disease were most evident.[24-26] In a significant proportion of articles, the selection criteria were either not described at all or were unclear, similar findings to ophthalmology.[1] Where either eye could be selected, the only statistically valid procedure is to select that eye at random unless an alternative can be justified as selection consistently of the right eye can result in bias. There may be systematic differences between right and left eyes. Hence, some conditions are more prevalent in either the left or right eye, e.g., early glaucomatous defects may favour the right eye as in certain types of migraine.[27] As a consequence, selecting the right eye may provide a random sample of right eyes but is a biased sample of all eyes.

A smaller proportion of studies utilized data obtained from both eyes, and a significant proportion of these did so without correction for correlation, a result similar to ophthalmology.[1] Such a procedure is likely to underestimate standard errors (S.E.), result in probability (P) values that are too low, and the calculation of imprecise CI, these problems becoming more profound as the degree of correlation between eyes increases.[4, 5] Some investigators attempt to avoid this problem by analysing data from one of the eyes only rejecting data from the fellow eye[11, 28-30]. This approach rejects valid data, reduces the potential power of the study, and raises ethical questions of subjecting patients to measurements that were not used in a subsequent analysis. In addition, some investigators average data from both eyes and the problems of using this procedure are discussed by Newcombe and Duff[31] and Murdoch et al.[10], or analyze data from each eye separately[32] which avoids the problem of rejecting useful data. Averaging data from both eyes can be a useful procedure if the correlation between the two eyes is high (close to unity) and if a treatment is applied which affects both eyes equally.[1] Obviously, averaging would not be recommended if the treatment is locally administered to one eye. However, as a result of averaging, the data analysis is likely to be less efficient and have less power as it does not utilize the fact that right and left eyes can be regarded as a ‘within subjects’ factor.[33-35]

A variety of statistical procedures are available to analyse data collected from both eyes in a variety of experimental circumstances (Table 3). Hence, a number of statistical tests specifically designed for correlated quantitative data have been described including those for non-parametric procedures such as the Wilcoxon test which compare medians from paired data[36, 37] and for linear regression.[38, 39] Between eye correlation can be measured using the intraclass correlation coefficient (ICC). The ICC measures the relationship between paired measurements from the same subject, i.e., right and left eyes[40] and not pairs of measurements made on the same experimental unit, e.g., intraocular pressure (IOP) and corneal thickness made on a sample of right eyes for which Pearson's correlation coefficient would be appropriate.[41] Various methods of calculating the ICC have been proposed[42-44], the usual method involving the calculation of the within-subject and between-subject components of variance from an anova. If, however, the two observations per subject vary in a predictable way, i.e., the dominant eye may always give higher values, then the method described by Rosner et al.[36] which takes this bias into account can be used. Investigators may also wish to study in more detail the extent of agreement between a measurement made on the right and left eyes of a sample of subjects[45, 46] and this should be carried out using Bland and Altman's method.[47-49] The essential feature of a Bland and Altman plot is that for each pair of values the difference between them is plotted against the mean of the two values. The mean of all pairs of differences is known as the degree of bias. Either side of the bias line are the 95% confidence intervals in which it would be expected that 95% of the differences between the two methods would fall.

Table 3. Recommended procedures for the analysis of data from both eyes

Objective

Procedure

Reference

anova, Analysis of variance; ICC, Intra-class correlation coefficient; S.D., Standard deviation.

Mean, S.D. of a sample of right and left eyes

anova nested design with calculation of variance components

To estimate the magnitude of a variable together with its variance from a sample of right and left eyes, a ‘random effects model’ anova[33] could be used. In a random effects model, the objective is not to measure the fixed effect of a treatment but to estimate the degree of variation of a particular measurement and to compare different sources of variation. These designs are also called nested or hierarchical designs.[35] The most important statistics from a random effects model are the ‘components of variance’ which estimate the variance associated with each of the sources of variation. The components of variance can be used to calculate appropriate S.D. and CI if required but S.D. can also be obtained from the anova when calculating the ICC.

If a hypothesis test that the proportions of eyes with a particular characteristic is similar in two groups, involving data collected from right and left eyes, is required than the procedure of Fleiss et al. [40], which accounts for the correlation between eyes, can be used and is described in detail by Karakosta et al.[1]. Essentially, an asymptotic approach is adopted with variance inflation factors applied to adjust the variance of the difference in proportions and to calculate an appropriately adjusted Z statistic.

A useful method of dealing with the two-eye problem is to exploit the correlation between eyes in clinical experiments[50-52]. The simplest experimental design of this type is a two-way design in which each treatment is allocated at random to the eyes of each subject separately.[53, 54] Originally the terminology randomised blocks was applied to this type of design by Fisher because it was first used in agricultural experiments in which treatments were applied to units within blocks of land. Hence, plots within a block analogous to eyes within a subject, tend to respond more similarly compared with plots in different blocks or eyes from different subjects.[35] If no other factors are involved, then the appropriate analysis would be a paired sample t- test or a two-way anova in randomised blocks.[41]

In a more complex experimental design, different treatments could be given, at random, to the right and left eyes of human subjects employing two or more different subject groups.[33] In such an experiment, the subject group would be regarded as a major factor while right/left eye would be regarded as a minor factor. This type of factorial design is best described as a split-plot factorial.[35] The difference between this and an ordinary factorial design is that in a completely randomised experiment, all subjects are allocated to treatment combinations at random whereas in a split-plot design, subjects can only be allocated at random to the main treatment groups, the sub-plot treatments then being randomised to right and left eyes within each subject. Hence, in a two-factor, split-plot anova, there are two error terms, the main-plot error is used to test the main effect of subject group while the sub-plot error is used to test the main effect of eyes and the possible interaction between the factors. With reference to the design of experiments employing these analyses, it should be noted that statistical power of the analysis will vary with the degree of correlation between the eyes. In general, as the correlation decreases, a larger sample size will be needed to provide a specified power because of the increased variability. Hence, some knowledge of the ICC between eyes in a specific circumstance is useful in designing the experiment efficiently.

To illustrate the analyses, anova is applied to the analysis of axon counts from the right and left optic nerves of twelve control subjects and twelve subjects with Alzheimer's disease (AD) (Table 4). The density of axons was quantified using an image analysis system.[55] Each section of the optic nerve was divided into four approximately equal quadrants. A sample field, approximately 2000 μm^{2} in area was located within each quadrant, as close as possible to the center of the section, and the number of axons present in the field counted and averaged for the four fields. Three different types of anova are illustrated. First, using the data from control subjects only, total variation was partitioned into that associated with subjects (between subjects) (Ïƒ2+2Ïƒs2) and between eyes nested within subjects (Ïƒ2). The components of variance indicate that the between subjects variance is approximately eight times that between eyes within a subject. Second, using control subjects only, the data were analysed as a two-way anova in which the total variance was partitioned into that associated with subjects, which was highly significant (F = 15.82, P = 0.0003), and between eyes which was not significant (F = 0.17, P = 0.67). In the third example, the data were analysed as a two-factor, split-plot anova with patient group as the main plot factor and eyes as a sub-plot factor. The data suggested a significant reduction in axon counts in AD compared with the control group (F = 17.34, P = 0.004) but with no significant differences between eyes (F = 1.11, P = 0.30), the interaction suggesting that the difference between control and AD was similar for right and left eye (F = 0.33, P = 0.57).

Table 4. Mean axon densities per sample field (2000 μm^{2}) in the right (R) and left (L) optic nerves of twelve normal subjects and twelve subjects with Alzheimer's disease (AD)

Control

AD

Subject

R

L

R

L

A

673

766

538

377

B

899

956

583

555

C

616

605

696

298

D

749

858

568

583

E

1078

1017

649

700

F

978

861

284

458

G

706

569

862

746

H

1005

991

848

774

I

1420

1258

716

698

J

1003

997

508

563

K

818

982

378

374

L

761

701

621

633

Analyses

1. Nested analysis of variance (anova) with calculation of components of variance for the control data only

Source

DF

SS

MS

Components of variance

Subjects

11

912155

82923

Ïƒ2+Ïƒs2 = 39023

Eyes within subjects

12

58525

4877

Ïƒ2 = 4877

2. A two-way anova for the control data only

Source

DF

SS

MS

F

Subjects

11

912155

82923

15.82 (P = 0.0003)

Eyes

1

876

876

0.17 (P = 0.67)

Error

11

57649

5241

3. A two-factor split-plot ANOVA comparing control with AD patients

Group

1

1097168

1097168

17.34 (P = 0.0004)

Main-plot error

22

1392056

63275

Eyes

1

8454

8454

1.11 (P = 0.30)

Group × Eyes

1

2509

2509

0.33 (P = 0.57)

Sub-plot error

22

167889

7631.3

Concluding remarks and advice

A flow chart summarising the major points and relevant advice is given in Figure 1.

In any study, consider whether it is advantageous to collect data from both eyes, which may reduce the number of subjects that have to be recruited and potentially increase the power of the study.

If only one eye is included and if both eyes are eligible, then the eye should be selected at random unless an alternative can be justified. A sample of such eyes can be analysed using conventional statistics.[41]

If one eye is chosen on the basis of clinical criteria, then investigators should consider whether the alternate eye could be used as a control rather than recruiting a separate group of subjects as a control. If one eye is chosen and a separate control group recruited, then the data can be analysed using conventional statistics.

If both eyes are included in a study, then the correlation between eyes should be assessed using the ICC. If the correlation is close to one, then data from both eyes could be averaged or one eye selected at random for analysis using conventional statistics

If the correlation is less than one, a variety of statistical procedures are available to analyse data collected from both eyes in a variety of experimental circumstances including for the Wilcoxon test[36, 37], linear regression[38, 39], and the Bland and Altman method of measuring agreement.[47, 49]

If the objective is to assess the magnitude and variability of a measurement, then a nested classification anova should be considered which includes the calculation of variance components, viz., between eyes within a subject and between subjects. Suitable S.D. can also be obtained when calculating the ICC.

If a hypothesis test is required that the proportions of eyes with a particular characteristic is similar in two groups, involving data collected from right and left eyes, than the procedure of Fleiss et al.[40] can be used.

If eyes are used as a ‘within subject’ variable in an experiment, the data can be analysed using a paired sample t- test or two-way anova in randomised blocks (single factor) or a factorial split-plot anova (more than two factors).

Investigators should clearly describe the design of their study, provide a rationale for their choice of one or both eyes, the selection criteria applied if one eye is chosen, and describe the appropriate data analysis.

Richard Armstrong was educated at King's College London (1968–1971) and subsequently at St. Catherine's College Oxford (1972–1976). His early research involved the application of statistical methods to problems in botany and ecology. For the last 36 years, he has been a lecturer successively in Botany, Microbiology, Ecology, Neuroscience, and Optometry at the University of Aston. His current research interests include the application of quantitative methods to the study of neuropathology of neurodegenerative diseases with special reference to vision and the visual system.