Visual mechanisms governing the perception of auto-stereograms

Authors


Dr J Antonio Aznar-Casanova, Faculty of Psychology, University of Barcelona, Passeig de la Vall d'Hebron, 171, E08035, Barcelona, SPAIN. E-mail: jaznar2@ub.edu

Abstract

Background:  Single image random dot stereograms (SIRDS) have been used to study diverse visual parameters and skills. The aim of the present study was to identify the main optometric factors involved in the perception of SIRDS and to obtain a discriminant model to categorise our participants in terms of their skill in perceiving SIRDS.

Methods:  Response time was determined to assess the ability of 69 participants to perceive the hidden three-dimensional shape in an auto-stereogram presented under controlled conditions, whereupon three skill level groups were defined. The same participants were administered a battery of optometric tests to evaluate various aspects of accommodation and convergence, as well as stereopsis and phoria. Linear discriminant analysis, which served to examine the relationship between response times and the evaluated visual parameters and skills, provided a set of discriminant functions (or model), thus allowing for the categorisation of participants according to their skill to perceive SIRDS.

Results:  Two discriminant functions were obtained, which allowed for an overall predictive accuracy of 66.67 per cent (p = 0.024), with a higher predictive accuracy for groups 1 (minimum time less than 10 seconds, 78.26 per cent) and 2 (minimum time greater than 10 seconds, 75.86 per cent) than for group 3 (SIRDS not perceived, 35.29 per cent). Stereoacuity, negative relative convergence, phoria at near and, to a lesser extent, the accommodative convergence and accommodation ratio were found to be the most relevant discriminant variables, although between-group statistically significant differences were only disclosed for stereoacuity (p = 0.001) and negative relative convergence (p = 0.003).

Conclusion:  The ability to perceive SIRDS was related to many visual parameters and skills, including, but not limited to, stereoacuity and negative relative convergence. It is uncertain whether SIRDS might be considered a useful tool in clinical practice.

One of the most amazing characteristics of the binocular visual system is the ability to integrate two two-dimensional (2D) images into a single stereoscopic three-dimensional (3D) image. The importance of binocular vision in guiding our interaction with the environment is shown in almost every aspect of daily life. Stereoscopic vision might be evaluated through the dichoptic presentation of stimuli, as in random dot stereograms (RDS), first developed by Julesz in 1960.1–3

A single image random dot stereogram (SIRDS) consists in a single two-dimensional image containing almost identical horizontally repeating patterns. When viewed with the proper vergence, a hidden three-dimensional scene is observed as emerging from the plane of the stereogram.4,5 In contrast to random dot stereograms, SIRDS do not require any external device to dichoptically present two images to the observer. In addition, SIRDS might be designed with larger disparities than random dot stereograms.6

The observation of SIRDS is partly based on the well known wallpaper illusion first documented in 1844 by Brewster,7 who described how a horizontally repetitive pattern appeared to shift in depth, either behind (parallel fixation) or forward (crossed fixation) with reference to the plane of the wallpaper. The observer attempts to match two consecutive patterns, which appear to originate in the same object, through the left and right eyes, thus creating a shift in the perceived depth of the object.8 A geometrical explanation of this phenomenon in the case of parallel fixation is shown in Figure 1. This figure shows an observer with the plane of accommodation coincident with the plane of the target (repetitive pattern), while, at the same time, the plane of convergence is located behind the plane of accommodation. This dissociation between accommodation and convergence serves to reveal the apparent or hidden object.

Figure 1.

Observation of a horizontally repeating pattern might lead, in uncrossed disparity, to the perception of an ‘apparent’ or virtual object located behind the plane of accommodation.

As depth perception depends on the angle of convergence, the location of the ‘apparent’ or virtual object is determined by the distance between identical elements in the auto-stereogram. The repeating pattern in SIRDS consists in a vertical strip containing a random dot map,4,9,10 thus allowing for a multiplicity of possible distances between repeated objects (individual dots) to be erroneously matched by the observer's visual system (the well-known false matching reported by Marr in 198211), resulting in many possible planes in depth and leading to the impression of volume (Figure 2).

Figure 2.

A: Single image random dot stereogram (SIRDS), in which a DOG-2D (difference of two two-dimensional Gaussians) function might be observed. B: Luminance map needed to generate the corresponding disparity or depth map.

When viewing a real scene, accommodation and convergence normally function as an integrated, coupled mechanism. For the correct visualisation of SIRDS, a disassociation of the planes of accommodation and convergence must occur,12,13 as displayed in Figure 1.

It is common practice for optometrists to examine the accommodation/convergence relationship by measuring the amplitude of the zone of clear single binocular vision (ZCSBV), which provides information about the ability to uncouple the planes of accommodation and convergence.14 The ZCSBV is assessed through the conjoint evaluation of the positive and negative relative convergence (PRC and NRC), that is, the amount of base-out or base-in prisms until blur, respectively, and positive and negative relative accommodation (PRA and NRA), that is, the maximum ability to stimulate or relax accommodation, respectively, while maintaining clear, single binocular vision. In addition, the relationship between accommodative convergence and accommodation (AC/A ratio)13 and the presence of a small heterophoria might also influence the visualisation of SIRDS, as well as the natural stereoacuity of the observer.

The ability to perceive SIRDS has been investigated with reference to the visual skills of the observer,15,16 mainly as a means to develop new visual tests. The self-reported skill to visualise SIRDS has been found to be highly predictive of stereoacuity, as measured by the TNO test.17 Other investigators have explored the association between the time required to perceive SIRDS and a variety of visual skills, noting that most observers can correctly see the hidden stereo-image in less than 20 seconds provided they initiate or maintain the proper amount of divergence.18 The inability of some observers to perceive SIRDS, even after several attempts lasting longer than one minute, has been explained by their ignorance of the proper viewing strategy, that is, their lack of information or practice regarding the need to disassociate accommodation and convergence.18–20

To the best of our knowledge, it is not clear why some observers with good stereoacuity fail to perceive SIRDS. Therefore, a study was designed to identify the visual mechanisms involved in the ability to perceive the hidden stereo-image in SIRDS, as well as to obtain a discriminant model to categorise our participants in terms of their skill in perceiving SIRDS. Participants were grouped into three different categories according to their ability to perceive SIRDS and the predictive value of various visual aspects was examined.

METHODS

Participants

A total of 69 healthy volunteers (15 men, 54 women) aged between 21 and 28 years (mean age 23.43 ± 4.85 years) participated in the study. Participants were recruited from the student population of the Technical University of Catalonia. Only participants with monocular and binocular distance and near visual acuity equal to 1.0 or better (decimal notation) were included in the study. Exclusion criteria were manifest binocular visual imbalance, colour vision anomalies, existing ocular pathology, ongoing ocular treatment and a history of ocular or refractive surgery.

All participants provided written informed consent after the nature of the study was explained to them. The study was conducted in accord with the Declaration of Helsinki tenets of 1975 (as revised in Tokyo in 2004) and was approved by the institutional ethical board of the Technical University of Catalonia.

Experimental setting

The same random dot auto-stereogram from the book ‘Magic Eye: A New Way of Looking at the World21 was used as the target stimulus throughout the study. This auto-stereogram, which showed a repeating pattern of red roses with a heart as the hidden 3D shape, displayed a size of 19 cm by 25.2 cm and was placed on a lectern in front of the observer. Luminance of the stimulus remained constant at 100 cd/m2.

Procedure

Once the sample was defined by adhering to the mentioned inclusion and exclusion criteria, all participants were administered a complete visual examination in terms of the visual parameters and skills assumed to govern the perception of SIRDS. This part of the study was conducted by a single experienced optometrist, who was unaware of the aims of the investigation.

The following aspects were examined:

  • 1near vision interpupillary distance at 40 cm (NID), with a pupillometer (HX-400 PD Meter) (Lianyungang Z&H Trading Co., Ltd., Lianyungang City, China)
  • 2stereoacuity, measured with the TNO test22 at 40 cm
  • 3phoria in distance and near vision, measured with the cover test and a handheld prism bar
  • 4positive and negative relative accommodation
  • 5positive and negative relative convergence (measured at 40 cm)
  • 6the AC/A ratio, evaluated with the gradient method.

These testing procedures are well described in published literature.23

Participants were seated in front of a table provided with a head and chin rest to prevent unwanted head movements. The auto-stereogram was placed on a non-fixed lectern allowing the participants to adjust their observation distance. This lectern was initially moved to a very short distance from the observers, and the participants were instructed to slowly and progressively increase the observation distance until the hidden 3D shape became visible. A time limit of 90 seconds was considered adequate to decide whether the observation of SIRDS was possible or not, and all participants were encouraged to keep trying, even allowing for the observation distance to be decreased, until this time limit was reached. Once the hidden shape was visible, participants had to press a button, thus registering the minimum time required to perceive the auto-stereogram, whereupon they were asked to describe the 3D shape they saw. As previously reported,24 this length of time helped categorise all participants in terms of their ability to perceive SIRDS.

It is interesting that none of our participants had had any previous experience with SIRDS, nor were they subject to any specific training prior to the testing session. In addition, participants did not receive any instructions on how to perceive the stereogram.

Data analysis

Data were analysed by means of linear discriminant analysis (LDA). This procedure is useful for classifying a set of subjects into several predetermined categories or groups according to a set of independent variables called predictors or discriminant variables (x1, x2, . . . , xn). The groups are determined by the values of the dependent or grouping variable. The model is based on a set of participants (‘training set’) with known category and known values of the discriminant variables, whereupon the LDA constructs one or more linear functions of the predictors or discriminant functions (f=a0+a1x1+a2x2+ . . . +an xn), which allow for the categorisation of new subjects with known values of their discriminant variables. The number of discriminant functions is defined as (g − 1), where g is the number of categories. As is recommended when the sample size is relatively small,25 discriminant functions were developed for the entire sample and then used to classify the participants of the same study group into the established categories.

Three skill levels or categories were defined according to the minimum time (RT) required to perceive the hidden 3D shape in the auto-stereogram (dependent variable): group 1 (minimum time less than 10 seconds); group 2 (minimum time more than 10 seconds); and group 3 (SIRDS were not perceived). We opted for 10 seconds as our cut-off value, as it corresponded to the median of our study sample (time interval values were not normally distributed) (Figure 3). Dis criminant or predictor variables were NID, stereoacuity, phoria in distance and near vision, PRC, NRC, PRA, NRA and the AC/A ratio.

Figure 3.

Minimum time required to perceive a single image random dot stereogram (SIRDS). The median time of 10 seconds was selected as a cut-off value to categorise the participants.

The LDA was performed with the SPSS software (version 17.0, SPSS Inc., Chicago, IL, USA).

RESULTS

A summary of the discriminant variables for each skill level group is displayed in Table 1. It might be noted that negative relative accommodation values are higher than should be expected for a testing distance of 40 cm, which might suggest that our study group included some low, non-corrected hyperopes. Non-linear relationships are not reflected in the discriminant functions unless specific variable transformations are made to represent non-linear effects. This was the case with both near and distant phorias. Therefore, it was assumed that participants with a slight heterophoria (phoria = -1) required less time to perceive SIRDS and a new variable was defined as the absolute value of the phoria plus 1 [abs(phoria + 1)].

Table 1. Descriptive statistics (means and standard deviations) of all predictor values for each group
Discriminant variablesGroup 1Group 2Group 3
MeanSDMeanSDMeanSD
  1. AC/A: accommodative convergence/accommodation ratio, NID: near interpupillary distance at 40 cm, NRA: negative relative accommodation, NRC: negative relative convergence, PRA: positive relative accommodation, PRC: positive relative convergence, SD: standard deviation

NID56.322.7555.512.2756.202.64
Stereo acuity55.0043.0955.8630.23154.41167.51
Abs (Phoria + 1) at near2.041.663.653.283.003.46
Abs (Phoria + 1) at far1.951.493.244.282.472.47
PRA-3.371.48-3.921.82-3.752.10
NRA3.210.643.510.843.410.81
PRC27.6510.5628.3111.1625.239.99
NRC13.653.1117.696.6612.823.81
AC/A2.301.262.371.262.671.97

The relative role of each discriminant variable in the categorisation of our participants (Table 1) revealed that, whereas stereoacuity values could discriminate between group 3 and groups 1 and 2 but failed to differentiate between group 1 and group 2, negative relative convergence and phoria at near could accurately discriminate between group 1 and group 2 (a preliminary analysis revealed a shorter minimum time in exophoria than in esophoria). When submitted to an ANOVA analysis, differences between groups were found to reach statistical significance only for stereoacuity (p = 0.001) and NRC (p = 0.003).

Table 2 summarises the standardised coefficients of the discriminant functions f1 and f2 resulting from the LDA. The relative values of these coefficients show that stereoacuity (-0.633 and 0.746), phoria at near (0.683 and 0.664) and NRC (0.651 and 0.424) were the most relevant predictors for the categorisation of the participants in the present study. To a lesser extent, the AC/A ratio (-0.232 and -0.330) could also be considered a relevant variable.

Table 2. Standardised coefficients of the canonical discriminant functions
VariablesCoefficients f1Coefficients f2
  1. AC/A: accommodative convergence/accommodation ratio, NID: near interpupillary distance at 40 cm, NRA: negative relative accommodation, NRC: negative relative convergence, PRA: positive relative accommodation, PRC: positive relative convergence, SD: standard deviation

NID0.016-0.174
Stereoacuity-0.6330.746
Abs (Phoria + 1) at near0.6830.664
Abs (Phoria + 1) at far-0.139-0.212
PRA-0.056-0.260
NRA-0.0250.242
PRC0.034-0.285
NRC0.6510.424
AC/A ratio-0.232-0.330

The localisations of the participants according to the values obtained from both discriminant functions are shown in Figure 4. The group centroid for each of the three skill level categories is also represented. The horizontal and vertical separation between group centroids reflects the higher discrimination power of f1 in comparison to f2.

Figure 4.

Location of each observer according to the values obtained from both canonical discriminant functions. The group centroid for each of the three skill level categories is also shown.

The discriminant functions served as a model to assign a group to each participant in accordance with the values of their respective predictor variables. Table 3 compares the actual classification of all participants with the one predicted by the model. It might be observed that, overall, 66.67 per cent of participants were correctly categorised in one of the three skill level groups, although the discriminant functions were found to present a higher predictive accuracy for groups 1 (78.26 per cent) and 2 (75.86 per cent) than for group 3, in which case only 35.29 per cent of participants previously identified as unable to perceive SIRDS were correctly allocated through discriminant analysis of the predictor variables.

Table 3. Model prediction versus actual group allocation in number and percentage of participants
Actual groupGroup sizePredicted group
123
  1. Overall correct categorisation: 66.67%

12318 (78.26%)2 (8.70%)3 (13.04%)
2297 (24.14%)22 (75.86%)0 (0.00%)
3177 (41.18%)4 (23.53%)6 (35.29%)

It must be noted that our statistical analysis revealed a certain overlap between groups, as shown by small eigenvalues of the discriminant functions and moderate values of the canonical correlation of the discriminant functions (Table 4), as well as Wilks' lambda values close to 1 (Table 5). Contrast of functions disclosed a statistically significant p-value of 0.024 for the whole model (f1 to f2), thus concluding that the model (including both functions) could be considered as discriminating.

Table 4. Eigenvalues, relative percentage of variance explained by each discriminant function and canonical correlation of the discriminant functions (small eigenvalues of the discriminant functions and moderate values of the canonical correlation of the discriminant functions reveal a certain overlap between groups)
Discriminant functionEigenvalueRelative percentageCanonical correlation
f10.38665.650.528
f20.20234.350.410
Table 5. Wilks' lambda and p-values testing the discriminating power of our model (Wilks' lambda values close to 1 describe a small overlap between groups; a p-value of 0.024 for the whole model (f1 to f2) reflects a high discriminating power)
Contrast of functionsWilks' lambdap-value
f1 to f20.6000.024
f20.8320.180

DISCUSSION

The present study aimed to explore the predictive accuracy of diverse visual parameters and skills, mainly related to accommodation and convergence, to correctly categorise observers according to their ability to perceive SIRDS. Categorisation per se was not a goal of the present investigation. Data analysis through LDA was considered a useful tool to examine the relative contribution of each predictor to the explanation of high intersubject variability in SIRDS perception.

The ability to perceive SIRDS was found to be described within an accuracy of 66.67 per cent by the discriminant functions resulting from LDA. Whereas the discriminant functions offered a high predictive accuracy for groups 1 and 2, only one-third of participants previously allocated to group 3 were properly categorised. Of a total 52 participants from groups 1 and 2 (23 plus 29), only three observers were assigned to group 3, that is, 94.23 per cent of those participants capable of discovering the hidden 3D image in the auto-stereogram were correctly identified by our discriminant model.

The inability of some observers with normal stereovision to perceive SIRDS has been documented previously,18,19 suggesting that a certain amount of practice is required to achieve the desired dissociation of the automatic and natural coupling between accommodation and convergence. In addition, these authors reported higher success rates when observers were informed of the most appropriate viewing strategy to perceive SIRDS.20 Therefore, it might be speculated whether, with adequate information and proper training, many participants erroneously classified in group 3 would have been correctly allocated to group 1 or group 2.

Viewing distance, which involves three different depth cues (binocular disparity, accommodation and convergence), is of critical importance for the correct visualisation of SIRDS. Even if our results failed to disclose any significant association between the minimum time and viewing distance, it might be interesting to mention that measurements could have been, at least to some extent, confounded by the personality of each subject (some might be prepared to respond very quickly, while others might deliberate for longer, even after they have perceived the stereo-image) or by difficulty in moving the lectern (although it presented a smooth movement, some participants might have experienced difficulties in adjusting it to the proper distance at which the target could be seen).

The relative values of the standardised coefficients of the discriminant functions f1 and f2 revealed stereoacuity, negative relative convergence, phoria at near and, to a lesser extent, the AC/A ratio as the most relevant predictors for the categorisation of our participants. This finding is in agreement with the previously documented association between stereoacuity and the self-reported skill in perceiving Magic Eye stereograms, as measured by the standard clinical TNO test,17 and would suggest that, apart from stereoacuity, divergence is the most important factor to consider when exploring the ability to perceive SIRDS. An ANOVA analysis of all discriminant variables disclosed statistically significant between-group differences only for stereoacuity (p = 0.001) and NRC (p = 0.003). It might be noted that, from an optometric standpoint, this finding is not unexpected, as all participants perceived SIRDS with parallel fixation, that is, uncrossed disparity.

To further explore the predictive value of stereoacuity and negative relative convergence, a stepwise discriminant analysis approach was implemented, in which only stereoacuity and NRC were taken into consideration in the categorisation process. The results of this analysis revealed that, although statistical significance was attained (p < 0.001), the simplified model only allowed for the correct categorisation of 52.17 per cent of participants, compared with a predictive accuracy of 66.67 per cent with the complete model (nine discriminant variables). Therefore, it might be concluded that even though stereoacuity and NRC play a significant role in explaining the high intersubject variability in SIRDS perception, other visual parameters and skills are necessary to gain a better understanding of this particularly interesting visual phenomenon.

Finally, it is important to note that although some of the differences between the numerical values of the visual parameters and the skills under evaluation lead to fair accuracy in discriminating our participants according to their ability to perceive SIRDS, many of the same differences could not be considered clinically significant. This fact might account for some of our findings being unexpected or difficult to explain. In effect, NRC values were expected to be higher in the participants requiring less time to perceive SIRDS and lowest in the participants pertaining to group 3, in which SIRDS were not perceived; however, the opposite trend was observed, even though there was no association between negative relative convergence and viewing distance. In addition, our analysis disclosed a moderate between-group overlap and even if not investigated in the present study, a possible influence of training and knowledge of viewing strategies in SIRDS perception (which could partially account for those participants failing to perceive the hidden shape altogether). All of these factors probably advise against the implementation of SIRDS as a new tool for practitioners to use in their daily visual examination routine.

In conclusion, the ability to perceive SIRDS is related to many visual parameters and skills, including, but not limited to, stereoacuity and negative relative convergence. It is uncertain whether SIRDS might be considered a useful tool in clinical practice.

ACKNOWLEDGEMENTS

The authors would like to acknowledge the kind assistance of Miquel Ralló, from the Department of Applied Mathematics of the Technical University of Catalonia, in the statistical analysis of the data.

GRANTS AND FINANCIAL SUPPORT

This work was supported in part by a grant from the Science and Technology Ministry of Spain (Ref. PSI-2009-11062/PSIC).

Ancillary