• Visual displays;
  • Visual perception;
  • Metacognition;
  • Folk fallacies;
  • Intuition;
  • Realism;
  • Spatial fidelity


  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Experiment 1
  5. 3. Experiment 2
  6. 4. Discussion
  7. Acknowledgments
  8. References

Often implicit in visual display design and development is a gold standard of photorealism. By approximating direct perception, photorealism appeals to users and designers by being both attractive and apparently effortless. The vexing result from numerous performance evaluations, though, is that increasing realism often impairs performance. Smallman and St. John (2005) labeled misplaced faith in realistic information display Naïve Realism and theorized it resulted from a triplet of folk fallacies about perception. Here, we illustrate issues associated with the wider trend towards realism by focusing on a specific current trend for high-fidelity perspective view (3D) geospatial displays. In two experiments, we validated Naïve Realism for different terrain understanding tasks, explored whether certain individuals are particularly prone to Naïve Realism, and determined the ability of task feedback to mitigate Naïve Realism. Performance was measured for laying and judging a concealed route across realistic terrain shown in different display formats. Task feedback was either implicit, in Experiment 1, or explicit in Experiment 2. Prospective and retrospective intuitions about the best display formats for the tasks were recorded and then related to task performance and participant spatial ability. Participants generally intuited they would perform tasks better with more realism than they actually required. For example, counter to intuitions, lowering fidelity of the terrain display revealed the gross scene layout needed to lay a well-concealed route. Individuals of high spatial ability calibrated their intuitions with only implicit task feedback, whereas those of low spatial ability required salient, explicit feedback to calibrate their intuitions about display realism. Results are discussed in the wider context of applying perceptual science to display design, and combating folk fallacies.

1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Experiment 1
  5. 3. Experiment 2
  6. 4. Discussion
  7. Acknowledgments
  8. References

Designers and developers create visual displays to support work when the information needed to conduct that work is separate from (distal to) users (Sanders & McCormick, 1993). With visual displays, information sensing and coding is all indirect. This paper explores a tension in this most general framing of visual display use. Driven by a complex mix of factors, there is a trend to display information in more proximal, realistic ways than ever before, with an implicit gold standard being photorealism (Daukantas, 2009; Ferwerda, 2003). Many users and designers enthuse about realistic depiction, finding it familiar, intuitive, and believing it to “minimize interpretive effort” (Dennehy, Nesbitt, & Sumey, 1994, p. 111). To them, highly realistic displays offer a perfect blend of form and function.

The literature on controlled performance evaluation of displays, though, shows mixed benefits at best from increasing realism. Different tasks pose different requirements and are best served by different display formats. For example, St. John, Cowen, Smallman, and Oonk (2001) showed that tasks requiring shape understanding, such as judging the rough layout of a scene, are performed best with realistically shaded, perspective view (three-dimensional or 3D) displays. On the other hand, tasks requiring relative position information, such as precisely judging distances and angles, are performed best with less realistic, two-dimensional (2D) topographic map displays. We review more examples of the effect of realism on task performance below.

Smallman and St. John (2005) coined the term Naïve Realism for users’ misplaced blanket faith in realistic information display that they reviewed in a number of studies. Further, Smallman and St. John laid out a theory which hypothesized that Naïve Realism resulted from folk fallacies, or misconceptions, about visual perception. Through everyday subjective experience, folk knowledge and intuitions have developed about how perception works and what it delivers. When these are mistaken, they are termed folk fallacies. For example, children and a sizeable proportion of adults believe extra-mission by light rays is the mechanism of sight. That is, they harbor the folk fallacy, espoused since antiquity, that seeing entails rays shooting from the eye to contact and perceive objects (Winer & Cottrell, 2004).

The Naïve Realism theory says that a triplet of folk fallacies that perception is easy, accurate, and complete underlie the misplaced faith in realistic information display. The folk fallacy that perception feels easy and effortless follows from the “inner screen” fallacy that image formation is sufficient for perception (Frisby, 1980; Pylyshyn, 2003). Rather than passively creating an inner screen, of course, over a century of perceptual science and physiological optics have revealed the complexities of perception’s active image interpretation. The folk fallacy that perception feels accurate is related to the “illusion of objectivity” that the inner perceptual world mirrors and re-creates an accurate, metric representation of the external world (Loomis, 1992; MacLeod & Willen, 1995). Again, modern perceptual science is rife with examples of perception’s inaccuracies, “illusions” and nonmetric approximations to reality (Cavanagh, 2005). The folk fallacy that perception feels complete is related to the “illusion of visual bandwidth” that perception continually updates everything in view (Varakin, Levin, & Fidler, 2004). The recent explosion of interest in change blindness and related attentional phenomena has debunked this myth with myriad demonstrations of the sparse sampling of the visual scene and the filling in of material not actively fixated (Simons & Rensink, 2005).

The Naïve Realism theory explains both the preference for, and the pattern of performance with, realistic displays. The theory explains the preference for realism because the folk fallacies lead directly to the mistaken intuition that an externally presented realistic display will be effortlessly and directly transformed into a complete, accurate internal representation kept updated on an inner screen. In reality, perception transforms realistic displays into an incomplete, flawed approximation of reality that is sparsely updated. The theory explains the pattern of performance on different tasks with realistic displays because their internal representation is only useful for gross judgments of scene layout, explaining why 3D displays can support shape understanding tasks. When tasks require any precision or metric judgment, then the imperfect, distorted representation from 3D displays will not suffice, and an undistorted display that makes scene dimensions explicit, such as absolute altitude with topographic contours, is required, explaining why 2D displays can support relative position tasks. See Smallman and St. John (2005) for more examples and details of the Naïve Realism theory.

1.1. Naïve Realism and display design trends

Here, we offer a new account of how the Naïve Realism theory explains trends in display design. We illustrate the wider trends and issues associated with realistic information display using geospatial displays as an example. Geospatial displays mediate perception of parts of the 3D physical world for users that may be separated from it. For example, air traffic control, military and civilian emergency operations, the control of unmanned vehicles, and meteorology all entail comprehending aspects of the physical 3D world from a geospatial display. Maps are perhaps the most familiar geospatial display, as they are in ubiquitous use for navigation tasks (for review, see MacEachren, 1995). The use of geospatial displays is growing, driven in part by their increased accessibility over the Internet and low-cost computers with fast graphical processors to generate them.

Fig. 1 is central to understanding the implications of the Naïve Realism theory for wider trends in geospatial display design discussed in this section. The figure is also used to contextualize and motivate the independent variables that were manipulated in two new experiments (situated at the yellow “increasing realism” banner). There are excellent, detailed descriptions of the history of geospatial display design and cartography available (e.g., MacEachren, 1995; McLeary, Jenks, & Ellis, 1991). Fig. 1 does not address all the fine distinctions raised in those histories. Instead, it attempts to account for the main developmental sequence they describe. Starting from unaided perception with the user’s visual system, the figure uses the Naïve Realism theory to account for the general order and progression of the main, natural categories of displays that have been created, and that are either in lab prototype or have been touted. There exist detailed taxonomies that have teased apart fine distinctions between the attributes of visual displays (e.g., Milgram & Kishono, 1994). Instead of maintaining all the differences in those taxonomies, Fig. 1 organizes displays by reproduction fidelity, that Naïve Realism posits as the driver of design trends.


Figure 1.  The illusory march of progress in display development towards photorealism. Over time, realism has increased from the addition of depth cues and higher spatial fidelity. Yet display designers and users are unwittingly marching towards an endpoint that would return users to their unaided, flawed perception of the world.

Download figure to PowerPoint

In this paper, the terms fidelity and realism are used interchangeably. Both refer to accuracy in detail of the reproduction of a scene, with high fidelity, realistic displays showing a scene akin to a natural view of it. There are many attributes of display fidelity, including contrast range and color reproduction. The two aspects of fidelity focused on here are depth cue fidelity and spatial fidelity. By taking a circuit around Fig. 1’s staircase, we review the trend and implications for increasing depth cue fidelity, spatial fidelity, and finally for the ostensibly unrelated trend for increasing user control of display content. This discussion naturally motivates five objectives for the new experiments.

1.1.1. Increasing depth cue reproduction fidelity

Fig. 1 caricatures trends in geospatial display design as “progress” on an Escher-like staircase that loops back on itself to end where it began. A user is shown following a display designer up the staircase past a parade of displays. The order of displays around the staircase represents design trends over time. Over time, display realism has increased from the addition of depth cues.

The starting point of the staircase for the user without a display is unaided perception. The first displays developed historically are the first encountered on the staircase. The first displays were 2D maps, which were top-down, 2D geospatial views designed to provide spatial appreciation of scenes. These 2D maps were later shown on flat visual displays driven by computers. Terrain height and relief in these 2D views was most commonly given by equal altitude, topographic, or “topo,” lines, to create so-called topo maps. It was long recognized, though, that topo maps are challenging to rapidly and accurately mentally reconstruct in three dimensions (e.g., Pick et al., 1995). Accordingly, with the advent of widely available fast rendering technologies from the digital computing revolution, display designers next created realistic perspective views of terrain—what were originally termed “3D maps” (Jenks & Brown, 1966)—which appear to dramatically and intuitively convey three-dimensional structure and shape needed to appreciate a scene. These “3D,” or “perspective view,” geospatial displays show perspective projections of scenes on flat screens from shallow viewing angles in realistically shaded depth relief. With increases in the speed of rendering technologies, and new sophisticated viewing control interfaces, static 3D displays gave way to dynamic 3D displays that enabled users to smoothly change viewing angles and positions to seamlessly explore depicted worlds.

At this point, let us pause to consider how the addition of depth cues meshes with the depth perception of the user. The human visual system uses about nine discrete cues to re-create the spatial layout and depth of objects (Cutting & Vishton, 1995). These depth cues have traditionally been listed from the simpler, so-called monocular cues that are available in each eye as by-product of image formation such as occlusion, to the more complex, “binocular” and “ocular-motor” cues that result from the geometry and optics of imaging the world through laterally offset eyes. Note how the trend for greater realism in geospatial displays reviewed to this point recapitulates this list. The original 2D topo maps were not realistic. They only possessed the monocular depth cue of occlusion—objects shown in a topo map lay on, or in front of, the terrain they occluded. Next, the static 3D perspective views added the rest of the monocular depth cues to the display—those of linear perspective, foreshortening, relative size, and height in the visual field.

Proceeding along the staircase, the user is next shown passing the dynamic 3D display, typical of those currently marketed for desktop PC applications, that adds the depth cue of motion perspective (or parallax) to the static 3D map. The designer is shown further up the staircase, heading towards displays further along the research and development pipeline that are now maturing from prototype to commercial systems. The dynamic stereo 3D displays add disparity cues to support binocular stereopsis. The designer is rounding the corner on the staircase to point towards future displays, presently available only as lab prototypes, which add the ocular-motor depth cues to complete the “fill in” of the remaining missing depth cues for the fullest fidelity, immersive 3D experience. For review of these new volumetric and immersive, dynamic 3D display technologies, see Bowman, Kruijff, LaViola, and Poupyrev (2005).

The Naïve Realism theory explains the motivation to achieve full fidelity depth cue realism and it exposes the logical flaw in chasing this gold standard. When one reaches the gold standard, one throws the user back on his or her perceptual apparatus, which is geared towards providing imperfect, just good enough representation of the world. In other words, one returns the user to the starting point of unaided perception. The idea that increasing depth cue fidelity could impair performance seems an anathema to users, designers, and even researchers. How could adding more depth cues possibly impair performance? For example, there was a burst of studies evaluating the addition of the cue of binocular disparity when stereoscopic display technologies began maturing in the late 1980s and early 1990s. Authors of studies that found no benefit for stereo in improving situational awareness often chastised themselves for failing to make the new cue strong enough, or failing to choose the right evaluation metric (e.g., Steiner & Dotson, 1990). The more likely culprit was the cue itself, as the best psychophysical data shows that binocular fusion for stereopsis inherently degrades spatial localization performance (McKee, Levi, & Bowne, 1990). A similar tone of surprised disappointment pervades the literature failing to find evidence for temporal realism (in the form of animation) supporting diagram comprehension and instruction (Tversky, Morrison, & Betrancourt, 2002). For other examples and discussion, see Smallman and St. John (2005).

For another example of the ways in which adding depth cues can impair performance, consider the move from 2D displays to static 3D displays, the subject of the experiments reported below. The situation is intricate. On the downside, perspective projection in the 3D views integrates and conflates three spatial dimensions into a two-dimensional image, resulting in massive line of sight ambiguity (Sedgwick, 1986). On the upside, the perspective projection in the 3D display adds the monocular depth cues of linear perspective, foreshortening, relative size, and height in the visual field to the distance of objects to help resolve this ambiguity. Another downside, though, is that linear perspective and foreshortening differentially compress the scene view. Geometrically, widths in perspective projections decay into the scene in inverse proportion to distance (linear perspective), while depths decay faster, in inverse square with distance (foreshortening) at shallow viewing angles. Psychologically, participants appear to employ a simplifying heuristic that all scene elements scale at the same slower rate as linear perspective, and they scale depth estimates by width alone (Smallman, St. John, & Cowen, 2002). This “cross-scaling” results in consistent distance underestimation errors, even though several additional depth cues have been added to the scene in the 3D perspective views, contributing to the poor relative position performance observed with these displays (St. John et al., 2001). The same pattern of distance underestimation is made both with unaided and display-mediated perception of the world (Loomis & Knapp, 2003), underscoring the fact that it is inherent to the way the visual system combines depth cues.

Users and their displays form a joint cognitive system, with information presentation and extraction shared between them (Woods & Roth, 1988). Naïve Realism highlights another downside of increasing display realism with regard to this point. There is a hitherto unrecognized shift in the burden of extracting information from the displays to the users in the progression of Fig. 1. The extraction burden shifts from the display to the user as depth cues are added and information is either taken away or masked. For example, the topo contours of the 2D display extracted and presented absolute altitude information about the scene for users. In the next, static 3D display, users must extract terrain altitude themselves from the display, with their own eyes. And when they must metrically extract distance from such 3D views, they may use heuristics such as cross-scaling to only achieve approximate estimates. The first motivation of the experiments reported below is to systematically validate the Naïve Realism of experimental participants performing these simple metric relative position estimation tasks with 2D and 3D displays.

1.1.2. Increasing spatial reproduction fidelity

The second major aspect of reproduction fidelity considered here is spatial fidelity (accuracy in spatial detail). To the lay public, the desire for high spatial fidelity is compelling. At the time of writing there is zealous marketing of “high definition” television (HD TV), media, and PC displays. As computing power has increased, it has allowed the creation of digital terrain models in higher spatial fidelity, and the raster scanning of more and more pixels to create higher fidelity images of them. At some point, though, if enough pixels are packed into an image, spatial filtering in the visual system of the user’s eye and brain must reduce depicted and real images of a scene to indistinguishable spatial metamers of each other. The rendering literature currently celebrates techniques to capture the translucency of skin, the smallest subtleties of haze and inter-reflection of objects in complex 3D scenes, and all at such fine pixilation as to become indistinguishable from real-world scene referent (Daukantas, 2009). The gold standard of spatial photorealism thus mirrors and reinforces that for depth cue fidelity.

The Naïve Realism theory proposes that the desire for high spatial fidelity is based in the “inner screen” fallacy that perception is easy. Instead, perception is complex. Realism can clutter and mask task-relevant information. For example, elaborately detailed realistic icons may mask information about identity. Simplifying and caricaturing objects improves performance by preserving task-relevant information and removing task-irrelevant information. We found support for this design approach with a new symbology of caricatured icons of Navy ships and planes called “Symbicons” (Smallman, St. John, Oonk, & Cowen, 2001). In related experiments, a majority of participants mistakenly predicted they would identify high fidelity realistically rendered 3D icons of ships and planes faster than either unrealistic symbols or Symbicons.

Other recent work has highlighted the misguided effort of striving for full spatial fidelity. This work has shown the perceptual insensitivity to deviations from realism in scenes. Cavanagh (2005) pointed out that the liberties taken in shading and perspective by Renaissance and other painters are surprisingly hard to spot. To him, the insensitivity to transgressions of realism suggests that “our visual brain uses a simpler, reduced physics to understand the world” (Cavanagh, 2005, p. 301). The effect has not only been exploited in art but also in design, specifically, in cartography. In several of his maps from the early 1960s, the famous Swiss cartographer Imhof used a technique of locally changing lighting direction away from the standard northwest direction to prevent notable terrain features from being concealed in shade (Imhof, 1982).1 Again, these lighting changes in the maps are not noticeable. In addition to local illumination inconsistencies, structural inconsistencies may also go unnoticed. It was Escher’s genius to exploit perception’s local computations of shape and structure to create globally impossible scene configurations, such as in our impossibly looped staircase in Fig. 1. Of course, if the brain uses local computation and simpler physics to understand the world, that implies that much of the effort of creating high-fidelity photorealistic, perspective renderings in geospatial displays is wasted.

Here, we demonstrate the utility of a new terrain simplification concept that enables terrain to be caricatured by low-pass spatial-filtering (smoothing) the underlying digital elevation model (see Fig. 2, right). By obliterating fine details and reducing clutter, gross scene structure and layout are revealed. Note how the canyons and main terrain arteries stand out in the smoothed terrain view in Fig. 2, right. A different example of unmasking scene structure is shown on the left of Fig. 2. This shows a recent Physics Today cover graphic of a new geospatial laser mapping technology that can penetrate tree cover (Carter, Shrestha, & Slatton, 2007). The authors themselves were surprised that the resulting visual representation revealed more underlying topography by showing less information. The second motivation of the two experiments is to test the Naïve Realism prediction that participants will intuit laying a route better with high spatial fidelity 2D and 3D terrain displays, when, in fact, simplifying and lower terrain fidelity should improve shape understanding of the terrain necessary to do the task.


Figure 2.  Seeing more with less. Left, a recent Physics Today cover illustrating a new geodetic laser scanning technique for revealing terrain topography normally concealed by foliage and other clutter (Carter et al., 2007; reprinted with permission from Physics Today, Volume 60, Issue 12, December 2007. Copyright 2007, American Institute of Physics). Right, terrain fidelity reduction by low-pass spatial filtering from high fidelity unfiltered (top), low fidelity filtered (bottom), tested in Experiment 1. In both cases, gross terrain structure is unveiled.

Download figure to PowerPoint

1.1.3. Increasing user control of display format

The third trend in display design is for users to be able to customize and tailor the content of their displays (Cruz, 1996). There are two design philosophies governing the content and presentation format of visual displays. The first philosophy emerging from the field of cognitive engineering is to analyze the information requirements of a work domain and then decide and design for users what should be represented on their displays (Bennett, Nagy, & Flach, 1997; Woods & Roth, 1988). Nuclear power plant and air traffic control displays have often been designed according to that philosophy. The second design philosophy is to let users decide for themselves what they want to have on their display by making the display customizable and tailorable (Cruz, 1996). For example, meteorological displays and geographic information systems allow users to see a scene in a variety of different formats, each with customizable color-coding and tailorable overlays. Because it reduces design burden, applies across work domains, appeals to user preference, and is enabled by interactive visualization technology, the second design philosophy is currently ascendant.

The design philosophy of user customization is premised on the notion that users know what is good for them, or, in technical parlance, that they possess “meta-representational competence” (diSessa, 2004). But do they? In recent studies on user-configurable meteorological displays, a third of participants intuited that the addition of task-irrelevant meteorological variables and realism would improve performance in both simple read-off and inference tasks when, in fact, it slowed them down (Hegarty, Smallman, Stull, & Canham, 2009). The third motivation of the experiments reported here, consistent with Hegarty et al., was to systematically manipulate several aspects of display format while measuring individuals’ intuitions for which formats would support different tasks. The results speak to how well displays might be configured and customized in real-world use.

1.2. Individual differences in Naïve Realism and factors that may moderate it

To this point, Naïve Realism has been portrayed as something universal, fixed, and immutable. This is not from dogmatism but rather because the theory is immature and requires refinement. Of course, realism is not the sole driving force behind display design, innovation, and user preference. The final impetus of the experiments reported here is to provide empirical data to refine the theory. What factors determine how likely one is to exhibit Naive Realism and remain Naively Realistic after experience with visual displays? Can we identify these factors with psychometrics? Individuals differ in their spatial ability and style of problem solving, and both abilities have been shown to affect performance with visual displays. For example, spatial ability correlates with performance extracting information from maps (e.g., Scholl & Egeth, 1982). In addition, those of lower spatial ability whose style is to solve problems visually (so-called low-spatial visualizers) have difficulty interpreting abstract spatial representations, such as graphs, and constructing problem representations that extract only the relevant information from the problem (Hegarty & Kozhevnikov, 1999). Finally, in recent studies, we found that Navy weather forecasters of lower spatial ability put more extraneous variables in their forecasting displays than needed to perform the task (Smallman & Hegarty, 2007). That is, they made them overly complex. The fourth motivation for the experiments is to relate the spatial ability of participants to their intuitions and performance with geospatial displays to determine whether individuals of low spatial ability are particularly Naively Realistic.

The maintenance of Naïve Realism after performing a task is particularly intriguing. Naïve Realism implies a psychological dissonance between continued positive intuitions for realistic displays that must be maintained in the face of continued negative experience performing with them. Originally, Smallman and St. John (2005) argued that the maintenance of Naïve Realism results from a combination of misunderstanding and misestimating task demand characteristics, being oblivious to perception’s subtleties and how perception’s just-in-time just-good-enough character cocoons one in a bubble of unawareness of its failings. To pierce this “bubble” of unawareness may require some combination of salient closed-loop feedback on task performance, extended experience with a visual display format, or heightened sensitivity to feedback. There is a large and complex aptitude-by-treatment interaction literature attempting to draw out the relationships and dependencies between individual differences and training styles and regimes. One particular study by Kyllonen, Lohman, and Snow (1984), provides some of the best data that spatial ability may have a moderating role in the effectiveness of task feedback. The fifth and final motivation for the experiments, therefore, is to explore what can mitigate Naïve Realism after performing a task: Are those of higher spatial ability more sensitive, receptive, and calibrated to their experience with visual displays? For those of lower spatial ability, can explicit task feedback compensate for spatial ability in calibrating intuitions?

2. Experiment 1

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Experiment 1
  5. 3. Experiment 2
  6. 4. Discussion
  7. Acknowledgments
  8. References

2.1. Method

2.1.1. Participants

Thirty-one college students or graduates (15 male, 16 female) with a mean age of 30.5 (range 18–60 years) were recruited from and were paid $30 for their participation.

2.1.2. Design

In a fully repeated-measures design, the 31 participants performed four terrain understanding tasks on terrain views shown in each of eight different display formats. The eight formats were created by manipulating three independent variables (IVs): 2 (depth relief format) × 2 (viewing angle) × 2 (terrain fidelity). Two of these variables manipulated the key differences between the natural categories of 2D and static 3D displays (see Fig. 1) by varying depth relief format (shaded vs. topo) and viewing angle (top-down 90° vs. shallow 45°). A third IV contrasted two levels of terrain spatial fidelity (high vs. low). In Fig. 3, the same piece of terrain is shown in each of the eight views, henceforth interchangeably referred to as “display formats.”


Figure 3.  The eight display formats formed from the intersection of the depth relief format (shaded vs. topo), viewing angle (45° vs. 90°), and terrain fidelity (sharp vs. smoothed) IVs. The same terrain is shown in the eight display formats, with the starting route shown as a blue dotted line. The small vertical insets show the altitude legends for shaded and topo formats. Average realism scores are beneath each display format, with the highest for the top left format and the lowest for the bottom right. The terrain views shown to participants were equal in size.

Download figure to PowerPoint

Two of the terrain understanding tasks were shape understanding (concealed route laying) and two were relative position (route altitude estimating). Performance on these tasks was subsequently related to intuitions, on a participant by participant basis, to test predictions from the Naïve Realism theory. To determine the realism of the eight display formats, a subset of participants rank ordered the eight display formats in terms of how realistically each depicted terrain, yielding an average realism score for each display format (intra-class correlation, r= .73) shown in Fig. 3. By rank ordering the displays in terms of perceived realism, the realism score for the display that actually supported the best performance could be compared with the realism score of the display that participants intuited they would do the task best with. The realism score method allowed us to quantify the display realism needed and the realism intuited, so that the two could be compared. The Naïve Realism theory predicts that participants will intuit needing more display realism than is actually necessary to support a task. Note that the realism scores were used only for analysis, and had no impact on which displays were presented.

Though not directly bearing on the theory, it is illuminating to consider how the different IVs contributed to judgments of realism. Referring to the shaded displays in Fig. 3 (four left-hand displays), participants assessed the sharp displays (top row) as more realistic than their smoothed counterparts (bottom row) for 90° and 45°—sharp 90° shaded (= 6.5, SD = 1.9) vs. smoothed 90° shaded (= 3.5, SD = 1.8), t(5) = 4.4, < .01; sharp 45° shaded (= 7.0, SD = 2.0) vs. smoothed 45° shaded (= 4.7, SD = 2.2), t(5) = 2.9, < .05. Referring to the sharp displays in Fig. 3 (top row), participants assessed the shaded displays as more realistic than their topo counterparts for 45° and 90°—sharp 45° shaded (= 7.0, SD = 2.0) vs. sharp 45° topo (= 4.8, SD = 1.0), t(5) = 3.1, < .05; sharp 90° shaded (= 6.5, SD = 1.9) vs. sharp 90° topo (= 3.3, SD = 2.2), t(5) = 2.9, < .05. Viewing angle did not lead to any significant differences in perceived realism.

A different piece of terrain was shown for each display format to prevent any carry-over of terrain knowledge. The eight pieces of terrain were always presented in the same order, but the order of the display formats, and therefore the display format to terrain pairing, was counterbalanced using a Latin Square design.

2.1.3. Stimuli

Eight swaths of mountainous terrain were cut from a selection of digital elevation models from the U.S. Geological Survey. Identical starting diagonal routes were designated across each piece of terrain, from lower left to upper right. Using real terrain increased external validity but complicated experimental control. Thus, several steps were taken to equate task difficulty as much as possible across the terrain set. First, each piece of terrain was graded against six criteria to ensure the presence of similar features (e.g., mountainous regions) across the set. A terrain swath was rejected if it lacked these features. Second, the digital elevation models were normalized to possess the same altitude range. Third, the starting routes were constructed to be initially visible from the same amount of surrounding terrain (i.e., equally concealed). This was accomplished through minimally adjusting waypoints away from the diagonal straight-line route to attain the same starting exposure score in each terrain piece (see Fig. 3). Finally, pilot testing confirmed that starting routes in each piece of terrain had approximately the same room for improvement in reducing their exposure to the surrounding terrain. The terrain pieces were then presented in different display formats according to the IVs, which are reviewed in turn.

First, terrain spatial fidelity was either high, or lowered by spatially filtering the terrain digital elevation models. To lower fidelity, custom software convolved the digital elevation models with a Gaussian low-pass spatial filter of fixed space constant 3-pixels (a value that pilot testing determined appropriate for the specific terrain pieces used). The spatial filtering had the effect of smoothing the terrain. Henceforth, we refer to the unfiltered, high-fidelity terrain as sharp, and the low-fidelity terrain as smoothed. The filtered or unfiltered digital elevation models were then meshed for visualization.

Second, depth relief format was manipulated by showing the terrain mesh in either shaded or topo depth relief. To accommodate viewing from different viewing angles, the two reliefs were built with a “texture draping” procedure. Shaded relief was created by draping a gray matte texture lit from the conventional northwest (top-left) direction. Topo relief was created by draping a white texture over the terrain mesh and then adding appropriate color-coded contour lines of equal altitude increments. There was no shading in the topo format. A color legend and scale were added to the side of the topo view; see Fig. 3. In both reliefs, colored dots were added at the locations of the highest and lowest altitudes in the scenes in the appropriate color from the topo legend to facilitate scene interpretation.

Third, the scene views were rendered in perspective from either 45° or 90° viewing angles using standard camera geometry. The computer interface interactions for the resulting 90° and 45° views were equated as much as possible while respecting the inherent line of sight ambiguities of the two views (Sedgwick, 1986). In both views, the mouse cursor changed into a thin crosshair when a waypoint was selected and dragged. This allowed participants to clearly pinpoint the new location where the waypoint would be placed.

Finally, for each terrain display the start and finish route locations were indicated by large dark blue dots, with start at lower left, and finish at upper right. The starting route was defined by four equally spaced adjustable waypoints. These were shown as large light blue dots placed diagonally between the start and finish locations. Strings of smaller blue dots defined the segments (route “legs”) between the waypoints; see Fig. 3. The large waypoints could be dragged and dropped with the mouse to define and adjust routes. All waypoints lay on the terrain surface.

2.1.4. Procedure

After informed consent and Ishihara plate color vision screening, participants completed the Vandenberg Mental Rotation Test (MRT) of spatial ability (Vandenberg & Kuse, 1978). Next, participants were asked to role-play a military surveyor whose primary mission was to lay concealed routes through unfamiliar terrain.

The route-laying mission was divided into four tasks that supported a natural, goal-directed sequence of first laying and gauging a coarse initial route, and then refining and defining it (see Fig. 4). The design of these four tasks allowed us to separately study shape understanding and relative position aspects of terrain appreciation. The objective of the shape understanding tasks was to create a route that was concealed from as much of the surrounding terrain as possible. These tasks required understanding the shape of terrain for gauging lines of sight to and from the route with a richer, more continuous, and global task than the more localized (can A see B?) line of sight judgments used previously (St. John et al., 2001). The objective of the relative position tasks was to estimate the altitude of the waypoints defining the routes as accurately as possible. These tasks required understanding the precise altitudes of route waypoints. The route-laying and altitude-estimating tasks were performed for both an initial route, coarsely defined by four waypoints, and a final route, more precisely defined by 14 waypoints.


Figure 4.  The four experimental tasks in order. The route-laying shape-understanding Tasks 1 and 3 (top row), and the altitude-estimating relative position Tasks 2 and 4 (bottom row). The panel below each terrain view shows the estimated altitude profile of the route.

Download figure to PowerPoint

To evaluate route-laying performance, we developed a new shape-understanding metric that assessed the concealment of a route. Overall route concealment was operationalized as the average length of lines of sight extending from the route to all surrounding terrain. Since these lines extended from the route itself to the surrounding terrain, they measured, and showed, the amount of terrain to which the route was exposed; see Fig. 5. We termed the line of sight metric the exposure score, and the corresponding visualization of these lines-of-sight superimposed on the terrain the exposure envelope. Illustrations of the exposure envelope, corresponding exposure score, and the impacts on both resulting from route adjustment were used to explain the goal of minimizing route exposure to the participants; see Fig. 5.


Figure 5.  Illustration of the exposure envelope (yellow), exposure score metric, and change score demonstrating a route made less (right) or more (left) exposed. These were used to explain the concept of route exposure and the scoring metric to participants in both experiments, and as actual feedback aids in Experiment 2.

Download figure to PowerPoint

To derive an exposure score, a route was always graded against unfiltered, high-fidelity digital elevation models, in order to use the same standard of performance measurement across trials. Further, this method of grading performance ensured that the task was not simply easier when terrain fidelity was low.

For each display format, participants completed the four tasks in the order illustrated in Fig. 4. These tasks are listed below with the specific terrain appreciation (shape understanding vs. relative position) measured by each task specified:

  • 1
     Speeded initial route laying (shape understanding)
  • 2
     Initial route altitude estimating (relative position)
  • 3
     Self-paced final route laying (shape understanding)
  • 4
     Final route altitude estimating (relative position)

The details of each task are covered in turn.

1. Initial route laying: First, in the speeded initial route-laying task, participants were instructed to reduce the starting route’s exposure to the surrounding terrain. This required participants to quickly search for canyons and valleys through which to lay promising concealed routes. Participants were advised to lay the initial route in a way that it could maximally benefit from the fine adjustments made later in the final route-laying task. To create this initial route, participants used the mouse to drag and drop the four adjustable waypoints (see the coarsely adjusted initial route in Fig. 4, Task 1). The waypoint adjustment range was restricted by the software to prevent the creation of routes with unfeasibly sharp turns or grossly unevenly spaced waypoints. If one of these constraints was violated by a waypoint movement, the software popped the waypoint back to its starting location to prevent that movement. To assess initial route-laying performance, the percent difference between the starting and initial route exposure scores was calculated.

2. Initial altitude estimating: Second, in the initial altitude-estimating task, participants were asked to re-create, as accurately as possible, an altitude profile of the route from the terrain view above. On an altitude panel presented beneath the terrain view, the top and bottom lines represented the highest and lowest points on the terrain, and were marked with the corresponding colored dots from the altitude legend and terrain display. The altitude of the route’s fixed start and finish waypoints was correctly positioned on the panel, whereas the altitude of the four adjustable waypoints was set in the middle of the altitude panel as a default. Using the mouse, participants vertically adjusted each waypoint’s altitude in the panel to reflect their estimates of waypoint altitude (see Fig. 4, Task 2). The selected waypoint was highlighted in both the scene view and altitude panel to clarify which waypoint was being estimated and adjusted. To assess initial altitude estimating performance, altitude estimation error for the initial route was calculated.

3. Final route laying: Third, in the self-paced final route-laying task, participants were instructed to further reduce the exposure of their initial route. To allow for finer adjustments, the initial route was redefined with 14 evenly spaced adjustable waypoints. Unlike the speeded initial route-laying task, participants were instructed to take care and time on this task. To make fine route adjustments, participants adjusted the final route using the mouse to drag and drop waypoints and could also use the keyboard arrow keys for precise single pixel waypoint adjustments (see finely-adjusted route in Fig. 4, Task 3). Since the purpose of this final routing task was to precisely adjust the initial route, rather than to create an entirely new route, the adjustment range of the final route waypoints was further limited by the software to prevent radical departures from the initially laid route. To assess final route laying performance, the percent difference between the initial and final route exposure scores was calculated. Furthermore, initial and final route-laying performance, when summed, yielded a useful, net metric of route laying performance across the entire experiment.

4. Final altitude estimating: Fourth, in the final altitude-estimating task, participants were asked to re-create the altitude profile of the carefully adjusted final route. Participants vertically adjusted the 14 waypoints in the altitude panel to estimate waypoint altitude, just as they had in the initial altitude-estimating task (see Fig. 4, Task 4). To assess final altitude-estimating performance, altitude estimation error for the final route was calculated as in Task 2.

Participants were instructed that they had 7 min to complete all four tasks for each display format. They were told to quickly complete the initial route-laying and altitude-estimating tasks, and to spend the majority of their time and focus on the final route-laying task. To help participants adhere to these instructions, the proctor reset a large timer to 7 min at the beginning of each new terrain view. Participants performed all four tasks on a practice piece of terrain shown in their first display format before proceeding to the eight experimental views.

To test the Naïve Realism predictions about intuitions for each task, participants’ prospective intuitions of which display format would best support their performance were probed after instructions and before the tasks. After the experiment, retrospective intuitions about which display format did support the best performance were also gathered.

The entire procedure, including consent, instructions, data collection, and debrief, took about 2 h.

Several predictions were tested in Experiment 1. First, shaded depth relief was expected to support the route-laying shape-understanding tasks (1 and 3) better than topo, while topo relief was predicted to support the altitude-estimating relative position tasks (2 and 4). Second, lowering terrain fidelity was predicted to support the route-laying shape-understanding task by unmasking scene structure necessary to locate promising regions to conceal the route. Third, the Naïve Realism theory predicted that participants would intuit needing more display realism than necessary across tasks. Finally, we predicted that individuals of lower spatial ability would be more Naïvely Realistic than those of higher spatial ability before and after the experiment.

2.2. Results

Because of the complex multifaceted nature of this multiphase experiment, reporting of the results is summarized and synthesized wherever possible. First, we report route-laying and altitude-estimating performance, followed by results for spatial ability and intuitions. Response times were recorded for all tasks, but for brevity of exposition are only reported for cases of speed–accuracy relationships.

2.2.1. Performance Net route-laying performance:  Route-laying performance in each route-laying task was measured as the percent change in route exposure. A greater reduction in exposure indicated better route concealment and thus better task performance. Net percent change in route exposure summed over the two route-laying tasks yields an objective, overall metric to compare display formats from the beginning of the experiment to the end of the final route-laying task. Though we focus here on the results for Experiment 1, to facilitate interpretation and comparison of results across experiments, net route-laying performance for both Experiments 1 and 2 is graphed in Fig. 6.


Figure 6.  Net route-laying performance summed across initial (Task 1) and final route-laying (Task 3) for Experiments 1 and 2 for the eight display formats.

Download figure to PowerPoint

For all route-laying and altitude-estimating measures, 2 (depth relief format) × 2 (viewing angle) × 2 (terrain fidelity) repeated measures anovas were conducted. As predicted for a shape-understanding task, net route exposure reduction was greater for shaded than topo depth relief—shaded: = 36.4%, SD = 11.6; topo: = 25.9%, SD = 13.8, F(1, 30) = 17.1, < .001, η2 = 0.364—see Fig. 6. Shading more readily conveys the shape of canyons and valleys in the scene, necessary for mentally computing lines of sight, to support laying a less exposed route. Such regions are harder to mentally extract, reconstruct, and exploit from topo contours. No significant effects involving viewing angle or fidelity were found for net routing performance. Initial and final route-laying performance:  Performance for the initial and final route-laying tasks was analyzed separately to examine how the different task demands for the two phases impacted performance across display formats. When analyzed separately, the shaded over topo advantage persisted for both initial and final route laying—shaded initial: = 20.6%, SD = 7.2; topo initial: = 14.6%, SD = 8.1, F(1, 30) = 10.1, < .01, η2 = 0.251; shaded final: = 15.8%, SD = 9.6; topo final: = 11.2%, SD = 10.2, F(1, 30) = 6.1, < .05, η2 = 0.170. The terrain fidelity manipulation had different effects on initial and final route-laying performance, interacting differently in each task with the other attributes of the display format; see Fig. 7. Participants appear to have subtly exploited different cues revealed by lower terrain fidelity according to the different time and task constraints of the two route-laying tasks.


Figure 7.  Effects of terrain fidelity on initial (left) and final (right) route-laying performance in Experiment 1.

Download figure to PowerPoint

In the speeded initial route-laying task, there was a marginal interaction for fidelity and depth relief, F(1, 30) = 3.7, = .06, η2 = 0.111; see Fig. 7, left. Post-hoc tests revealed a significant shaded versus topo advantage only for smoothed and not sharp terrain displays—shaded smoothed: = 22.6%, SD = 9.0; topo smoothed: = 14.0%, SD = 11.9, t(30) = 3.6, < .01. When routing was time-pressured, smoothing the terrain may have accentuated the shading cue that supported the shape understanding necessary to quickly identify useful terrain features such as valleys and canyons to lay an initial coarse route. For example, smoothed shaded regions were better defined and more contiguous than smoothed topo regions. This coarse cue was useful when time pressure was high, and the requirements for accuracy were lower.

However, in the self-paced final route laying task, terrain fidelity interacted with viewing angle, F(1, 30) = 5.0, < .05, η2 = 0.144, but not depth relief; see Fig. 7, right. Post-hoc analyses revealed that performance was significantly improved for 45° vs. 90° views when terrain was smoothed, but not when it was sharp—45° smoothed: = 16.4%, SD = 11.9; 90° smoothed: = 10.7%, SD = 12.2, t(30) = 2.5, < .05. Though presenting the scene in 45° compared to 90° did not improve performance overall, it did improve performance when the terrain was smoothed. When routing was self-paced, smoothing the terrain may have enabled participants to exploit additional cues available in the 45° view to extract the shape understanding necessary for detailed final route laying. For example, the waypoint and crosshair slightly diverged in the 45° view as altitude increased because of the shallow viewing angle and may have been useful for finding depressions and canyon bottoms necessary for optimizing route exposure during the final route-laying task, when time pressure was low, and the requirement for accuracy was higher. Altitude-estimating performance:  Altitude-estimating performance was calculated using a root mean squared error metric, with

  • image

Lower values represent less error in estimating the altitude profile of the route across all its waypoints and therefore better task performance.

As predicted, for the precise relative position altitude tasks, participants estimated initial and final waypoint altitude more accurately with topo relative to shaded displays—topo initial: = 20.5, SD = 4.2; shaded initial: = 23.9, SD = 4.8, F(1, 30) = 10.7, < .01, η2 = 0.263; topo final: = 12.8, SD = 3.8; shaded final: = 20.3, SD = 5.4, F(1, 30) = 73.7, < .001, η2 = 0.711. Shading gradations support only coarse, relative altitude judgments needed for shape understanding, whereas topo lines support the precise, objective altitude judgments required of the relative position task. No effects involving fidelity or viewing angle were found.

2.2.2. Spatial ability and intuitions

Mean MRT scores were 32.9 (SD = 18.0, range 8–72) across participants. Split by gender, mean MRT scores were 38.6 for males and 23.8 for females, replicating the classically observed gender difference in spatial ability (see Hegarty & Waller, 2005).

The Naïve Realism theory predicted that participants would intuit needing more display realism than necessary to perform the experiment tasks. Spatial ability was predicted to have a moderating impact on both prospective intuitions (judged before the task) and retrospective intuitions (judged after the task), with those of high spatial ability being less Naïvely Realistic. To test these predictions, it was necessary to operationalize both the realism of the eight different display formats, and Naïve Realism (desiring more realism than necessary for a task). Display realism was defined with realism scores, calculated as the mean ranking of how realistically each display format depicted terrain, described above in the 2.1.2 Method section (see Fig. 3).

The realism score of the display format that participants intuited would support their best performance operationalized the amount of display realism participants thought they needed for a task. For example, participants who predicted doing initial route laying best with the sharp 45° shaded display format thought they needed high display realism (realism score = 7.0) for the task. The actual realism needed to support their best task performance was the realism score of their best performing display for that task. Naïve Realism was operationalized as a significantly higher realism score for the display that participants intuited would support their best performance. Thus, if the same participant actually performed initial route laying best with the smoothed 45° shaded display with realism score = 4.7, then they were deemed Naïvely Realistic (because the 7.0 realism score for the display they predicted is greater than the 4.7 realism score for the display they actually did best with).

Participants were asked for their intuitions about which display format would support the best performance for each task both before (prospective intuitions) and after doing the task (retrospective intuitions). These two intuition judgments provide usefully different information: Prospective intuitions are based on assumptions and expectations about the task demands and relative utility of the different display formats, while retrospective intuitions reflect participants’ experiences with the task and the display set. In general, in the metacognition literature retrospective judgments have been found to be more accurate than prospective (Glenberg & Epstein, 1985; Maki, Foley, Kaher, Thompson, & Willert, 1990). Here, we separately assessed both types of judgments, to examine how spatial ability may moderate susceptibility to Naïve Realism before performing a task, and maintenance of Naïve Realism after performing a task.

Participants were split by median MRT score into two groups of low and high spatial ability. Prospective and retrospective intuition errors were separately calculated for the low and high spatial ability groups, for all experimental tasks. Fig. 8 shows the results of the novel methodology we have developed to test the Naïve Realism theory. To save space, and to facilitate interpretation of the results across experiments, we integrated the results of Experiment 2 into the figure. Fig. 8 has 16 panels, formed from the intersection of two participant groups (low vs. high spatial ability), two feedback conditions (Exp 1: implicit vs. Exp 2: explicit), and four terrain understanding tasks (initial and final route laying and altitude estimating).


Figure 8.  Effects of spatial ability in the calibration of intuitions about display realism and performance. Route laying task feedback was either implicit (No-Feedback, Exp 1) or explicit (Feedback, Exp 2). Realism score extends from 1 (low) to 8 (high), truncated here to save space. Actual display realism needed to perform the task is shown with bsl00066, and prospective (PI) and retrospective intuitions (RI) about realism are shown as error distances from actual realism. Gray bars indicate significant intuition error; white bars indicate no intuition error. Calibration after performing the task manifests by prospective intuition error before the experiment (gray bar) and no retrospective intuition error after the experiment (white bar).

Download figure to PowerPoint

Each panel of Fig. 8 has the realism score along the x-axis, and plots the actual realism needed to support the best performance at that task, averaged across participants, as a bold black vertical line with triangle pointer beneath. The prospective intuitions and retrospective intuitions, averaged across participants, are also plotted along the realism scale, and horizontal bars show their distance away from the black-line actual realism needed. Bars are gray if realism intuited and actual realism required to perform the task with are significantly different; bars are white if realism intuited and realism needed are not significantly different. Thus, white bars indicate calibrated judgments, and gray bars indicate uncalibrated judgments which are invariably in the direction of Naïve Realism—intuiting more realism than necessary prospectively, or retrospectively, to perform the task.

Focusing on rows 1 and 2 in Fig. 8, four key patterns are evident from the analyses of intuitions in Experiment 1. First, Naïve Realism occurred across route-laying and altitude-estimating tasks, both for prospective and retrospective intuitions. Prospective and retrospective intuitions were significantly greater than actual realism needed (indicated by gray bars) in 7 of 16 intuition judgments (and prospective and retrospective intuitions were numerically greater than actual realism needed in 14 of 16 intuition judgments). Second, the actual realism needed to support the best performance across tasks (= 4.3, SD = 1.4 for both Experiments 1 and 2) fell near the middle of the realism scale (4.5). It was not the case that the most realistic displays were optimal for task performance. Rather, displays of intermediate realism supported the best performance. Third, both those of high and low spatial ability were prone to Naïve Realism before performing the tasks, as evidenced by prospective intuition error for both groups. Fourth, and of most interest, when committing significant prospective intuition errors, those of higher spatial ability consistently calibrated their retrospective intuitions, while those of lower spatial ability did not. This pattern occurred primarily for the final route-laying and altitude-estimating tasks. However, it was these final tasks that participants were instructed to focus and spend their time on, perhaps explaining the failure to find support for the prediction in both initial speeded tasks where participants may not have been as mindful about their performance, and judgments about that performance.

To summarize the results for Experiment 1, our first prediction based on the shape understanding/relative position task dichotomy (St. John et al., 2001) was confirmed with different display attributes supporting the two terrain task types: Shading supported shape understanding needed for route laying, while topo supported precise relative position judgments needed for altitude estimating. With respect to the second prediction, we found that lowering terrain fidelity did improve shape understanding specifically for route laying, as predicted, but in subtly different ways for initial and final route laying. As predicted by the Naïve Realism theory, and confirming our third experimental prediction, participants intuited that more realistic displays would support better task performance across all terrain tasks. In line with our fourth prediction, after performing the tasks, only those of high spatial ability appropriately calibrated their retrospective intuitions about the displays. However, contrary to prediction, it was not the case that only those of low spatial ability were Naively Realistic before the experiment—those of high spatial ability were, too.

3. Experiment 2

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Experiment 1
  5. 3. Experiment 2
  6. 4. Discussion
  7. Acknowledgments
  8. References

Experiment 1 demonstrated pervasive Naïve Realism across both shape-understanding and relative position terrain–understanding tasks. However, only those of high spatial ability appropriately calibrated their retrospective intuitions about task performance. Perhaps only those of high spatial ability were sensitive to the implicit feedback they obtained when performing the experimental tasks with the different displays. If so, then perhaps making task feedback explicit and salient might moderate Naïve Realism for those of low spatial ability. Experiment 2 was designed to test this prediction by focusing on the effect of feedback on performance and the calibration of retrospective intuitions in the final routing task.

There are many ways to provide feedback, and the effects of feedback on task performance are far from straightforward. For example, feedback can even degrade task performance in certain situations (see Kluger & DeNisi, 1996). The approach we took to maximize the likely positive impact of feedback was to make it salient, understandable, immediate, and as task relevant as possible. This was accomplished through the addition of two aids for global and local feedback on route laying.

3.1. Method

3.1.1. Participants

Thirty college students or graduates (17 male, 13 female) with a mean age of 34.2 (range 19–58 years) were recruited from and were paid $30 for their participation. This group of new participants did not differ in mean age or spatial ability from the group in Experiment 1.

3.1.2. Design, stimuli, and procedure

The design, stimuli, and procedure from Experiment 1 were repeated for Experiment 2, with two additions to the routing tasks to give salient and immediate global and local performance feedback, respectively. First, a continuously available exposure score was shown next to the display, as shown in Fig. 5, along with a color-coded change score to provide a global index of the current route’s exposure compared to exposure at the beginning of the task. Both scores updated upon each waypoint movement. The exposure score of all starting routes was 174. If a participant adjusted a route to reduce its exposure, by 47 units, for example, the performance improvement was indicated by an update to the exposure score (127 = 174 − 47) and to the change score, labeled “better by 47” in green text. An exposure increase of 47 units resulted in a higher exposure score (221 = 174 + 47) and a change score labeled “worse by 47” in red text to indicate the performance decrement.

Second, the exposure envelope visualization, as shown in Fig. 5, was available to participants to superimpose on the terrain display, on demand. Pressing the spacebar showed the envelope, which provided feedback on the local exposure along the course of the route. For example, this enabled a participant to focus her efforts on a particularly visible subsection of the route where the envelope was fat. The envelope changed to reflect the route’s line of sight exposure as participants adjusted the waypoints. To ensure adequate visibility and contrast of the envelope against each display format, the exposure envelope lines were yellow on the achromatic shading (as in Fig. 5) and achromatic gray on the colored topo format.

Intuition questioning, time allocation, overall experiment duration, and instructions were the same as in Experiment 1, with the sole addition of instructions on how to interpret and use the two new feedback features.

The key predictions for Experiment 2 were that feedback would improve the route-laying performance pattern observed in Experiment 1 equally across formats, and that the explicit performance feedback in Experiment 2 would reduce the moderating effects of spatial ability on retrospective intuitions observed in Experiment 1.

3.2. Results

Performance, spatial ability, and intuition results are reported, including comparisons between the Experiments (no-feedback, Exp 1; feedback, Exp 2) where appropriate.

3.2.1. Performance Net route-laying performance:  For all performance analyses, mixed-design anovas were conducted, with depth relief format, viewing angle, and terrain spatial fidelity again as within-subjects variables, and feedback (no-feedback, Exp 1; feedback, Exp 2) as a between-subjects variable. Overall, net route-laying performance was significantly improved by feedback; see Fig. 6 above—feedback: = 47.9%, SD = 6.4; no-feedback: = 31.1%, SD = 10.6, F(1, 55) = 55.6, < .001, η2 = 0.485.

Feedback also interacted with depth relief format, F(1, 59) = 7.2, < .01, η2 = 0.109. However, the shaded over topo advantage was considerably lessened by feedback; see Fig. 6—feedback: F(1, 29) = 10.4, < .01, η2 = 0.264; no-feedback: F(1, 30) = 17.1, < .001, η2 = 0.364. In Experiment 2, the shading advantage over topo was 3.2% compared to 10.5% in Experiment 1, and it was probably reduced because the presence of feedback could largely compensate for the inferior shape understanding conveyed by topo lines. Across all display formats, the performance range for net route laying was reduced to 7.2% with feedback, from 15.6% with no feedback. Initial and final route-laying performance:  The feedback advantage was found for both initial and final route-laying performance—feedback initial: = 29.4%, SD = 7.1; no-feedback initial: = 17.6%, SD = 5.6, F(1, 59) = 82.7, < .001, η2 = 0.584; feedback final: = 18.4%, SD = 4.6; no-feedback final: = 13.5%; SD = 8.5, F(1, 59) = 7.9, < .01, η2 = 0.118.

However, none of the significant effects involving fidelity, format, or viewing angle found in Experiment 1 were found for initial or final route-laying performance when feedback was provided in Experiment 2. Feedback had the unanticipated effect of leveling away all the display format effects and interactions found in Experiment 1. One reason for the leveling effects of feedback across display format may be that it changed the way participants approached the task. Because feedback was constantly available, participants may have adopted a strategy of improving exposure to a similar criterion (score) across display formats. Future research could limit the availability of feedback or track the usage of the score separately from the envelope to study the pattern of feedback usage across display formats. In the current experiment, because both sources of feedback were constantly available, we were unable to distinguish the usage and utility of global feedback of the score from the local feedback afforded by the exposure envelope.

Though routing performance for shaded and topo displays was equalized by feedback for initial and final route laying, there was a cost of extra time required for route laying in topo displays—topo initial: = 72.9 s, SD = 24.4; shaded initial: = 62.0 s, SD = 21.7, F(1, 29) = 6.8, < .05, η2 = 0.190; topo final: = 219.2 s, SD = 49.3; shaded final: = 208.2 s, SD = 45.0, F(1, 29) = 4.5, < .05, η2 = 0.135. It is possible that participants could not readily identify promising terrain features with topo displays, as they could with shaded, and hence spent more time exploring unprofitable areas of the terrain in topo displays. Because participants received feedback, these unprofitable forays could be corrected, solely at a cost in time spent, not in goodness of the ultimate route produced. Altitude estimating performance:  As in Experiment 1, participants in Experiment 2 estimated initial and final route altitude significantly better with topo compared to shaded displays—initial topo: = 19.7%, SD = 3.1; initial shaded: = 24.5%, SD = 3.8, F(1, 29) = 29.6, < .001, η2 = 0.505; final topo: = 14.8%, SD = 4.3; final shaded: = 20.5%, SD = 5.1, F(1, 29) = 48.4, < .001, η2 = 0.625. Note that the route-laying task feedback interventions in Experiment 2 were of negligible utility to the altitude-estimating task, explaining why the same pattern of performance obtained as in Experiment 1.

To summarize to this point, as predicted, feedback improved performance in Experiment 2. However, it reduced or eliminated the display format effects observed in Experiment 1. Though feedback equalized performance for topo and shaded displays in initial and final route laying, it could not completely compensate for the mental challenge of reconstructing shape understanding from topo displays, as suggested by the extra time that was required to exploit the feedback when routing with topo displays.

3.2.2. Spatial ability and intuitions

Mean MRT scores were 30.6 (SD = 20.4, range 0–68), and again were higher for males (38.3) than for females (20.5).

The spatial ability and intuition results for Experiment 2 are shown in Fig. 8, above, alongside the results from Experiment 1. Three key points for Experiment 2 are evident in Fig. 8. First, Naïve Realism again occurred across route-laying and altitude-estimating tasks, both for prospective and retrospective intuitions. Prospective and retrospective intuitions again were significantly greater than actual realism needed (shown by gray bars) in 6 of 16 intuition judgments for Experiment 2, and they were numerically greater than actual realism needed in 14 of 16 intuition judgments for Experiment 2. Second, when committing significant prospective intuition errors, those of higher spatial ability again consistently calibrated their retrospective intuitions in Experiment 2. Third, of most importance, those of lower spatial ability, when committing significant prospective intuition errors, also calibrated their retrospective intuitions (for the route-laying task on which feedback was provided).

As in Experiment 1, feedback participants of higher spatial ability in Experiment 2 appropriately calibrated their retrospective intuition errors. Comparing across the two experiments, explicit and salient task feedback was needed to calibrate the retrospective intuitions of those of low spatial ability for final route laying.

4. Discussion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Experiment 1
  5. 3. Experiment 2
  6. 4. Discussion
  7. Acknowledgments
  8. References

In the Science paper that gave birth to 3D terrain maps over 40 years ago, Jenks and Brown (1966) noted, presciently, that “the cartographer…must choose between realism and practicality” (p. 857). Sound advice, indeed. But what Jenks and Brown may not have realized was that the technology they were pioneering would one day see wide use by users unaware of any such trade-off. Only a year before their Science paper, Sutherland (1965) had defined a vision for virtual reality technology that relegated displays to seamless lenses on the world. The burden of practicality in Sutherland’s vision was all on technologists to create the seamless displays. It was taken as a given that the implied realism was practical as far as the user was concerned. And so the “march of progress” that we caricatured in Fig. 1 towards an implicit gold standard of photorealism began, in full spatial and depth cue fidelity. The Naïve Realism theory explains this march and it throws into question the meta-representational competence of users at a time when they are increasingly given control over their displays and able to populate them with realistic content that is particularly attractive, yet particularly problematic, from a perceptual standpoint.

In two fairly elaborate experiments in which we related psychometrics to intuition and performance measures, and designed and validated new terrain simplification and visual feedback aids, we validated several concepts and gained new insights into others. We observed the Naïve Realism–predicted decoupling of intuition and performance with high-fidelity, realistic terrain views. Participants intuited needing more display realism than required for both the unfamiliar and fairly complex task of laying a concealed route and for the simple task of judging altitude. Increasing terrain fidelity did not improve route-laying performance. In Experiment 1, the fine detail of high-fidelity views masked and cluttered detection of gross features needed for route concealment. One does not need to see the boulders along the jagged edges of a canyon to detect the canyon itself. Similarly, realistic shading or high terrain fidelity was not needed for estimating altitude. Rather, altitude could be simply read off from unrealistic topo contours.

The pattern of intuition for realism, and superior performance for simplification, mirrors that previously observed for realistic icons and caricatures of Navy vessels and aircraft (Smallman, St. John, Oonk, & Cowen, 2000; Smallman et al., 2001) and forecasting with meteorological displays (Hegarty et al., 2009). An older literature on the factors determining general aesthetic appeal found complexity to be a strong driver of preference (Berlyne, 1970). For example, Jacobsen and Höfel (2002) showed that the attribute of complexity was the second highest rated aesthetic attribute, after symmetry, for a set of novel icons. We are not the first to caution against complexity in display design, although we may be the first to do so from the perspective of perceptual science. Other practitioners and researchers have previously emphasized the perils of complexity in design. In addition to Jenks and Brown’s implicit warning, mentioned above, Bertin (1983) stated that “simplification is an obligation of the communication process” (p. 166). Lowe (1994) warned that “unnecessary detail that might reduce the clarity of the visual argument” (p. 468). And Tufte (1983) stressed maximizing the “ratio of data to ink” as a guiding principle in design to avoid, what he termed, superfluous “chart junk” whenever possible.

In the experiments, we began to investigate what characteristics may predispose individuals to Naïve Realism, and how it can be maintained. By collecting intuitions about task performance and relating those to spatial ability, in Experiment 1 we found that the implicit feedback of performing a task was sufficient to calibrate the intuitions of those of high spatial ability. In contrast, those of low spatial ability required the salient, immediate, and conspicuous task feedback made available in Experiment 2, to calibrate their intuitions. These results are a touch point to the vast treatment-aptitude literature, where spatial ability has been implicated as a mediator of sensitivity to feedback (Kyllonen et al., 1984). The results also mirror recent initial observations of the naturalistic preference and use of graphical weather maps by Navy weather forecasters, where those of higher spatial ability configured their displays to just depict task-relevant information (Smallman & Hegarty, 2007).

Despite a growing interest in relating performance to metacognitive judgments (e.g., Hegarty et al., 2009; Tractinsky, Katz, & Ikar, 2000), the general methodological issues of how best to measure and relate the two have not been settled. Here, we have advanced the methodological state of the art, but it could still be improved. First, retrospective judgments could be probed closer in time to task performance. We imposed a burden on the memory of participants by asking for their retrospections after eight separate blocks with different displays. The absence of a clean high/low spatial ability calibration pattern in the first two speeded phases may have been because of inattention and inability to encode accurate metacognitions about task performance under those conditions (Fig. 8). Second, the prospective and retrospective judgments could be made a series of local confidence judgments about each display, instead of the global judgment about the intuited single best display employed here. For example, the feedback available in Experiment 2 leveled away much of the performance differences by display format. This made global retrospective intuition judgments more difficult as they were based on a restricted performance range, requiring finer discriminations between performance across display conditions. The ideal combination might be a series of local confidence judgments for reduced memory burden and increased sensitivity, supplemented by global judgments for context.

Those of high spatial ability were more sensitive to their display-related performance, although they too expressed Naively Realistic intuitions before performing the tasks. Perhaps there is a universal draw to what is apparently easy. Could preference for realistic 3D perspective views be another instance of the “paradox of the active user” (Carroll & Rosson, 1987)? The paradox is the classically observed stable suboptimal behavior that users are willing to engage in, for example, with inefficient text editing strategies, because it is apparently too hard to learn a more optimal pattern of behavior. This opens the fascinating question of the relationship of melioration to Naïve Realism, and vice versa. Users may be either unwilling to learn new symbols or apparently counter-intuitive less realistic display formats by judging that the reward in terms of enhanced performance does not justify the effort and labor of learning them (see Neth, Sims, & Gray, 2006). It would be interesting to see whether melioration patterns show the same breakdown by spatial ability observed in our studies.

In applied implications of the work, we developed and validated new display concepts with potential application to operational tasks. Past research has only used terrain simplification as an experimental control condition. Eley (1991) smoothed terrain views to make control displays for testing whether memory encodes only the gist of terrain. Here, we propose applying terrain simplification to actual task display designs. Our spatial filtering technique offers a general approach to simplifying terrain views for a variety of task needs and environments. Finally, these experiments and the related recent work of Hegarty et al. (2009) highlight the pitfalls of making display format user-configurable. It has been known for a while that users may not know what is best for them (Andre & Wickens, 1995). Naïve Realism now explains why. Care must be taken to not just equip users with tools, such as the new envelope visualization and terrain simplification developed here, that provide them backup for, and feedback on, their display-related behavior, but guidance on the appropriate use of those tools, as well.

Overall, we hope the discussion here of the implicit trends in display design, and the misconceptions about realism and its impact on perception, opens a new dialogue between basic and applied science. Applied display design stands to benefit from more involvement and integration of perceptual science.

  • 1

    We thank Prof. Sara Fabrikant for bringing Imhof’s maps and lighting technique to our attention.


  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Experiment 1
  5. 3. Experiment 2
  6. 4. Discussion
  7. Acknowledgments
  8. References

This work was sponsored by the Office of Naval Research (ONR), program officers Drs. Astrid Schmidt-Nielsen and Paul Bello, through a contract from Space and Naval Warfare Systems Center Pacific, managing scientist Dr. Michael B. Cowen. The work was also supported directly by ONR, program officer Dr. Gerald Malecki. Thanks to Daniel Manes and Frank Lacson of PSE for technical assistance, Geoffrey Williams for graphic artistry, and Chiesha Stevens, Caitlin Couey, and Kathryn Imler for research assistance on the project. Thanks also to Dr. Mark St. John for helpful comments on a previous version of the manuscript. Portions of this work were reported at the 2007 and 2008 annual meetings of the Human Factors and Ergonomics Society. Any opinions, findings, conclusions, or recommendations expressed herein are those of the authors and do not necessarily reflect the views of the Department of Defense.


  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Experiment 1
  5. 3. Experiment 2
  6. 4. Discussion
  7. Acknowledgments
  8. References
  • Andre, A. D., & Wickens, C. D. (1995). When users want what’s NOT best for them. Ergonomics in Design, 3, 1014.
  • Bennett, K. B., Nagy, A. L., & Flach, J. M. (1997). Visual displays. In G.Salvendy (Ed.), Handbook of human factors and ergonomics (2nd ed.) (pp. 659696). New York: John Wiley & Sons.
  • Berlyne, D. E. (1970). Novelty, complexity and hedonic value. Perception & Psychophysics, 8, 279286.
  • Bertin, J. (1983) Semiology of graphics: Diagrams, networks, maps (translated by W. J.Berg). Madison, WI: University of Wisconsin Press.
  • Bowman, D. A., Kruijff, E., LaViola, J. J. Jr, & Poupyrev, I. (2005). 3D user interfaces: Theory and practice. Boston, Addison-Wesley.
  • Carroll, J. M., & Rosson, M. B. (1987). Paradox of the active user. In J. M.Carroll (Ed.), Interfacing thought: Cognitive aspects of human-computer interaction (pp. 80111). Cambridge, MA: MIT Press.
  • Carter, W. E., Shrestha, R. L., & Slatton, K. C. (2007). Geodetic laser scanning. Physics Today, 60, 4147.
  • Cavanagh, P. (2005). The artist as neuroscientist. Nature, 434, 301307.
  • Cruz, I. F. (1996). Tailorable information visualization. ACM Computing Survey, 28 (4es), December 1996.
  • Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W.Epstein & S.Rogers (Eds.), Handbook of perception and cognition, Vol. 5 (pp. 69117). San Diego, CA: Academic Press.
  • Daukantas, P. (2009). Photorealistic rendering: Making the virtual into reality. Optics & Photonics News, 20, 3439.
  • Dennehy, M. T., Nesbitt, D. W., & Sumey, R. A. (1994). Real-time three-dimensional graphics display for antiair warfare command and control. Johns Hopkins APL Technical Digest, 15, 110119.
  • Eley, M. G. (1991). Selective encoding in the interpretation of topographic maps. Applied Cognitive Psychology, 5, 403422.
  • Ferwerda, J. A. (2003). Three varieties of realism in computer graphics. In B. E.Rogowitz & T. N.Pappas (Eds.), Proceedings SPIE human vision and electronic imaging VIII, Vol. 5007 (pp. 290297). Santa Clara, CA: SPIE press.
  • Frisby, J. P. (1980). Seeing. Oxford, England: Oxford University Press.
  • Glenberg, A. M., & Epstein, W. (1985). Calibration of comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 11, 702718.
  • Hegarty, M., & Kozhevnikov, M. (1999). Types of visual-spatial representations and mathematical problem solving. Journal of Educational Psychology, 91, 684689.
  • Hegarty, M., Smallman, H. S., Stull, A. T., & Canham, M. (2009). Naïve cartography: How intuitions about display configuration can hurt performance. Cartographica, 44, 171186.
  • Hegarty, M., & Waller, D. (2005). Individual differences in spatial abilities. In P.Shah & A.Miyake (Eds.), Handbook of visuospatial thinking (pp. 121169). New York: Cambridge University Press.
  • Imhof, E. (1982). Cartographic relief presentation. Berlin and New York: Walter de Gruyter.
  • Jacobsen, T., & Höfel, L. (2002). Aesthetic judgments of novel graphic patterns: Analysis of individual judgments. Perceptual and Motor Skills, 95, 755766.
  • Jenks, G. F., & Brown, D. A. (1966). Three dimensional map construction. Science, 154, 857864.
  • Kluger, A. N., & DeNisi, A. (1996). Effects of feedback intervention on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119, 254284.
  • Kyllonen, P. C., Lohman, D. F., & Snow, R. E. (1984). Effects of aptitudes, strategy training, and task facets on spatial task performance. Journal of Educational Psychology, 76, 130145.
  • Loomis, J. M. (1992). Distal attribution and presence. Presence, 1, 113119.
  • Loomis, J. M., & Knapp, J. M. (2003). Visual perception of egocentric distance in real and virtual environments. In L. J.Hettinger & M. W.Hass (Eds.), Virtual and adaptive environments (pp. 2146). Hillsdale, NJ: Lawrence Erlbaum.
  • Lowe, R. K. (1994). Selectivity in diagrams: Reading between the lines. Education Psychology, 14, 467491.
  • MacEachren, A. M. (1995). How maps works: Representation, visualization, and design. London: Guilford Press.
  • MacLeod, D. I. A., & Willen, J. D. (1995). Is there a visual space? In R. D.Luce, M. D.D’Zmura, D. D.Hoffman, G. J.Iverson, & A. K.Romney (Eds.), Geometric representations of perceptual phenomena: Papers in honor of Tarow Indow on his 70th (pp. 4760). Mahwah, NJ: Lawrence Erlbaum Associates.
  • Maki, R. H., Foley, J. M., Kaher, W. K., Thompson, R. C., & Willert, M. G. (1990). Increased processing enhances calibration of comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 609616.
  • McKee, S. P., Levi, D. M., & Bowne, S. F. (1990). The imprecision of stereopsis. Vision Research, 30, 17631779.
  • McLeary, G. F., Jenks, G. F., & Ellis, S. R. (1991). Cartography and map displays. In S. R.Ellis, M.Kaiser, & A. J.Grunwald (Eds.), Pictorial communication in virtual and real environments (pp. 7696). London: Taylor & Francis.
  • Milgram, P., & Kishono, F. (1994). A taxonomy of mixed reality virtual displays. IECE Transactions on Information Systems, E77-D(12), 13211329.
  • Neth, H., Sims, C., & Gray, W. (2006). Melioration dominates maximization: Stable suboptimal performance despite global feedback. In R.Sun & N.Miyake (Eds.), Proceedings of the 28th annual meeting of the cognitive science society (pp. 627632). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Pick, H. L., Heinrichs, M. R., Montello, D. R., Smith, K., Sullivan, C. N., & Thompson, W. B. (1995). Topographic map reading. In P. A.Hancock, J. M.Flach, J.Caird, & K. J.Vicente (Eds.), Local applications of the ecological approach to human-machine systems, Vol. 2 (pp. 255284). Hillsdale, NJ: Lawrence Erlbaum.
  • Pylyshyn, Z. W. (2003). Seeing and visualizing: It’s not what you think. Cambridge, MA: MIT Press Bradford Books.
  • Sanders, M. S., & McCormick, E. J. (1993). Human factors in engineering and design (7th ed.). New York: McGraw-Hill, Inc.
  • Scholl, M. J., & Egeth, H. E. (1982). Cognitive correlates of map reading ability. Intelligence, 6, 215230.
  • Sedgwick, H. A. (1986). Space perception. In K. R.Boff, L.Kaufman, & J. P.Thomas (Eds.), Handbook of Perception and human performance, Vol. 1 (pp. 21012157). New York: Wiley.
  • diSessa, A. A. (2004). Metarepresentation: Native competence and targets for instruction. Cognition and Instruction, 22, 293331.
  • Simons, D. J., & Rensink, R. A. (2005). Change blindness: Past, present and future. Trends in Cognitive Science, 9, 1620.
  • Smallman, H. S., & Hegarty, M. (2007). Expertise, spatial ability and intuition in the use of complex visual displays. In Proceedings of the 51st annual meeting of the human factors and ergonomics society (pp. 20002004). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Smallman, H. S., & St. John, M. (2005). Naïve Realism: Misplaced faith in the utility of realistic displays. Ergonomics in Design, 13, 613.
  • Smallman, H. S., St. John, M., & Cowen, M. B. (2002). Use and misuse of linear perspective in the perceptual reconstruction of 3-D perspective view displays. In Proceedings of the 46th annual meeting of the human factors and ergonomics society (pp. 15601564). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Smallman, H. S., St. John, M., Oonk, H. M., & Cowen, M. B. (2000). When beauty is only skin deep: 3-D realistic icons are harder to identify than conventional 2-D military symbols. In Proceedings of the 44th annual meeting of the human factors and ergonomics society (pp. 480483). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Smallman, H. S., St. John, M., Oonk, H. M., & Cowen, M. B. (2001). ‘SYMBICONS’: A hybrid symbology that combines the best elements of SYMBols and ICONS. In Proceedings of the 45th annual meeting of the human factors and ergonomics society (pp. 110114). Santa Monica, CA: Human Factors and Ergonomics Society.
  • St. John, M., Cowen, M. B., Smallman, H. S., & Oonk, H. M. (2001). The use of 2D and 3D displays for shape understanding vs. relative position tasks. Human Factors, 43, 7998.
  • Steiner, B. A., & Dotson, D. A. (1990). The use of 3-D stereo display of tactical information. In Proceedings of the 34th annual meeting of the human factors society (pp. 3640). Santa Monica, CA: Human Factors and Ergonomics Society.
  • Sutherland, I. F. (1965). The ultimate display. In Proceedings of the international federation of information processing (IFIP) congress (pp. 506508). New York: IFIP.
  • Tractinsky, N., Katz, A. S., & Ikar, D. (2000). What is beautiful is usable. Interacting with Computers, 13, 127145.
  • Tufte, E. R. (1983). The visual display of quantitative information. Cheshire, CT: Graphics Press.
  • Tversky, B., Morrison, J. B., & Betrancourt, M. (2002). Animation: Can it facilitate? International Journal of Human-Computer Studies, 57, 247262.
  • Vandenberg, S. G., & Kuse, A. R. (1978). Mental rotations, a group test of three-dimensional spatial visualization. Perceptual and Motor Skills, 47, 599604.
  • Varakin, D. A., Levin, D. T., & Fidler, R. (2004). Unseen and unaware: Implications of recent research on failures of visual awareness for human-computer interface design. Human-Computer Interaction, 19, 389422.
  • Winer, G. A., & Cottrell, J. E. (2004). The odd belief that rays exit the eye during vision. In D. T.Levin (Ed.), Thinking and seeing: Visual metacognition in adults and children (pp. 97119). Cambridge, MA: MIT Press.
  • Woods, D. D., & Roth, E. M. (1988). Cognitive systems engineering. In M.Helander (Ed.) Handbook of human-computer interaction (pp. 141). Amsterdam: Elsevier Science Publishers B. V. (North-Holland).