Marden's (2013) reanalysis of Knecht et al. (2011) suggesting that specimen SEMC-F97 is the result of the skimming behavior of a neopteran insect and, more importantly, fossil evidence of “… surface skimming as a precursor to the evolution of flight in insects” (Marden 2013) is found to be deficient on three fronts: (1) the principal specimen was never viewed firsthand which led to significant morphological misinterpretations; (2) poorly designed and executed neoichnological experiments led to incredulous results; and (3) the assumption that this specimen is fossil evidence supporting the surface skimming hypothesis of the origin of insect flight despite the fact that since its induction into the literature that hypothesis has been refuted based on significant paleontological, phylogenetic, genetic, and developmental evidence.

The progress of science is contingent upon healthy debate, ideally with an objective view of the data, resulting in a better approximation of the truth. We find disagreement with the conclusions presented in our own article (Knecht et al. 2011), a welcome and normal part of discovery, research, and discussion. Here we question the overall methodology by which Marden (2013) drew conclusions, ultimately questioning their validity. Marden's (2013) portrayal of the data gleaned from the trace fossil specimen from the Pennsylvanian (Westphalian) Wamsutta Formation of southeast Massachusetts (Knecht et al. 2011; University of Kansas, Entomological Collections SEMC-F97) effectively results in misinformation regarding the fossil itself, and subsequently its interpretation.

This comment spans the technical issues in Marden's (2013) comment that directly affect the integrity of the fossil data, as well as the issues surrounding the surface-skimming hypothesis of the origin of insect flight (Marden and Kramer, 1994), and the wealth of paleontological, phylogenetic, and developmental evidence against it. Specifically, these issues include the following: (1) errors involving specimen elements, (2) misinformation provided in figures, (3) misunderstanding of the nature of trace fossils, (4) misunderstanding of the experimental reproduction of trace fossils, and (5) scientific arguments regarding the evolution of flight and phylogenetic reconstructions of stem Pterygota.

Trace fossils, as fossil evidence of behavior, preserve the direct imprint of or action of some portion of an animal's body in a medium, such as sediment, rock, or other substrate. Trace fossils are an indication of what the animal was doing at the time the imprint (trackway, burrow, bite-mark, etc.) was produced. In most trace fossils, the maker is unknown and the interpretation of such requires a thoughtful and careful analysis at the intersection of several fields (i.e., ethology, biology, taphonomy, sedimentology) and depending on the granularity of the evidence, varying levels of interpretive detail are possible. To interpret a producer as an insect, for example, imprints and trackways exhibiting the presence of three leg pairs are sufficient. To interpret a producer as a dasyleptid, a greater number of characters are required, such as number of abdominal segments, presence of abdominal styli imprints, location of mandibular palp impressions, and so on. Essentially, when it comes to trace fossil interpretation, details matter.

Marden (2013) performed a reinterpretation of photographs rather than a reanalysis of the original specimen. Reanalysis involves a level of critical analysis of an entire dataset (in this case, including the actual specimen) that was not evident in the comment on our prior publication (Knecht et al. 2011). All of the actual impressions visible on the part and counterpart specimens should have been considered in context. By utilizing solely photographs, Marden (2013) failed to grasp the detail visible in the specimen and the spatial relationships of the elements. Using photographs for ground-up reanalysis is neither common practice in ichnology, nor in paleontology in general, and is fraught with pitfalls including, but not limited to, misperception of relief (negative or positive), incomplete knowledge of the characters outside the frame of the photograph, reduction in detail, and a poor understanding of the overall orientation and setting of the specimen. Reinterpretations are more common from photographs, but typically only when fine detail is lacking in the specimen itself, which is not the case in SEMC-F97. Semantics aside, to refute our findings without examining the actual fossil is to risk telescoping error from one misinterpretation to another. The following points are the major shortcomings of Marden's (2013) method of analysis.

Marden Makes the Interpretation that Marks Leading from the Femora are Due to the Impression of Laterally Folded Wing Pairs Impacting the Substrate

To support this assertion, Marden (2013) identified paired, parallel linear impressions made by the folded wings of modern stoneflies during neoichnological experimentation (see section Marden Uses Live Stoneflies in an Experiment to Attempt to Prove that Stoneflies Can Produce Traces Like the Fossil in Question) in approximately the same location (relative to the body) as thin linear grooves visible in the photo of the original fossil specimen. He further commented that paired impressions were visible behind the mesothoracic legs in the experimentally produced traces, but did not offer photographic evidence. In the published figures (Fig. 1 in Marden 2013), the neoichnological stonefly evidence is shown (Fig. 1C, 1D in Marden 2013) adjacent to the specimen photo (Fig. 1B, 1E in Marden 2013). This assertion is critical to Marden's thesis because it would indicate that the trace fossil represents a winged insect that, when at rest, held its paired wings folded, draped along its back and sides, not aloft as in mayflies.

Figure 1.

Detail views of linear grooves made by flexible protrusions from the femora of the tracemaker in specimen SEMC-F97. (A) Area between mesothoracic (MsF) and methathoracic (MtF) femoral imprints with arrows highlighting three distinct curved incisions. (B) Area behind the metathoracic femoral imprint (MtF) and adjacent to abdominal imprint (Abd) with arrows highlighting at least three grooves with arcuate paths. Note change in distance between grooves from posterior to anterior. Both images show the positive relief from counterpart specimen. Scale bars in (A) and (B) each represent 2 mm.

A critical analysis of the original specimen would have revealed that the best-preserved sets of these striae (posterior of one mesothoracic femur and posterior of the same side's metathoracic femur) are not paired, but appear as sets of three, and that they are not parallel. In fact, each of these striae has an independent curvature (“sub-parallel” in Knecht et al. 2011) that would indicate three individual, flexible protrusions that produced them. These features are minute and difficult to see, and for this reason we published an inset microphotograph (Fig. 2A in Knecht et al. 2011) to show those preserved behind the mesothoracic femur. Here we offer a republication of that inset photo (Fig. 2A), and another closeup (Fig. 2B) of the striae behind the metathoracic femur that was enlarged directly from the published photo of Knecht et al. (2011) showing the likely presence of three striae and their arcuate path. It is unclear why this evidence of at least three grooves instead of two, and their curving, not parallel, path was not presented in Marden's (2013) article.

Figure 2.

Detail photo of impressions lateral to the body axis purported by Marden (2013) to be made by wings but were actually made by legs. Black arrows indicate locations of clearly discernable tarsal impressions. Note width of repeated linear imprints is similar to the tibia imprint width in main impression. Image is negative relief from the part specimen lit at very low angle to show shallow impressions. Scale bar is 10 mm.

Marden Makes the Interpretation that the Repeated Impressions Distal to the Main Body Impression are the Result of Wing Impact

Multiple misleading lines of data presented by Marden (2013) are used to assert this point significant because his thesis requires a flying insect (with flapping wings) that could make such marks at those locations on the specimen. First, Marden neglects to ascertain that the “faint marks” in the area distal to the body axis are actually leg impressions, as shown in our original article. As Knecht et al. (2011) point out, these impressions are elongate, and those that are deeply impressed have the same width as the leg impressions proximal to the body. In addition, as drawn in Knecht et al. (2011), some of them retain the bulging impression of the tarsus at their most distal points. One photograph is included with this comment that details these impressions (Fig. 3C). In the comment by Marden, this figure was reproduced in such a small frame (Fig. 5 in Marden 2013) that it suffered from reduced detail and the reader could not see clearly the impressions as originally drawn. Based on this evidence in the fossil specimen, Knecht et al. (2011) interpreted these marks as repeated impressions of the tibia and tarsi of the insect as it palpated the substrate.

Figure 3.

Reproduction of Marden's (2013) Figure 5 illustrating failings of the overlay analysis. In the upper illustration, dashed straight lines highlight convergence of the posterior linear imprints. These indicate rotation about a fixed point in the location indicated by the arrow, not on the body axis. A circle in the lower diagram highlights an area where the linear imprints have been erased. In both diagrams, lines have been drafted to show a discrepancy in rotation between the body axis in the fossil (above) and the body axis of the plecopteran as placed on the drawing. The total difference is about 9°. A discrepancy in scale is shown by the scale line drawn from body axis to the same location in the drawing in upper and lower diagrams. The lower diagram shows the body axis of the plecopteran is shifted about −15% laterally. Scale bar represents 20 mm in original sketch.

To support his own assertion that these could be marks made by paired, flapping wings as they touched the substrate, Marden (2013) overlaid a photograph of a stonefly captured as its wings reached maximum downbeat onto the drawing of Knecht et al. (2011). Three errors were made in the production and interpretation of this overlay. First, the axis of the stonefly's body in the overlay has been rotated about 9° counterclockwise and shifted about 3 mm to the right (as measured on the publication's page) from the orientation and location of the body axis in the original specimen (see Fig. 3). The amount of rotation and translation applied to the overlay make Marden's (2013) interpretation of these marks as wing impressions an inaccurate comparison with the fossil data. Second, if straight lines were drawn through the median of each of the distal posterior limb impressions in the actual specimen, because they are radially arranged their convergence would indicate a pivot point of the producing structure somewhere distal to the body axis, adjacent to the midsection of the abdomen, not on the thorax where wings would articulate (see Fig. 3). Third, in the same overlay figure, Marden eliminated all impressions made by other limbs in the original specimen to add emphasis to the two sets of impressions distal to the body axis (see Fig. 3) that seemed to fit the stonefly model. Although that is convenient to his interpretation, it is another inaccurate representation of the fossil data when compared to the stonefly overlay.

Marden Attempts to Show a Close Correspondence between the Ventral Anatomy of the Producer in the Fossil Specimen with the Arrangement and Shape of Sternal Plates in the Stonefly

In another example of the misuse of photo overlay, Marden (2013) reproduces the ventral anatomy of the stonefly as colored shapes and placed similar colored shapes on top of the original specimen photo in an attempt to show similarity of the specimen to stonefly ventral anatomy, critical to his thesis. As a standalone figure, it has misrepresented and obscured what is actually visible in the fossil specimen. Included with this comment is a high-resolution closeup of the sternal area in question (Fig. 4). In this area, we deemed it impossible to discern individual sternal elements and that is why they were not described in detail or outlined in figures within our original publication (Knecht et al. 2011). We do not believe he has reliably identified these elements in the specimen photograph; in fact the shapes drawn in Marden (2013) have little to no correspondence with the actual specimen.

Figure 4.

Photo enlargement of the sternal impression. Anterior direction as indicated on photograph. Note unclear boundaries between sterna. Scale bar is 2 mm.

Marden Argues that the Ratio of Thorax to Abdomen Width in the Specimen is too Small to Indicate a Flying Insect such as a Mayfly, Which Typically Have Enlarged Thoracic Cavities

Marden (2013) argues that because crown group mayflies and other modern flying insects have enlarged thoracic cavities to allow for flight muscle mass, if the insect that made the specimen were capable of flight, as opposed to skimming, the ratio of thorax to abdomen width should be above about 1.25. This point is critical to Marden's (2013) thesis because the hypothesis regarding a surface-skimming origin of flight requires primitive winged insects that exhibit the skimming behavior. We do not argue the point that flying insects have enlarged thoracic cavities, nor do we argue that thorax:abdomen width metrics might be useful when complete insect bodies are available for study. However, a significant assumption was made in the decision to use such a metric and it indicates a misunderstanding of the nature of trace fossils and a misrepresentation of thoracic morphology.

First, Marden (2013) assumed that the entire width of the thorax is represented in the trace fossil. We doubt this is the case. Not only it is clear based on inflection of segments that the abdomen was pressed deeply into the sedimentary medium, but also it is clear that the anterior of the insect was not as deeply impressed—the legs held this portion of the body higher in the sediment. Because this is the case, only the sternal area of the thorax is impressed and not likely to its full width. Therefore, a direct measurement of this area of the trace fossil would not represent the full width of the thorax.

Second, the mayfly thorax is strongly humped dorsally to accommodate flight musculature. Even if the insect responsible for the trace fossil was similar to primitive crown group mayflies like Siphlonurus (a comparison used by Marden 2013), we would not expect that the dorsal enlargement of the thorax would be visible in a ventral impression, or that the true volume of the thoracic cavity be accurately estimated by a simple width measurement.

Third, the thorax:abdomen width ratio metric Marden used in a summary of Permian flying insects (Marden 2013, S1) is supposedly derived from work that measured muscle mass. In the previous studies referenced by Marden to support this metric, it was the ratio of muscle mass to body weight of various flying animals from a number of groups that showed a strong correlation with the ability to fly (Marden 1987) and the ratio of thorax to abdomen mass of lepidopterans (Srygley and Chai 1990) that correlated to flight performance (not the ability to fly). Using a simple metric, such as width of the thoracic cavity, might seem like a sensible proxy of flight performance if the mass (and therefore thoracic volume) was evenly distributed within the thorax, but that clearly is not the case in many flying insects, including modern Ephemeroptera, and as mentioned above, this is not possible to measure in the trace fossil.

Finally, the summary data (N = 14) compiled to make this point in Marden's (2013) comment come solely from reconstructions of Permian flying insects. Reconstructions are not considered reliable sources of data from the fossil record, mainly because they are the product of a number of partial specimens (especially in the case of insects) that may have suffered significant taphonomic effects and that may not have contained all of the elements presented in the reconstruction. In other words, they are a plausible idea of what the insect may have looked like. Marden (2013) states that data for S1 came from the website by R. Beckemeyer, a compilation of many reconstructions by Tillyard, Carpenter, and others, which provides a warning to this effect and is not peer-reviewed. Moreover, many of the reconstructions by Tillyard were quite fanciful and biased by his own preconceived notions of what fragmentary specimens were purportedly related to the extant fauna and were independent of any application of phylogenetic method. Even contemporaneous authors seriously questioned his hypotheses (Carpenter 1930a, b), and today many of his taxa have been dramatically reinterpreted (Carpenter 1992; Grimaldi and Engel 2005). Basing conclusions on such “evidence” is erroneous and any credible study would critically evaluate the material much as modern paleontologists are doing. In sum, Marden's (2013) assertion that the distribution of data indicates the ratio for flight ability should be about 1.25 is not currently based in fact, has never been tested or verified, and is not applicable to the trace fossil in question.

Marden Uses Live Stoneflies in an Experiment to Attempt to Prove that Stoneflies Can Produce Traces Like the Fossil in Question

Experimental neoichnology, or the experimental production of traces using modern analogs, is a growing area of interest to paleontologists. It must be performed under consideration of a good number of caveats, however. Many of the caveats are direct corollaries of the principles of paleoichnology: (1) the same animal can produce many different traces using different behaviors; (2) different animals can produce similar-looking traces using similar or different behaviors; (3) the sedimentary media have a strong influence on the behavior of the organism and the resulting trace; and (4) taphonomic factors outside of experimental control could change the appearance of the trace.

Marden (2013) used captured Taeniopteryx sp. (Plecoptera) as a model animal for an experiment that involved allowing the stonefly to skim across the surface of a pool of water propelled by wing power, “dock” at the edge of the pool, and crawl away over a sedimentary medium. Although this observation proved nothing new (stoneflies are known to be able to skim, dock, and crawl away while flapping their wings), Marden (2013) implied that it was evident that the trace fossil in question was the result of the same behavior by a similar animal. Because a full-body impression like fossil SEMC-F97 was not created by the natural behavior of the stonefly in the experiment, Marden (2013) extended the experiment by gluing a pin to the dorsal surface of the stonefly thorax and pressing the animal into wet kaolinite to make an impression. This “rubber stamp” method to produce a full-body impression is wholly inappropriate. Not only does it imply the producer had to be pushed into the sedimentary media to create an impression, but also it does not reflect the natural behavior of the organism. In this particular case, the natural behavior and posture of the organism in question is of critical importance.

It should further be noted that neoichnological experimentation using only stoneflies, without mayflies as a method of contrast to the original hypothesis proposed by Knecht et al. (2011), suggests prejudice toward a desired outcome. This also highlights a misunderstanding of one of the aforementioned caveats of experimental neoichnology: that different tracemakers can make similar traces. Identifying a tracemaker is a complicated process and involves understanding the integral relationships among ethology, anatomy, sedimentology, and taphonomy.

Finally, the use of wet kaolinite as a sedimentary medium for the experiment is not representative of the kind of sediment with which the original organism would have interacted. “Mud,” like that in which the trace fossil in question was made, is a combination of silt and clay, and the proportion of silt to clay in the mixture is an important determining factor in the production and taphonomic alteration of the trace. Pure clay, as Marden (2013; i.e., kaolinite) used, has more adhesive properties and lower permeability than a typical mud with silt content. These differences will affect how the model animal behaves and how its tracks are preserved. Proper experimental technique would have included microscopic determination of the grain size distribution of the sediment in the fossil specimen, and the subsequent production of an experimental medium of similar distribution.

Marden Suggests that Specimen SEMC-F97 Supports “… Surface Skimming as a Precursor to the Evolution of Flight in Insects”

Although of considerable significance, trace fossil specimen SEMC-F97 does not illuminate the origins of powered flight, an event now understood to have taken place at least 80–90 million years earlier (Engel and Grimaldi 2004; Grimaldi and Engel 2005). Plecoptera, even stem-group Plecoptera, are phylogentically nested within Neoptera and by some accounts within Polyneoptera (Terry and Whiting 2005; Grimaldi and Engel 2005; Yoshizawa 2011), such that even a cursory examination and understanding of insect phylogeny reveals the vast gulf between stoneflies and the common ancestor of pterygote insects, not to mention that skimming is not a plesiomorphic trait of stoneflies (Will 1995).

The fact that stem groups of lineages, such as Plecoptera, Ephemeroptera, and perhaps even Odonatoptera, had terrestrial immatures, not aquatic immatures like their modern counterparts (e.g., immature Protereismatidae lacked gills and were terrestrial, immature Palaeodictyoptera were the same, as were immature Lemmatophoridae of the stem lineage to Plecoptera: Grimaldi and Engel 2005; Engel et al., in press), renders any hypothesis reliant on their being aquatic moot. In addition, recent reviews of molecular developmental and paleontological work on the formation of insect wings has demonstrated that they are not derived from gill structures (Engel et al., in press), further distancing skimming behavior from anything associated with a gill-like structure.

Finally, while attempts have been made to push Plecoptera back into the Carboniferous (Béthoux et al. 2011; a fact that if true would still not alter any of these conclusions), they have been based on recircumscribing the order to encompass a broader suite of putative taxa (similar to recircumscribing Homo sapiens to encompass those taxa which fall along its stem lineage) rather than discovering any species which preserves definite synapomorphies currently known to characterize species within crown-group Plecoptera.

The time period and cladistic position of any plecopteran (living or fossil) negates them as a “proto-flyer.” Seeking a stonefly to reveal flight origins among insects is the equivalent of using an antelope or orangutan to understand the origins of tetrapods, and violates even basic tenets of evolutionary biology and phylogenetic reasoning. Skimming stoneflies, while considerably fascinating in and of themselves (we do not deny the remarkable significance of this behavior among stoneflies nor those elegant biomechanical studies which have characterized it), are nothing more than adaptive storytelling in relation to the origins of insect flight (Gould 1979; Engel et al., in press).

In summary, we feel that all of the aforementioned inaccuracies misinform the reader about the data available from specimen SEMC-F97 and the evolutionary history of the Plecoptera. At best Marden (2013) is a naïve approach to reinterpretation. It is phenomenally peculiar that a restudy of this material should fail to ever examine the specimens firsthand. The material is openly available in a public institution for any qualified researcher to examine and is a critical part of the dataset. This suggests a desire to make data conform to a predetermined hypothesis rather than to critically evaluate or reconsider original observations. This approach coupled with poor ichnological and neoichnological practices suggests the need for a more critical evaluation of Marden (2013), specifically the validity of the methods used within and the questionable nature of its conclusions.