Impact of a student–teacher–scientist partnership on students' and teachers' content knowledge, attitudes toward science, and pedagogical practices



Engaging K-12 students in science-based inquiry is at the center of current science education reform efforts. Inquiry can best be taught through experiential, authentic science experiences, such as those provided by Student–Teacher–Scientist Partnerships (STSPs). However, very little is known about the impact of STSPs on teachers' and students' content knowledge growth or changes in their attitudes about science and scientists. This study addressed these two areas by examining an STSP called “Students, Teachers, and Rangers and Research Scientists” (STaRRS). STaRRS was incorporated into the existing long-standing education program “Expedition: Yellowstone!” For teachers, a pre-test, intervention, post-test research design was used to assess changes and gains in content knowledge, attitudes, and pedagogical practices. A quasi-experimental, pre-test–post-test, comparison group design was used to gauge gains in students' content knowledge and attitudes. Data analyses showed significant positive shifts in teachers' attitudes regarding science and scientists, and shifts in their pedagogical choices. Students showed significant content knowledge gains and increased positive attitudes regarding their perceptions of scientists. The findings indicate that STSPs might serve as a promising context for providing teachers and students with the sort of experiences that enhance their understandings of and about scientific inquiry, and improve their attitudes toward science and scientists. © 2013 Wiley Periodicals, Inc. J Res Sci Teach 51: 84–115, 2014

For the past three decades, national science education reform efforts have consistently and strongly endorsed inquiry-based learning as the most effective way of connecting students to science (American Association for the Advancement of Science [AAAS], 1989). The National Science Education Standards (NSES; National Research Council [NRC], 1996) defined inquiry in two ways. First, inquiry was framed as a set of integrated processes representing the “diverse ways in which scientists study the natural world and propose explanations based on the evidence derived from their work” (p. 23). The NSES called for enabling students to develop the skills needed to do inquiry, as well as develop understandings about the nature of inquiry. Second, inquiry was conceived as a teaching strategy and learning approach, which could frame science education.

Reform efforts continued to redefine and broaden the scope of science education should look (e.g., NRC, 2000, 2007). Taking Science to School: Learning and Teaching Science in Grades K-8 (NRC, 2007) summarized more than a decade of research on science teaching and learning, and proposed four strands of proficiency in science education: (a) Knowing, using, and interpreting scientific explanations of the natural world, (b) generating and evaluating scientific evidence and explanations, (c) understanding nature and the development of scientific knowledge, and (d) participating in scientific practices and discourse (p. 36). The report Learning Science in Informal Environments: People, Places, and Pursuits (NRC, 2009) added two other strands for students, namely, to: (a) Experience excitement, interest, and motivation to learn about phenomena in the natural and physical world, and (b) think about themselves as science learners and develop identity as someone who knows about, uses, and sometimes contribute to science (p. 4) (also see, Krajcik, Czerniak, & Berger, 2003). A most recent reform document, A Framework for K-12 Science Education (NRC, 2011), laid a roadmap for future science curriculum development focused on student performance expectations that are tightly aligned with three critical dimensions: scientific and engineering practices, disciplinary core ideas, and crosscutting concepts. Drawing on this framework, The Next Generation Science Standards [NGSS] (Achieve, Inc., 2013) emphasized student engagement with scientific practices across the K-12 curriculum.

The aforementioned reports (e.g., AAAS, 1989; NRC, 2000, 2007, 2009, 2011) have emphasized the need for multiple ways to engage students with scientific inquiry in the sense of “participating in normative scientific practice akin to those that take place and govern scientific work” (NRC, 2009, p. 70). These are sometimes called authentic science experiences (Chinn & Malhotra, 2002), which aim to engage students in finding evidence-based answers to questions and problems in natural contexts with no pre-determined solutions (McKay & McGrath, 2006). The present study is focused on Student–Teacher–Scientist Partnerships (STSPs), which provide for a meaningful context to engage students with actual or authentic scientific investigations.


It is well documented that “student learning of science depends on teachers having adequate knowledge of science” (NRC, 2007, p. 296). “Knowledge of science,” in this sense, is more than just understanding science content. Only when teachers become more comfortable with both science content and the processes through which claims to scientific knowledge are generated and validated, will they be able to enact the vision of the science education reforms outlined above (NRC, 1996). That is, science teachers need to develop pedagogical content knowledge (PCK) for teaching, which address both the substantive and syntactic dimensions of their disciplines (Schwab, 1978; Shulman, 1986, 1987, 1992). Toward enabling science teachers to develop their PCK, the NRC (2007) reiterated the need to provide teacher professional development (PD) that includes experiential learning opportunities with time to: (a) generate answerable questions, (b) think scientifically, (c) analyze phenomena, (d) interpret evidence, and (e) engage in meaningful discourse about the validity of generated claims. The latter behaviors are typical of research science culture. Partnering with scientists or basing a partnership on aspects of scientific field research could provide a powerful context to engage teachers with such scientific practices. Indeed, providing such direct experiences for teachers, which are modeled after ways we want them to teach their own students is what Taking Science to School highlighted as “an important trend” (NRC, 2007, p. 311).

Student–scientist partnerships is one strategy that employs authentic, inquiry-based learning to provide students and teachers with access to the scientific community and allow for their engagement with actual scientific research. The benefits to students (i.e., engagement with authentic science), and scientists (i.e., securing additional resources for data collection efforts) often are highlighted in such partnerships (Harnik & Ross, 2003b; Wormstead, Becker, & Congalton, 2002). However, the crucial engagement and intermediary role of science teachers often seems to either be taken for granted or subsumed by the assumptions underlying the partnerships. In this study, we use the label STSPs to highlight the central role that teachers play both as “students”—learning about scientific research, and “teachers”—transforming what they learn into pedagogical enactments that progressively approximate scientific practice. In this sense, STSPs are defined as partnerships in which students, teachers, and scientists work together to answer real-world questions about a phenomenon or problem the scientists are studying. STSPs are based on conducting scientific research that is enhanced by pre-college student participation (Tinker, 1997), and are believed to provide teachers and students with meaningful firsthand experiences of scientific practices.

This study was undertaken in the context of a STSP titled, “Students, Teachers, and Rangers & Research Scientists: Investigating Earth Systems at Mammoth Hot Springs” (STaRRS). STaRRS is an “embedded” STSP, meaning the components of the partnership were designed for integration into an existing educational program. Consistent with the aforementioned centrality of science teachers to such endeavors, an additional key feature of this partnership was to also focus on teacher needs by providing them with extensive, research-based, on-going PD. Moreover, on-going, site-based, year-long support for participant teachers was built into the partnership, and intended to problems with content, inquiry processes, and logistical issues (including those with equipment and protocols developed for the partnership) so that teachers are better equipped to address related challenges. In the context of student and teacher engagement with STaRRS, this study aimed to answer the following questions: (1) What is the impact of participation in the STaRRS partnership on teachers' science content knowledge, attitudes toward science and scientists, and pedagogical strategies? and (2) What is the impact of participation in the STaRRS partnership on students' science content knowledge and attitudes toward science and scientists?

Literature Review and Theoretical Framework

The literature attributes several benefits to STSPs, which fall into two categories: Benefits for education (participating teachers and their students) and benefits for scientists (the scientist or research group in need of specific data). Stated educational benefits include providing authentic experiences (Donahue et al., 1998; Harnik & Ross, 2003b; Moss, Abrams, & Kull, 1998; Tinker, 1997), which in turn give students increased understanding of the scientific research process (Evans, Abrams, Rock, & Spencer, 2001; Finarelli, 1998; Harnik & Ross, 2003b; Ross et al., 2003; Wurstner, Herr, Andrews, & Alley, 2005). STSPs also have been described as vehicles for changing students' attitudes toward and interest in science (Caton, Brewer, & Brown, 2000; Comeaux & Huber, 2001; Ross et al., 2003; Wormstead et al., 2002; Wurstner et al., 2005). Some studies found that particular partnerships led to perceived increases in students' understanding of specific content. The latter perceived gain was considered to be an important feature even though no supporting empirical data were collected (Finarelli, 1998; Gilmer, 1997). Benefits for teachers, including gains in content knowledge and an increased use of inquiry-based instructional strategies, have been noted as well (Caton et al., 2000; Comeaux & Huber, 2001; Evans et al., 2001; Ross et al., 2003; Wormstead et al., 2002).

For scientists, the benefits of STSPs are twofold. Many studies found that STSPs give scientists the ability to collect data that would be difficult or impossible to acquire without extra help (Lawless & Rock, 1998; Wormstead et al., 2002; Ross et al., 2003; Tinker, 1997; Wurstner et al., 2005). Secondly, partnerships provide a vehicle to engage with K-12 education in a way that brings more effective teaching strategies to college level instructors through scientists' personal engagement with K-12 educators (Caton et al., 2000; Donahue et al., 1998).

Experiential Learning and Authenticity

To a major extent, the impact of STSPs is premised on the effectiveness of experiential learning, which Kolb (1984) described as “knowledge that results from the combination of grasping and transforming experience” (p. 41). The roots of experiential learning can be traced to Dewey's (1943) progressive education with its emphasis on learning through experience. Rogers (1959) further posited that experiential education is meaningful when there is personal involvement, self-initiation, and the freedom to explore. These scholars felt it was important to create educational experiences that resemble or, better, provide immersion within, disciplinary practice. Students, thus, become immersed in a discipline beyond its substantive content; also learning about its syntactic nature, that is, the inquiry procedures used to determine the validity of substantive disciplinary content (Schwab, 1978).

While inquiry-based science learning can provide some insight into the work of scientists, it does not necessarily entail engagement with authentic scientific practice (Chinn & Malhotra, 2002). The word authentic is frequently used in the education literature to describe experiential activities and contexts that mirror activities conducted by practitioners (e.g., Harnik & Ross, 2003a; Wormstead et al., 2002). When used in reference to science experiences, definitions of authenticity differ, ranging from modeling what scientists do (NRC, 1996) to focusing on student-designed investigations that produce artifacts representing student learning (Marx, Blumenfeld, Krajcik, & Soloway, 1997).

Experiential learning can connect the process of authentic scientific inquiry with inquiry-based learning. STSPs make such a connection possible. Usually, STSPs are built and maintained between university scientists and K-12 teachers and students. Authenticity for STSPs is ensured by the fact that a major aim is the generation of new scientific knowledge using the participation of K-12 students to make meaningful contributions that were central to scientists' work (Barstow, 1996). A major driver for STSPs is a “real” scientific problem, which could benefit from a distributed data collection effort both to mitigate recourse limitations and enhance the reach of scientists (e.g., geographically or temporally). Data quality often is paramount. Thus, if required, the Research Scientists' Granting Organization sometime fund the provision of sophisticated equipment to students. Moss et al. (1998) endorsed the benefits to science, but also highlighted the reciprocal benefits to science education. They argued, “that in order for students to be involved in the process of doing scientific research [and, thus, serve the science], they must first begin to develop an understanding of what that process entails” (p. 150). Indeed, Moss et al. found that limiting student involvement to specified data collection protocols, which aimed to answer scientists' questions in a STSP also limited “the scope of the project for the students” (p. 159). They recommended that students also be allowed to explore questions they developed themselves. Links between student data to the research scientists' questions in the partnership were considered to be less important than student participation in the processes of science:

Whether scientists make use of data from student-generated areas of inquiry is unimportant. What is important is that students will be both contributing to authentic research, by following provided protocols, and will be experiencing a broader range of what the research process entails by exploring their own questions. (p. 159)

The sorts of educational benefits advocated for by, among others, Moss et al. (1998), it turns out, are synergistic with scientists' needs because STSPs can, thus, serve as vehicles to fulfill “outreach” and “broader impact” requirements for scientific research, which often are mandated by U.S. federal funding agencies. Michaels, Shouse, and Schweingruber (2008) also emphasized these broader educational benefits to learners accruing from STSPs, noting that:

When students engage in science as practice, they develop knowledge and explanations of the natural world as they generate and interpret evidence. At the same time, they came to understand the nature and development of scientific knowledge while participating in science as a social process. (p. 35)

While STSPs appear to be ideal contexts that provide all elements necessary for mutual benefits for science and science education—including interactive, authentic experiences for students—they have had a history fraught with challenges. Developing and sustaining effective and reciprocally beneficial partnerships is rather difficult.

Challenges Facing STSPs

Definitions of STSPs, such as those advocated by Barstow (1996) essentially suggest or assume that students involved in “authentic” science experiences would or should perform tasks in a manner that corresponds to that of scientists. Such definitions, nonetheless, are problematic because, given their age and training, K-12 students are not in a position to perform professional scientific inquiry. Instead, Lee and Songer (2003) argued, there is a need to make distinctions between professional and student scientific inquiry. For instance, “authentic” skills learned by students may be conceived of as those that will enable students to cope with everyday life problems beyond classroom settings (Brown, 1993). Another approach to addressing this definitional problem is to adopt the notion of hybrid activities (Brown, Collins, & Duguid, 1989). Brown et al. (1989) argued that students must wrestle with emergent problems that contain authentic activity. They used the term hybrid activities to describe specific types of authentic, experiential, science activities, which are framed by one culture but attributed to another. For example, such experiences include scientific investigations in which the practitioner already knows the outcome, but the student does not. Still, even after addressing issues associated with defining authenticity within STSPs, such partnerships are faced with substantial challenges.

To start with, the need for extensive wide-ranging data collection had raised several concerns related to the quality and use of student data. Indeed, data quality has been the focus of much of the literature on STSPs (Dolen & Tanner, 2005; Evans et al., 2001; Harnik & Ross, 2003b; Lawless & Rock, 1998; Ross et al., 2003; Tanner, Chatman, & Allen, 2003). Other challenges for STSPs have been identified by a small body of literature and include similarities and differences between the culture of science and school science (Barstow, 1996; Carr, 2002; Moreno, 2005; Tinker, 1997), and the identification of good questions, projects, or studies for partnerships (Doubler, 1996; Tinker, 1997). Carr (2002), Barstow (1996), Caton et al. (2000), Ledley, Haddad, Lockwood, and Brooks (2003), Moreno (2005), and Tomanak (2005) identified many cultural challenges facing STSPs, which at times are invisible to each set of participants. There are basic differences in the knowledge base among participant students, teachers, and scientists, as well as disparities in the ways conflict is viewed and dealt with. For example, while unrelenting and critical scrutiny of data protocol implementation is considered part and parcel of scientific discourse, it could be considered “off-putting” from a pedagogical perspective. These and other differences can create misunderstanding between the partnering university research science and education cultures.

Challenges beyond cultural differences that often appear in the literature fall loosely into five categories: (a) content knowledge background needs of teachers and scientists, (b) accuracy and relevance of student data, (c) securing and negotiating resources for both scientists and teachers (materials, time, and personnel), (d) communication needs and barriers, and (e) outside factors affecting both the educational and research communities. In many cases, existing literature tells us that if the above challenges inherent in STSPs are not addressed, they can seriously impede a partnership (Evans et al., 2001; Ledley et al., 2003; Moreno, 2005; Tanner et al., 2003).

The bulk of extant studies have focused on identifying and addressing challenges associated with STSPs. However, systematic empirical studies aimed at gauging the often-stated benefits of STSPs for participating students and teachers are virtually absent. The present study aimed to address this gap in the research literature by investigating whether STSPs actually result in increased content knowledge and improved attitudes toward science among middle school students and their science teachers, as well as desired shifts in participating teachers' instructional practices. The present study also shed light on components of STSPs that may facilitate growth among students and teachers.

Addressing Challenges Facing STSPs

Along with identifying the abovementioned challenges, researches have put forth a number of recommendations for mitigating the impact of such challenges (e.g., Carr, 2002; Caton et al., 2000; Doubler, 1996; Harnik & Ross, 2003b; Lawless & Rock, 1998; Ledley, Haddad, Lockwood, & Brooks, 2003; Moreno, 2005; Tinker, 1997). These recommendations can be condensed into seven domains: (a) True partnerships need to be structured with an eye toward addressing hierarchical issues and power imbalances between scientists, teachers, and students; (b) partnerships must include open and frequent communication among the partners; (c) the need of all participants must be incorporated into the partnership design and activities, including those of the research scientists and the students; (d) data quality and use must be addressed; (e) research questions pursued by students need to be carefully selected so that they are appropriate for a partnership's context; (f) long-term relationships must be actively developed with attention to sustainability; and (g) a third-party liaison should be included in the partnership. This is a person who is familiar with the worlds and cultures of both education and scientific community. The liaison acts as a facilitator to help mediate relationships in the partnership and raise the scientists and educators understanding of each other's goals and needs. As will become evident below, these recommendations were given due consideration when designing and implementing the STaRRS partnership.


This study used a quasi-experimental design with two components. Impact of the STaRRS partnership on participant students' science content knowledge and attitude toward science and scientists was assessed using a pre-test–post-test, non-randomized, comparison group design. A pre-test–post-test single group design was used to assess impact on teachers' science content knowledge, attitudes, and pedagogical strategies. The STaRRS partnership served as the intervention for students and teachers in the treatment group.

Development of the STaRRS STSP

The STaRRS STSP was created to attend to some of the challenges noted in the literature. The primary challenge being to address the importance of students doing more than just collecting data for the scientists' research project, as this offers limited opportunities for them to be involved in the process of doing scientific research. The activities were designed to help students engage in authentic science experiences grounded in the tools, techniques, attitudes, and processes of scientific inquiry. Additionally, STaRRS provided students with opportunities to explore questions of their own interest, collect scientific data, and communicate their results to audiences beyond their classrooms.

Three scientific fieldwork components at Yellowstone National Park (YNP) were developed for STaRRS. These included the use of (a) whole group photographic data collection at specific locations to monitor change in YNP hot springs geomicrobiology over time; (b) small group descriptive hot springs data collection using specified protocols to provide students with experiences collecting extensive sets of data about a system at a single point in time; and (c) small group, student-generated research investigations in which students used their knowledge of the system to develop relevant, answerable research questions. These research activities provided a full inquiry cycle and research experience for students and connected them to a larger scientific research project (Chinn & Malhotra, 2002).

Population and Participants

Expedition: Yellowstone! (E:Y!) is a long-standing, curriculum-based residential environmental education program located in YNP. E:Y! provides 4- or 5-day experiences for fourth through eighth grade students focusing on the areas of geology, ecology, and human history. The present partnership was designed to enhance and supplement the geology portion of the E:Y! experience for students and teachers.

The target population for the study was teachers who bring their students to E:Y! These teachers are dedicated to providing extraordinary experiences for their students. The recruitment pool included 45 experienced E:Y! teachers who were planning to bring students to YNP during the 2008–2009 school year. From this pool, nine teachers from eight schools in six U.S. states volunteered to participate in STaRRS. Volunteering to be in the treatment group required the ability to participate in an initial summer workshop and be scheduled to participate in E:Y! during the 2008–2009 school year. The nine treatment teachers had between 5 and 21 years of experience. Five of the schools were public and three were private. Five teachers brought 5th graders, one a combined group of 5th and 6th graders, another brought 7th graders, and one 8th graders for a total of 193 students in the intervention group. Five groups were self-contained with one teacher instructing students in all subject areas. Each of these five groups brought all students from that particular grade level to E:Y! Of the remaining three groups, one was self-contained, with all students of that age from the school attending E:Y! However, the STaRRS teacher was not their regular classroom teacher. The final two groups had only a portion of the school's students attended E:Y! In both cases, students applied to attend and spent extra time outside of regular classes preparing for the trip. The STaRRS teachers for the latter two groups were far removed from their students' everyday classes; one was a second grade teacher, the other a resource reading specialist.

Eleven more E:Y! teachers, most of whom could not participate in the initial summer workshop for various reasons, from nine schools in three states volunteered as participants in the comparison group. The comparison group teachers had between 5 and 25 years of teaching experience. All of the students in these groups attended E:Y! with their primary teacher. The comparison group comprised a total of 187 students: Four 6th grade classes, four 4th grade classes, and one combined 5th and 6th grade class. Only one of the comparison schools was private, and this was the only school whose students applied to participate in E:Y! and spent time outside of the regular school day to prepare for their expedition. Every teacher who volunteered for the study was accepted.


The STaRRS partnership was developed with the intention of embodying essential characteristics for success, as well as addressing some of the STSP challenges identified in the literature, including: (a) data accuracy issues, (b) communication difficulties, (c) resource availability, (d) participants' needs, and (e) cultural differences (i.e., school science versus scientific practice) through the use of a third-party liaison (Carr, 2002; Caton et al., 2000; Doubler, 1996; Harnik & Ross, 2003a; Lawless & Rock, 1998; Ledley et al., 2004; Moreno, 2005; Tinker, 1997).

Components and Goals of the Partnership

STaRRS had two main components. These components consisted of a summer PD workshop for teachers and on-going school-year support for teachers and their students. The workshop was a 4-day, 45-hour intensive workshop that took place in YNP and covered four areas: (a) hot springs geomicrobiology, (b) specific field tools, (c) hot springs field science, and (d) integration and transfer to classrooms.

There were three main goals developed by the partnership for participant teachers. The first goal was to provide teacher PD that included content enabling teachers to make explicit connections to how the study of Mammoth Hot Springs (MHS) helps the broader scientific community understand early earth environments, ancient and modern coral reef systems, and the search for life on other planets. The second goal focused on ways of making science more accessible to students by involving them in data collection processes. Finally, the workshop was designed to model for teachers the field experiences in the same way their students would be involved in them during their expeditions. This approach provided an opportunity for teachers to work on integrating the target content and processes within their own school contexts.

Field science was defined by the partnership as investigations (using scientific practices) that could only be conducted in situ (or on location). Questions developed by teachers and students had specific parameters. These parameters prevented the development of questions that could be answered by other research methods, including Internet, journal, or personal communication sources. Specific field tools, in this case, were selected for their use at the hot springs, though they were also appropriate for classroom and playground settings. Finally, questions were developed such that they made use of the data collection tools.

To accomplish these goals, the university PD team, including geologists, rangers, and education specialists, provided field experiences in YNP, in which teachers learned content about the hot springs in the field and how to use the specifically chosen tools. Teachers were able to directly connect with the hot springs systems and prior research, and begin to ask and answer their own questions through the planned inquiry sessions. After the on-site field experiences, additional assistance was provided to STaRRS teachers by the primary author in the form of school visits, electronic communications, and conference calls.

STaRRS Pre-Expedition Preparation

STaRRS teachers spent approximately two additional weeks of class time, beyond the normal E:Y! preparation time, getting their students ready for their STaRRS expeditions. During this additional preparation time, they taught the students how to use the various tools, some basic hot springs systems content, and scientific inquiry skills, mostly related to asking questions that lend themselves to investigation using data available through project tools. Teachers did this initially using content the students were familiar with, and then they expanded it to hot springs content. During this pre-expedition preparation time, many teachers utilized their STaRRS peers and participated in on-line discussions. The primary author, as the liaison, spent time in each classroom for 1–3 days prior to the expedition to support teachers during this process. Communication between the teachers, students, scientists, and among other STaRRS teachers was facilitated throughout the school year using on-line discussion boards, weekly emails, and video conferencing.

Differences Between STaRRS and Comparison Group Expeditions

At E:Y!, STaRRS students spent more time studying MHS geology than regular E:Y! groups (up to 8 hours vs. 3). Table 1 illustrates the differences between a regular experience (Comparison groups) and one that involved the STaRRS intervention (Treatment groups) during their time at Yellowstone. The STaRRS expedition strayed from the regular E:Y! schedule on days 1 and 2. Instead of dividing the second day between Norris Geyser Basin and MHS, STaRRS groups spent the entire day at the hot springs. At MHS, rangers led some of the usual E:Y! geology activities and then teachers guided their students through the three components of the partnership in the field. Students collected photo point data, grid transect data (creating a picture of a specific point of the hot springs at a given moment in time), and collected data to answer questions that they had developed prior to leaving for the field. Eliminating the driving time between the locations provided more time for data collection at three different hot springs. At the end of this field experience, students synthesized their findings and presented their preliminary results to their peers. On several occasions a visiting scientist or other Yellowstone Park staff attended these presentations.

Table 1. Summary of instructional focus and time spent during E:Y! and STaRRS expeditions
Days and FocusInstructional ActivitiesTime Spent


  1. Shading indicate differences.
Day 1: StewardshipStewardship content1–2 hours1–2 hours
EveningGeology content2 hours1 hour + 1 hour STaRRS
Day 2: GeologyGeology content1 hour1 hour STaRRS
 Norris Geyser Basin hike2–3 hoursN/A
EveningMammoth Hot Springs hike2–3 hours5–6 hours including field work
   +1–2 hours presenting
 Ecology content2 hours2 hours
Day 3: EcologyEcology + hike6–7 hours6–7 hours
EveningHuman history content2 hours2 hours
Day 4: Human historyHuman history + hikes6–7 hours6–7 hours
EveningCampfire2 hours2 hours
Day 5: Wrap upCleaning up and checking out2 hours2 hours

STaRRS Post-Expedition Follow-Up

Following E:Y!, STaRRS teachers spent approximately two additional weeks guiding students through further analysis, processing, and data preparation for formal presentations of their findings within their community. Some groups presented these findings to their families and friends during a special evening event, others presented their results to school board members at a regular board meeting. One school submitted their presentations to the regional science fair.

A 3-day follow-up workshop for STaRRS teachers was conducted at the end of the study. This final workshop gave teachers, rangers, and scientists a chance to reflect on the experiences, learn new content, and plan for future work.

Comparison Group Participation

Comparison groups attended their regular E:Y! expeditions with no additional time spent prior to, at, or after E:Y! on geology or field science concepts. Comparison group teachers did not attend any PD or receive any school year support from the STaRRS staff. For the data collection, they administered the same instruments to their students prior to and following their expeditions. For the submission of these assessments, each teacher received a digital camera and an infrared thermometer. Later they were also given access to the STaRRS curriculum ( and invited to participate in data collection of photo points the following year.

Procedures and Instruments

As described earlier, as a part of the intervention, treatment students experienced the STaRRS partnership in their classrooms and embedded in their E:Y! experience, while comparison students engaged in typical E:Y! expedition activities during the same school year. Hereafter, treatment students will be referred to as STaRRS students and comparison students as E:Y! students.

To answer the first research question, data were collected from the teachers prior to their initial and final PD workshops. Three assessments were used. The first covered earth science content, the second surveyed teacher attitudes, and the third was an extensive survey of enacted curriculum. The second research question was answered by collecting pre- and post-expedition data from both E:Y! and STaRRS students using two assessments. The first focused on specific earth science content and the second surveyed students' attitudes regarding science and scientists.

Assessing Participants' Science Content Knowledge

The Geoscience Concept Inventory (GCI) was used to assess teachers' science content knowledge. The validity and reliability of the GCI was established through the work of Libarkin and Anderson (2008) using RASCH analysis. They reported a KR-20 classic test reliability of 0.69 with an item separation reliability of 0.99 (Libarkin & Anderson, 2008). For this study, 25 questions were selected for use with STaRRS teachers. Initially, the GCI sub-test construction guidelines were followed in order to ensure reliability of this measure. In addition to the required 15 items, 10 other questions that were closely aligned with E:Y! content were selected. Next, another RASCH analysis, using Winstep® and the test results from a pilot group of Rangers and teachers (n = 15) was conducted on this item subtest. This analysis indicated the questions fell along the continuum identified by the original assessment developers in the pathway map. Thus, all of the items for the subtest were used in the ANCOVA analysis to assess changes in STaRRS teachers' content knowledge.

A modified version of the GCI (Libarkin & Anderson, 2006) was developed for use with participant middle school students. Questions were selected from the original instrument and adapted for younger students and new questions were developed to match E:Y! and STaRRS content. The final instrument covered three geosciences content areas: general knowledge; E:Y! content, and STaRRS content. The GCI for Middle Level Students (GCI-MLS) consisted of 22 questions (11 multiple choice and 11 short-answer). Table 2 includes examples of questions from both the GCI and the GCI-MLS. Two scientists, two middle level science teachers, E:Y! rangers, and the researchers established the content validity of the instrument by matching the items to NSES earth science content standards and state standards for Idaho, Wyoming, and Montana. The E:Y! content subsection was composed of concepts taught by Rangers during a typical expedition, including general hot springs formation, tectonic plate movement, and erosion (causes and effects). STaRRS content items assessed specific geoscience concepts taught during STaRRS expeditions, including types of typical environments that supported early life and the Hot Springs Facies Model (Fouke et al., 2000), a specific scientific model used to study hot springs systems.

Table 2. GCI for teachers and GCI-MLS for students—sample questions
InstrumentQuestion CategorySample QuestionsResponse Choices With Correct Responses Highlighted
  • aQuestions were unique to each instrument.
  • bQuestion and answer choices were identical on both instruments.
GCIaSTaRRS(1) Some scientists claim that they can determine when the Earth first formed as a planet. Which technique(s) do scientists use today to determine when the Earth first formed? Choose all that apply.(A) Comparison of fossils found in rocks
   (B) Comparison of different layers of rock
   (C) Analysis of uranium and lead in rockb
   (D) Analysis of carbon in rock
   (E) Scientists cannot calculate the age of the Earth
GCI and GCI-MLSbGeneral knowledge(18) What is groundwater?(A) All liquid water that resides beneath the Earth's surface
   (B) Muddy mixture of water and dirt that lies beneath the Earth's surface
   (C) Only the water found in underground lakes and rivers that is clean enough to drink
   (D) Only water that is moving beneath the Earth's surface
   Only water that is not moving beneath the Earth's surface
GCI and GCI-MLSbE:Y!(14) Fossils are studied by scientists interested in learning about the past. Which of the following can become fossils? Circle all that apply.(A) Bones
   (B) Plant material
   (C) Marks left by plants
   (D) Marks left by animals
   (E) Animal material (like scat—animal feces)
GCI-MLSaGeneral knowledge(13) Are rocks and minerals alive?(A) Yes, rocks and minerals grow
   (B) Yes, rocks are made up of minerals, and minerals are like plant cells
   (C) Yes, rocks and minerals are always changing
   (D) No, rocks and minerals do not reproduce

The CGI-MLS reliability was established in two ways. First, a test–retest reliability rating was achieved by using assessments from a group of 36 E:Y! students who took the measure right before and after their expedition. The Spearman coefficient reliability for the instrument was 0.69. Due to the non-linearity of the questions, the GCI-MLS was also subjected to a RASCH analysis using students' pre-tests (n = 187). This analysis indicated that the test was well matched to the sample and generally fit the model developed by Libarkin and Anderson (2008). The pathway map showed most of the items fell on the path, close to the vertical center. Because the purpose of the test was to measure improvement, the spread of the items from easy to difficult was considered carefully, and it was determined that a progression from easy to difficult items was evident in the GCI-MLS. The Wright map, item statistics and case statistics were checked for problems and the GCI-MLS was determined to be an adequate measure for growth.

Assessing Participants' Attitude Toward Science and Scientists

Five 10-question scales from The Test of Science Related Attitudes (ToSRA) developed by Fraser (1981) were chosen for teachers. The scales were: (a) Social Implications of Science (SIS), (b) Normality of Scientists (NS), (c) Adoption of Scientific Inquiry (INQ), (d) Adoption of Scientific Attitudes AD-ATT), and (e) Leisure Interest in Science (LEI). The ToSRA was originally developed for use with middle and high school students. However, it has been used in several studies with adults including undergraduates (Newbill, 2005). Newbill reported reliability coefficients for the scales at 0.82, which she determined to be sufficiently close to Fraser's (1981) original reliability (0.84). A small pilot study using 15 volunteer E:Y! teachers and Rangers produced a reliability rating of an average of 0.68 for the five scales being used with the teachers. Individual values for each area were: SIS = 0.77; NS = 0.66; INQ = 0.90; AD-ATT 0.32; and LEI 0.75. The low reliability for AD-ATT matches earlier data on this instrument, in that for all of Fraser's (1981) work show in an average here of 0.67. The latter construct has typically shown a lower reliability coefficient than the other scales. When an item analysis was conducted on this scale, two of the questions (#19 and #39) had higher than average SDs. Both of these were phrased in the negative. Number 19 says, “Finding out about new things is unimportant” and item 39 states, “I am unwilling to change my ideas when evidence shows that the ideas are poor.” Both of these statements require selecting what would be double negatives for a positive attitude score, which may have created confusion for the survey takers. Based on the reliability coefficient, the results from this particular scale should be considered with this in mind.

Participant students were administered four ToSRA scales: NS, INQ, LEI, and Enjoyment of Science Lessons (ENJ). The reason only four scales were used with students was to prevent assessment fatigue. It was anticipated that it would take 30 minutes to administer the GCI-MLS and another 20 minutes to complete the ToSRA with the four chosen scales, which were deemed the most relevant of the six available scales. Pre-test data (n = 366) was used to calculate Cronbach's alpha. The four scales were found to have a high degree of internal consistency with the following values: NS = 0.67; INQ = 0.77; ENJ = 0.93; and LEI = 0.87. Their average (0.81) matched the averages reported by Fraser (1981) of 0.80 for Year 7 Australian students and 0.78 for 9th grade U.S. students. Table 3 includes sample questions from each of the scales used.

Table 3. ToSRA scales for teachers and students—description and examples of positive and negatively phrased statements
ScaleDescriptionExample Phrases
  • aScale present on teacher version of ToSRA only.
  • bScale is present on student version of ToSRA only.
  • All other scales are present on both teacher and student versions of the ToSRA.
Social Implications of Science (SIS)aManifestation of attitudes toward the role of science in societyMoney spent on science is well worth spendingScientific discoveries are doing more harm than good
Normality of Scientists (NS)Manifestation of attitudes toward scientists as “normal people”Scientists like sports as much as other people doScientists are LESS friendly than other people
Attitude to Scientific Inquiry (INQ)Acceptance of inquiry as a scientific way of thinkingI would prefer to find out why something happened by doing an experiment than by being toldDoing experiments are not as good as finding out the information from teachers
Enjoyment of Science Lessons (ENJ)bEnjoyment of learning experience in science classesScience lessons are funI do NOT like science activities
Adoption of Scientific Attitudes (AD-ATT)aAdoption of scientific attitudes and habits of mindIn science reports I report unexpected results as well as expected onesI am unwilling to change my ideas when evidence shows the ideas are poor
Leisure Interest in Science (LEI)Interest in science-related activities outside of schoolI would enjoy visiting a science museum on the weekendListening to a talk on the radio about science would be boring

Assessing Teachers' Pedagogical Strategies

The Surveys of Enacted Curriculum (SEC) is a large multiple-choice inventory that assesses teacher instructional decision-making. It is used to provide educators, administrators, and researchers information on the indicators of classroom practice. They were developed by the Council of Chief State School Officers and have been thoroughly field tested to ensure validity and reliability (Blank, Porter, & Smithson, 2001). The science surveys contain more than 150 questions in three areas: (a) instructional practice, (b) subject content, and (c) teacher characteristics. STaRRS teachers were instructed to complete the SEC surveys at the outset and conclusion of the study keeping in mind their most recent school year and corresponding set of students. The two administrations were undertaken 1 year apart. While the student groups were different, all of the teachers except one had taken their prior group of students to E:Y!, so the two groups were considered to be as equivalent as possible for the purposes of this instrument.

The surveys asked teachers to first determine the amount of time spent throughout the entire school year in 27 broad content areas. Then they identified the amount of class time spent eliciting each of five student expectations: (a) memorization and recall, (b) performing procedures, (c) communicating understanding, (d) analyzing information, and (e) applying concepts. The overall instructional time spent, content areas, and expectations, called cognitive demands, are then represented by three-dimensional graphics that resemble topographic maps. They show the amount of time spent crossed with content topics and corresponding areas of cognitive demand. When compared, these maps provide a visual overview of the changes teachers reported in their teaching practice from pre- to post-STaRRS intervention. The five topic areas of focus for STaRRS were measurement in science, nature of science, ecology, science and technology, as well as acids, bases, and salts. Examples of questions and response matrices can be found in Table 4.

Table 4. SEC—sample questions and response choices
Category and Instructions for RatingResponse ChoicesSample Questions/Items


  1. Survey of Instructional Practices Teacher Survey Grades K-12 Science (Blank et al., 2001).
Instructional Activities in scienceAmount of instructional time25—Listen to the teacher explain something about science to the class as a whole
How much of the science instructional time in the target class do students use to engage in the following tasks?1—Little (Less than 10% of instructional time for the school year)28—Write about science in a report or paper on science topics
 2—Some (10–25% of instructional time for the school year) 
 3—Moderate (26–50% of instructional time for the school year)33—Do a science activity with the class outside the classroom or science laboratory (e.g., field trips or research)
 4—Considerable (More than 50% of instructional time for the school year) 
Time on topicResponse codesNature of science:
 0 = None102—Nature of scientific inquiry/method
Indicate the amount of time spent on each topic covered in this class(Not covered) 
 1 = Slight coverageScience and technology:
 (Less than one class/lesson)204—Design or implement a solution or product
 2 = Moderate coverage 
 (One to five classes/lessons)Measurement and calculation in science:
 3 = Sustained coverage412—Data displays (e.g., tables, charts, maps, graphs)
 (More than five classes/lessons) 
Expectations for studentsResponse codesAnimal biology:
 0 = No emphasis807—Structure and function
Indicate relative emphasis of each student expectation for every topic taught:(Not a performance goal for this topic) 
1. Memorize facts/definitions/formulas1 = Slight emphasisEcology:
2. Conduct investigations/perform procedures(Less than 25% of time on this topic)1,301—Food webs/chains
3. Communicate understanding of science concepts2 = Moderate emphasis 
4. Analyze information(25–33% of time on this topic)Earth systems:
5. Apply concepts/make connections3 = Sustained emphasis2,006—Erosion and weathering
 (More than 33% of time on this topic) 

Data Analysis

The primary researcher scored the teachers' GCI, and ToSRA items using keys provided by the instrument developers (Fraser & Butts, 1982; Libarkin & Anderson, 2008). All GCI and ToSRA pre- and post-assessments and spreadsheets were re-scored and checked by two independent scorers, producing an inter-rater reliability of >0.99 for scoring and data entry. The students' GCI-MLS and ToSRA were scored by a single research assistant, under the supervision of the primary researcher, using keys developed by the primary researcher (GCI-MLS) and instrument developers (Fraser & Butts, 1982). Ten percent of the GC-MLS and ToSRA pre- and post-assessments were randomly selected and re-scored by the researchers, producing an inter-rater reliability of 0.97 for the pre-test and 0.99 for the post-test for the GC-MLS 0.99 for the pre-test and post-test for the ToSRA. The SEC data were scored by the instrument providers ( and rechecked by the researchers.

Teacher GCI pre- and post-test scores were analyzed in three groups. The first covered the entire test, including all 25 items. Then each of the two subsections (general knowledge and E:Y! content) were analyzed. Since the hypothesis being tested was directional, a one-tailed dependent sample t-test model was used to compare teacher scores prior to and at the conclusion of the study. Descriptive statistics were generated for each of the five teacher ToSRA scales. Next, pre- and post-test differences were analyzed using dependent sample t-tests. Each of the ToSRA scales measures a different construct and combining them does not produce a meaningful score (Fraser & Butts, 1982). Thus, analyses focused on comparing scale scores rather than the total scores.

Analysis of covariance (ANCOVA) was used to analyze student pre- to post-test score differences for the GCI-MLS and ToSRA using the pre-test scores as the covariate. This approach controlled for any differences in pre-test scores for the treatment and comparison groups. In addition, the GCI-MLS data were analyzed in four sections (the total test and three subsections) and the ToSRA was analyzed as four separate scales.


Impact on Teachers' Content Knowledge, Attitudes, and Pedagogical Strategies

Content Knowledge

Although the differences for the GCI scores on the total test (TT) were found to be statistically significant (see Table 5), the difference represented an average of one point gain on the 25-item test. This finding could be of questionable practical significance. The pre-test average GCI score in Libarkin and Anderson's (2005) original study with undergraduate students (n = 2,215) stood at 41%. Libarkin and Anderson reported that undergraduates in their study who pre-tested high (which they defined as above 60%) exhibited no change in their post-test scores. The STaRRS teachers' pre-test average was 71%. Thus, it is very likely that the instrument was not sensitive enough to show gains in teacher content knowledge because of a ceiling effect. STaRRS teachers were among a self-selected group with solid understandings of geological and earth science concepts related E:Y! experience. The reader is reminded that the geology and earth science topics addressed in the GCI used in the present study were most closely aligned with E:Y! and STaRRS content.

Table 5. Descriptive statistics and dependent samples t-tests for STaRRS teachers' GCI findings
Subsection (# of Items)Pre-Test M (SD)RangePost-Test M (SD)RangeDifference M (SD)Significance


  • n = 9.
  • *p < 0.05, one-tailed.
TT (25)17.76 (2.87)14–2218.82 (2.80)15–231.06 (1.72)0.042*
GK (19)13.40 (2.54)10–1713.77 (2.38)11–170.37 (0.96)0.130
E:Y! (6)4.36 (1.29)1–65.05 (0.66)4–60.69 (1.19)0.050


In the area of attitudes toward science and scientists, STaRRS teachers demonstrated statistically significant gains on four of the five ToSRA scales (see Table 6). The greatest gains were on the NS and LEI scales with effect sizes of 0.97 and 1.09, respectively. Smaller gains were detected for SIS and AD-ATT. These findings demonstrate a possible impact of STaRRS participation on teachers' attitudes. The differences also were practically significant since each of the observed attitude differences were greater than one third of a standard deviation, which was defined by Cohen (1988) as a moderate effect size. In fact, the effect size of NS was nearly a whole standard deviation, and LEI was greater than 1.0 SD. INQ showed no change. This latter finding could be due to the fact that this self-selecting group of STaRRS teachers joined the program specifically to increase the amount of inquiry-based science they were teaching, so that this scale was not affected by this project.

Table 6. Dependent samples t-tests for STaRRS teachers' ToSRA pre-post-assessment differences by scale
ScalePre-Test M (SD)RangePost-Test M (SD)RangeDifference M (SD)Significance


  • n = 9.
  • *p < 0.05, one-tailed.
  • **p < 0.01, one-tailed.
SIS40.1 (3.28)35–4541.9 (3.00)37–461.80 (2.15)0.014*
NS37.3 (3.97)31–4340.1 (4.60)32–482.80 (2.90)0.007**
INQ39.6 (2.88)33–4239.8 (2.25)37–440.20 (2.78)0.413
AD-ATT40.6 (2.07)37–4342.4 (1.51)40–451.80 (2.39)0.021*
LEI39.4 (3.10)34–4441.4 (2.59)38–452.00 (1.83)0.004**

Pedagogical Strategies

The impact of STaRRS participation on pedagogical strategies was assessed using the SEC. These surveys provide a wealth of information on the domains, as well as depth and breadth of teaching practice. Only a small portion of the data has been selected for presentation in Figures 1-5.

Content maps developed using the SEC software allows for viewing these data in three-dimensional space. The maps show how science content topics align with the cognitive demand expectations of the teachers. These data are then overlaid with shading and contour lines representing the percentage of instructional time dedicated to corresponding domains. The maps resemble topographic maps and are read in a similar manner. The horizontal grid lines correspond with the topic areas and the vertical ones correspond with six categories of student cognitive expectations. These expectations correlate with Bloom's Taxonomy (Bloom, Hastings, & Madaus, 1971). Lower level thinking skills are on the left, and more complex and higher level thinking skills are on the right. The locations where the grid lines intersect are called “measurement nodes.” At each of the nodes, the shaded bands of color represent increases or decreases in reported teaching time.

Comparisons of the SEC pre- and post-content maps provided evidence of favorable pedagogical shifts in the STaRRS teachers' classrooms practice. These shifts were evident in both time dedicated to various science topics and teachers' cognitive expectations of students related to both E:Y! and STaRRS curriculum content areas. The following sections highlight five of these specific areas with examples.

Measurement in Science

Figure 1 shows teacher reported shifts in instructional focus for the subtopic areas covered under “measurement in science” over the course of the study (one academic year) in both content and student expectations. The most noticeable shifts are in three areas. The first is found at the nodes where mass, weight, and length intersect with performing procedures (A1) and (A2). The second applies to temperature (B1) and (B2) across all expectation areas. The final area is related to data displays (C1 and C2). In the last two topic areas, teachers' reports of content show an increased distribution of breadth of cognitive expectations.

Figure 1.

SEC—Measurement and calculation in science.

Nature of Science

On the nature of science maps (Figure 2), three measurement nodes stand out. The first two can be seen where nature of scientific inquiry/methods intersect with performing procedure and applying concepts at (D1) and (D2). Teachers reported increased time spent in both areas. The largest shift, however, was reported by teachers at the node where scientific habits of mind intersect with communicating understandings (E1) and (E2).

Figure 2.

SEC—Nature of science.


Three areas of interest (Figure 3) are found in the center of the ecology maps along F1 and F2. The pre-test map supports the claim that this topic was already present in the STaRRS teachers' curriculum (Yellowstone Association Institute/Yellowstone National Park [YAI/YNP] 2004), and teacher expectations were mostly focused on engaging student in communicating their understanding. However, these maps demonstrate that, by the end of the study, teachers reported substantial increases in time spent having students communicate their understandings (F2) in three subtopic areas: food webs and chains, ecosystems, and adaptations and variations. Additionally, teachers reported spending more time having students apply concepts about food webs and chains (G1) and (G2) and ecosystems (H1) and (H2). All of the concepts displayed on this map are specific to E:Y! and their Ecology Day curriculum, the day that followed the Geology Day/STaRRS curriculum during the expedition, not to the STaRRS components of the expedition, which entails that the impact of the STaRRS intervention spilled into additional areas of participant teachers' curriculum.

Figure 3.


Science and Technology

The findings apparent in the science and technology subsection indicate a greater distribution and increase in time reported at the end of the study in this area (Figure 4). At the outset, STaRRS teachers reported their strongest area of focus to be at (K1) where performing procedures intersects with laboratory tools and safety. The increase in this subtopic area (K2) could represent an increase in emphasis on the safe use of tools in the field and student behavior expectations while collecting data at hot springs. Both safety and behavior are covered in detail at all expeditions, but field notes from the STaRRS expeditions match the teachers' reported increases in durations dedicated to this area because more time was spent at the hot springs conducting field research.

Figure 4.

SEC—Science and technology.

Another area showing an increased emphasis is found at the node where the relationship between scientific inquiry and technological design and communicating understanding intersect (J1) and (J2). This finding may represent the increased time the teachers spent with the students developing answerable questions and designing field research projects prior to and during the expeditions. These data are reinforced by qualitative data, including those from field notes, as well as teacher and student interviews.


The final map (Figure 5) shows the topic areas related to acids, bases, and salts. This is a content area usually covered more in-depth above the eighth grade level (NRC, 1996). However, pH was an area of minimal focus at the outset of the study and was most likely related to the regular E:Y! curriculum (YAI & YNP, 2004). Small increases by the end of the study represented additional time spent across all cognitive demand areas in acids, bases, behaviors and strengths (L2) and pH (M2), which may have been due to attention to pH as a result of the development and selection of field research questions by students. Both STaRRS teachers and their students were especially interested in colors produced by microbial mats observed at MHS. Because of this interest, every teacher had at least one student group explore the relationships between the colors and pH levels of the spring water, which would have necessitated extra instructional time and focus on pH.

Figure 5.

SEC—Acids, bases, and salts.

It is important to note that of the nine teachers only seven reported teaching pH within their classrooms. This is most likely an artifact of the difference in the grade level and subject matter taught by the teachers versus the groups they brought to Yellowstone. For example, one STaRRS teacher taught much younger students in the regular classroom (second grade) and pH is not an appropriate topic for primary grades. Therefore, this teacher did not report data for either year.

In summary, most of the STaRRS teachers reported changes in the amount of time spent teaching and the demands they made of their students, not only in topic areas related to the STaRRS content, but in some cases, in more general science content areas. This was evidenced by the movement along the continuum of cognitive demand as defined by the SEC, from teaching lower level thinking skills to more complex ones, as appropriate for the specific content area. In some cases, such as the use of specific scientific tools and measurements, this meant moving from memorization and recall to performing procedures. In other areas, such as inquiry, logic and reasoning, students spent more time communicating understanding and applying concepts.

Impact on Students' Content Knowledge and Attitudes


Since RASCH analysis determined that the items on the GCI-MLS were a good fit and an adequate measure for growth, the raw data were used in the analyses. Table 7 shows the differences, ranges, and percentage gains in E:Y! and STaRRS students' pre- and post-test GCI-MLS scores. One reason to present the data in this format is that they more closely correlate with the ways assessment data are usually presented in classrooms. Analysis revealed (see Table 8) that STaRRS group students made statistically significant gains (p < 0.01) in all areas as compared to E:Y! students. Cohen's d was calculated for the results using Thalheimer and Cook's (2002) methodology. The effect sizes of the total test (0.91) and STaRRS subsections (1.33) gains were very large, compared to the E:Y! students' subsection (0.43) and general knowledge subsection (0.23), which showed moderate gain and small gains, respectively.

Table 7. GCI-MLS—E:Y! and STaRRS students' pre-test/post-test/difference scores and percentage gains
GroupsPre-Test M (SD)RangePost-Test M (SD)RangeDifference M (SD)% Gains
TT (42 items)
E:Y!11.60 (4.47)2–2413.68 (4.81)3–252.09 (4.18)4.8
STaRRS13.18 (4.87)2–2820.12 (6.66)4–356.93 (6.18)16.7
GK (21 items)
E:Y!8.33 (3.27)2–179.28 (3.44)1–170.95 (3.36)4.5
STaRRS9.24 (3.33)1–1910.98 (3.68)3–181.74 (3.61)8.2
E:Y! content (7 items)
E:Y!1.64 (1.24)0–51.93 (1.33)0–60.29 (1.46)4.1
STaRRS1.77 (1.37)0–52.75 (1.50)0–60.98 (1.75)14.0
STaRRS content (14 items)
E:Y!1.63 (1.37)0–62.47 (1.62)0–60.84 (1.71)6.0
STaRRS2.16 (1.64)0–66.39 (3.18)0–124.23 (3.16)30.0
Table 8. ANCOVA GCI-MLS with pre-test covariates


  • E:Y! n = 180, STaRRS n = 186.
  • **p < 0.01, two-tailed.
TT pre-test covariate
TT pre-test3,185.4613,185.46127.010.000
GK pre-test covariate
GK pre-test1,079.3111,079.31110.920.000
E:Y! Content pre-test covariate
E:Y! pre-test63.84163.8434.930.000
STaRRS content pre-test covariate
STaRRS pre-test195.531195.5332.970.000


STaRRS students' results significantly differed from E:Y! students on the ToSRA in two areas: NS and LEI (see Table 9). The percentage of increase on the NS scale and decrease on the LEI scale are shown graphically in Figure 6. STaRRS students exhibited a positive change (p < 0.01) on the scale measuring their views of scientists as regular people. On the leisure interest in science scale, E:Y! students showed an increased negative attitude that was significantly larger (p < 0.05) than STaRRS students. As far as the latter finding is concerned, although E:Y! and STaRRS students all exhibited more negative attitudes toward engaging in science-type activities in their leisure time, STaRRS students' decrease was significantly less than E:Y! students. This may imply that the experience of STaRRS helped to decrease a known pattern of declining interest in science as students go up the K-12 ladder (e.g., American Association of University Women, 1994).

Table 9. ANCOVA ToSRA scales with pre-test covariate


  • E:Y! n = 182, STaRRS n = 187.
  • *p < 0.05, two-tailed.
  • **p < 0.01, two-tailed.
NS—pre-test covariate
INQ—pre-test covariate
ENJ—pre-test covariate
LEI—pre-test covariate
Figure 6.

E:Y! and STaRRS students' change (%) on ToSRA subscales—normality of scientists and leisure interest in science.

There were no differences on either the INQ or ENJ scales. The INQ scale matched the lack of change in the teacher's ToSRA scores. It is possible that inquiry was already a part of teaching in classrooms both for the treatment and comparison groups, so, the students did not perceive any differences during the school year when the study was undertaken. In a sense, all participant teachers, ones who make the effort to bring groups of 12–32 students on a 4- to 5-day hands-on field expedition at YNP, are likely not representative of the typical population of teachers (Bob Fuhrmann, Director of E:Y!, Personal communication, March 2006).

Discussion and Conclusions

The evidence indicates that participation in STaRRS did impact students' and teachers' geoscience content knowledge and attitudes toward science and scientists, as well as the self-reported pedagogical practices of teachers. This research study is, to the best of our knowledge, the first to provide empirical evidence showing a direct impact of an STSP on valued educational outcomes for science teachers and their students. The present findings stand as a contribution to the research literature on STSPs. But why did engaging teachers and students with park rangers and scientists in authentic science research activities result in measurable change for teachers and students on particular outcomes? How can these results be connected to current reform efforts in science education? What are the implications for future partnerships, particularly when framed in the context of current reform efforts? Two emergent themes serve both to frame and inform a discussion of these questions. The first relates to STSPs as a vehicle and catalyst for change in K-12 science education. The second theme speaks to the tangible benefits of situating teacher PD through experiential learning within an STSP, and establishing both functional and strategic links between the PD and STSP.

Scientific Practice in Science Education

Engaging K-12 students with authentic inquiry experiences that progressively approximate scientific practice has been a consistent and major theme in science education reforms for the past half century (Abd-El-Khalick et al., 2004). While some progress has been achieved in this regard, much remains to be desired: K-12 science instruction largely continues to be incommensurate with how scientists conduct their practice (Anderson, 2007; Talanquer, Tomanek, & Novodvorsky, 2013). Toward achieving this goal, reforms have called for, explicitly or implicitly, a three-step process for effecting change. The first step is to promote science teachers' understandings of scientific content and inquiry through engagement with the sort of experiences we expect them to enact in their own classrooms. The NRC (2007) called for engaging teachers with “ongoing opportunities to learn science… [that] should mirror the opportunities they will need to provide for their students” (p. 7). In the case of Taking Science to School (NRC, 2007), these opportunities would be related to the generation and evaluation of explanations of the natural world, understanding nature and the development of scientific knowledge, and engagement in scientific practices and discourse. The Framework (NRC, 2011) states that science teacher PD should “prepare teachers to meet the challenges of the Next Generation Science Standards” in terms of disciplinary core ideas, crosscutting concepts, and scientific practices (Wilson, 2013, p. 312). The sort of PD opportunities dedicated to this end and closest in their approach to STSPs, have taken the form of teacher authentic research apprenticeships (Sadler, Burgin, McKinney, & Ponjuan, 2010).

The second step is to support teachers as they engage with the process of transferring their newly acquired understandings and skills for the purpose of transforming their own instructional practice. In this regard, the reforms and literature on best practice have for called long-term on-site support and PD, as well as teacher coaching and mentoring (e.g., Darling-Hammond, Wei, Andree, Richardson, & Orphanos, 2009; NRC, 2001), in addition to “just-in-time assistance” that draws on the potential of “new technologies and social media to make high-quality science PD available to all teachers” (Wilson, 2013, p. 312). The extent to which these two steps are interlocking has varied across interventions, which has translated into differential impacts in terms of effect on science teachers' disciplinary knowledge, understandings of nature of science and the development of scientific knowledge, and/or instructional practice (Bell, Blair, Crawford, & Lederman, 2003; NRC, 2001; Sadler et al., 2010). Obviously, the more meaningfully coordinated the two steps are, the more potent their effect.

The third step, and arguably the most important, often is assumed to necessarily follow from the first two: The assumption is that the combined impact of the first two steps eventually transform students' experiences in science classrooms to include engagement with approximations of authentic inquiry or scientific practice. This assumption has proven to be problematic, to say the least. Anticipated large-scale transformations in terms of teacher instructional practices and student learning experiences are yet to be realized (Arzi, 2012; Jones & Carter, 2007). It is the very nature of this third step that sets high quality STSPs apart from high quality science teacher authentic research internships coupled with meaningful classroom support.

STSPs as Catalysts for Change

Rather than assuming change in student experiences will automatically follow as a result of working at the level of teachers (i.e., improving teacher knowledge and skills and supporting their efforts to bring about change), STPSs build student agency into the very process of change. Thus, the aforementioned two interlocking steps with an assumed impact on the third is replaced with a three-interlocking-step process, which intersects the domains of scientific research, teacher PD and pedagogical practice, and student learning experience. In other words, STSPs create a transformative space at the intersection of these three domains that increases the likelihood of impacting student and teacher outcomes, thus, bolstering the process of change classroom instruction. Next, we discuss a number of episodes documented in the course of implementing STaRRS, which shed light on how the incorporation of students as a third axis in the partnership transformed the nature of interactions within this shared space and contributed to the documented positive impacts on students' and teachers' knowledge and attitudes, and teacher self-reported instructional practice.

Breaking Down Barriers

First, the inclusion of students in STaRRS—and by extension similar STSPs—seemingly resulted in elevating the status of science teachers to “equals” as fully recognized partners with practicing scientists. Often, by virtue of substantial gaps in terms of training and scientific expertise, teachers engaged in authentic scientific internships necessarily find themselves locked in a hierarchical relationship with scientists. These relationships are associated with the extent to which the teachers can or, more often cannot, make meaningful contributions to the scientific tasks at hand. In comparison, the introduction of students and the necessity of attending to their needs in an STSP makes the pedagogical expertise of teachers and their deep knowledge of students an indispensible resource to the partnership. Thus, teachers' voices and contributions now have a significant place in the course of decisions associated with the implementation of not only pedagogical matters, but scientific ones as well.

An illustrative example transpired when scientists presented the transect grids (to be used for the second set of data collection by STaRRS students) at the first teacher workshop. Almost immediately teachers (and rangers) subjected these protocols to critical revision due to factors that would inhibit data collection as originally conceived. Scientists had envisioned a one-by-one meter transect grid, which may have been an appropriate size for an adult to carry into the field. Teachers highlighted the importance of taking the stature of a 10–14 year-old into consideration and suggested a revised 50 cm × 50 cm frame for the grid, which the scientists accepted. Similarly, because of teacher and ranger input, other aspects of the data collection protocol, which involved skills such as sketching, taking photographic evidence, and communicating with scientists, were also revised to better match student abilities while still meeting the scientists' need for the collection of reliable data.

Such intense interactions with the science research team, carried out as peer-to-peer discussions, might have helped reshape some of the teachers' attitudes about scientists. Since the protocols shared during the workshop were not finalized, the involvement of teachers and rangers in the revision process allowed them to take active part in this aspect of doing the science. In this sense, the interactions between teachers, rangers, and scientists were quite collegial with mutual respect and appreciation for the varied sets of knowledge and expertise that each group brought to the partnership. Although these kinds of critical discussions and design iterations are common in science, students, teachers, or the general public often does not experience them. Most of the latter audiences receive science in the form of finalized product; cleaned-up sound bites. Participating in the protocol revision process may have given the teachers new insights into how field research is planned and set up, but also into the social dimensions of doing science as a collective and dynamic endeavor. Such interactions and insight most likely were behind the marked increase in participant teacher attitudes toward science and scientists as measured by the ToSRA. Indeed, the largest gains for teachers were evident in the case of the NS and LEI sub-scales with respective effect sizes of 0.97 and 1.09. Teachers now experienced scientists as “normal” folk and as colleagues with whom they can interact at ease. The observed increase in the leisure interest in science could be attributed to the specific nature of the STaRRS experience, being situated in outdoor and naturally appealing settings as compared, for instance, to indoor research laboratories.

Evidence of Change

Second, the inclusion of students in the partnership made proximal any improvements in student outcomes, which might have facilitated some of the observed changes in teachers' instructional practices. The abovementioned model of bringing about change in teacher practice is premised on first changing their knowledge and beliefs in the hope of getting them to eventually change their instruction. In a sense, teachers are asked to suspend judgment about the alternative instructional models (e.g., reform-oriented instruction) with which they are being presented, to learn how to implement these strategies, and then to actually undergo the process of relegating their own approaches—which might have “worked” well for them in the past—in favor of these alternatives, because the latter will eventually bring about improvements in student outcomes. It is not hard to see that, in this case, student outcomes are distal and far removed from the context and timeframe, sometimes by several months if not longer, where teachers experience the new and much desired strategies. Guskey (2002) argued against such model, which had proven time and again to be less than optimal (Jones & Carter, 2007). Instead, Guskey suggested that change in teacher instructional practice follows evidence of change in student outcomes: This is exactly what STaRRS and carefully designed STSPs can provide.

At several stages in the STaRRS experience, teachers and students were working, literally, side by side on “scientific inquiry” both in the sense of collecting data on behalf of the science research team (genuine inquiry mostly from the perspective of teachers, and some students) and collecting data to answer student-generated questions (genuine inquiry mostly from the perspective of students, and some teachers). Along the way, teachers were constantly receiving formative feedback from students about their experiences and a firsthand account of how student learning was unfolding. This formative feedback provided initial proximal assessments of the impact of STaRRS on student outcomes.

In the context of attempting to answer scientific questions about the hot springs systems at Yellowstone, there were three main research activities connecting the students, teachers, rangers, and scientists: Gathering photo point data, collecting specific transect data, and generating student-driven field research studies. These activities were designed to give teachers and students experience with a full cycle of scientific inquiry, while at the same time connecting teachers and their classroom communities to a broader scientific research project. This full cycle of scientific inquiry was heavily encouraged at all expeditions, even within an already packed schedule. In spite of time constraints, analysis of field data and presentations of findings were carried out on site by students at the end of Geology Day. These presentations served two purposes. First, they helped students refine their thinking and make connections across their activities. Many students reported in post-evaluation interviews that it was through giving their own, and watching others', presentations that they really understood what they were doing and how scientific inquiry worked. Second, the presentations served to provide feedback to teachers, and gave them evidence of student learning long before the post-test was administered. The following anecdote serves to illustrate the impact of such feedback on teachers.

One of the STaRRS groups brought teachers, instead of parents, as chaperones. Two of these teachers approached the first author at the end of the data collection portion of the field experience, just prior to student data analysis and presentation of findings. They indicated their concern about the amount of time spent on the project, including the extra week spent prior to the expedition teaching STaRRS content and processes, and the similarly extended time spent on these activities on site. They felt that after more than 6 hours in the field the students did not seem to “get it.” However, right after student data processing and presentations, the two teachers returned to report that now they thought the students “got it! And now, so do we!” Evidence of student understanding during the presentations was overwhelming and had produced clear change in these chaperoning teachers' perspective and attitudes toward extended engagement with inquiry that builds on students' own research questions. One could only assume that STaRRS teachers must have experienced such shifts at much deeper and lasting levels, which could have translated into changes in their instructional practices.

Indeed, the present evidence indicated major changes and shifts in teachers' self-reported pedagogical strategies as reported on the SEC. To be sure, these changes could be attributed to compliance to the STaRRS curriculum by teachers and the requirements of the partnership. Nonetheless, evidence of changes in instruction related to a content area, namely ecology, which was not a part of the STaRRS curriculum, adds confidence to the assertion that participation in the partnership impacted teachers' pedagogical strategies beyond the STaRRS curriculum.

Addressing Student Needs

Third, it was attention to student needs that might have resulted—at least, to a significant extent—in the measured impact on their attitudes and content knowledge, as well as on teachers' content knowledge. Based on the arguments of Moss et al. (1998), STaRRS did not limit student involvement to just scientist-prescribed data collection and analysis activities. STaRRS students explored questions related to their areas of interest, which served to make experiences at YNP E:Y! more authentic for them. Developing answerable questions, nonetheless, was a novel skill for students and most of the STaRRS teachers. For these teachers, their only experience developing their own questions was at the STaRRS workshop. They had not used or taught this skill to their students before, which now they found themselves having to deliver on. This is not surprising as most science curricula, including those written for guided inquiry, often provide the actual questions for teachers to use in student exploration. The teachers' need for support in this area was addressed through the development of a set of activities. The activities were fleshed out in the STaRRS classrooms during the year and shared with the rest of the group through bimonthly communications and web-conferences.

Being able to develop answerable research questions is a specific skill, and one that is not always well understood. In order to develop good questions, prior knowledge must include not just the concepts, but also an understanding of the relevant tools (including their limitations) and procedures. Hansson and Yarden (2012) explored the possible influences of separating procedural knowledge from the development of research questions. They found that as participant teachers in their study improved their ability to perform laboratory procedures using specific tools and protocols, their ability to develop appropriate research questions also improved. In other words, engagement with thinking about and putting forth answerable questions could have served as both an incentive and meaningful context for students and teachers to internalize the requisite science concepts and inquiry skills. In the context of STaRRS, learning about all these aspects was meaningfully intertwined. In a sense, the benefits for STaRRS teachers and students did not simply arise just from the research experience itself. Rather, having scaffolded opportunities to learn about the geobiology content of the university research group, the uses and limitations of several tools, and the characteristics of the hot springs system provided teachers and their students a foundation on which to ask and research their own questions. In turn, the very act of engagement with trying to answer these questions further deepened student and teacher understandings of these disciplinary concepts and inquiry processes, and better refined their inquiry-related skills.

STaRRS student content knowledge gains showed both anticipated and less anticipated outcomes. On the STaRRS content subsection of the GCI-MLS, comparison students were not expected to do as well, and they did not. After all, comparison teachers were not privy to the new science content, tools, and techniques of studying hot springs. However, STaRRS students did significantly better on all sections of the GCI-MLS including the portion attributed to E:Y! content. This subsection addressed content that is taught at all expeditions. STaRRS students also performed better on the general knowledge portion of the test. This subsection was matched to NSES (NRC, 1996) and State standards and represented knowledge that is aligned with curricula in most schools. These findings indicate that the STaRRS experience enhanced students' science content knowledge beyond areas directly presented to them in pre-expedition class work and in expedition experiences and instruction. As noted above, students' experiences as a whole may have led them to learn other geoscience topics with greater understanding.

Furthermore, the authenticity of the research site added to the richness of students' and teachers' experiences and helped them to build rather flexible skills. Weather, safety, and the quality of the hot springs (which can be incredibly variable in their flow rate) at the time of data collection, sometimes limited the use of particular questions that student and teachers had developed before coming to YNP. When STaRRS groups arrived at the study site, over half of them had to change their questions to fit the current conditions. The first STaRRS expedition shared these glitches with the upcoming expeditions and teachers spent more time developing questions than was originally planned. It made a difference. Students who came up with their first set of questions prior to their field experience were able to do so again in the field with relative ease. The STaRRS experience highlights the importance of actively and explicitly teaching such skill and providing activities and support to help both students and teachers negotiate what many consider as one of the most difficult inquiry skills to learn and teach (Abd-El-Khalick et al., 2004; NRC, 1996).

Connections to Current Science Education Reform Efforts

By now, the connections of STSPs, such as the STaRRS partnership, to the most recent wave of reform efforts in science education as embodied in the Framework documents (NRC, 2011) and associated NGSS (Achieve, 2013) should have come into sharp focus. When carefully structured and executed, STSPs can provide valuable PD activities for science teachers and learning opportunities for K-12 students to engage with scientific practices and develop deep understandings of disciplinary ideas. The latter are two of the three major axes targeted in these reforms. The third axis, related to crosscutting concepts, would follow closely when it comes to concepts, such as patterns, scales and proportion, systems and systems models, and stability and change, which could be targeted in an STSP like STaRRS. Indeed, several crosscutting concepts were explored by scientists, teachers, rangers, and students using the Hot Springs Facies Model (Fouke et al., 2000), a conceptual systems model that aids the understanding of a hot springs sedimentary environment. Calcium Carbonate is deposited and “grows” (forming the rock called travertine) in very predictable patterns at a variety of scales. The scales used in this partnership ranged from millimeters to meters and were studied and correlated to abiotic (non-living) parameters, such as spring water flow rate and temperature, and biotic (living) factors such as ranges of colors exhibited in microbial communities.

In this regard, it does not escape us that while using STSPs as a context for teaching disciplinary content and scientific practices is very promising, the implementation of such partnerships could be challenging. Effective STSPs require planning, attention, and flexibility. Key attributes include the ability of these partnerships to: (a) connect teachers and students to research science, while at the same time support core conceptual science learning through the integration of scientific practices and cross-cutting concepts; (b) use accompanying experiential, research-based PD that addresses the needs of scientists, teachers, and students; and (c) provide student ownership of knowledge through asking and answering their own research questions within the partnership. Developers of future STSPs should consider the critical importance of connecting the two types of research activities by having accurate scientific data collection lead to the development and implementation of student-driven research. Further research on these key attributes will enable us to develop more solid and sustainable models for future STSPs.

This work was supported by the National Science Foundation Biocomplexity in the Environment Program (grant number EAR-0221743) and a supplementary Research Experience for Teachers Program award. The products are those of the authors and do not necessarily reflect those of the funding agency. The authors wish to acknowledge the important contributions of the Fouke Lab at UIUC, including Amanda Oehlert, Holly Vescogni, Kathy Yang, Dr. Robert Sanford, and Dr. Bruce Fouke, the Division of Resource Education and Youth Programs at Yellowstone National Park, including Rangers Bob Fuhrmann, Ellen Petrick, Melanie Condon, Trudy Patton, Michael Breis, Matt Ohlen, and Sabrina Diaz, and the E:Y! and STaRRS teachers and their students, who must remain unnamed in this forum.