This article concerns how noncognitive constructs—personality and motivation—can be assessed and developed to increase students' readiness for college. We propose a general framework to account for personality and motivational differences between students. We review numerous studies showing that personality and motivational factors are related to educational outcomes, from early childhood to adulthood. We discuss various methods for assessing noncognitive factors, ranging from self-assessments to performance tests. We consider data showing that personality and motivation change over time and find that particular interventions have proven successful in changing particular personality facets, leading to increased achievement. In a final section we propose a strategy for implementing a comprehensive psychosocial skills assessment in middle and high school, which would include setting proficiency standards and providing remedial instruction.
We review an extensive list of constructs and frameworks to motivate the development of a unifying general framework, or common framework, into which other systems can fit. Doing so provides several advantages: a common framework allows the maximum number of researchers to contribute to advances in our knowledge of how to assess, intervene, and develop students while using standard terminology. A common framework also means that results and findings will be readily accepted by both the scientific and user communities.
As a general framework we propose the five-factor model (FFM) of personality to account for trait-level differences between students. We propose in addition a more process-oriented description of goals and motives to account for other noncognitive aspects of school performance. Much of the research in education uses the self-regulatory learning framework for describing goals and motivational processes. These two levels of personality description—trait and process—may work together in effecting personality change.
Empirical Evidence for the Importance of Noncognitive Constructs
We examined evidence that personality and related noncognitive factors are related to educational outcomes. We found considerable evidence that such relationships exist and in many cases are fairly strong. Conscientiousness in particular, and especially its facets of achievement striving, self-discipline, and diligence, has been shown repeatedly to predict academic success from early grades through graduate school. Conscientiousness predicts academic outcomes even after controlling for prior academic history and standardized test scores. Other factors of the Big 5 have been less consistent in their prediction of school outcomes, but there is some evidence that neuroticism, particularly its anxiety and impulsiveness facets, may impair learning, and openness may enhance it. There were also numerous suggestions for factors that might mediate the relationship between these personality factors and achievement. For example, conscientiousness may cause greater achievement by increasing the expenditure and regulation of effort, leading to greater persistence, and higher perceived ability, by influencing class attendance, or even by leading to a more regular sleep cycle. Neuroticism may cause poor study attitudes that in turn can lead to decreased achievement. Both factors may underlie the development of good study habits, study skills, study attitudes, and study motivation, which have as a group been found to be powerful determinants of academic achievement. Other factors, such as time management, self-efficacy, and academic self-concept, along with academic discipline, commitment to college, and interpersonal and intrapersonal behaviors, have also been found to relate to academic achievement.
We discuss a wide variety of both conventional and novel methods for assessing noncognitive skills. Self-assessments are the most common and are likely to be useful in any kind of noncognitive assessment system, particularly when the stakes are not high. Situational judgment tests are also an increasingly popular way to measure noncognitive factors. They have been used in so many studies over the past 10 years or so that the methodology for developing them is now fairly affordable, and the measures are becoming increasingly reliable and valid. It is probably a useful idea to supplement self-assessments with a different kind of assessment such as a situational judgment test at the very least to reduce measurement method bias. Other assessments, such as teacher ratings and interviews, are also quite useful, and they are currently the most viable for high-stakes selection applications. However, they do place a high burden on the rater or the person conducting the interviews. Where that cost is too high, a strategy in some applications might be to use them as an occasional assessment, examining their relationship to self-assessments and situational judgment tests for a subset of participants.
Other assessments reviewed, such as conditional reasoning and the implicit association test (IAT), are intriguing and may potentially be quite useful. This is also true of time-use methods and word classification methods. However, all of these are still in a research status and may have to undergo additional evaluations before being employed operationally.
We examined the malleability of psychosocial or personality factors and evaluated the evidence for established methods for improving them. There is a widespread perception that personality factors are fixed over the lifespan—we have the personality we were born with. Two meta-analyses (B. W. Roberts & DelVecchio, 2000; B. W. Roberts, Walton, & Viechtbauer, 2006) demonstrated quite convincingly that this is not the case. The correlation between personality tested a year or more apart is only moderate, suggesting that while there is some consistency in personality, there is also change: Some individuals increase on some personality factors, others decrease, but in general there is change in the rank order of people over time. Also, there are mean-level changes in personality over the lifespan—we tend to become more conscientious, considerate of others, socially dominant, and emotionally stable as we grow through adolescence and into adulthood. This change suggests that personality in some sense may be thought of as a skill that can be developed like other skills. If so, then principles that govern cognitive skill change—such as practice makes perfect and it is easier to change narrow domains than broad domains—may prove useful in personality development efforts.
We also reviewed the evidence that already exists for whether and how personality and psychosocial factors could be improved, which suggests interventions and policies that could be implemented in the schools. For each of the five factors, specific interventions have proven successful. These include exercises and training in critical thinking (openness), study skills (conscientiousness), test and math anxiety reduction (neuroticism), teamwork and leadership (extroversion and agreeableness), and attitudes. Interventions along the lines of those described here could be evaluated in conjunction with a comprehensive psychosocial assessment system.
Recommendations for Future Research
We consider the findings from the literature review and our own experiences to suggest how a comprehensive psychosocial assessment scale could be developed and how it might best be used. The core assessment constructs would be the Big 5 factors, along with particular facets that have proven to be important in education, such as the achievement striving and dependability facets of conscientiousness and the anxiety facets of emotional stability. The key aspect of core constructs would be to enable comparisons among schools, districts, and even states and to enable trend comparisons.
The primary assessment types used would be self-assessments and situational judgment tests. In addition teacher ratings could be used to compare with self-assessments and situational judgment tests. Developmental scales could be made available to students or institutions to assist in monitoring student progress from middle school to high school graduation. Developmental scales with norms could be presented for each of the factors and perhaps proficiency standards (basic, proficient, and advanced) for different target groups, such as 2-year and 4-year college students and workers in the various workforce sectors. To supplement assessments, it would be useful to provide specific suggestions in the form of feedback and action plans that might enable students to engage in self-help or assisted improvement programs. One way interventions could be structured would be to provide feedback on the psychosocial dimensions themselves, and on the student's strengths and weaknesses, instructions on how to set goals to improve, how to monitor progress in improvement, and to provide exercises, feedback, and experiential learning activities. Interventions such as these have already been developed, particularly to teach time management, teamwork, coping with test-anxiety, test-taking strategies, and others.
Psychosocial factors are important in education. Numerous studies have shown that psychosocial factors are correlated with school achievement. As early as preschool, personality (conscientiousness) predicts achievement (Abe, 2005). In middle school, psychosocial factors—mostly self-efficacy, selfconcept, and confidence—have been shown to predict reading, science, and math achievement on several large-scale domestic and international assessments even after controlling for demographics, school attendance, and home educational materials (Campbell, Voelkl, & Donahue, 1997; Connell, Spencer, & Aber, 1994; J. Lee, Redman, Goodman, & Bauer, 2007). Self-discipline was found to predict academic achievement (grades and test scores) beyond IQ for eighth graders (Duckworth & Seligman, 2005). In college, several recent meta-analyses have shown that psychosocial factors add to grades and test scores in predicting both achievement and retention. The psychosocial factors include conscientiousness (Noftle & Robins, 2007; O'Conner & Paunonen, 2007; Wagerman & Funder, 2007), academic discipline, social activity, emotional control (Chamorro-Premuzic & Furnham, 2003; Robbins, Allen, Casillas, Hamme Peterson, & Le, 2006), and study habits, skills, and attitudes (Crede & Kuncel, 2008). In several studies of middle school, high school, and community college students, several psychosocial factors—conscientiousness, time management, test anxiety, communication, and teamwork skills—have been found to predict both standardized test scores and grades (MacCann, Minsky, & Roberts, 2008; R. D. Roberts, Schulze, & MacCann, 2007; R. D. Roberts, Schulze, & Minsky, 2006; Zhuang, MacCann, Wang, Liu, & Roberts, 2008).
The effects of psychosocial skills do not end in school but continue on through the transition to the workforce. This trend can be seen in studies that have shown the effects of psychosocial skills, particularly (but not exclusively) conscientiousness and ethics (integrity), on job performance (Schmidt & Hunter, 1998) and labor economic outcomes (e.g., wages, employment, incarceration rates [e.g., Heckman, Malofeeva, Pinto, & Savelyev, 2007; Heckman & Rubinstein, 2001]).
The purpose of this prospectus is to explore the feasibility of creating a comprehensive, psychosocial (noncognitive) assessment of college readiness for secondary students and to show how noncognitive skills can be improved. We review findings based on diverse literatures and methods attesting to the importance of psychosocial factors in education. We attempt to make the case that psychosocial factors are important, that we know how to measure and develop them, and that we can improve educational achievement, particularly for underserved students, by doing so.
This prospectus is organized into sections, each addressing a key issue, as follows:
Framework. Is it possible to develop a comprehensive framework identifying the key psychosocial factors related to school success? What are these key factors (e.g., work ethic, dependability, teamwork, resilience)? Is there a rationale for a common terminology to describe those factors?
Evidence. What empirical evidence (correlation or experimental) is there that these factors are related to educational outcomes, particularly in high school? What is the empirical evidence for the relationship between the different psychosocial factors and various academic outcomes, such as school grades, standardized test scores, and staying in school, as well as affective outcomes such as having a positive attitude, being interested and engaged in school, and overall wellbeing?
Methods. What are the best methods for measuring the various psychosocial factors (e.g., self-reports, others' ratings, situational judgment tests)? Are these methods equally valid? How can these psychosocial factors be measured in a comprehensive, academic psychosocial assessment system, which might include a common scale of performance across grades (methods would include self-assessments, ratings by others—e.g., teachers and principals, and situational judgment tests)?
Improvement. Can psychosocial factors be improved? If so, is there any evidence that improving psychosocial factors will result in improvements in educational outcomes?
Recommendations for future research. Is it conceivable that a comprehensive, academic psychosocial assessment system could be developed? How could it be used (e.g., high-stakes admissions, policy monitoring, outcomes evaluations, self-help)? Could a common psychosocial scale of performance be established across grades to enable studying developmental trajectories and determine whether growth is on track? Finally, how should researchers study the development of psychosocial skills? How can assessment guide understanding as to how these skills can be improved?
Psychosocial Skills Framework
Many studies purport to identify noncognitive factors important for school and workplace success. But different studies identify different factors, and inconsistency is often present in findings. Part of this inconsistency is due to studies from different disciplines and perspectives using different terminology to describe the same thing (e.g., conscientiousness as a synonym for responsibility or for noncognitive skill) and using the same terminology to describe different things (e.g., integrity ranging in meaning from intellectual integrity to absenteeism). There is a benefit to standardizing terminology as a way of assuring cumulative progress and facilitating the identification of the key findings in the literature. This is the purpose of this section—to propose a general, standardized, psychosocial skills framework. The strategy here is to review factors and frameworks from a variety of disciplines— industry, education, personality, and industrial–organizational psychology. But the goal is to determine how these factors and frameworks can be organized into a common framework with standardized terminology.
A Note on Terminology
The factors that are the focus of this prospectus are identifiable in their distinctiveness from what are sometimes called cognitive factors, or intelligence, or knowledge, skills, and abilities. They go by a variety of names. They are commonly called noncognitive factors in the psychology and economics literatures, O factors (for “other,” in contrast with knowledge, skills, and abilities [KSAOs]) in industrial–organizational psychology, as well as psychosocial skills, nonacademic skills, socio-affective skills, personality, personal skills, personal qualities, attitudes, dispositions, and character traits.
But these noncognitive factors themselves may be further subdivided into two categories. Snow and Farr (1980), borrowing from English and English (1958) suggested a tripartite distinction between cognition (perceiving, recognizing, conceiving, judging, reasoning, and sensing), affection (feeling, emotion, mood, and temperament), and conation (a conscious tendency to act, associated with impulse, desire, volition, purposeful striving). (See also, Cattell , who suggested a division between ability, temperament, and dynamic traits.) The terminology has not stuck, but the idea of a tripartite distinction seems to remain viable. Several personality researchers (McAdams, 1996; B. W. Roberts, 2009; B. W. Roberts & Wood, 2006) suggest a distinction between ability and personality traits on the one hand and motives, goals, and aspirations on the other—being organized is distinguishable from the aspiration to get organized (B. W. Roberts, 2009), for example. This distinction was honored in a recent attempt to link personality and economics (Borghans, Duckworth, Heckman, & ter Weel, 2006), in which intelligence and personality (patterns of thought, feeling, and behavior, that is, how we think, feel, and act) were contrasted with the expectations, motivations, values, drives, interests, and attitudes (how we wish to think, feel, and act) that give rise to that.
Terminology is undoubtedly a problem in this field. In this article we will use the terms cognitive and noncognitive (or psychosocial), acknowledging the inadequacies of this terminology, and we will further subdivide noncognitive factors into personality traits versus states, goals, and motivational processes.
Personality Assessment and the Big 5
Personality assessment has a long history in psychology. Hundreds and maybe thousands of personality traits or constructs have been suggested over the years. But in the last 20 years the field has essentially reached a consensus—there is a much smaller number of independent dimensions, only five, underlying the myriad of constructs suggested (Digman, 1990; Goldberg, 1993; John, 1990).
The fundamental idea of the Big 5 is based on the lexical hypothesis, which is that language has evolved to characterize the most salient distinctions between people. Therefore, if people are asked to describe themselves (or others) using adjectives sampled from the language (e.g., using a Likert scale or an adjective checklist), then a factor analysis of the resulting data should reveal the basic personality dimensions. This methodology has led to the development of the FFM or Big 5 model in personality psychology. The five factors are extraversion, agreeableness, neuroticism, conscientiousness, and openness. The finding of these five factors has been shown to generalize across ages, to include children and adolescents as well as adults (Digman, 1997). And it has been replicated across at least 14 different languages (Saucier & Goldberg, 2006). Replications are not based on translations of English into other languages. Rather they involve sampling adjectives from the native language dictionary, having people rate themselves (or others) on those adjectives, and conducting factor analyses of the data. Such studies have shown that an FFM typically produces a good representation of the data.
The basic FFM has faced several challenges. Probably the most significant one is that in certain languages, such as Hungarian, Italian, French, Korean, and Turkish, a sixth factor emerges (e.g., Ashton et al., 2004). Saucier (2008; also De Raad, 2006) suggested more inclusive rules for item selection and conducted an analysis of numerous cross-language datasets, supporting a six-factor model. The model was also supported in an English speaking sample. The six factors are similar to the Big 5, but with the addition of another factor, which is essentially an honesty (vs. negative valence) factor (this factor is a facet of agreeableness in the Big 5). Another distinction is that the emotionality factor (neuroticism in the Big 5) has a “better defined positive pole” including traits such as courage and self-assurance (Saucier, 2008). Nevertheless, the FFM, almost universally accepted in the psychology research literature, is now being extended into the economics literature (Borghans et al., 2006), and for this article we will adopt it as part of the basic framework for psychosocial factors.
Facets of the Big 5
Another level of specification in the FFM is the level of facets, which are subcategories of the five factors. A facet is a lower order factor or item cluster in the FFM hierarchy, reflecting the fact that a set of items or indicators can have some commonality (shared variance) that is independent of the higher order factor and that any given factor can have several correlated facets (in principle there could additionally be subfacets and subsubfacets, but this idea has not been systematically pursued). Facets are considerably less stable than factors, and there have been many different proposals for facets of the Big 5. Table 1 presents the facets from one of the most popular assessments of the FFM, the NEO PI-R (the name is based on the instrument's original construct makeup of neuroticism-extraversion-openness; the latter initials stand for personality inventory-revised). However, there is no claim for any special status for this particular set of facets—the NEO PI-R is simply a widely used instrument.
Table 1. Big 5 Factors, Neuroticism-Extraversion-Openness Personality Inventory-Revised (NEO PI-R) Facets, and Example Items
NEO PI-R facet
IPIP scale name
Positive/negative example items from the IPIP
Note. IPIP = International Personality Item Pool.
Complete tasks successfully /misjudge situations
Like order/leave a mess
Follow the rules/break rules
Work hard/do just enough to get by
Get chores done right away/waste my time
Avoid mistakes/rush into things
Neuroticism (emotional stability)
Worry about things/relaxed most of the time
Get angry easily/rarely get irritated
Often feel blue/feel comfortable with myself
Am easily intimidated/am not embarrassed easily
Often eat too much/easily resist temptations
Panic easily/remain calm under pressure
Make friends easily/am hard to get to know
Love large parties/prefer to be alone
Take charge/wait for others to lead the way
Am always busy/like to take it easy
Love excitement/dislike loud music
Radiate joy/am seldom amused
Trust others/distrust people
Would never cheat on taxes/use flattery to get ahead
Make people feel welcome/look down on others
Am easy to satisfy/have a sharp tongue
Dislike being center of attention/think highly of myself
Sympathize with the homeless/believe in eye for eye
Have a vivid imagination/seldom daydream
Believe in the importance of art/do not like poetry
Experience emotions intensely/seldom get emotional
Prefer variety to routine/dislike changes
Like complex problems/avoid philosophical discussions
Tend to vote for liberals/believe in one true religion
The second column, International Personality Item Pool (IPIP) scale names, presents clone scales for the NEO PI-R facets. The IPIP is a website (http://ipip.ori.org/ipip/; Goldberg, 1999; Goldberg et al., 2006) that provides public domain personality items that form clone scales for most commercially published and research-based personality scales. (Clone scales are established by identifying items that have the highest correlations with the commercial scale scores based on a sample of participants who have been administered both the public domain items and the commercial items.) The items listed in Table 1 are also from the IPIP.
Table 1 is only meant to be illustrative of the kinds of facets proposed for the Big 5. The appendix presents all the associated Big 5 facets we could identify from the IPIP (Toker, 2008). The scale names or facets have been ones identified in different studies or different commercial instruments. (Letters in parentheses show cross-listed facets; negative signs indicate reverse keyed facets.) The point of the appendix is to show the wide range of facets associated with the Big 5 factors. Their being listed in the appendix is not meant to imply empirical independence; rather, they are simply scale names for scales from commercial instruments that have been categorized into the Big 5 factors based on a content matching process (Toker). Some of these could even be compound scales (mixtures of facets or Big 5 factors; e.g., Hough & Ones, 2001) rather than facets.
Because facets reflect specific variance independent of Big 5 factor variance, facet scores may provide incremental validity over Big 5 factors in predicting various external criteria such as behavioral outcomes (Paunonen & Ashton, 2001). Evidence also shows that facets of the same factor may present divergent relations with criteria—for example, the achievement-striving facet of conscientiousness may correlate positively to achievement, while tidiness, another facet of conscientiousness, may be uncorrelated or correlate negatively. This suggests that facets rather than Big 5 factors, per se, may often prove to be the most appropriate level for measurement.
Beyond the Big 5
Some evidence is also present for additional factors beyond the Big 5. If items other than strictly personality items are included in an analysis, then it is possible to identify additional personal attributes, such as religiosity, honesty, deceptiveness, conservativeness, conceit, thrift, humorousness, sensuality, and masculinity–femininity (Paunonen & Jackson, 2000). However, even these overlap with the Big 5 (Saucier & Goldberg, 1998).
Another approach to identifying personal factors beyond the Big 5 was one used by Saucier (2000). In an attempt to identify social attitude factors, he presented definitions of -ism words (e.g., empiricism, altruism, perfectionism) to participants, who rated their degree of endorsement of the concept on a Likert scale, and from the correlation matrix of those ratings a four factor structure emerged; in subsequent research, with additional items, a six-factor structure was found. The factors were alphaisms (religious orthodoxy), betaisms (unmitigated self-interest, including materialism and ethnocentrism), gammaisms (protection of the civil order associated with western democracy), deltaisms (mysticism and subjective spirituality), government interventionism, and harshness toward outsiders. Something like the first factor (alphaisms) often appears in social attitude studies and is closely aligned with a liberalism–conservatism dimension. This finding is consistent with the broader literature on the topic, which tends to find a general conservatism, authoritarianism, dogmatism cluster as a dominant social attitude factor. However, this factor is not truly separate from the Big 5, as it correlates with conscientiousness and openness (Carney, Jost, Gosling, & Potter, 2008).
Cross cultural studies on values have also produced attitude and values factors. For example, Schwartz and Bardi's work on values (Schwartz & Bardi, 2001) suggested a number of factors, such as achievement, power, and security. On the basis of item content, it seems that values could be accounted for by the Big 5 framework, as shown in Table 2. However, at least one analysis found unidimensionality among values measured by Schwartz's scale (Stankov, 2007), and there was little overlap with the Big 5. But it is possible that this is due to the response scale format itself, as three of the factors identified in that study were identified by response format.
Table 2. Correspondence Between Big 5 Personality Factors and Other Factors
Self-regulated learning factor
"Are they really ready to work?" (applied skill)
Note. SCANS = Secretary's Commission on Achieving Necessary Skills.
Self-directedness (creativity, freedom, independence, curiosity, choosing own goals)
Intrinsic/ extrinsic motivation
I: Works with diversity
Creating & conceptualizing
Spirituality (spirituality, meaning of life, sense of inner harmony, sense of detachment)
Job-specific task proficiency
Analyzing & interpreting
Written & oral communication
Another study by Hofstede (2001), who tested IBM employees around the world, suggested five cultural dimensions: power distance (expectations regarding the distribution of power), uncertainty avoidance (degree of threat felt in uncertain situations), individualism versus collectivism (degree to which one is expected look after oneself only), masculinity versus femininity (degree to which gender roles are distinct), and long-term orientation (degree to which society rewards perseverance and thrift). No study, other than the inconclusive Stankov (2007) study, has attempted to account for these factors by the Big 5.
Vocational and avocational interests refer to one's attitudes (likes and dislikes) toward activities and occupations. The dominant model here is Holland's (1959, 1997) model of vocational interests. On the basis of extensive factor analytic studies, he has proposed six dimensions, realistic (e.g., mechanical interests), artistic (e.g., writing, musical, and artistic activities), investigative (e.g., mathematics and science), social (e.g., teaching and counseling), enterprising (e.g., involving leadership and communication skills, such as sales), and conventional (e.g., involving clerical and arithmetic abilities, such as bookkeeping). Meta-analyses have shown that overlap exists between interests and personality (Larson, Rottinghaus, & Borget, 2002). The overlaps are shown in Table 2. This table shares common themes with Ackerman and Heggestad (1997), who identified such links (and those also shared with cognitive ability) as trait complexes.
Youth development programs (Eccles & Gootman, 2002; Moore & Lippman, 2005), positive youth development programs (Lerner, 2005), and character education projects have assembled a wide range of noncognitive scales and assessments for evaluation purposes. Many of these programs are associated with the positive psychology movement (e.g., Seligman & Csikszentmihalyi, 2000; Snyder, Rand, & Sigmon, 2005). To take one example, Peterson and Seligman (2004) proposed 24 character strengths based on a set of criteria (leads to excellence, involves deliberate reflection, distinguishable from talents and abilities, has a positive pole, recognized cross-culturally, morally valued, etc.), producing the following list:
Appreciation of beauty and excellence, bravery, citizenship, creativity, curiosity, fairness, forgiveness and mercy, gratitude, hope, humor, integrity, judgment, kindness, leadership, love, love of learning, modesty and humility, persistence, perspective, prudence, self-regulation, social intelligence, spirituality, and zest.
This system has not received the kind of extensive psychometric analysis received by the other systems proposed here, but is worthy of mention due to its prominence in educational discussions and the status of the positive psychology movement (see Zeidner, Matthews, & Roberts, 2009). However, it is clear that many of the character traits listed can be classified into a Big 5 framework; indeed, many of the terms are themselves used as Big 5 facet names. Table 2 provides suggestions for overlaps.
Goals and Motivational Processes
To this point we have argued that the FFM is a good framework for capturing differences between students on a wide variety of noncognitive factors—personality, attitude, interests, values, and character strengths. As we suggested above, these are commonly referred to as personality traits, the relatively enduring patterns of thoughts, feelings, and behavior. However, personality traits might not be the sum total of personality. The tripartite distinction discussed above implies a separate realm of personality—its dynamic aspect, or conation, or goals and motives—states or processes that are triggered in response to particular situations. McAdams (1996) suggested that these two realms represent different levels of personal description, with personality traits (the Big 5) residing at Level I, the level of dispositional traits, and “tasks, goals, projects, tactics, defenses, values and other developmental, motivational, and/or strategic concerns that contextualize a person's life in time, place, and role” (p. 295) residing at Level II, the level of personal concerns. Similarly, B. W. Roberts and Wood (2006) proposed a neo-socioanalytic model of personality, which attempts to marry trait and social-cognitive perspectives on personality to accommodate this distinction.1 In B. W. Roberts and Wood's (2006) scheme, traits are separate from goals and motives. The evidence for their separability is that they are only weakly correlated (B. W. Roberts & Robins, 2000) and that they display separate developmental trajectories (traits tend to go up with age, goals tend to go down; B. W. Roberts, O'Donnell, & Robins, 2004).
Thus we see that a general framework for noncognitive factors in education ought to accommodate both trait and process or state (goal-and-motive) level descriptions of individuals. Latent state-trait theory (LST; Steyer, Schmitt, & Eid, 1999) provides a way to model responses to situations as a function of traits and states (and trait-state interactions) and shows the unity of these two levels of personality description. In the education literature, much of the process-level research falls into the category of self-regulated learning.
Self-regulated learning is an important and rather general concept within educational psychology. Teaching students self-regulated learning skills is increasingly seen as a goal on par with teaching students subject-matter skills. An indication of the importance of self-regulated learning is that it served as the guiding framework for the background questionnaire for the influential Program for International Student Assessment (PISA; Baumert et al., 2006). In some significant ways, self-regulated learning has become the unifying concept for thinking about the effects of noncognitive factors on academic learning.
Many definitions of self-regulated learning appear in the literature, but what they seem to have in common is the idea that learning is maximized when students are motivated, have available a repertoire of learning strategies, and are metacognitively aware of what they are doing while learning or performing an academic task. The self-regulated learning process is also commonly broken down into activities occurring before (forethought), during (monitoring and control), and after (reflection) learning (Pintrich & De Groot, 1990; Pintrich, Wolters, & Baxter, 2000; Weinstein, Husman, & Dierking, 2000; Weinstein, Zimmerman, & Palmer, 1988; Zimmerman, 1990, 1998, 2000, 2002). Engaging in self-regulated learning enhances academic performance, confidence and self-efficacy, and insight (e.g., Zimmerman & Bandura, 1994).
Self-regulatory activities occurring during forethought include self-efficacy judgments, goal orientation, and time and effort planning. Monitoring involves being metacognitive, that is being aware of one's thinking, motivation, and effort levels, and being conscious of time and task conditions. Control consists of selecting strategies for directing thinking, maintaining motivation and affect, and regulating effort. Reflection includes cognitive judgments, affective reactions, and evaluations of task and context (Pintrich & Zusho, 2002). In some schemes this stage also takes into account participation by others, such as teachers, peers, and parents, to regulate students' behavior by encouraging them, providing them with tools and techniques, and showing how and when to do a certain task (Pintrich, 2000). From these descriptions it can be seen that there are several key process concepts—self-efficacy, goal setting and goal orientation, and control beliefs.
Self-Efficacy (Competency Beliefs)
One's beliefs and expectancies about one's own competence and effectiveness influence performance as well as future learning and personal development (e.g., Bandura, 1997). In the academic setting, self-efficacy refers to students' beliefs in their capabilities to master challenging academic demands by organizing and executing the courses of action (i.e., cognitive, behavioral, and social) necessary for successful academic performance (Bandura, 1982). Self-efficacy is distinct from (but related to) general outcome expectancies, beliefs about causality and controllability (investigated in studies of causal attributions), and self-esteem. Self-efficacy is presumed to affect academic performance by increasing persistence, goal setting, management of work time, and flexibility in testing problem-solving strategies (Schunk, 1984; Zimmerman & Bandura, 1994).
Bandura (e.g., 1997) viewed self-efficacy as dynamic and context specific, but it may have a more general character. Self-efficacy is listed as a facet of neuroticism (appendix), and a review of the definition of the concept makes it clear why. Bandura (1982) argued that students with high academic self-efficacy persist longer and exert greater effort and that this happens because “in the face of difficulties, people who entertain serious doubts about their capacities slacken their efforts or give up altogether, whereas those who have a strong sense of efficacy exert greater effort to master the challenges” (p. 25). And further, “people who judge themselves ineffective in coping with environmental demands tend to generate high emotional arousal, become excessively preoccupied with personal deficiencies, and cognize potential difficulties as more formidable than they really are: Such self-referent concerns undermine effective use of the competencies (that) people possess” (pp. 25–26).
Goal Setting and Mastery Versus Performance Goal Orientation
Self-regulated learning emphasizes the importance of goals for quantity, quality, or rate of performance (Schunk, 1990). Goals are involved across all phases of self-regulation from goal setting to employing goal-directed actions, to evaluating goal progress and modifying strategies, to successfully completing a task (Zimmerman, 1998). Goals improve self-regulation by affecting students' self-evaluations of progress, self-efficacy, and motivation (Schunk, 1995). Goals direct individuals' attention to relevant task characteristics, help in selecting and applying appropriate strategies, and help in monitoring progress by comparing current status to a final goal. A difference between present performance and the desired goal may cause dissatisfaction, leading to either increased effort or quitting, depending on whether they believe they can succeed, that is, whether goals are attainable (Schunk, 1995). Goal attainment builds self-efficacy and guides individuals to select new, more challenging goals (Schunk, 1990).
A dichotomy of goal orientation distinguishes learning or mastery goals, which reflect students' desire to learn by trying to gain a better understanding of a topic, from performance or ego-achievement goals, which refer to students' interest in how they perform relative to other people (Archer, 1994; Butler, 1993; Greene & Miller, 1996). Although performance goals may serve as powerful motivators, learning goals are believed to be especially effective in enhancing individuals' self-efficacy and self-regulation (Schunk, 1995). Students with learning goals test themselves more often and process the information more deeply by applying sophisticated techniques, whereas students with performance goals tend to study superficially (Archer, 1994). Students who approach learning with a mastery goal orientation show increased deep cognitive processing, memory recall, better text comprehension, and greater use of self-regulatory strategies (Graham & Golan, 1991; Pintrich & De Groot, 1990). Self-efficacious individuals guided by learning goals are more likely than those with performance goals to use self-regulated strategies and are more likely to succeed academically (Ee, Moore, & Atputhasamy, 2003). Use of learning goals relates to various adaptive outcomes, including performance, interest, and positive affect, whereas performance goals have been linked to less adaptive outcomes (Pintrich, 2000).
Control Beliefs: Attributions, Locus of Control, and Beliefs About Intelligence
Dweck and colleagues (Dweck, 2000; Dweck & Leggett, 1988) have suggested that students hold fundamental beliefs about the nature of intelligence and that these beliefs affect academic outcomes. One perspective, the entity view, is that intelligence is fixed. Students with this perspective believe their intelligence is a fundamental part of their makeup and there is not much they can do about it. Consequently they tend to attribute successes and failures to their intelligence or lack thereof, which they see as a factor outside their control. A more productive view of intelligence is the incremental belief, which sees intelligence more as a skill that can be increased with study and effort. Students holding this view attribute successes and failures to factors they have control over, such as amount of study, effort, and preparation put into a task, the effective use of strategies, and other self-regulatory processes. There is some evidence that students can change from holding entity to holding incremental views (Blackwell, Trzesniewski, & Dweck, 2007). The incremental versus entity perspective seems to map fairly nicely onto the constructs of locus of control (internal vs. external) and attribution theory (locus of control, stability, controllability).
What Is Motivation?
It seems obvious that school achievement is related both to content knowledge and to motivation, but motivation is a rather amorphous construct in psychology. Is motivation a separate construct or an epiphenomenon—something that is caused by something else? Within a self-regulatory learning framework, motivation seems mostly to be seen as the latter. Motivation is thought to increase when students attribute success to controllable (e.g., effort) rather than uncontrollable (e.g., ability) factors, have high self-efficacy, are mastery rather than performance goal orientated, and are intrinsically rather than extrinsically motivated (Schunk & Zimmerman, 2006). Anderman and Wolters (2006) suggested that in addition to goals, values (including interests) and affect may influence achievement motivation.
Mapping Process-Level to Trait-Level Constructs
How do constructs emerging from the motivation and self-regulatory learning traditions fit within an FFM framework? The FFM has not played a significant role in education thus far. For example, Division 15 of the American Psychological Association recently published a Handbook of Educational Psychology (Alexander & Winne, 2006), and only one chapter even touched on the topic, while several dealt with process-level and motivational issues.
However, although we are presuming that traits and processes are different levels of personality description, empirically they can still overlap. For example, traits and states overlap—extroverts are more likely to be in a positive mood and neurotics in a negative one, for example (Watson & Clark, 1992). Similarly, trait-process relationships may exist. Traits may predispose individuals toward certain process-level activities, for example.
Although there are certainly cognitive components to self-regulation, such as knowledge about effective strategies, there are probably personality ones as well. The beneficial effects of conscientiousness on performance may be attributable to goal setting and goal commitment (Barrick, Mount, & Strauss, 1993) and to mastery goals (Heggestad & Kanfer, 2000). Test anxiety correlates with adoption of performance-avoidance goals, which may mediate some of its detrimental effects on affect and behavior (McGregor & Elliott, 2002). Agreeableness is negatively related to motivations toward competitive excellence (Heggestad & Kanfer, 2000). Personality may also affect goals qualitatively, with, for example, social affiliation goals relating to agreeableness and dominance goals to extraversion (Matthews, Deary, & Whiteman, 2003).
Self-esteem, locus of control, and self-efficacy have been analyzed using factor analysis approaches. The finding, based on a meta-analysis (Judge & Bono, 2001) is that attributing outcomes to effort (as opposed to uncontrollable, external factors, such as intelligence or luck), self-esteem, and self-efficacy all tend to be most highly related to the neuroticism factor in the Big 5. This has led to the proposal for a “core self-evaluation” trait, which might be seen as a broader version of neuroticism (e.g., see Bono & Judge, 2003). An alternative formulation suggests that self-efficacy is better thought of as a mediator of the conscientiousness–performance relationship (Chen, Casper, & Cortina, 2001).
Noncognitive Factors Important for Educational Success: Interview Studies
One way to identify factors important to educational success is to ask teachers, faculty members, and other school and university administrative staff members. Several studies conducted by ETS have asked experts to identify the factors most important to educational success. A benefit of this approach is that it captures the language educators use to describe the key noncognitive factors, which is useful in communicating the findings.
In one study, 15 graduate school scholars and mentors were interviewed by phone and group discussions and were asked what qualities they would like to see in students (Enright & Gitomer, 1989). Seven “general competencies” were identified, four of which were noncognitive qualities. Appendix Table cross-lists these factors with Big 5 factors.
Several years later, the GRE® Board commissioned the Horizons project to understand how the GRE was used and how it might be improved (Briel et al., 2000). The project interviewed graduate school deans and faculty at 61 institutions, with more in-depth telephone interviews seeking to define successful graduate students with 21 faculty members and deans (Walpole, Burton, Kanji, & Jackenthal, 2002). A qualitative analysis (basically, counting the number of mentions and multiplying it by the enthusiasm of the mention) yielded a list of qualities, with certain noncognitive factors among the most frequently and enthusiastically mentioned. Table 2 cross-lists these with the Big 5 factors.
We have conducted several additional studies in the recent few years resulting in similar qualities. In one, the Enhance project, we interviewed 46 K–12 school administrators, which provided a list of noncognitive factors that could be organized into student engagement variables, learning skills variables, and school climate variables. The engagement and learning skills variables are listed in Table 2 (the school climate variables—student–teacher relationship, learning environment, school safety, and teacher isolation—are omitted from the table). Also as part of that project we interviewed 21 undergraduate, 19 graduate, and 19 business school faculty members who provided a list of factors, also presented in Table 2. Interview studies with qualitative analyses have their limitations, such as inconsistent terminology and cognitive biases (e.g., availability heuristic), but the list of terminology, taken across these studies, does present a picture of the qualities educators say are important for being a successful student.
Noncognitive Factors in the Workforce
There have been several significant activities identifying workforce competencies, which could prove useful in identifying the noncognitive factors important for educational success.
One of the first of these was the U.S. Army's Project A, a large-scale effort to overhaul the personnel selection and classification process used by the military services. The project was a response to a request important for educational success.
The request was made by the U.S. Congress to validate the Department of Defense's Armed Services Vocational Aptitude Battery, a cognitive test battery used to select and place applicants into the various job specialties within the military. The project involved the development and analyses of both new predictor tests and various outcome criteria against which to validate the new and the old measures. A wide variety of both novel and conventional measures of both cognitive and noncognitive skills were administered to tens of thousands of military personnel in numerous studies. There were a number of technical issues associated with the project, but for the purposes of this perspective, the key finding was the identification of eight major outcome dimensions (Campbell, 1990; Campbell, McCloy, Oppler, & Sager, 1993), referred to as the “taxonomy of higher order performance components.” The eight taxonomic factors are intended to be distinct and comprehensive and to represent the range of outcomes that organizations value in their employees. There were two cognitive proficiency factors, but the remaining six were primarily noncognitive. Table 2 cross-lists these factors with the Big 5.
Although the system was developed for military occupations, the concepts and principles underlying the taxonomy have proved to be useful beyond the military. The system has been widely adopted and adapted by industrial–organizational psychologists to the workforce in general. Also, Kuncel, Hezlett, and Ones (2001), presented a variant on this model to accommodate undergraduate student performance based on critical incidents. Reeve and Hakel (2001) presented an “armchair” adaptation of the Kuncel et al. (2001) model to accommodate graduate student performance.
The Great 8
A model inspired by the Project A model, although based on a different methodology, was developed to represent competencies for the general workforce by Bartram and colleagues (Bartram, Robertson, & Callinen, 2002; Kurz & Bartram, 2002). They also proposed the Great 8 competency factors. Competencies were derived from factor analysis and multidimensional scaling analysis of supervisor-, self-, and overall job-performance ratings. This contrasted somewhat with Project A, which depended on data obtained through cognitive measures, motivation, and personality questionnaires. Bartram, Kurz, and Baron (2003) referred to theirs as a criterion- as opposed to predictor-centered model. The Great 8 factors are listed in Table 2. The cross-listing with the Big 5 follows the authors' suggestions (Bartram et al., 2003), based on 33 studies with over 5,000 participants in different job sectors and cultures.
Secretary's Commission on Achieving Necessary Skills
Somewhat contemporaneously with the Department of Defense's endeavors to map out workforce competencies, the U.S. Department of Labor undertook its own investigation and developed a framework. The Secretary of Labor appointed a commission to determine the skills needed to succeed in the workplace. This resulted in a “three-part foundation” of basic literacy and computation skills, thinking skills, and personal qualities (Department of Labor, 1991). Basic skills included reading, writing, arithmetic, listening, and speaking. Thinking skills included creative thinking, decision making, problem solving, seeing things in the mind's eye, knowing how to learn, and reasoning. But most importantly for our purposes here, personal qualities such as those listed in Table 2 mapped against the Big 5 factors. In addition, the Commission identified five workplace competencies one of which was interpersonal, which included the six factors listed in Table 2 (prefaced by I:).
The major significance of Secretary's Commission on Achieving Necessary Skills (SCANS) is not the methodology for how the commission arrived at these competencies, but rather the recognition of the importance of noncognitive skills in the workplace. The SCANS efforts had several policy implications, including for example the passage of the School-to-Work Opportunities Act of 1994 (PL 103-239), which provided assistance to schools and employers to, among other things, instruct students on general workforce competencies including noncognitive ones such as developing positive work attitudes.
Are They Really Ready to Work?
Fairly natural extensions of the SCANS work can be seen in efforts undertaken to get employers to evaluate both (a) the importance of and (b) graduates' readiness with respect to various workforce competencies. One of these is a recent study, “Are They Really Ready to Work?,” which was produced by a consortium of The Conference Board, Partnership for 21st Century Skills, Society for Human Resource Management, and Corporate Voices for Working Families (Casner-Lotto & Barrington, 2006). The project involved interviewing 400-plus employers as to (a) the importance of various applied skills and (b) high school, community college, and college graduates' readiness with respect to those applied skills. The applied skills are listed in Table 2.
Several key findings emerged from the study. First was that these applied skills, mostly noncognitive skills, were judged to be as or more important than the academic skills (not listed, but similar to the basic skills from SCANS). Secondly, employers for the most part considered graduates not prepared for the workforce with respect to many of the most important competencies.
Putting It All Together
The purpose of reviewing this extensive list of constructs and frameworks is to motivate the development of a unifying general framework, or common framework, into which other systems can fit. Doing so provides several advantages. First, using a common framework means that the maximum number of researchers can contribute to advances in our knowledge of how to assess, how to intervene, and how to develop students, using standard terminology. A unique, proprietary system excludes researchers and practitioners either due to cost barriers or to disputes over the viability of the proprietary system. Second, using a common framework means that results and findings will be readily accepted by both the scientific and user (educational practice) communities. Terminology may have to be tailored to fit the using segment—scientific versus practice, for example—but if the underlying dimensions are the same then terminology can be seen as separate from the constructs themselves.
As a general framework, we propose the FFM of personality to account for trait-level differences between students. We propose in addition a more process-oriented description of goals and motives to account for other noncognitive aspects of school performance. As we showed, much of the research in education uses the self-regulatory learning framework for describing goals and motivational processes. In a later section on developing noncognitive skills we show how these two levels of personality description—trait and process—may work together in effecting personality change.
There are two important issues to address. One is the personality—skills/competencies continuum. There is a widespread belief that personality is a stable characteristic of individuals, and creating a system for monitoring, intervening, and improving personality is therefore futile. We review data later that contradicts this widely held belief. However, even without this caveat, it is important to distinguish basic personality tendencies from what we might call noncognitive skills. For example, consider time management. At a basic personality trait level, time management might be seen as a facet of the personality factor of conscientiousness. Nevertheless, one can learn to become a better time manager, and courses are available that teach just that. Also, consider typical personality items, such as “get angry easily” or “am afraid of many things.” Although they are indicators of personality traits, they also can be thought of as skill deficiencies, and therapies and training courses have been developed specifically to remedy such personality limitations. This idea is explored more thoroughly in the Improvement section in this report.
Another issue is what might be called the predictor-criterion distinction. This is not always a clear-cut distinction, but personality factors, interests, and the like are often thought of as predictors, and the Great 8 and the Project A performance factors are often treated as criteria (outcomes). Partly this reflects what we want to argue is an inaccurate understanding of personality in which personality factors are relatively immutable, whereas outcomes reflect environmental influences (e.g., training, interventions) on the individual. For particular purposes, it may be useful to distinguish predictors and outcomes, but we think of this distinction more like a continuum than like a binary distinction. This topic, too, is addressed more thoroughly in the Improvement section in this report.
Empirical Evidence for the Importance of Noncognitive Constructs
Any attempt to understand the complete causal chain associated with educational attainment must include the effects of noncognitive factors, such as personality and motivational processes, in concert with ability and social and economic factors at home and in the community (Matthews, Zeidner, & Roberts, 2007; Saklofske & Zeidner, 1995). This section of our article presents an overview of relations between noncognitive constructs and academic outcomes (see also O'Conner & Paunonen, 2007; Noftle & Robins, 2007, for reviews and meta-analyses covering some of the same territory). We focus here on the links between personality traits and educational outcomes (we covered the relationships between motivational processes and educational outcomes in the previous section). We first focus on the broad factors of the FFM of personality. Then, we review the effects of facets, compound traits, and interstitial traits related to the Big 5 that have received special attention—achievement striving, time management, academic self-concept, and self efficacy. Following that, we consider the relationships between the Big 5 and two formative constructs, student engagement, and college readiness. Our review includes research on populations from preschool to graduate school.
Evidence for Relations Between the Big 5 and Academic Achievement
Many different personality traits have been correlationally linked to academic performance. But given the prominence of the Big 5 framework in the personality literature (Digman, 1990) as well as its emerging recognition in the economics literature (Borghans et al., 2006), we believe it is most productive to organize findings around the Big 5. We review these findings, one factor at a time.
Ackerman and Heggestad's (1997) meta-analysis revealed a positive relation between openness (O) and standardized measures of knowledge and achievement. They suggested that crystallized intelligence (i.e., acquired cognitive skills) may be one potential mediating factor in the relationship between O and scholastic ability. O is modestly correlated with cognitive ability; correlations typically range between 0.20 and 0.30. Of the Big 5, O has the highest correlations with SAT Verbal scores, also falling in the 0.20–0.30 range (Noftle & Robins, 2007). (Interestingly, O did not correlate with SAT Math.) O has been positively associated with final grades, even when controlling for intelligence (Farsides & Woodfield, 2003). O may facilitate the use of efficient learning strategies (e.g., critical evaluation), which, in turn, affects academic success (Mumford & Gustafson, 1988). But a meta-analysis (Crede & Kuncel, 2008) found O to correlate with study attitudes (r = .30) but not study habits (r = .08).
The correlation between O and academic achievement is not always found (e.g., Busato, Prins, Elshout, & Hamaker, 2000; O'Conner & Paunonen, 2007), leading to the suggestion that the creative and imaginative nature of open individuals may be sometimes a disadvantage in academic settings, particularly when individuals are required to reproduce curricular content rather than produce novel response or creative problem solving (De Fruyt & Mervielde, 1996). Some of the ambivalence in findings concerning O may be due to the loose nature of the O factor (De Raad, 2006; Hong, Paunonen, & Slade, 2008), which reflects both intellectual orientation and openness, weighted according to the items included to measure it in a particular study.
Conscientiousness (C) has consistently been found to predict academic achievement from preschool (Abe, 2005) through high school (Noftle & Robins, 2007), the postsecondary level (O'Conner & Paunonen, 2007), and adulthood (Ackerman & Heggestad, 1997; De Fruyt & Mervielde, 1996; Shiner, Masten, & Roberts, 2003). C measured in school children was found to predict academic achievement at age 20 and eventual academic attainment at age 30 (Shiner & Masten, 2002). C predicts college grades even after controlling for high school grades and SAT scores (Conard, 2006; Noftle & Robins, 2007), suggesting it may compensate for lower cognitive ability (Chamorro-Premuzic & Furnham, 2003). High C may be associated with personal attributes necessary for learning and academic pursuits such as being organized, dependable, efficient, striving for success, and exercising self-control (Matthews & Deary, 1998). For example, C was found to predict early completion of independent credit assignments and signing up early to participate in a study (Dollinger & Orf, 1991). C might even affect achievement through its effect on the sleep schedule—C is related to morningness (Randler, 2008; R. D. Roberts & Kyllonen, 1999), and high C individuals experience earlier rising and retiring times (Gray & Watson, 2002). The effects of C on academic performance may be mediated by motivational processes such as expenditure of effort, persistence, perceived intellectual ability (Boekaerts, 1996; Noftle & Robins, 2007), effort regulation (Bidjerano & Yun Dai, 2007), and attendance (Conard, 2006). There is some evidence that particular facets of conscientiousness—achievement-striving, self-discipline, diligence, achievement via independence—may be particularly strong predictors of academic achievement, perhaps stronger than the broad C trait itself (Kuncel et al., 2005; Noftle & Robins, 2007; O'Conner & Paunonen, 2007).
In early studies, neuroticism (N) was shown to predict poorer academic performance among school-aged children. For example, Entwistle and Cunningham (1968) used data from an almost complete age group of 3,000 13-year-olds and reported that emotional stability was related to academic success. Shiner and Masten (2002) reported results for a longitudinal study of 205 participants who were assessed around ages 10, 20, and 30. Negative emotionality at age 20 was correlated with poor adaptation concurrently and 10 years previously. A meta-analysis has suggested a correlation of around −0.20 between N and academic achievement measures (e.g., Seipp, 1991), and there is evidence for the particular importance of the anxiety and impulsiveness facets of N (Kuncel et al., 2005; O'Conner & Paunonen, 2007). A meta-analysis suggested that the relationship may be due to N's correlation with study attitudes (−.40; Crede & Kuncel, 2008). However, some studies of both school children (Heaven, Mak, Barry, & Ciarrochi, 2002) and university students (Busato et al., 2000) have failed to find any significant correlations between N and attainment. Two reviews and meta-analyses (Noftle & Robins, 2007; O'Conner & Paunonen, 2007) also did not find a consistent relationship. Such inconsistencies may reflect the role of moderator factors. For example, McKenzie and Tindell (1993) showed that N was related to lower achievement only in students with weak superegos. Self-control and focusing of motivation may compensate for negative emotionality.
In general there does not seem to be a relationship between extraversion (E) and college performance (Kuncel et al., 2005; Noftle & Robins, 2007), although some studies have found evidence for a small, negative correlation (O'Conner & Paunonen, 2007). Age may moderate the effect of E on academic success. Before the age of 11–12 years, extraverted children outperform introverted children (Entwistle & Entwistle, 1970); among adolescents and adults some research has shown that introverts show higher achievement than extraverts (e.g., Furnham & Chamorro-Premuzic, 2004). This change in the direction of the correlation has been attributed to the move from the sociable, less competitive, atmosphere of primary school to the rather formal atmospheres of secondary school and higher education, in which introverted behaviors such as avoidance of intensive socializing become advantageous. Extraverts and introverts also differ in parameters of information processing, such as speech production, attention, and reflective problem solving (Zeidner & Matthews, 2000), with performance varying along meaningful dimensions. For example, extraverts have been shown to be better at oral contributions to seminars but poorer at essay writing than introverts (Furnham & Medhurst, 1995).
Although the temperamental precursors of agreeableness (A), such as prosocial orientation, relate to better social adjustment, relations between A and academic attainment are consistently nonsignificant (Kuncel et al., 2005; Noftle & Robins, 2007; O'Conner & Paunonen, 2007; Shiner et al., 2003). However, antisocial personality traits associated with low A may have detrimental effects (see below).
In summary, generalized personality traits constitute one of several noncognitive factors that may impact classroom learning and academic performance. As explored further in the third section, personality assessment may also be informative about a student's strengths and weaknesses at the process level. For example, high N students may need help with stress management, low C students with maintaining interest, and high E students with managing social distractions. Indeed, studies of anxiety-by-treatment interaction in education imply that educators ought to consider designing personalized learning environments matched with key personality factors (Snow, Corno, & Jackson, 1997). For example, students high in trait anxiety should benefit more from structured learning–teaching environments, whereas students low on trait anxiety (as well as those higher on E or O) should benefit from unstructured learning–teaching environments (Zeidner, 1998).
Evidence for the Validity of Personality Facets, Compound Traits, and Interstitial Constructs
We covered some of the evidence for the validity of facets of the Big 5 in the preceding Section. A finding has been that in many cases, facets have higher relationships to particular outcomes than do the Big 5 factors themselves (O'Conner & Paunonen, 2007), although there may be Brunswickian symmetry explanation for this (i.e., narrow predictors predict narrow outcomes; broad predictors predict broad outcomes; Wittmann, 1988). In this section, we focus on several factors that may be particularly important in educational achievement that warrant further discussion. We are not sure whether these are facets, compound traits, or interstitial constructs, but the key point is that they have received particular attention for their relevance to educational outcomes.
Poor time management, such as not allocating time properly for work assignments, cramming for exams, and failing to meet deadlines are mentioned as a source of stress and poor academic performance (e.g., Gall, 1988; Longman & Atkinson, 2004; Macan, Shahani, Dipboye, & Phillips, 1990). Several studies report correlations between time management and academic achievement in college students (see e.g., Britton & Tesser, 1991; Macan et al., 1990).
Depending on how it is operationalized, time management can also be a compound construct comprising multiple scales. Doing so suggests that particular subscales of time management may be conceptualized as facets of conscientiousness (MacCann, Duckworth, & Roberts, 2009). Certain subscales of time management significantly have been found to predict grades and student engagement in samples from both community and 4-year colleges (R. D. Roberts et al., 2007; R. D. Roberts, Schulze, & Minsky, 2006).
Anxiety is a robust and well-established facet of neuroticism (e.g., McCrae & Costa, 1994, 1999; Schulze & Roberts, 2006). The term test anxiety refers to the negative affect, worry, physiological arousal, and behavioral responses that accompany concern about failure or lack of competence on an exam or similar evaluative situation (Zeidner, 1998). Research has found that test anxiety has a detrimental effect on academic performance (Chappell et al., 2005; Keogh, Bond, French, Richards, & Davis, 2004). Results from meta-analyses have shown a consistent and moderate negative relationship between test anxiety and academic performance (r = −.21; e.g., Hembree, 1988; Seipp, 1991; Zeidner, 1998). Recent studies with students at community and 4-year colleges exhibit similar effect sizes (R. D. Roberts et al., 2007). Almost a third of American primary and secondary students may be affected by test anxiety (e.g., Lufi, Okasha, & Cohen, 2004).
Academic self-concept refers to a student's values, attitudes, and beliefs toward academics (Schwarzer & Jerusalem, 1989). Although the related concepts of self-efficacy and self-esteem are related to neuroticism (Judge & Bono, 2001), it is not clear whether academic self concept is. Academic self-concept is consistently related to higher academic attainment (e.g., Zeidner & Schleyer, 1999). A substantial correlation (r = .56) has been observed between academic self-concept and school grades (Byrne & Shavelson, 1986). Academic self-concept also relates to strategic planning (Howard-Rose & Winne, 1993). Students who perceive themselves as competent are more likely to employ sophisticated planning strategies than those who view themselves as inept. However, with all these relationships it is not clear which one causes the other, and it may be a reciprocal relationship (cf., Trzesniewski, Donnellan, & Robins, 2003).
Self-efficacy was shown to be a moderate correlate of neuroticism (ρ = −.35) and extroversion (ρ = .33) in Judge and Ilies (2002) meta-analysis. Many studies have examined the relation between self-efficacy and academic outcomes. Schunk (1989) found that students with a high sense of academic self-efficacy demonstrated greater persistence and effort in their academic learning. Students with low academic self-efficacy were more prone to becoming discouraged by challenging tasks. Besides persistence and effort, high self-efficacy may be related to the use of cognitive strategies when encountering difficult and demanding problems. Zimmerman (1998) suggested that knowledge of various self-regulation strategies may not be sufficient for successful goal attainmentstudents also need high self-efficacy to apply these strategies (Zimmerman, Bandura, & Martinez-Pons, 1992).
Evidence for the Importance of Formative Constructs in Academic Success
To this point the discussion has been on reflective constructs (Edwards & Bagozzi, 2000), which are the latent psychological constructs (e.g., personality, intelligence) that drive behavior; that is, they are the common cause underlying performance on items or indicators. Another kind of construct is the formative construct (or composite construct; or compound trait [Hough & Ones, 2001]), which is simply a composite or sum of a set of components, which do not necessarily even have to be correlated (Bollen & Lennox, 1991). The classic example of a formative construct is socioeconomic status, which is the sum of parental occupational status, parental education, and family wealth. (Other examples cited by Bollen and Lennox  include exposure to discrimination with indicators race, sex, age, and disability; and life stress with indicators such as job loss, divorce, recent bodily injury, and so forth; examples from the industrial–organizational psychology include integrity and customer-service orientation.) In this section, we review two formative constructs, student engagement, and college readiness. These constructs are looser and perhaps more multifaceted than the personality constructs discussed thus far, but they are nevertheless widely recognized in educational circles as constructs and are likely to prove to be related to the five factors in some way.
Student engagement is a broad formative construct referring to “participating in the activities offered as part of the school program” (p. 14, Natriello, 1984). It is widely thought to be related to achievement and staying in school (Brewster & Fager, 2000). There are several large-scale surveys of student engagement, beginning with the National Survey of Student Engagement (NSSE), that have now expanded to versions for high schools, community colleges, beginning college, faculty, and law school (Kuh, 2007). The surveys are designed to determine the degree to which students interact with their teachers, how they spend their time inside and outside the classroom, and how they react to the school experience.
J. Lee et al. (2007) reviewed the literature on student engagement in relation to academic achievement for the K–12 population. They organized the findings around the three components of engagement—behavioral, cognitive, and emotional—suggested by Fredricks, Blumenfeld, and Paris (2004).
Behavioral engagement includes measures tapping into facets of conscientiousness: whether students follow school rules, arrive at school on time, attend class, turn homework in on time, and avoid fights (Finn, 1993; Finn, Pannozzo, & Voelkl, 1995; Finn & Rock, 1997; Fredricks et al., 2004). Additional behaviors that may be considered indicators of behavioral engagement include paying attention in class, working hard for good grades, attempting to do work thoroughly, seeking information on one's own, trying to overcome difficulties, and actively participating in class discussions (J. Lee et al., 2007). At a more advanced level, behavioral engagement indicators include initiating academic discussions with teachers and other students, participating in school governance, taking part in subject-related extracurricular activities, and joining school athletics teams (e.g., Fredricks & Eccles, 2006).
Cognitive engagement is demonstrated by students' decisions concerning the degree of effort that they put toward schoolwork and students' expectations for academic achievement (J. Lee et al., 2007). It is indicated by preferences for challenging work, persistence in the face of failure, and intrinsic motivation toward learning beyond the desire to attain good grades (Bandura, 1997; Connell & Wellborn, 1994; Fredricks et al., 2004; Newman & Schwager, 1992).
Emotional engagement includes students' affective reactions and feelings toward school and learning in general; students' expressed levels of interest, boredom, happiness, enthusiasm, curiosity, and anxiety (see e.g., Alexander, Entwisle, & Dauber, 1993; Connell et al., 1994; Fincham, Hokoda, & Sanders, 1989; Fredricks et al., 2004; V. E. Lee & Smith, 1995; Skinner & Belmont, 1993; Stipek, 2002). J. Lee et al. (2007) asserted that feeling proud of academic success, accomplishments, and a sense of belonging or identification with the school are important indicators of emotional engagement.
This entire article is about college readiness in a broad sense, but there has also been research that can be understood as treating college readiness as a formative construct. Oswald, Schmitt, Kim, Ramsay, and Gillespie (2004) identified 12 dimensions of college performance by going through college catalogs and sorting the components of the various mission statements. The sort identified several components: dealing with intellectual behaviors (knowledge, learning, and artistic), interpersonal behaviors (multicultural, leadership, interpersonal, and citizenship; a combination of extraversion and agreeableness facets), and intrapersonal behaviors (health, career, adaptability, perseverance, and ethics; a combination of extraversion, neuroticism, and agreeableness). They developed two noncognitive measures, a biodata and situational judgment inventory, to measure these factors. They were then administered to nearly 3,000 students at 10 different colleges and universities (Camara, Sathy, & Mattern, 2007). Results demonstrated that the addition of the new intellectual, interpersonal, and intrapersonal measures resulted in significant incremental validity over SAT scores for all outcomes (first year grade point average [GPA], self-rated performance, and absenteeism).
Le, Casillas, Robbins, and Langley (2005) developed the school readiness inventory (SRI) and found that motivational and skill constructs clustered together to include constructs with clear parallels to Big 5 facets (particularly conscientiousness; thus academic discipline, commitment to college, general determination, and goal striving). Each of the measure's 10 subscales had a positive relationship with GPA, retention, or both. Evidence of curvilinear relationships was also seen for emotional control (a proxy for neuroticism) and social activity (a proxy for extraversion), with extreme values of these factors being detrimental.
Several studies conducted by ACT (Robbins, Allen, & Sawyer, 2007; Robbins et al., 2004, 2006) found motivational factors provided up to a 10% increase in variance explained for predicting academic performance and up to 4% for college persistence. Other motivation and skill scales also showed positive associations with academic performance. A one standard deviation increase in academic discipline and commitment to college led to an increase in the odds of sophomore retention by over 30%. This was true even after controlling for institutional variation and academic achievement.
The Third Pillar: Study Habits, Skills, and Attitudes
Crede and Kuncel (2008) conducted an extensive meta-analysis of the relationship of a wide variety of study habits, study skills, and attitude measures (SHSA) with collegiate GPA, and grades in individual classes. They found strong relationships between SHSA and study motivation with those outcomes, independent of high school grades and standardized test scores. Crede and Kuncel concluded that this formative construct, SHSA, was such a powerful predictor of college achievement that it ought to be considered the third pillar, alongside grades and test scores, in college admissions. There was some overlap between SHSA and personality factors, in meaningful ways, but there was not sufficient data to test a full model linking personality with SHSA and achievement outcomes. However, Crede and Kuncel did suggest that SHSA might mediate the relationship between personality and achievement.
This section examined evidence that personality and related noncognitive factors are related to educational outcomes. We found considerable evidence that such relationships exist and in many cases are fairly strong. Conscientiousness in particular, and especially its facets of achievement striving, self-discipline, and diligence, has been shown repeatedly to predict academic success from early grades through graduate school. Conscientiousness predicts academic outcomes even after controlling for prior academic history and standardized test scores. Other factors of the Big 5 have been less consistent in their prediction of school outcomes, but some evidence exists that neuroticism, particularly its anxiety and impulsiveness facets, may impair learning, and that openness may enhance it. Numerous suggestions pointed to factors that might mediate the relationship between these personality factors and achievement. For example, conscientiousness may cause greater achievement by increasing the expenditure and regulation of effort, leading to greater persistence and higher perceived ability, by influencing class attendance or even by leading to a more regular sleep cycle. Neuroticism may cause poor study attitudes that in turn can lead to decreased achievement. Both factors may underlie the development of good SHSA and study motivation, which have as a group been found to be powerful determinants of academic achievement. Other factors, such as time management, self-efficacy, and academic self-concept, along with academic discipline, commitment to college, and interpersonal and intrapersonal behaviors, have also been found to relate to academic achievement.
The evidence reviewed here is primarily correlational, as it is based mostly on personality traits. Still the correlational evidence seems sufficiently well-established and powerful in magnitude to warrant moving forward with psychosocial assessments, and taking the next steps to determine if modifying psychosocial factors will result in improved achievement and other outcomes. We return to this topic in the Improvement section in this report. But first, we review ways to assess noncognitive factors.
Noncognitive characteristics can be assessed in many different ways, with self-assessments, interviews, and behavioral observations representing just a few of them. In this section, we will briefly describe both well-established and recently developed assessment tools. A commonly used classification of assessment methods is Block's (1971) LOTS system of organizing assessment techniques: L = life data (e.g., biodata, letters of recommendation, transcripts); O = observer data (behavioral observations, projective tests, open-ended text); T = test data (situational judgment tests, IATs, the go/no go association test); S = self-report (standard personality inventories). However, we find it useful to organize assessments in a two-by-two table, by source (self or other) and type (ratings or performance), as shown in Table 3. This is not an exhaustive list of assessments, but it probably covers the most commonly used ones.
Table 3. Source × Type Organization of Assessment Methods
Writing/speaking samples, thematic apperception test
Different assessments do not always give the same score on a trait. For example, self-ratings of cognitive ability were found to have a correlation of only r = 0.25 (average over 55 studies) with actual ability test performance (Mabe & West, 1982), although this can be increased with certain methodological procedures. A meta-analysis found that the correlation between IAT and self-assessment measures was only about 0.24 (Hofmann, Gawronski, Gschwendner, Le, & Schmitt, 2005). It is not necessarily the case that one method is better than the other. Sometimes, two measures independently predict outcome criteria, each adding variance to the other. For example, Bratko, Chamorro-Premuzic, and Saks (2006) found that both self-reported and peer-rated conscientiousness predicted school performance (controlling for intelligence) independent of one another.
Faking is also an issue in noncognitive assessment, particularly when used in high-stakes applications. We will devote special attention to the fakability of each of the assessment methodologies. Additionally, we will discuss psychometric topics related to the prediction of educational outcomes.
Self-assessments are the most widely used approaches for capturing students' noncognitive characteristics. Most insights concerning the relationship between noncognitive qualities and educational or work-related outcomes stem from research conducted with questionnaires. Self-assessments usually ask individuals to describe themselves by answering a series of standardized questions. The answer format is mostly a Likert-type rating scale, but other formats may also be used (such as yes/no or open answer). Typically, questions assessing the same construct are aggregated; this aggregated score serves as an indicator of the relevant personality domain.
Self-assessments are a relatively easy, cost-effective, and efficient way of gathering information about the individual. However, many issues need to be taken into account when developing a psychometrically sound questionnaire, and one must explore a large literature on a wide variety of such issues, such as number of points on a scale, scale point labels, neutral point, alternative ordering, and others (Krosnick, Judd, & Wittenbrink, 2005). For instance, response scale format influences responses (Rammstedt & Krebs, 2007). Whether one should use positively and negatively keyed questions (to avoid acquiescence) is still controversial (e.g. Barnette, 2000; DiStefano & Motl, 2006). Respondents vary in their use of the scale—for example, young males tend to use extreme answer categories (Austin, Deary, & Egan, 2006), as do Hispanics (Marin, Gamba, & Marin, 1992), and in general, large cultural effects are apparent in response style (Harzing, 2006).
Respondents can also fake their responses to appear more attractive to a prospective employer or institution (e.g., Griffith, Chmielowski, & Yoshita, 2007; Viswesvaran & Ones, 1999; Zickar, Gibby, & Robie, 2004), resulting in decreased validity (Pauls & Crost, 2005). We have conducted several mini-conferences addressing the faking problem and have identified several promising methods for collecting self-assessments, such as giving real-time warnings (Sackett, 2006), using a multidimensional forced choice format (pitting equally attractive noncognitive factors—such as “works hard” and “works well with others” against each other) (Stark, Chernyshenko, & Drasgow, 2005), and using one's estimates of how others will respond to help control for faking (Prelec, 2004; Prelec & Weaver, 2006), but evidence for their effectiveness in controlling for faking remains to be demonstrated unequivocally (Converse et al., 2008; Heggestad, Morrison, Reeve, & McCloy, 2006).
Others' Ratings and Letters of Recommendation
Other-ratings are assessments in which others (e.g., supervisors, trainers, colleagues, friends, faculty advisors, coaches, etc.) rate individuals on various noncognitive qualities. This method has a long history and countless studies have been conducted that employed this methodology to gather information (e.g., Tupes & Christal, 1961/1992). Other-ratings have an advantage over self-ratings in that they preclude socially desirable responding, although they do permit rating biases. Self- and other-ratings do not always agree (Oltmanns & Turkheimer, 2006), but other-ratings are often more predictive of outcomes than are self-ratings (Kenny, 1994; Wagerman & Funder, 2007).
Letters of recommendation can be seen as a more subjective form of other's ratings and have been extensively used in a broad range of situations (Arvey, 1979). Letters of recommendation provide stake-holders with detailed information about the applicant's past performance, with the writer's opinion about the applicant being expressed in the form of an essay. In response to a major drawback of letters of recommendation—namely, their nonstandardized format—a more structured system, initially coined the Standardized Letter of Recommendation (e.g., Walters, Kyllonen, & Plante, 2003, 2006), and now the Educational Testing Service Person Potential Index (ETS® PPI, 2009) has been developed. This assessment system prompts faculty members to respond to specific items using a Likert scale, in addition to eliciting comments. It has been used operationally at ETS for selecting summer interns and fellows (Kim & Kyllonen, 2008; Kyllonen & Kim, 2004) as well as through Project 1000 for the selection of graduate student applicants (Liu, Minsky, Ling, & Kyllonen, 2007), and will supplement the GRE beginning in 2009 (see ETS, 2009; Kyllonen, 2008).
Situational Judgment Tests
A situational judgment test (SJT) is one in which participants are asked how best to or how they might typically deal with some kind of situation. For example, a situation might be a group project in which one member did not help out, and the possible responses are to talk to the nonparticipating member in private or in front of the group, or to let the incident pass without comment. Situations can be described in words, or videotaped, and responses can be multiple choice, constructed response, ratings (how good would this response be?), and so forth (McDaniel, Morgesen, Finnegan, Campion, & Braverman, 2001). As such, SJTs can be regarded as fairly simple, economical simulations of job tasks (Kyllonen & Lee, 2005).
These SJTs may be developed to reflect more subtle and complex judgment processes than are possible with conventional tests. The methodology of SJTs enables the measurement of many relevant attributes of individuals, including leadership, the ability to work with others, achievement orientation, self-reliance, dependability, sociability, agreeableness, social perceptiveness, and conscientiousness (e.g., Oswald et al., 2004; Waugh & Russell, 2003). Numerous SJTs, ranging from print-based measures of business analysis and problem solving (Kyllonen & Lee, 2005) to video-based measures of communication skills (Kyllonen, 2005), have been developed.
Also, SJTs have been shown to predict many different criteria such as college success (Lievens & Coestsier, 2002; Oswald et al., 2004), army leadership (Krokos, Meade, Cantwell, Pond, & Wilson, 2004; Legree, 1995), and managerial performance (Howard & Choi, 2000). Though applications in education have been relatively limited, applying SJTs as a predictor in educational domains has received increased interest (Lievens, Buyse, & Sackett, 2005a; Oswald et al., 2004). This interest is also due to the fact that evidence shows that SJTs have construct validity, both of a predictive (see Sternberg et al., 2000) and consequential nature (see Etienne & Julian, 2005).
Research on SJTs has revealed that respondents are able to improve their score in a retest (Lievens, Buyse, & Sackett, 2005b) or after coaching (Cullen, Sackett, & Lievens, 2006), although the improvement may be small (d = .25). Compared to self-assessments, SJTs appear to be less susceptible to faking, where the improvement due to incentives can be up to a full standard deviation.
Biographical data (biodata) have been or are being explored for college admissions use in the United States (Oswald et al., 2004) and Chile (Delgalarrando, 2008). Biodata are typically obtained by asking standardized questions about individuals' past behaviors, activities, or experiences. A sample question could be: How often in the last two weeks have you eaten fast food? Respondents are given multiple-choice answer options or are requested to answer in an open format (e.g., frequency). ETS (Baird & Knapp, 1981; Stricker, Rock, & Bennett, 2001) developed a biodata (documented accomplishments) measure that produced scores for six scales: academic achievement, leadership, practical language, esthetic expression, science, and mechanical. For the leadership category, items were Was on a student–faculty committee in college. Yes/No. If YES: Position, organization, and school?
Measures of biodata have been found to be incrementally valid beyond SAT and the Big 5 in predicting students' performance in college (Oswald et al., 2004). Obviously, biodata can be faked, but faking can be minimized in several ways (e.g., Dwight & Donovan, 2003; Schmitt, Oswald, Kim, Gillespie, & Ramsay, 2003). Asking students to verify with details, for example, can minimize faking.
Transcripts contain information on the courses students have taken, earned credits, grades, and GPA. As official records, transcript information can be taken as more accurate than self-reports. Transcript data can be standardized and used in validity studies. For example, the U.S. National Center for Educational Statistics supports an ongoing collection of transcripts (the NAEP High School Transcript Study, http://nces.ed.gov/nationsreportcard/hsts/), which classifies courses, computes GPA, and links resulting data to NAEP achievement scores (Shettle et al., 2007).
Interviews are the most frequently used method of personnel selection in industry (Ryan, McFarland, Baron, & Page, 1999), but they are also used for admissions, promotions, scholarships, and other awards. Interviews vary in their content and structure. In a structured interview, questions are prepared before the interview starts. An unstructured interview simply represents a free conversation between an interviewer and interviewee giving the interviewer the freedom to adaptively or intuitively switch topics. Research has shown that unstructured interviews lack predictive validity (Arvey & Campion, 1982) or show lower predictive validity than structured interviews (Schmidt & Hunter, 1998). Best practices for conducting interviews are summarized as follows (from Schuler, 2002):
High degree of structure
Selection of questions according to job requirements
Assessment of aspects that cannot be better assessed with other methods
Scoring with pretested, behavior-anchored rating scales
Empirical examination of each question
Rating only after the interview
Training of interviewers
Structured interviews can be divided into three types: the behavioral description interview (BDI; Janz, Hellervik, & Gillmore, 1986), situational interview (SI, Latham, Saari, Pursell, & Campion, 1980), and multimodal interview (MMI; Schuler, 2002). The BDI (also referred to as job-related interview) involves questions that refer to past behavior in real situations. The situational interview uses questions that require the interviewees to imagine hypothetical situations (derived from critical incidents) and state how they would act in such situations. The multimodal interview combines the two approaches and adds unstructured parts to ensure high respondent acceptance.
Meta-analyses of predictive validity of interviews for job performance (Huffcutt, Conway, Roth, & Klehe, 2004; Marchese & Muchinski, 1993; McDaniel, Whetzel, Schmidt, & Maurer, 1994; Schmidt & Hunter, 1998) show that structured interviews (a) are good predictors of job performance (corrected correlation coefficients range from .45 to .55), (b) add incremental validity above and beyond general mental ability, and (c) include BDIs that show a higher validity than SIs. Interviews are less predictive of academic as compared to job-related outcomes (Hell, Trapmann, Weigand, & Schuler, 2007). Predictive validity probably also depends on the content of the interview, but the meta-analyses cited here aggregated interviews with different contents.
Behavioral observations entail watching observable activities of individuals and keeping records of the relevant activities (Cohen & Swerdlik, 2005). Records can vary from videos, photographs, and cassette recordings to notes taken by the observer. The general assumption behind this method is that individuals vary in observable behaviors; this variation is stable over time and across different situations and, thus, can be regarded as an indicator of a personality trait (Stemmler, 2005).
One form of behavioral observation often used in selection is the assessment center. Assessment centers can comprise many different methods (including achievement tests), but they feature role play and presentation tasks. In these tasks, participants are asked to act in a simulated situation. These situations are designed in such a way that a certain behavior can be highlighted (e.g., assertiveness). A meta-analysis showed that assessment centers moderately predict job and training performance but do not add incremental validity beyond general mental ability (Schmidt & Hunter, 1998).
A strength of assessment centers for measuring personality is that they are performance-based rather than opinion-based self-assessments. As such, they are less easily faked than are self-assessments. On the other hand, a drawback to assessment centers is that they assess maximum performance, which may not be representative of typical behavior.
In this section, we present newly developed, innovative ways for measuring personality traits. The methods presented below do not require self-reports. Rather, noncognitive qualities are inferred from other variables such as reaction times. The measurement objective is not obvious to the participants, and thus, these measures may be less susceptible to faking (e.g., Ziegler, Schmidt-Atzert, Bühner, & Krumm, 2007). For this and other reasons, these methods may be of potential use in various assessment situations. So far, validity evidence is relatively sparse and more research examining their usability in high-stakes assessment situations is needed.
Implicit Association Tests
The IAT (Greenwald, McGhee, & Schwartz, 1998) has become an incredibly popular method for researching noncognitive factors, particularly attitudes, having been examined in more than 250 empirical studies (Greenwald, Nosek, & Sriram, 2006). The IATs record the reaction time it takes to classify stimulus (e.g., word, picture) pairs, which is then treated as an indirect measure of whether a participant sees the stimuli as naturally associated. Thus, IATs measure the strength of implicit associations, for example, to gauge attitudes, stereotypes, self-concepts, and self-esteem (Greenwald & Farnham, 2000; Greenwald et al., 2002).
The IATs generally exhibit reasonably good psychometric properties. Meta-analyses have shown they have high internal consistencies (.8 to .9; Hofmann et al., 2005), although somewhat lower test-retest reliabilities (.5 and .7), which is a common finding in reaction time research. The IATs predict a wide variety of criteria, particularly spontaneous (as opposed to controlled) behavior (Bosson, Swann, & Pennebaker, 2000; Gawronski & Bodenhausen, 2006; McConnell & Leibold, 2001). However, they have not been used in studies of educational outcomes. The Hofmann et al. (2005) meta-analysis estimated the correlation between implicit (IATs) and explicit (self-reports) measures of personality to be .24, with about half of that due to moderating variables.
The promise of the IAT is that as a performance measure it should be less susceptible to faking. However, a finding is that the IAT is to a certain extent fakeable (Fiedler, Messner, & Bluemke, 2006). Given that, and given that still controversy still exists about what the IAT measures (Rothermund & Wentura, 2004), and given that a lot of method-specific (construct irrelevant) variance is associated with IATs (Mierke & Klauer, 2003) it is clear that more research is needed before IATs (and their cousins, the Go-No Go Association Test; Nosek & Banaji, 2001) can be regarded as viable tools in various applied assessment contexts.
Conditional Reasoning Tests
Conditional reasoning tests (CRTs) are multiple-choice tests consisting of items that look like reading comprehension or logical reasoning items, but they really measure world-view, personality, biases, and motives (James, 1998; LeBreton, Barksdale, & Robin, 2007). Following a passage and a question, the CRT presents two or three logically incorrect alternatives, and two logically correct alternatives that reflect different world views. Participants are asked to state which of the alternatives seems to be most reasonable based on the information given in the text. Thus, respondents believe that they can solve a problem by reasoning about it, not realizing that there are two correct answers, and that their selection is guided by implicit assumptions underlying answer alternatives.
Participants select one of the logically correct alternatives, presumably according to his or her underlying beliefs, rationalizing the selection through the use of justification mechanisms. For example, the examinee might select an aggressive response to a situation, justifying it as an act of self-defense or as retaliation (LeBreton et al., 2007). These justification mechanisms serve to reveal hidden or implicit elements of the personality. To illustrate this idea consider the example from LeBreton et al. (2007) in Figure 1.
Alternatives A and D can be ruled out on logical grounds. Both B and C could be considered logically correct (or at least not incorrect) but reflecting different perspectives. Of the two responses, selecting c is taken as an indicator of aggression, because to do so reflects a justification that the spouse has hostile intentions. A score reflecting an individual's level of aggression is obtained by aggregating the answers to several of these kinds of items.
The CRT for aggression has been shown to be unrelated to cognitive ability, reliable, and valid for predicting different behavioral manifestations of aggression in the workplace (average r over 10 studies = .44; James, McIntyre, & Glisson, 2004). Most of the research on CRTs has been in measuring aggression or achievement motivation (James, 1998). However, the method has proven difficult to replicate (Gustafson, 2004). Also, as with IATs, the promise of resistance to faking has not been established (LeBreton et al., 2007), and thus it seems that CRTs may need further work before being used in high-stakes assessment situations.
Objective Personality Tests
Objective personality tests (OPTs) can be defined as “personality tests that assess an individual's behavior in a standardized situation without requiring the individual to rate his/her behavior in a self-report” (p. 19, Schmidt & Wilson, 1975). They were proposed and evaluated by Cattell (1957, 1973) in a programmatic effort in the 1950s. Although little attention had been given to OPTs since that time, a minor revival has occurred recently due to the ease with which computer-based versions can be developed. An example is the Objective Achievement Motivation Test (Schmidt-Atzert, 2006). Participants are asked to perform a challenging task involving pressing buttons as fast as possible to move an object on a road displayed on the computer screen. In the next trial, the same task is performed but participants need to specify their performance goals in advance. Next, a computerized competitor is displayed. The change of performance between these trials serves as an indicator of different aspects of achievement motivation.
To date, few OPTs have been developed, and those that have, have yielded no sufficient validity evidence. The promise is that they are resistant to faking (Ziegler et al., 2007) but the lack of successes in this realm suggests that they cannot be a viable personality assessment method at this time.
Assessing Emotions With Writing/Speaking
On the basis of the idea that what we write and say and how we write and say it reflects our personality, Pennebaker and colleagues (Chung & Pennebaker, 2007) have embarked on a research program involving correlating words and word types from open-ended writing (e.g., emails) with personality and behavioral measures. They have found that the use of particular function words (e.g., pronouns, adjectives, articles) is related to individuals' affective states, reactions to stressful life events, social stressors (see also Mehl & Pennebaker, 2003), demographic factors, and biological conditions. For example, the use of “I” is associated with depression, and speaking to a superior, based on email correspondence. Moreover, word choices can be used to detect deception (Hancock, Curry, Goorha, & Woodworth, 2004; Newman, Pennebaker, Berry, & Richards, 2003). The volume of material available, for example, over the Internet, and the availability of inexpensive automated classification tools provide plenty of research opportunities to continue to identify these kinds of correlations. The magnitude of relationships found tends to be very low, but the method is nevertheless intriguing, and its low cost and unobtrusive nature suggest that it may lead to applied assessment applications in the future.
Time Use: Day Reconstruction Method
A relatively new behavioral science domain concerns how people use their time. An assessment technique is the Day Reconstruction Method (DRM; Kahneman, Krueger, Schkade, Schwarz, & Stone, 2004). The DRM assesses how people spend their time and how they experience the various activities and settings of their lives. It combines features of two other time-use techniques: time-budget measurement (the respondent estimates how much time is spent on various categories of activities) and experience sampling (the respondent records his or her current activities when prompted to do so at random intervals throughout the day). The DRM requires that participants systematically reconstruct their activities and experiences of the preceding day with procedures designed to reduce recall biases.
When using the DRM, a respondent first recreates the previous day by producing a confidential diary of events. Confidentiality encourages respondents to include details they may not want to share. Next, respondents receive a standardized response form and use their confidential diary to answer a series of questions about each event, including (a) when the event began and ended, (b) what they were doing, (c) where they were, (d) with whom they were interacting, and (e) how they felt on multiple affect dimensions. The response form is returned to the researcher for analysis. In addition, respondents answer a number of demographic questions.
Respondents complete the diary before they are informed about the content of the standardized response form, so as to minimize biases. A study of 909 employed women showed that the DRM closely corresponds with experience sampling methods (Kahneman et al., 2004). The DRM is a time-consuming and intrusive form of assessment that requires a significant effort from respondents. More research is needed to capture psychometric qualities of the method. However, initial evidence suggests that this method is effective in assessing characteristics otherwise difficult to capture, such as sense of wellbeing and mood (Belli, 1998; Kahneman et al., 2004).
A wide variety of both conventional and novel methods for assessing noncognitive skills has been presented in this section. Self-assessments are the most common and are likely to be useful in any kind of noncognitive assessment system, particularly when the stakes are not high. Also, SJTs are becoming an increasingly popular way to measure noncognitive factors. They have been used in so many studies over the past 10 years or so that the methodology for developing them is now fairly affordable, and the measures are becoming increasingly reliable and valid. It is probably a useful idea to supplement self-assessments with a different kind of assessment such as an SJT at the very least to reduce measurement method bias. Others' assessments, such as teacher ratings and interviews, are also quite useful, and as discussed, they are currently the most viable for high-stakes selection applications. However, they do place a high burden on the rater or the person conducting the interviews. Where that cost is too high, a strategy in some applications might be to use them as an occasional assessment, examining their relationship to self-assessments and SJTs for a subset of participants.
Other assessments reviewed here, such as CRTs and the IATs, are intriguing and may potentially be quite useful. This is also true of time-use methods and word classification methods. However, all of these are still in a research status, and may have to undergo additional evaluations before being employed operationally.
Does Personality Change?
Freud suggested that the basic tendencies of personality were set by age 5, and since then there has been a broadly held assumption that personality remains stable across the lifespan. This assumption has had some empirical support. For example, McCrae and Costa (1994) summarized the results of a score of longitudinal studies concluding that (a) individual differences are stable from early childhood but especially after age 30, (b) personality level is fixed by age 30, (c) these findings are true for all the Big 5 factors, and (d) with the exception of those suffering from dementia and other psychiatric conditions, these findings are true for just about everyone. Depending on one's personality, this could either be good or bad news, but as McCrae and Costa (1994) put it, “Individuals who are anxious, quarrelsome, and lazy might be understandably distressed to think that they are likely to stay that way” (p. 9).
More recently, two meta-analyses have provided us with a more complete and perhaps more hopeful picture of how personality develops over the lifespan. Consider that there are two basic ways in which personality change can be observed with age. Personality level can increase (or decrease), and rank orderings of personality can be preserved or can change. By analogy, consider the trait of height. People get taller with age, to a certain point, and so in that sense, height changes with age—there is a mean-level change in height. But the rank ordering of people is probably fairly stable—2-year-olds who are at the 95th percentile in height for their age are likely to be tall as adults. These two forms of change—rank order and mean-level change—are in principle independent. Personality might change with age but rank orderings could remain fairly stable. Or personality could change in one direction for some people, and in the opposite direction for others, leading to no mean-level change but instability in the rank ordering of people.
The rank ordering stability of personality was investigated by B. W. Roberts and DelVecchio (2000) who examined 152 longitudinal studies in which personality was assessed twice, with a lag of at least 1 year between assessments. Studies varied in the age at which participants were assessed (ranging from 6 weeks to 73 years old), and in the lag between the two assessments (on average the lag was 7 years, but it ranged from 1 to 53 years). A total of over 3,000 test-retest correlations were analyzed. They found that the rank order consistency (test-retest correlation) was .31 in childhood, .54 in college, .64 by age 30, and .74 by ages 50–70. (In another analysis they also statistically controlled for lag, but this did not make much difference for these estimates.) Interestingly, there was not much of an effect for whether it was a self-assessment or observer rating, suggesting that the findings were not due to individuals simply remembering how they had answered in the previous assessment. Overall these results suggest some consistency in the rank order of people in personality across the lifespan, but certainly considerable room exists for change, particularly through early adulthood.
Mean-level change in personality over the lifespan was investigated with a follow-up meta-analysis (B. W. Roberts, Walton, et al., 2006), based on longitudinal studies using criteria similar to those from the previous study (B. W. Roberts & DelVecchio, 2000). They examined 92 studies with over 50,000 participants and 1,600 estimates of change (with a median lag of 6 years). They examined change on each of the Big 5 factors, but with extroversion divided into two facets—social dominance (independence, self-confidence) and social vitality (sociability, positive affect, gregariousness, energy level)—following a suggestion that these two facets develop in opposite directions (Helson & Kwan, 2000). For each sample, and for each personality factor. B. W. Roberts and DelVecchio (2000) computed change scores as the difference between the Time 1 mean and Time 2 mean scores divided by the standard deviation of the Time 1 scores. A plot of their findings (cumulative change across the life span) is shown in Figure 2.
It can be seen that personality changed throughout the lifespan, particularly in young adulthood. Individuals became more socially dominant, conscientious, agreeable, and emotionally stable throughout the lifespan particularly in adolescence and early adulthood. Change over the lifespan was up to a full standard deviation, a large change. Social vitality did not grow, and in fact, it seems to have trended downward. Openness showed a curvilinear relationship.
Taking findings from both meta-analyses together, we can conclude that personality changes in two ways over the course of the lifespan—the rank order of individuals changes; some individuals increase and some decline—and on top of that there are mean-level changes for all the Big 5 factors. This suggests that Freud's idea that personality is fixed at age 5 is incorrect, and that McCrae and Costa's (1994) conclusions concerning the “recognition of the inevitability of his or her one and only personality,” and that “few findings in psychology are more robust than the stability of personality” were perhaps overly pessimistic.
How Does Personality Change?
The two meta-analyses suggest that over time people tend to get more conscientious, emotionally stable, agreeable, and socially dominant, on average, but that some people go in the opposite direction. Why would that be? One proposal (B. W. Roberts, Walton, et al., 2006) is based on the fact that personality change occurs particularly during early adulthood. This might suggest that life experience or social role changes during that period are responsible. For example, transitioning from being in school to getting a job or going to college or transitioning from living at home to living on one's own means that one has to learn to be more conscientious (showing up for work on time) and agreeable (able to work with others) in order to be successful in those environments. Different experiences and different social role transitions might then be partly responsible for different personality trajectories. In addition to changing roles, watching ourselves (i.e., watching how others react to what we do), watching others (i.e., modeling others' behavior), and listening to others who provide feedback on how we should change can lead to personality changes (B. W. Roberts, Wood, & Caspi, 2008; Table 14.3).
Thus, with respect to the framework we outlined in Section II, personality change can be seen as a response to changes in the goals and motives we experience. The advantage of goals and motives for effecting personality change is that they are easier to manipulate than is personality per se. In the sociogenomic model (B. W. Roberts, 2009; B. W. Roberts & Caspi, 2003), environments or situations affect thoughts, feelings, and behaviors, but traits serve as a countervailing force inhibiting personality change. Situations do not immediately lead to dramatic personality change. Rather, environments affect traits to the degree to which those environments are persistent, that is, the situation stays the same for a long time. Certain situations lead to goal-and-motive changes, which in turn lead to personality changes, over time.
It is likely that personality and cognitive abilities are similar in this respect. Both are affected by goals and motivational processes—self-regulated learning programs are designed to increase ability by manipulating goals and motivational processes. In addition, consider two principles—in regard to practice and narrow domains—that govern changes in cognitive ability.
Practice Makes Perfect
Skill acquisition goes through a series of stages (Anderson, 1983; Fitts, 1964) from cognitive (conscious, declarative, deliberate) to associative (procedural, response retrieval) to autonomous (automatic, effortless), and expertise is thought to require up to 10,000 hours of practice (Ericsson, Krampe, & Tesch-Römer, 1993). Applying this principle to personality suggests that personality change occurs with practice, lots of practice, which may be provided by persistent environments; at first it may require conscious attention but over time become increasingly automatic.
Changes in Narrow Domains Are Easier Than Changes in Broad Domains
For example, Venezuela's Project Intelligence, which attempted to increase the overall intelligence of the country's school children, was more successful in teaching specific skills than in boosting broad measures of IQ (Herrnstein, Nickerson, de Sanchez, & Swets, 1986). Applying this principle to personality suggests that personality change will be easier in narrow domains, such as time management, rather than broader domains, such as conscientiousness.
We now consider the literature on interventions designed to change personality.
Interventions Aimed at Personality Change
There is evidence from a wide range of promotion, prevention, and treatment interventions that youth can be taught personal and social skills (Beelman, Pfingsten, & Lösel, 1994; Cartledge & Milburn, 1995; Collaborative for Academic, Social, and Emotional Learning, 2003; Commission on Positive Youth Development, 2005; Greenberg et al., 2003; L'Abate & Milan, 1985; Lösel & Beelman, 2003). These programs may prove particularly useful for children living in poverty and at high risk of failing in school (Heckman et al., 2007).
The personal and social skills taught cover such areas as self-awareness and self-management (e.g., self-control and self-efficacy), social awareness and social relationships (e.g., problem solving, conflict resolution, and leadership skills), and responsible decision making. Furthermore, although personality change is not the explicitly stated purpose of these interventions, in most cases it is a result. Eccles and Gootman (2002) provided a general framework describing the features of youth development programs that work including physical and psychological safety, opportunities to belong, appropriate structure, supportive relationships, positive social norms, support for efficacy, opportunities for skill building, and integration of family, school, and community efforts.
Below, we summarize interventions that have the potential to improve each personality dimension of the FFM. Specifically, we discuss existing interventions and remediation programs designed to improve academic achievement, personal skills, and social skills, and map these interventions to specific FFM factors and facets. Our mapping process will be guided by the items drawn from the IPIP (Goldberg et al., 2006) that measure constructs similar to those in the 30 NEO PI-R facet scales.
In recent years, there has been an increased demand for interventions that enhance individuals' critical thinking skills. Critical thinking can be thought of as having both skill (e.g., identify unstated assumptions) and dispositional (e.g., be open minded) components (Kennedy, Fisher, & Ennis, 1991). In the FFM, the dispositional component of critical thinking corresponds to the openness factor, and perhaps particularly to its intellect facet. Items such as “Like to solve complex problems,” “Enjoy thinking about things,” and “Have difficulty understanding abstract ideas” measure this facet.
Both academic institutions and the Internet have become active in providing guidelines aimed at increasing and improving critical thinking skills in everyday life. Cohen, Freeman, and Thompson (1998) noted that critical thinking skills training should involve instruction, demonstration, and practice in order to most effectively help individuals identify and handle different types of information. Research has indicated that critical thinking skills successfully learned in class or through training can effectively be transferred to different domains, situations, and contexts (e.g., Fong, Krantz, & Nisbett, 1986; Kosonen & Winne, 1995; Lovett & Greenhouse, 2000). Overall, research in this area has demonstrated that critical thinking skills are learned best and successfully enhanced when diverse skills and domains are addressed throughout the teaching period.
Several studies have been conducted that demonstrate the value of teaching critical thinking skills (see Cotton, 1991, for a review). For example, the Project Intelligence intervention discussed above (Herrnstein et al., 1986) was essentially a critical thinking intervention for seventh graders, comprising a year's worth of weekly lessons on deductive and inductive reasoning, hypothesis testing, problem solving, and decision making. Students receiving this instruction demonstrated higher gains than a control group in their critical thinking skills. A meta-analysis by Haller, Child, and Walberg (1988) investigated the effect of classroom interventions that taught the metacognitive critical thinking skills of awareness (being aware of one's own cognitive activities), monitoring (checking one's own reading for comprehension), and regulating (compensating for lack of comprehension) on improved reading comprehension. Students who received instruction targeting these skills demonstrated greater gains in reading comprehension than those who were not taught these skills. Several other programs have been shown to be effective at improving critical thinking skills. Some of these include programs that focus on teaching elementary logic, inference, and transfer (applying something learned in one setting to another setting), creative thinking, and philosophical thinking (Cotton, 1991). Although these studies did not treat openness as an outcome variable per se, the findings suggest that openness and its facets may be modified through direct instruction and well-designed interventions.
Increases in conscientiousness from first year to senior year in college have been shown to correlate with a higher GPA (Robins, Noftle, Trzesniewski, & Roberts, 2005). Study skills interventions and related approaches have been used to increase the achievement striving, orderliness, self-discipline, and self-efficacy facets of conscientiousness. The achievement-striving facet is measured through such items as “Work hard,” “Do more than what's expected of me,” and “Demand quality.” Interventions, geared toward enhancement of individuals' achievement-striving, offer techniques that increase motivation, to learn and to perform, and present strategies that help individuals to set proper and realistic goals. Examples of orderliness items include “Like order,” “Do things according to a plan,” and “Love order and regularity.” Self-discipline is measured through items such as “Am always prepared,” “Carry out my plans,” and “Get to work at once.” Improving the facets of orderliness and self-discipline may aid in increasing both time management and study skills, and, ultimately, academic achievement. The facet of self-efficacy is assessed through items that include “Complete tasks successfully,” “Excel in what I do,” and “Handle tasks smoothly.” Self-efficacy has been shown to be both a consequence and an antecedent of performance. As a consequence, beliefs about one's competence on a task are influenced by individuals' prior outcomes, and therefore, self-efficacy is molded by their performance. As an antecedent, higher self-efficacy is consistently shown to lead to greater use of diverse learning strategies, increased effort, sustained persistence, and higher attainment on a variety of tasks (Bandura, 1997; S. Lee & Klein, 2002; Linnenbrink & Pintrich, 2003; Schunk, 1990, 1995). Providing students with extensive feedback and helping them to gain initial levels of competence in a certain domain will lead to increased self-efficacy and, consequently, to higher achievement on future tasks.
There is now a growing body of research indicating that after-school programs that focus on the development of personal and social skills result in improved academic performance (Collaborative for Academic, Social, and Emotional Learning, 2003; Greenberg et al., 2003; Weissberg & Greenberg, 1998; Zins, Bloodworth, & Weissberg, 2004). Well-run academic components improve students' academic achievement, and when these components are coupled with well-run personal and social skill components, students' achievement is enhanced even more (Durlak & Weissberg, 2007). Interventions that recognize the interdependence between youths' personal and social development and their academic development can be effective. College freshman on academic probation improved their grades and effort (academic hours attempted and earned) after enrolling in a study skills course that did that (Lipsky & Ender, 1990). This trend persisted over two years and boosted the probability of students' staying in the program.
Study skills remediation programs that take into account the interaction of behavior, cognition, and various personal and environmental factors lead to the most significant improvement in students' academic performance and motivation (Dignath, Buettner, & Langfeldt, 2008). The most effective training programs are ones that provided students with feedback about their strategic learning. In addition, the instruction of action control strategies positively influences students' strategy use and, consequently, their performance. In summary, numerous intervention programs have been shown to successfully alter students' conscientiousness and thus enhance their academic achievement.
Interventions for neuroticism may involve techniques to help improve one's self-esteem, coping skills, and level of anxiety (e.g., test or math; which are highly correlated, r = .61, Hembree, 1990). In the FFM, self-esteem, coping, test anxiety, and math anxiety are represented by the facet of anxiety. Anxiety is measured with items that include “Worry about things,” “Fear for the worst,” and “Get caught up in my problems.” Coping also includes the facet of vulnerability. Examples of items that measure vulnerability include “Panic easily,” “Can't make up my mind,” and “Get overwhelmed by emotions.” Finally, a facet of neuroticism, anger, is important to consider when working on improving coping skills. Anger is measured with items such as “Get angry easily,” “Lose my temper,” and “Am often in a bad mood.”
Numerous efforts geared toward reducing students' stress and anxiety and thus promoting achievement have been undertaken by researchers and practitioners (Hembree, 1988, 1990). Several classroom interventions have attempted to reduce mathematics and science anxiety within whole classes, based on Fennema's (1989) model, which views subject-specific anxiety as caused by lack of competence in the subject domain. However, interventions that directly treat the anxiety per se appear to be more successful. Hembree's (1988) meta-analysis found a lasting reduction in test anxiety from behavioral treatments (that reduced emotionality) and cognitive-behavioral treatments (that reduced worry). For students low in test-taking skills, testwiseness training helped reduce test anxiety. The meta-analysis also found that test-anxiety reduction led to improved test performance and grades. A follow-up meta-analysis focusing on mathematics anxiety arrived at similar conclusions (Hembree, 1990). Curriculum changes focusing on improving mathematics instruction were not effective. The most effective treatments were behaviorally based, focusing on reducing emotionality. Treatment effects were long lasting and led to improved test scores.
Zeidner (1998) has indicated that, in order for an intervention to be effective, an individual must have at least some level of skill (i.e., problem solving and test taking); have at least a moderate interest and motivation to participate; and be given the opportunity not only to practice the skills taught in the intervention process, but also apply them to real-world situations and evaluate them realistically. Thus, such interventions must be specifically tailored to the individual as no single intervention will be successful for everyone. Among the behavioral interventions the meta-analyses found to be successful, systematic desensitization was deemed as most effective, whereas relaxation techniques did not lead to lower anxiety levels (Hembree, 1990; Udo, Ramsey, & Mallow, 2004). Although effective, individualized interventions are very expensive and time-consuming, and so the cost of such treatments is an issue.
Mixed approaches may be more efficient and less demanding in terms of time and economic resources. Smith, Arnkoff, and Wright (1990) suggested a multidimensional approach to intervention (e.g., focused on cognitive, emotional, academic, and social skills) as more effective than approaches with a singular focus.
In the case of coping, interventions are often designed to teach individuals how to manage the cognitive and behavioral aspects that are perceived as controllable by an individual (Compas, Connor-Smith, Saltzman, Harding Thomsen, & Wadsworth, 2001). These interventions typically include techniques that help the individual deal with and handle stress, such as positive reappraisal, problem solving, and stress avoidance (see Ayers, Sandler, West, & Roosa, 1996; Compas, 1998; Ebata & Moos, 1991; Lengua & Long, 2002; Rudolph, Dennig, & Weisz, 1995). Existing coping resources, such as optimism, self-esteem, and social support, can improve an individual's ability to manage stress and anxiety, as well as his or her ability to use appropriate coping strategies (Taylor & Stanton, 2007). Together, these studies suggest that neuroticism, particularly anxiety, may be modifiable to some extent and doing so leads to enhanced achievement and general life functioning.
Several interventions in use can influence individuals' extraversion. One such intervention is training in leadership and teamwork. In the FFM, leadership represents a large part of the assertiveness facet of extraversion and is measured with items such as “Try to lead others,” “Can talk others into doing things,” and “Wait for others to lead the way” (−). Furthermore, teamwork is represented in the facet of friendliness and is measured with items such as “Feel comfortable around people,” “Act comfortably with others,” and “Avoid contacts with others” (−). It is also represented in the facet of gregariousness and is measured with items such as “Enjoy being part of a group,” “Involve others in what I am doing,” and “Prefer to be alone (−).”
Students' leadership and teamwork potential can be improved through programs geared specifically toward leadership development. The goal of these programs is to increase student personal development and academic achievement. Cress, Astin, Zimmerman-Oster, and Burkhardt (2001) used longitudinal data from 875 students to access whether student participation in leadership education and programs had an impact on educational and personal development. The results indicated that participants show a growth in civic responsibility, leadership skills, multicultural awareness, understanding leadership theories, and personal and societal values.
Three common elements have been found to affect student development: having the opportunity to volunteer, being exposed to internships, and participating in group projects in the classroom (Cress et al., 2001). Buckner and Williams (1995) used a theoretical model of organizational effectiveness and leadership to examine leadership programs. They found that student leaders tended to be mentors to others within their organization or club and least often as brokers to individuals outside their immediate unit. Furthermore, the type of organization or club, student classification, and gender produced significant differences in the leadership roles performed. Some recommendations put forth by the researchers include training in areas where student leaders express self-perceived leadership role deficiencies, providing additional opportunities to perform the broker leadership role, interacting with university administrators, and including peer-education with seniors and underclassmen.
Another effective way to improve leadership skills is through after-school programs (ASPs). Not only do they improve leadership skills, but ASPs can also positively affect other facets of extraversion. For instance, ASPs can increase the amount of positive interactions students have with others. Positive interactions with others is represented by the friendliness facet of extraversion and is measured by items such as “Make friends easily,” “Act comfortably with others,” and “Am hard to get to know (−),” and also by the gregariousness facet, which is measured by items such as “Enjoy being part of a group,” “Involve others in what I am doing,” and “Want to be left alone (−).” Furthermore, ASPs can also influence students' assertiveness, which is also a facet of extraversion.
Durlak and Weissberg (2007) conducted a meta-analysis of ASPs that seek to enhance the personal and social development of children and adolescents. Their analysis revealed significant increases that occurred in youths' positive social behaviors. Included in positive social behaviors were leadership behaviors, positive social interactions, and assertiveness. At the same time, significant reductions occurred in problem behaviors and drug use. Substantial differences emerged between programs that used evidence-based approaches for skill training and those that did not. The former programs consistently produced significant improvements among participants in all of the above outcome areas (mean effect sizes ranged from 0.24 to 0.35), whereas the latter programs did not produce significant results in any outcome category. These findings have important implications for future research, practice, and policy. The first is that ASPs should contain components to foster the personal and social skills of youth, because participants can benefit in multiple ways if these components are offered. The second is that such components are effective only if they use evidence-based approaches. When it comes to enhancing personal and social skills, successful programs are SAFE, which is an acronym for sequenced, active, focused, and explicit. In summary, ASPs and other programs show promise in influencing students' extraversion for the benefit of important academic and life outcomes.
The interventions that influence extraversion can also have the effect of influencing many of the facets of agreeableness. The teamwork and leadership interventions described previously influence several facets of agreeableness. Specifically, these interventions influence the agreeableness facets of trust, morality, altruism, cooperation, and sympathy. Trust is measured with items such as “Trust what people say,” “Believe that others have good intentions,” and “Distrust people (−).” Morality is measured with items such as “Use others for my own ends (−),” “Put people under pressure (−),” and “Take advantage of others (−).” Altruism is measured with items such as “Anticipate the needs of others,” “Love to help others,” and “Am indifferent to the feelings of others (−).” Cooperation is measured with items such as “Can't stand confrontations,” “Hate to seem pushy,” and “Yell at people (−).” Finally, sympathy is measured with items such as “Value cooperation over competition,” “Am not interested in other people's problems (−),” and “Can't stand weak people (−).” As such, any intervention that influences leadership and teamwork should have a large influence on agreeableness.
Furthermore, ASPs have been found to influence agreeableness through their influence on fostering students' positive interactions with others and by increasing the amount of cooperative behaviors students demonstrate (Durlak & Weissberg, 2007). ASPs increase agreeableness by influencing the same five agreeableness facets that are the focus of the teamwork and leadership intervention.
An attitude is best defined as a positive or negative evaluation toward a particular entity (Eagly & Chaiken, 1993), and recent research has demonstrated that student attitudes are effective predictors of valued academic outcomes (e.g., Lipnevich, Krumm, MacCann, & Roberts, 2008). For example, Lipnevich et al. (2008) found that a significant amount of variance in math scores can be explained by theory of planned behavior (TpB) components (Ajzen, 1991). Specifically, according to the TpB, attitude (positive or negative evaluation toward math), subjective norms (important others' evaluations toward math), and perceived control (self-efficacy toward math) predict the intention to perform a behavior. The TpB lends itself to the creation of interventions in that this model can be used to identify which component should be focused on to encourage a desired behavior (Ajzen, 2006). Once the critical attitudinal components are identified, interventions that target the specific components should be effective in promoting behavioral change. The addition of assessments and interventions of attitudes to interventions aimed at personality change can help to significantly improve student academic and life outcomes, and any researcher developing intervention strategies should at the very least consider the potential contribution of the addition of an attitude assessment and intervention.
The purpose of this section was to examine the malleability of psychosocial or personality factors and to determine whether there were established methods for improving them. A widespread perception holds that personality factors are fixed over the lifespan—we have the personality we were born with. Two meta-analyses (B. Roberts & DelVecchio, 2000; B. W. Roberts, Walton, et al., 2006) demonstrated quite convincingly that this is not the case. The correlation between personality tested a year or more apart is only moderate, suggesting that while there is some consistency in personality, there is also change: Some individuals increase on some personality factors, others decrease; but in general, there is change in the rank order of people over time. In addition, mean-level changes in personality occur over the lifespan—we tend to become more conscientious, considerate of others, socially dominant, and emotionally stable as we grow through adolescence and into adulthood. This growth suggests that personality in some sense may be thought of as a skill that can be developed like other skills. If so, then principles that govern cognitive skill change—such as practice makes perfect and it is easier to change narrow domains than broad domains—may prove useful in personality development efforts.
In this section, we also reviewed the evidence that already exists for whether and how personality and psychosocial factors could be improved, which suggests interventions and policies that could be implemented in the schools. For each of the five factors, there have been specific interventions that have proven successful. These include exercises and training in critical thinking (openness), study skills (conscientiousness), test and math anxiety reduction (neuroticism), teamwork and leadership (extroversion and agreeableness), and attitudes. Interventions along the lines of those described here could be evaluated in conjunction with a comprehensive psychosocial assessment system. We now turn to a specific suggestion for how we could imagine doing that.
Recommendations for Future Research
In this section we consider the findings from the literature review and our own data collections and experiences, and we synthesize them in a way to suggest how a comprehensive psychosocial assessment scale could be developed and how it might best be used. We will consider various possible uses and evaluate the prospects for those various uses. One use would be a high-stakes assessment to supplement grades and cognitive test scores as components of the application package for admission into higher education. Another use would be a noncognitive assessment for monitoring student status for institutional use and could include a developmental scale. A third use would be a noncognitive assessment that would be part of a system for developing or improving noncognitive skills.
Our experience has suggested several reasons why schools and districts might be interested in participating in a comprehensive psychosocial assessment program. Some districts are interested in accreditation issues. Presenting empirical data on student noncognitive factors rather than relying on anecdotes can assist in this process. Some districts are responding to concerns expressed by parents and the school board and need to monitor student psychosocial variables for various purposes, such as monitoring school policies. In both of these kinds of cases the availability of an institutional report characterizing student levels on various psychosocial factors (e.g., percentages in the basic, proficient, and advanced categories) is a useful incentive. In other districts, the focus is more specifically on student performance, and the provision of feedback and action plans following assessments may be part of a student readiness or college preparedness sequence.
The core assessment constructs would be the Big 5 constructs of conscientiousness, extraversion, emotional stability, agreeableness, and openness. Particular facets of these constructs could be emphasized, such as the achievement striving and dependability facets of conscientiousness as well as the anxiety facets of emotional stability. Our review provides many suggestions for which factors and facets provide the strongest correlations with academic achievement and which factors may be most susceptible to change.
Supplementary constructs could also be developed as part of a catalog available for custom assessments. These could include both applied facets (e.g., time management, test anxiety, bullying, test-taking strategies), and other factors (e.g., attitudes toward school and learning, vocational interests, outcome factors such as citizenship). These custom assessments would be used both as incentives for schools to participate, and for other purposes that did not necessitate trend or other comparisons (e.g., evaluations of one-off school policy changes or interventions).
The primary assessment types used would be self-assessments and SJTs. Self-assessments are fairly straightforward to develop. They typically can be administered at the rate of three to four items per minute, enabling a 10-minute questionnaire of approximately 30–40 items. The SJTs require slightly more time to develop. Also, SJTs take longer to administer, but 10 minutes would permit 10 items or so (e.g., almost two items per Big 5 factor). All testing likely would be by paper and pencil. In addition, teacher ratings could be used sparingly (either on a subset of students or with specific classes or schools) as a quality control mechanism against which to evaluate data from self-assessments and SJTs.
Several experimental methods were reviewed in this prospectus, but these are not yet ready to be used in an operational testing program. However, such measures could be administered in conjunction with the activities outlined here, or as separate activities, in a research and development context.
Although not the focus of this review, high-stakes noncognitive assessments are a potential application. Other than letters of recommendation, high-stakes assessments of noncognitive factors have not been implemented in any large-scale sense in education thus far. One barrier has been that the most common noncognitive assessments—self-assessments—are fairly easily faked and coached. However, ratings by others are less susceptible to this validity threat. With the growing appreciation of the importance of noncognitive factors in education, it is not surprising that a large-scale, high-stakes noncognitive assessment, based on faculty ratings, will be implemented this year for graduate school admissions (ETS, 2009). If successful, it is reasonable to expect that a similar application for college could follow. A benefit of noncognitive high-stakes assessments is that they signal to the student the importance of noncognitive factors.
One can imagine developmental scales, based on annual noncognitive assessments, made available to students or institutions to assist in monitoring student progress from middle school to high school graduation. Developmental scales with norms could be presented for each of the core psychosocial factors and perhaps additional factors. These kinds of data would enable comparisons between schools, districts, and even states, and enable trend comparisons.
It would be important to present these constructs and facets to educational users (teachers as well as school, district, and state administrative personnel) in order to find language that would be most useful for score reporting. For example, terms like neuroticism, might not be readily accepted within the context of the public school system. Terminology could be developed in focus groups and telephone surveys.
A useful supplement to developmental scales would be the provision of proficiency standards for different target groups. Proficiency standards (e.g., basic, proficient, advanced) could be established based on contrasts with populations from various workforce sectors (e.g., healthcare, personal services, retail, manufacturing, professional) or 2-year and 4-year college populations. Or they could be established with other methodologies commonly used in standard settings (Cizek & Bunch, 2007).
Probably the most sensible way to implement such a scheme would be to begin with a few school networks, to pilot test and improve the measures iteratively for a couple years, and then to scale up. Incentives for school and district participation would initially be student payments, but over time the usefulness of the institutional reports would be a sufficient incentive. Another incentive would be that custom assessments, measuring factors particular schools or districts might find useful, such as test anxiety, time management, bullying, student engagement, and others, would be made available. The core assessment could be used for school, district, and state comparisons, both contemporaneously and for trends over time. Custom assessments would be used as needed by schools and districts.
The benefit of developmental scales or proficiency reports to the institution and the school is primarily the knowledge of where that institution or student stands with respect to what have proven to be the most important and general psychosocial factors. As with any large-scale assessment, such as PISA or NAEP, the provision of reliable, high quality data on these factors provides the background from which policies and interventions, carried out at the school, district, state, or even national level, can be tried out and their effects evaluated using assessment scores as the basis for the evaluation. However, it may be useful additionally to provide specific suggestions in the form of feedback and action plans that might enable students to engage in self-help or assisted improvement programs.
In industrial–organizational psychology, a literature has emerged on how such intervention programs might be developed in what has come to be known as developmental assessment centers (e.g., Thornton & Rupp, 2005). In analogy to what employees learn in such centers, one can imagine students, as a result of, and as a supplement to participating in the assessment, learning more about the psychosocial dimensions themselves, learning about their strengths and weaknesses, learning how to set goals to improve, learning how to monitor their progress in improvement, and being provided with exercises, feedback, and experiential learning activities.
In fact we have developed programs along these lines to teach community college students how to improve their time management skills and test-taking strategies, reduce their test anxiety, and transfer these skills to their school work (R. D. Roberts et al., 2007). The program is a set of assessments and specific interventions designed to enhance psychosocial skills. In the initial version of the program, the assessments were of time management, teamwork, coping with test anxiety, and test-taking strategies. These topics were chosen because discussions with high school and community college students, teachers, faculty members, and administrators suggested that these four areas were both important and perceived as being amenable to improvement through instruction and practice. (Others under consideration were communication skills, study skills, critical thinking, problem solving, ethics, and leadership.) Interventions were designed by interviewing and conducting focus groups with experts—teachers, faculty advisors, and guidance counselors. Experts were presented with profiles of hypothetical students (e.g., displaying high test anxiety and poor time management skills) and then asked what kind of feedback, advice, and exercises they might suggest to such students. We captured and aggregated these suggestions, then rewrote them in the form of feedback and action plans that could be given to students in a article format or in an online system. What is unknown at this time is the effectiveness of systems like this in developing students' psychosocial skills.
Several risks are associated with a comprehensive psychosocial assessment effort such as the ones outlined here. The first risk is that sufficient literature does not exist to provide a sound basis for moving forward with the development of such an assessment. However, in this review we have tried to make the case that we believe enough is currently known to move forward in a productive direction.
A second risk is that serious discrepancies are present between high performing and high challenge schools—so serious that efforts to address one do not inform efforts to address the other. For example, the language used by students in the two types of schools is different, or the psychosocial survival factors in the two environments are different. We believe that this is a serious challenge but not an insurmountable one. The literature already tells us that there is a common core of important factors ranging from work ethic to time management skills. We believe that it should be possible to develop a common scale for these factors that can include benchmarks ranging from very low to very high proficiency and can be expressed in language that is developmentally appropriate.
Several metrics can be suggested for evaluating sustainability of a comprehensive psychosocial assessment system such as the one envisioned here. One is continued growth in acceptance by schools. This growth is likely to be a leading indicator of sustainability in that schools are likely to recognize the value of the system before clear, unambiguous “what works” style evidence can be obtained demonstrating its worth. However, the next metric would involve additional empirical demonstrations of the value of the system. This metric could be accomplished in some kind of randomized trial design (with schools or districts as the unit of analysis), with psychosocial scores as criteria (near transfer). If psychosocial skills were changed, then an important question would be whether doing so results in educational improvements, such as higher grades, higher standardized test scores, and increased retention. The correlations between psychosocial skills and achievement suggest that this may be possible, but there is a shortage of controlled studies that have established this kind of causal relationship. In addition supplemental outcome measures such as some of the performance criteria variables used in the College Board criteria work, or in the Great 8 performance outcome work reviewed here could be employed to provide evidence for effects of the assessments and interventions and policies based on the assessments.
We thank Stefan Krumm, Jihyun Lee, Waverly VanWinkle, and Matthew Ventura for their help on earlier versions of this article. We thank Brent Roberts, Laura Lippman, Nathan Kogan, Larry Stricker, Dan Eignor, Rich Coley, and Gerry Matthews for providing reviews of earlier drafts. We thank Mary Lucas for administrative support. This project was supported by a grant from the Bill and Melinda Gates Foundation and by Educational Testing Service Research Allocation funding.
Both McAdams (1996) and B. W. Roberts and Wood (2006) suggested a third level, personal narrative, an even more fine-grained level of personality description, but we ignore the measurement implications of that level in this paper.
Appendix: Big 5 Factors and Lower Level Facets and Scales
Note. Lower level facet and scale names are from the IPIP website http://ipip.ori.org/ipip/. Their being listed here is not meant to suggest a specific facet model (e.g., empirical independence). Rather, they are simply scale names for scales from commercial instruments. Categorization into the Big 5 factors was based on a content matching process (Toker, 2008). Letters within parentheses (e.g., [O]) indicate facets and scales cross-listed in more than one Big 5 factor category; the letter refers to which other Big 5 factor the construct is cross-listed in. A minus sign (−) indicates a reverse keyed facet or scale.