Meta-Analysis of Dark Side Personality Characteristics and Critical Work Behaviors among Leaders across the Globe: Findings and Implications for Leadership Development and Executive Coaching



This paper concerns critical work behaviors for leaders across the globe and how scores on dark side personality measures predict those behaviors. Using a global archive of job analytic data, we first identify the work behaviors most critical for performance in managerial jobs across organisations, industry sectors, and countries. Next, we identify criterion-related validation research studies including dark side personality measures and performance ratings for at least one of these critical work behaviors. Based on meta-analytic results, we examine relations between scores on dark side personality measures and critical leader work behaviors. Also, we examine evidence of potential moderators of these relationships. Finally, we consider the implications of our results for I/O professionals engaged in using personality assessment for leadership development and executive coaching.


Management influences organisational performance, and some managers are more effective than others. Bloom and Van Reenen's (2007) research on over 700 manufacturing firms in the US, France, Germany, and the UK shows that companies that use effective management practices are more profitable than those that do not. Because senior leadership drives management practices, leadership ultimately determines the fate of organisations (R. Hogan, 2007). In other words, when leadership is effective everyone benefits.

Likewise, everyone suffers from ineffective or destructive leadership (Einarsen, Aasland, & Skogstad, 2007), which is of concern given research suggesting that between 33 per cent and 61 per cent of leaders act destructively (Aasland, Skogstad, Notelaers, Nielsen, & Einarsen, 2010). For example, former Hewlett-Packard CEO Carly Fiorina lavished herself with bonuses and perks while terminating thousands to reduce costs. Dick Fuld's leadership resulted in the collapse of Lehman Brothers, triggering a nationwide financial panic (Portfolio, 2009). Tony Hayward's repeated stumbles in the wake of the explosion and spill at the Deepwater Horizon rig led to his ousting as BP CEO (Coster, 2010). Each of these leaders made destructive decisions that resulted in disaster for individuals and organisations under them.

Researchers speculate about the cost of ineffective leadership. Lombardo, Ruderman, and McCauley (1988) estimated a cost of $500,000 per failed leader, which amounts to over $1 million today when adjusted for inflation. More recent estimates range from $1.5 to $2.7 million for each failed senior manager (DeVries & Kaiser, 2003). However, these estimates exclude hidden costs such as golden parachutes, severance packages, missed corporate objectives, lost intellectual and social capital, and costs resulting from a disengaged workforce (J. Hogan, Hogan, & Kaiser, 2010).

How widespread is the problem of ineffective leadership? In their review of 12 published estimates of the base rate of managerial failure, Hogan et al. (2010) concluded that 50 per cent of managers will fail, and half of those will be fired. Organisations recognise this crisis as evidenced by the fact that the leadership development and coaching industry exceeds $2 billion annually (Hoagland-Smith, 2009), with over 30,000 coaches operating worldwide (International Coach Federation, 2007).

Leadership Performance

Defining effective leadership is difficult. Numerous leadership performance models exist, which vary in both the number of work behaviors described and the generality or specificity of those behaviors (Tett, Guterman, Bleier, & Murphy, 2000). Adding to this difficulty, modern leaders across the globe are expected to serve a variety of roles. For example, Campbell and Lee (1988) noted that different organisational constituencies (e.g. subordinates, customers, peers, shareholders) hold varying conceptualisations of effective leader performance based on various informational, cognitive, and affective constraints. Similar research demonstrates that different parties may hold dissimilar perceptions of a leader's performance due to varying opportunities to observe the leader's behavior (Lance, Teachout, & Donnelly, 1992; Murphy & Cleveland, 1995).

Furthermore, research has identified a number of differences in leadership behavior and style across cultures (Byrne & Bradley, 2007; Goh, 2009; Munley, 2011). Holt, Bjorklund, and Green (2009) asked participants from 19 different cultures to express their perceptions of effective leadership. Regardless of culture, respondents associated qualities such as responsibility, charisma, authenticity, and vision with effective leadership. However, the rank-order of these ratings varied by region, with Western respondents placing more emphasis on authenticity and Eastern respondents favoring vision.

Javidan, Dorfman, de Luque, and House (2006) used the Global Leadership and Organizational Behavior Effectiveness (GLOBE) research program to examine worldwide leadership differences. The authors used a sample of 62 cultures to identify 10 culture clusters: Anglo, Confucian Asia, Eastern Europe, Germanic Europe, Latin America, Latin Europe, Middle East, Nordic Europe, Southern Asia, and Sub-Saharan Africa. Studying leader behavior by country in these clusters, they found a number of culturally contingent leader attributes, including being individualistic, status conscious, and prone to taking risks. However, the authors also noted a number of universal impediments and facilitators of leadership effectiveness. Being asocial, irritable, and dictatorial were viewed as detrimental to leadership effectiveness across countries and cultures, whereas being trustworthy, visionary, charismatic, inspirational, and an effective team-builder were viewed as universal leadership facilitators.

To identify relationships between scores on dark side personality measures and leadership behaviors, we first sought to identify specific behaviors rated as important for leadership performance across a range of managerial jobs representing multiple organisations, industries, and countries. As such, we explore the following:

  • Research Question 1: Which leadership behaviors will consistently be rated as critical to leadership effectiveness across organisations, industries, and countries?

Personality and Leadership

Despite the prevalence of coaching, little agreement exists on optimal activities, goals, and evaluation methods. Consequently, results of these efforts are mixed. Bono, Purvanova, Towler, and Peterson (2009) note that leadership development can be affected by coaches' training and background, tools used in the process, and whether outcomes focus on individual goals or business needs. Coaches without psychological training tend to come from human services and helping professions and use single methods and sources to accomplish goals focused on individual needs. In contrast, those with psychological training tend to use multiple methods and sources to inform strategic leadership development (Bono et al., 2009).

Peterson (2010) suggests accurate insight into one's development needs as a condition facilitating effective coaching. For this reason, coaches often use personality assessments to provide strategic self-awareness to leaders regarding how they and others perceive the positive and negative aspects of their personalities (e.g. Dawdy, 2004; Major, Turner, & Fletcher, 2006; Mansi, 2007; Peterson, 1993). Because personality predicts managerial effectiveness, promotion, and level (Hough, Ones, & Viswesvaran, 1998), leader emergence and effectiveness (Bono & Judge, 2004; Judge, Bono, Ilies, & Gerhardt, 2002), and overall job performance for managers and executives (Barrick, Mount, & Judge, 2001; J. Hogan & Holland, 2003), the use of personality assessment in these contexts can have considerable returns.

The “Bright Side” of Personality

Socioanalytic theory helps explain the connection between leader characteristics and leadership outcomes (R. Hogan, 1983; R. Hogan & Shelton, 1998). Based on this theory, because (a) people work in groups and (b) groups are hierarchical, individuals are driven by motives to get along with and get ahead of others. Task performance, focused on structuring work and accomplishing goals, fits the “getting ahead” motive, whereas the facilitative and socially oriented nature of contextual performance corresponds with the “getting along” motive (Oh & Berry, 2009). Different personality characteristics, which elicit different motives and goals (Judge, Piccolo, & Kosalka, 2009), are often expressed through behaviors aimed at achieving one or both of these goals.

For example, getting along is primarily predicted by Emotional Stability (because emotionally stable individuals are positive and rewarding to deal with), Conscientiousness (because conscientious individuals are predictable), and Agreeableness (because agreeable individuals are sensitive toward others). However, components of Extraversion concerned with sociability also suggest a focus on getting along with others. Likewise, getting ahead is primarily predicted by Emotional Stability (because emotionally stable individuals are confident), Extraversion (because extraverted individuals are driven and ambitious), and Openness to Experience (because individuals open to new experiences are curious and eager to learn). However, components of Conscientiousness concerned with need for achievement also suggest a focus on getting ahead of others (J. Hogan & Holland, 2003; Oh & Berry, 2009). Together, these characteristics are representative of the Five-Factor Model (FFM; Digman, 1990; Goldberg, 1992; John, 1990; McCrae & Costa, 1987) of personality. This model describes the positive characteristics individuals exhibit in everyday life.

The “Dark Side” of Personality

Although positive personality characteristics can contribute to managerial success, other personal attributes such as arrogance, volatility, and distrust can lead to failure (Dotlich & Cairo, 2003; Judge et al., 2009; Resick, Whitman, Weingarden, & Hiller, 2009). Characterised as either “dark side” factors or “derailers”, these attributes represent normally advantageous strategies that individuals may over-use in stressful or ambiguous situations that challenge self-regulation and social vigilance (Baumeister, Muraven, & Tice, 2000; McCall & Lombardo, 1983). Bright side and dark side characteristics coexist, although other individual differences and situational/organisational factors also influence dysfunctional leadership behavior (Tett & Burnett, 2003).

V. Jon Bentz (1967, 1985a, 1985b, 1990) can be credited with some of the first investigations into dark side personality dimensions among managers. In a 30-year study of failed managers at Sears, Bentz observed that otherwise intelligent and skilled managers failed due to “overriding personality defects” including difficulties building teams, delegating to subordinates, dealing with complexity, and maintaining relationships. Other problems concerned failures to learn from experience, being overly reactive, and making emotional decisions. McCall and Lombardo (1983) replicated these findings in interviews contrasting successful versus failed executives across three US-based industrial organisations.

Adding empirical rigor to these efforts, Hogan and colleagues (cf. Arneson, Milliken-Davies, & Hogan, 1993; R. Hogan & Hogan, 2001; R. Hogan, Raskin, & Fazzini, 1990) found that scores on dark side personality measures predicted performance in professional and leadership jobs above and beyond FFM measures, but in a negative direction. Others followed their lead, linking dark side personality dimensions to job performance outcomes. Moscoso and Salgado (2004) investigated relationships between dark side personality attributes and task and contextual performance, finding seven dysfunctional personality styles (i.e. suspicious, shy, sad, pessimistic, sufferer, eccentric, risky) that negatively predicted job performance. Others found a relationship between narcissism and counterproductive work behaviors (Judge, LePine, & Rich, 2006; Lee, Ashton, & Shin, 2005; Penney & Spector, 2002). However, as noted by R. Hogan (2007), dark side personality attributes often represent an over-use of otherwise beneficial strategies, suggesting that relationships between scores on dark side personality measures and performance may be positive for some behaviors. Therefore, we explore the following:

  • Research Question 2: Are relationships between scores on dark side personality measures and job performance consistently negative across a range of critical leadership behaviors?

Potential Moderators

When examining relationships between dark side personality measures and job performance, it is important to consider potential moderators. For example, research examining personality differences across cultures suggests that country or culture may serve as potential moderators. Western conceptions of personality are rooted in the model of the independent self, whereas Asian cultures stress the notion of interdependence (Markus & Kitayama, 1998). McCrae and colleagues (see McCrae, Costa, & Yik, 1996; McCrae, Bond, Yik, Trapnell, & Paulhus, 1998) found that participants in Hong Kong are more introverted than American participants, and that Eastern cultures inhibit imaginative fantasy, need for variety, liberal values, and cheerful optimism, resulting in higher levels of practicality, conservatism, and serious-mindedness. McCrae and Terracciano (2005) later noted that Europeans and Americans generally score higher in Extraversion than do Asians and Africans. These efforts demonstrate that mean scores on personality dimensions can differ across cultures.

Despite these differences, however, the general structure of personality remains intact across cultures. In their research examining the generalisability of the FFM of personality, McCrae and Costa (1997) examined data from the English form of the NEO-PI-R and six translations (Costa & McCrae, 1992). Using Varimax rotation of five factors, they demonstrated the similarity between factor structures from American, German, Portuguese, Hebrew, Chinese, Korean, and Japanese data. When targeted rotations were used, the American factor structure was closely reproduced even at the level of secondary loadings. Because their samples represented diverse cultures across five distinct language families, McCrae and Costa (1997) concluded that the trait structure of personality represents a human universal.

Finally, meta-analytic research examining personality and performance has generally drawn similar conclusions in domestic and international settings. Specifically, researchers across the US, Canada, and Europe (cf. Barrick & Mount, 1991; Dudley, Orvis, Lebiecki, & Cortina, 2006; J. Hogan & Holland, 2003; Hough, 1992; Hough, Eaton, Dunnette, Kamp, & McCloy, 1990; Judge et al., 2002; Salgado, 1997, 1998; Tett, Jackson, & Rothstein, 1991) have all found general support for Conscientiousness and Emotional Stability and partial support for other FFM dimensions in predicting job performance. In contrast with these Euro-American findings, however, Oh (2009) demonstrated that Conscientiousness and Extraversion best predict performance in East Asian cultures (i.e. South Korea, Japan, China, Taiwan, and Singapore) based on the critical role that interpersonal relationships play in determining career success in Asia. Similar meta-analyses using scores on dark side personality measures to predict leader performance have not previously been conducted. For that reason, we explore the following:

  • Research Question 3: Do results from a diverse set of criterion-related validity studies representing multiple countries provide evidence of potential moderators between scores on dark side personality measures and managerial job performance?



Job Evaluation Tool (JET)

To identify critical work behaviors for managerial jobs across the globe, we used archival data from the Job Evaluation Tool (JET; Hogan Assessment Systems, 2009a). The JET represents one of the most extensively researched, reliable, and valid worker-oriented job analysis tools available (Foster, Gaddis, & Hogan, 2012). Included in the JET is the Competency Evaluation Tool (CET; Hogan Assessment Systems, 2009b), used to measure a taxonomy of work behaviors mapped to the Domain Model of Organizational Performance (R. Hogan & Warrenfeltz, 2003). This model provides a broad framework for organising work behaviors into four domains: (a) intrapersonal skills, including self-awareness, self-control, emotional maturity, and integrity; (b) interpersonal skills, including social skills, empathy, and developing and sustaining relationships; (c) leadership skills, including influencing, persuading, and building/maintaining a high performing team; and (d) business/technical skills, including planning, organising, and resource allocation. In the CET, Subject Matter Experts (SMEs)—high-performing incumbents or supervisors who provide a representative sample of occupational and demographic strata—are asked to indicate the extent to which each of 56 specific behaviors relates to performance for a target job. Raters evaluate each item using a 5-point scale ranging from 0 (Not associated with job performance) to 4 (Critical to job performance).

Hogan Development Survey (HDS)

The Hogan Development Survey (HDS; R. Hogan & Hogan, 2009) measures 11 dark side personality dimensions associated with job derailment. The 11 HDS scales and their definitions appear in Table 1.

Table 1. HDS Factors, Scales, and Definitions
HDS FactorHDS ScaleConcerns seeming …
Moving AwayExcitablemoody and inconsistent, being enthusiastic about new persons or projects and then becoming disappointed with them.
Skepticalcynical, distrustful, overly sensitive to criticism, and questioning others' true intentions.
Cautiousresistant to change and reluctant to take even reasonable chances for fear of being evaluated negatively.
Reservedsocially withdrawn and lacking interest in or awareness of the feelings of others.
Leisurelyautonomous, indifferent to other people's requests, and becoming irritable when they persist.
Moving AgainstBoldunusually self-confident and, as a result, unwilling to admit mistakes or listen to advice, and unable to learn from experience.
Mischievousto enjoy taking risks and testing the limits.
Colorfulexpressive, dramatic, and wanting to be noticed.
Imaginativeto act and think in creative and sometimes unusual ways.
Moving TowardDiligentcareful, precise, and critical of the performance of others.
Dutifuleager to please, reliant on others for support, and reluctant to take independent action.

A 1997 principal components analysis of the HDS scales resulted in the extraction of three global factors accounting for 62 per cent of variance in the matrix. The first factor, defined by the Excitable, Skeptical, Cautious, Reserved, and Leisurely scales, resembles the theme of “moving away from people” in Horney's (1950) model of flawed interpersonal tendencies. Together, these scales describe a pattern of managing one's insecurities by avoiding others. The second factor, including the Bold, Mischievous, Colorful, and Imaginative scales, corresponds with Horney's theme of “moving against people”, or managing self-doubt by dominating and intimidating others. Finally, the Diligent and Dutiful scales comprise Horney's theme of “moving toward people”, defined by managing one's insecurities by building alliances to minimise the perceived threat of criticism (R. Hogan & Hogan, 2009). This factor structure has been replicated multiple times using both US and global normative samples. The most recent results, obtained in 2008 using both Varimax and Direct Oblimin rotations, are available in the Hogan Development Survey manual (R. Hogan & Hogan, 2009).

The HDS is available in over 40 languages. To ensure that the assessment functions properly across languages and cultures, a rigorous adaptation process is followed (Hogan Assessment Systems, 2011b). This process begins with forward- and back-translations consistent with practices recommended in academic literature (e.g. Geisinger, 1994; Hambleton & Patsula, 1999; van de Vijver & Hambleton, 1996) and the International Test Commission's Test Translation and Adaptation Guidelines (Hambleton, 2001). This rigorous translation process ensures that translated items (a) read correctly in terms of grammar and syntax, (b) measure the same construct as original items, (c) possess cultural sensitivity and relevance, and (d) maintain the same strength of wording as original items (Hogan Assessment Systems, 2011b).

Following translations, equivalence analyses are conducted to compare English and translated forms (van de Vijver & Leung, 1997). First, Classical Test Theory bootstrapping analyses (Manly, 2007) are used to examine properties of items and scales. At the item level, endorsement rates and corrected item–total correlations for translated items are examined against 95 per cent confidence intervals from bootstrapped samples estimating true population parameters. For scale-level analyses, these methods are used to examine scale means and coefficient alpha reliability estimates. Item and scale parameters that fall outside these confidence intervals suggest that translation revisions may be needed. Together, these analyses ensure the functional equivalence of translated forms at the item and scale level.

Moreover, to verify the structural equivalence of scales, global factors, and the overall assessment, Procrustes analyses (Schönemann, 1966) are used to find congruence coefficients between English and translated forms of the HDS. With this analysis, researchers rotate a matrix of factor loadings from the translated assessment to assess similarity with a matrix of factor loadings from the original English assessment. Congruence coefficients are computed to determine the degree of similarity between the two structures, ranging from −1.0 (maximum inverse similarity) to +1.0 (maximum similarity). Some researchers (e.g. McCrae, Zonderman, Costa, & Bond, 1996; Mulaik, 1972) argue that a coefficient of .90 is indicative of acceptable congruence, with others (e.g. Lorenzo-Seva & ten Berge, 2006) suggesting that coefficients as low as .85 demonstrate fair similarity. Because this approach is widely applied for its ease of use and straightforward interpretation (e.g. McCrae & Terracciano, 2005; McCrae et al., 1996; Rodriguez-Fornells, Lorenzo-Seva, & Andres-Pueyo, 2001; Rolland, Parker, & Stumpf, 1998; Schmitt, Allik, McCrae, & Benet-Martinez, 2007), it provides a sound method for ensuring that the factor structure of the HDS remains intact across languages (Hogan Assessment Systems, 2011b).

Leaders can use the HDS to gain strategic self-awareness about how to improve their performance and relationships with others at work. The HDS measures characteristics that leaders often consider strengths. However, a manager may rely on these tendencies excessively when under stress, leading to detrimental effects on performance (Kaiser & Hogan, 2011). For example, a manager with a high score on the Diligent scale may view him- or herself as detail-oriented, but an over-reliance on these tendencies may lead his/her supervisor, co-workers, and subordinates to view these behaviors as nit-picky, perfectionistic, and micro-managing.

Research using the HDS illustrates its value in predicting performance. For example, in a sample of 290 incumbent managers, Benson and Campbell (2007) found negative relationships between elevated scores on Excitable, Skeptical, Cautious, Leisurely, Mischievous, and Imaginative scales and ratings of leader performance. In a multi-wave, multi-method longitudinal study of military cadets, Harms, Spain, and Hannah (2011a) found that scores on the Skeptical and Imaginative scales negatively predicted trajectories of leader development. In other words, young officers who questioned others' motives or were eccentric and odd in their approach to solving problems were rated poorly by superior officers across a variety of performance dimensions including judgment, communication, leadership, fairness, sense of duty, response to feedback, courage, conduct, unselfishness, army values, conscientiousness, and fitness. Some mixed results were also noted, with scores on other HDS scales (i.e. Cautious, Bold, Colorful, Diligent, Dutiful) relating positively with leader development over time and across multiple leadership dimensions. Researchers also note that the dark side personality dimensions measured by the HDS explain incremental variance in job performance beyond that explained by FFM personality measures (Harms, Spain, & Hannah, 2011b).


We obtained job analysis data from an archive that includes CET data from hundreds of managerial jobs representing multiple countries, industry sectors, and organisations (Hogan Assessment Systems, 2011a). Studies included in our sample used the CET to examine work behaviors required in managerial jobs in the US, Australia, Kenya, Germany, and those spanning multiple countries. Respondents (N = 4,372) took the JET between 1995 and 2010, completing the CET in English, Brazilian Portuguese, Spanish, and German. Inter-rater agreement for CET ratings in each study met a minimum criterion of .70. The industry sectors that were represented included aeronautics, construction, energy, finance, hospitality, manufacturing, medical, and transportation among others. SMEs who reported demographic data were 69 per cent male and 31 per cent female with an average age of 42.70 years (SD = 7.99 years). Most of these SMEs were White (70%), with Black/African-American (12%), Two or More Races (7%), Hispanic (5%), and Asian (2%) representing other racial/ethnic groups. SMEs reported an average of 7.40 years' incumbent experience in their managerial or executive role (SD = 7.87 years). Because the number of respondents per job ranged from 1 to 1,212 in our sample (M = 23.86, SD = 90.21), we conducted analyses at the job level to avoid over-representation of any one job. As a result, our final sample consisted of average CET ratings for 256 managerial and executive jobs.

To determine the impact of scores on dark side personality characteristics on critical leader behaviors, we sought to obtain data from studies that met five criteria. Studies had to (a) include data for applicants applying to or incumbents currently in managerial jobs, (b) include HDS scale data as predictors, (c) use a concurrent or predictive validation strategy with working adults, (d) contain supervisor ratings of overall job performance and one or more critical leader behaviors, and (e) contain information on the study's country of origin. We obtained data from the Hogan archive (Hogan Assessment Systems, 2010). As the publisher of the HDS, Hogan Assessment Systems maintains an archive containing over 50 criterion-related validity studies that include the HDS as a predictor instrument. We also conducted on-line searches to attempt to identify additional samples through three databases: PsycINFO, SocINDEX, and Business Source Complete (Carroll, 2007). We used “HDS” and “Hogan Development Survey” as search terms, but did not find any additional datasets not already available in the Hogan archive. We identified 12 independent samples (N = 1,279) that met each of these five criteria. All potential datasets excluded from our analyses either (a) did not include data for managerial jobs, or (b) failed to include supervisor ratings of overall job performance and one or more critical leader behavior. The 12 studies included in our analyses were conducted from 1997 to 2011. Three samples were international, with two samples representing Australia and one representing Denmark. Also included were three multi-national samples including data from across the UK and Europe. The six remaining samples represented managerial and executive jobs in the US.


Although each sample included in our study contained at least one measure of overall job performance and ratings on critical leader behaviors, the items and response formats used to collect these data varied across samples. Examples of overall job performance ratings include “overall performance”, “summed performance ratings”, “total score”, and “overall effectiveness”. Examples for specific work behavior items include “integrity” and “keeps word” (trustworthiness), “displays good judgment” and “makes sound and defensible decisions” (decision making/problem solving), and “dependability” and “reliability” (dependability). Supervisors provided performance ratings in all samples.

Data Analyses

To examine our first research question relating to which behaviors are consistently rated as critically important for managerial performance, we computed average ratings for each CET item across all jobs. Next, we computed the mean and standard deviation of average ratings across all CET items. Finally, we identified critical behaviors as those that received average ratings at least one standard deviation above the mean across all items.

To examine the consistency of these ratings across countries, we first calculated the average ratings for each CET item for all jobs within a specific country. This included average ratings from 170 jobs located in the US, seven jobs located in Australia, 17 jobs located in Kenya, two jobs located in Germany, and 15 jobs located across multiple countries. Country information was unavailable for the remaining 45 jobs. Next, we used Intra-Class Correlations (ICC; Shrout & Fleiss, 1979) to assess the reliability of CET results. We used a two-way random effects model to test for absolute agreement among ratings with average scores for each country serving as “raters” for the analyses. Two-way random effects models are appropriate when drawing raters from a larger population of potential raters (i.e. additional countries). Absolute agreement takes the magnitude of rating differences into account when estimating reliability.

To investigate our second research question relating to relationships between scores on dark side personality measures and job performance, we examined correlations across criterion-related validity samples to identify relationships between scores on each HDS scale and ratings of overall leadership performance and critical leadership behaviors. For our meta-analyses, we used zero-order product-moment correlations (r) as effect sizes and, as recommended by Hunter and Schmidt (2004), a random-effects model.

We examined the operational validity of each HDS scale by correcting for range restriction in each predictor variable and unreliability in performance measures. We corrected for direct range restriction using Thorndike's Case II correction (Thorndike, 1949). We chose direct range restriction corrections so our research would be consistent with prior research on personality and job performance, and because when the ratio of restricted to unrestricted standard deviations is close to 1, direct and indirect range restriction corrections yield very similar results (Schmidt, Shaffer, & Oh, 2008). For our analyses, we were able to obtain standard deviations for each HDS scale from each individual dataset (i.e. restricted standard deviations) and estimates of population variance (i.e. unrestricted standard deviations) for each HDS scale from a global normative dataset containing HDS data from 67,614 working adults (Hogan Assessment Systems, 2011a). Furthermore, this normative sample is stratified based on a number of demographic and job-related variables such as job type, industry, and assessment purpose (i.e. applicant or incumbent data). Because this normative dataset represents a large global population of working adults that mirrors those using the HDS for selection and development purposes, these estimates provide a reasonable approximation of population variance for each HDS scale. Based on these results, average range restriction ratios of restricted to unrestricted standard deviations (i.e. “ux”) across scales ranged from .84 for Diligent to .99 for Reserved (M = .93, SD = .04).

Although some (e.g. Mount & Barrick, 1995; Ones, Viswesvaran, & Schmidt, 1993) argue in support of correcting for unreliability in predictor scales, we did not correct this artifact because corrections for predictor unreliability inflate operational validities of these scales in predicting performance. To correct for criterion unreliability, we used the mean inter-rater reliability coefficient of .52 reported by Viswesvaran, Ones, and Schmidt (1996). Some researchers (e.g. Murphy & DeShon, 2000) have argued against the use of this estimate, favoring higher reliability estimates such as test–retest or internal consistency that result in more conservative corrections. However, because all samples included in the present research used single-rater job performance ratings as criterion measures, the .52 inter-rater reliability estimate provides the most appropriate correction for criterion unreliability. This correction increased estimated population parameters by a factor of 1.39 (i.e. sample-weighted average correlations were approximately 72 per cent of the corrected size). Finally, to prevent bias, we used only one criterion measure per study for each meta-analysis in line with recommendations provided by Hunter and Schmidt (2004).

Finally, to explore our third research question concerning potential moderators of the relationships between scores on dark side personality measures and managerial performance, we examined the percentage of variance accounted for by statistical artifacts and credibility intervals from each of our meta-analyses. Because of a lack of power when identifying moderators with a small number of studies or overall sample size (Sackett, Harris, & Orr, 1986), we only examined the potential for moderators in relationships between scores on each HDS scale and overall job performance.


Critical Leadership Behaviors

To examine our first research question, we reviewed criticality ratings provided by SMEs on the CET to identify critical leadership work behaviors. The mean criticality rating across CET items was 3.23 with a standard deviation of 0.36. Seven items received average ratings at least one standard deviation above the overall mean (3.59 or greater). Table 2 presents these items and their average criticality ratings. Also, we used the Domain Model of Organizational Performance to identify which domains were represented among critical leader behaviors.

Table 2. Critical Managerial Work Behaviors
Work behaviorDomainKNMSD
  1. Note: K = Number of studies; N = Total number of participants across K studies; M = Mean; SD = Standard deviation.
Work AttitudeIntrapersonal2564,6153.700.37
Leading OthersLeadership2564,6123.680.49
Decision Making/Problem SolvingLeadership2564,5863.660.37
Achievement OrientationIntrapersonal2564,6123.630.38
Interpersonal SkillsInterpersonal2564,6123.490.43

Of particular note is that most critical leadership behaviors fall within the intrapersonal domain. These results are counterintuitive compared to most leadership competency models (e.g. Bartram, 2005; Borman & Brush, 1993; Yukl & Lepsinger, 1992) that focus on leadership tasks such as savvy communication, managing performance, developing employees, and strategic planning. In contrast with these models, job analysis ratings for leadership roles across the globe indicate that SMEs care more about who the leader is rather than their competence in traditional leadership functions, as only leading others and decision making met this threshold from the leadership domain. The complete lack of behaviors associated with the business/technical domain underscores this point. However, due to previous research indicating the importance of interpersonal skills for effective leadership (Bass & Bass, 2008; Yukl, 2012), we also examined relationships between scores on each HDS scale and criteria measuring the ability to work well and get along with others.

To determine the consistency of these ratings across countries, we examined Intra-Class Correlations (ICC) to examine inter-rater agreement where average ratings from all available countries served as “raters” and ratings on each CET item served as “participants”. The resulting ICC was .92, indicating a high level of agreement across countries used as rater sources. This answers our first research question, indicating that regardless of country of origin, SMEs consistently rate leadership behaviors similarly in terms of both rank order and magnitude.

Dark Side Personality–Leader Behavior Relationships

Our second research question concerned using scores on a dark side personality measure to predict critical managerial work behaviors. To answer this question, we examined relationships between HDS scale scores and ratings of overall job performance and critical leadership behaviors. Table 3 presents these results. We found significant relationships between scores on each of the 11 HDS scales and at least one performance measure. Below, we discuss these results for overall managerial performance and each critical managerial work behavior.

Table 3. Meta-Analysis Results for Performance Criteria by HDS Scales
  1. Note: Overall = Overall managerial performance; Trust = Trustworthiness; Attitude = Work attitude; Leading = Leading others; Decision = Decision making/problem solving; Achieve = Achievement orientation; Depend = Dependability; Adapt = Adaptability/flexibility; Inter = Interpersonal skills; EXC = Excitable; SKE = Skeptical; CAU = Cautious; RES = Reserved; LEI = Leisurely; BOL = Bold; MIS = Mischievous; COL = Colorful; IMA = Imaginative; DIL = Diligent; DUT = Dutiful; K = Number of studies; N = Total number of participants across K studies; r = Observed mean correlation; SDr = Standard deviation of r; ρ = Population correlation corrected for range restriction and criterion unreliability; SDρ = Standard deviation of the corrected population correlation; %VE = Percent of variance explained; L80 = Lower boundary of 80% Credibility Interval; U80 = Upper boundary of 80% Credibility Interval; L95 = Lower boundary of 95% Confidence Interval; U95 = Upper boundary of 95% Confidence Interval; *Correlation is significant at .05 level.

Overall Managerial Performance

Scores on the Leisurely (ρ = −.20), Excitable (ρ = −.16), Cautious (ρ = −.16), Skeptical (ρ = −.14), Reserved (ρ = −.11), and Imaginative (ρ = −.08) scales negatively predicted overall managerial performance. These results indicate that managers who are privately resentful, emotionally volatile, indecisive, mistrustful, insensitive, and distractible tend to perform poorly in managerial and executive roles. Scores on the Colorful (ρ = .11) scale positively predicted overall performance, indicating that managers who command attention and draw others to them tend to receive higher overall performance ratings.


Scores on the Colorful (ρ = −.29), Bold (ρ = −.21), Imaginative (ρ = −.19), Mischievous (ρ = −.16), and Skeptical (ρ = −.10) scales negatively predicted supervisory ratings of trustworthiness. These findings suggest that managers who are attention-seeking, overconfident, unpredictable, manipulative, and cynical tend to lack others' trust in the workplace. Scores on the Dutiful (ρ = .17) scale positively predicted trustworthiness, indicating that managers who are conforming and eager to please tend to be trusted. Interestingly, scores on the Excitable (ρ = .09) scale also positively predicted trustworthiness. This finding suggests that these individuals, although emotionally unpredictable, are viewed as trustworthy because of the transparency of their motives and agendas.

Work Attitude

Scores on the Leisurely (ρ = −.32), Cautious (ρ = −.28), Excitable (ρ = −.24), Skeptical (ρ = −.23), and Colorful (ρ = −.03) scales negatively predicted work attitude in managerial jobs. In other words, leaders who are privately resentful, indecisive, emotionally volatile, mistrustful, and attention-seeking tend to be viewed by others as having a poor attitude towards work.

Leading Others

Scores on the Cautious (ρ = −.23), Leisurely (ρ = −.19), Reserved (ρ = −.12), Skeptical (ρ = −.11), Dutiful (ρ = −.11), and Excitable (ρ = −.08) scales negatively predicted supervisory ratings of leading others across managerial roles. These findings suggest that managers who are reluctant to act, stubborn, aloof, cynical, ingratiating, and moody tend to be viewed as poor leaders. As with overall managerial performance, however, Colorful (ρ = .17) scale scores positively predicted leadership ratings, indicating that those who command attention and draw others to them tend to be viewed as natural leaders.

Decision Making/Problem Solving

Scores on the Cautious (ρ = −.19), Reserved (ρ = −.10), and Excitable (ρ = −.06) scales negatively predicted leader decision making and problem solving. These findings indicate that managers who are reluctant to act, aloof, and make decisions based on emotion rather than available information tend to be viewed as poor at using sound reasoning to make decisions and implement effective solutions to problems at work.

Achievement Orientation

Scores on the Imaginative (ρ = −.19), Leisurely (ρ = −.19), and Skeptical (ρ = −.13) scales negatively predicted supervisory ratings of achievement orientation. In other words, managers who are easily distracted, stubborn about helping others, and hold grudges tend to have difficulties in striving to meet and exceed goals for themselves and others.


Scores on the Skeptical (ρ = −.22), Leisurely (ρ = −.22), Mischievous (ρ = −.20), Imaginative (ρ = −.17), Diligent (ρ = −.16), and Reserved (ρ = −.16) scales negatively predicted dependability in managerial roles. These findings suggest that managers who are overly cynical, privately irritable, impulsive, eccentric, unable to delegate, and aloof tend to be viewed as undependable for performing work in a consistent and timely manner.


Scores on the Imaginative (ρ = −.19) and Excitable (ρ = −.14) scales negatively predicted leader adaptability and flexibility. In other words, oddly creative and emotionally volatile managers tend to have difficulties in adapting quickly to changing circumstances, switching directions as needed, and being willing to try new methods at work.

Interpersonal Skills

Scores on the Excitable (ρ = −.14), Diligent (ρ = −.10), and Imaginative (ρ = −.09) scales negatively predicted interpersonal skills in managerial roles. These findings indicate that managers who are emotionally volatile, perfectionistic, and oddly eccentric tend to be viewed as unable to get along with others or behave appropriately in social situations.

Summary of Findings

The Excitable, Skeptical, Cautious, Reserved, and Leisurely scales comprising Horney's (1950) “moving away from others” factor were responsible for 26 of 43 statistically significant outcomes across criteria. All five scales negatively predicted overall managerial performance and leading others, and a majority of these scales negatively predicted work attitude, making effective decisions, and being perceived as a dependable leader.

The Bold, Mischievous, Colorful, and Imaginative scales comprising Horney's (1950) “moving against others” factor accounted for 13 statistically significant outcomes, with all four scales negatively predicting managerial trustworthiness and individual scales negatively predicting work attitude, achievement orientation, dependability, adaptability, and interpersonal skills. However, these tendencies showed mixed results for overall managerial performance and positively predicted leading others.

Finally, the Diligent and Dutiful scales comprising Horney's (1950) “moving toward others” factor predicted only 4 of 43 statistically significant outcomes. Scores on the Diligent scale negatively predicted dependability and interpersonal skills, with Dutiful scores showing mixed findings for trustworthiness and leading others. However, neither of these scales predicted overall managerial performance.

Potential Moderators

To investigate our third research question concerning potential moderators of the relationships between scores on dark side personality measures and leader performance, we examined results between HDS scale scores and overall performance ratings for potential moderators. Our results indicate that 100 per cent of the variance in results was accounted for by statistical artifacts for four scales: Skeptical, Cautious, Leisurely, and Imaginative. Of the remaining seven scales, the percentage of variance accounted for by statistical artifacts fell below the 75 per cent threshold outlined by Hunter and Schmidt (2004) as a general “rule of thumb” for examining the potential for moderators for four scales: Reserved (70%), Bold (73%), Mischievous (71%), and Dutiful (71%).

It should be noted, however, that these results are largely dependent on the number of studies included in a meta-analysis rather than total sample size. Furthermore, even with 12 studies included for examining overall job performance ratings as criteria, this does not provide an adequate number of studies to conclude that moderators do not exist when the percentage of variance accounted for is at or above 75 per cent; neither is it large enough to further examine the impact of specific potential moderators such as country of origin. In comparison, an examination of 80 per cent credibility values shows that these intervals include zero for five scales: Reserved, Bold, Mischievous, Diligent, and Dutiful. As a result, we can only conclude that moderator effects are likely for some HDS scales.


This study examined three research questions about relationships between dark side personality characteristics and critical leader work behaviors. Our results provide insights into leader performance and hold significant implications for I/O professionals working with these populations.

Our first research question concerned the identification of work behaviors critical to performance among leaders across the globe. Contrary to common notions about critical leader behaviors (e.g. employee development, building teams, managing performance), SMEs providing ratings for over 250 managerial jobs across the globe indicated that critical leader behaviors are inwardly focused, such as concerns with being trustworthy, demonstrating a positive attitude, and adapting to an ever-changing world. With respect to the Domain Model of Organizational Performance, most critical leader behaviors represented the intrapersonal domain, although the leadership domain is also represented. In short, the dimensions most critical to leaders are not technical work skills, but the basic attributes that provide a foundation for effective work behaviors and, consequently, leadership.

Our second research question focused on how scores on HDS scales predict performance and critical leader work behaviors. As previously noted, some dark side personality attributes are more toxic than others for critical leader behaviors. For example, tendencies to move away from others when stressed (i.e. Excitable, Skeptical, Cautious, Reserved, Leisurely) are consistently toxic to both overall managerial performance ratings and ratings of critical leader behaviors. Tendencies associated with manipulating others or building alliances in response to stress, however, show mixed results. One can conclude, then, that as with FFM personality dimensions, some dark side personality dimensions are more predictive of performance than others across countries, industries, organisations, and jobs. Moreover, for certain scale–behavior pairings, HDS scale scores positively predicted performance. Although dramatic and attention-seeking (i.e. Colorful) behavior negatively predicted trustworthiness, that same behavior positively predicted leading others and overall performance for managers. These observations reiterate the importance of predictor–criterion alignment, a finding previously noted for FFM personality dimensions (J. Hogan & Holland, 2003).

Our final research question concerned whether unknown variables likely moderate the relationships between scores on dark side personality measures and leader behavior. Based on an examination of credibility intervals, we found only limited evidence of potential moderators in the relationships between HDS scale scores and overall managerial performance, but cannot examine specific moderators given the limited number of studies currently available. Additional samples would be necessary to statistically test for the presence and impact of potential moderators.

Practical Implications

This research bears significant implications for I/O professionals engaged in development and coaching efforts with leaders around the world. First, professionals should ensure that their development efforts are focused on critical leader behaviors. That is, before working with incumbent leaders on business and leadership skills related to team-building and managing resources, coaches would be well advised to maximise the leader's intrapersonal capabilities including trustworthiness, positive attitude, dependability, and adaptability. To build a high performing team and drive productivity, a leader must first be trusted and viewed as someone on whom others can depend.

Second, professionals should be mindful of the impact of dark side personality characteristics on critical aspects of leader performance. In particular, coaches should work with incumbent leaders on developing strategic self-awareness around tendencies to move away from others when stressed. For example, leaders may gain awareness of an inclination to hide behind a closed office door as a response to looming deadlines or other stressors. By using assessment of dark side personality characteristics to target developmental interventions to the specific needs and characteristics of executive clients, coaches can help mitigate the negative impact of dark side personality attributes on critical leader performance behaviors (Nelson & Hogan, 2009).

Furthermore, our results show that not all dark side personality characteristics consistently have negative relationships with different work outcomes. Executive coaches can use this information to better target behaviors likely to be impacted by an individual's specific dark side personality characteristics. For example, when working with a manager who received a high score on the HDS Colorful scale, it is important to note that others are likely to view this individual as leaderlike and charismatic, but are less likely to view him or her as trustworthy. Therefore, simply encouraging this manager to be more subdued and less attention-seeking may, in fact, have a negative impact on how others see him or her as a leader. Instead, it would be more effective to focus specifically on concerns over trustworthiness by encouraging behaviors aimed at building perceptions of trust.

Finally, our research indicates that coaching efforts aimed at reducing the effects of dark side personality characteristics on critical leader behaviors may show positive results around the globe. Despite mean score differences across cultures, the overall structure of personality generalises across languages and countries. As such, professionals can export these lessons to their work with incumbent managers and executives across continents.

Limitations and Directions for Future Research

Although this research provides unique insights into the impact of dark side personality attributes on critical leadership performance behaviors, it is not without its limitations. First, complex analyses were completed to ensure equivalence of translations of the Hogan Development Survey. In contrast, the Competency Evaluation Tool used to assess criticality of leader work behaviors was not subject to the same process. Although our research indicates consistency in ratings across countries, future efforts should focus on replicating these results using additional job analysis measures and similar equivalence analyses on a wider range of jobs from more countries.

Also, despite the fact that we derived job analysis results from multiple samples representing a variety of countries, these samples were mainly from individualistic Western cultures. As such, future research investigating these relationships should include additional international samples, especially from collectivistic Eastern cultures. Given cultural differences in mean personality scores and conceptualisations of effective leadership, these additional non-Western samples would further strengthen this line of research. Furthermore, an examination of additional datasets with both dark side personality and job performance measures from a more diverse set of countries could help better address the potential moderating effects of country on these relationships.

Finally, we propose a number of directions for future research examining relationships between dark side personality characteristics and job performance. Although the current study provides some insights into these relationships, several additional research questions have yet to be examined.

For example, Connelly and Ones (2010) and Oh, Wang, and Mount (2011) provided initial meta-analytic evidence that observer ratings of personality may provide better prediction than self-ratings. Specifically, operational validities of observer ratings for Five-Factor Model personality dimensions were higher than those based on self-report ratings for predicting academic achievement or overall job performance. Moreover, observer ratings provided incremental validity above and beyond self-ratings, but the reverse was not true. Because similar research has yet to be conducted on the relationships between dark side personality characteristics and job performance, future research should explore these possibilities.

Second, prior research demonstrates that performance ratings may vary by rater groups such as supervisors, peers, and subordinates, and that between-source variance may provide a more comprehensive account of the construct of managerial performance (Oh & Berry, 2009). Future research investigating scores on dark side personality measures and performance behaviors should include multiple rater sources to investigate these potential differences.

Third, measures of counterproductive work behaviors (e.g. absenteeism, theft, safety violations) were not available in our 12 samples. Future research including dark side personality measures and counterproductive work behaviors for managers could provide additional insight into the impact of dark side personality characteristics on negatively oriented job outcomes.

Fourth, several of our datasets included data across multiple managerial levels (i.e. entry-level supervisors, middle managers, executives) which did not allow us to investigate potential moderating effects of job level on relationships between scores on dark side personality dimensions and critical performance behaviors. Future research should extend these methods to managerial job levels and other job families (e.g. professionals, sales and customer support).

Fifth, although we limited our review to relationships between dark side personality characteristics and critical leader behaviors, other behaviors are also likely important for specific leadership roles. Future research should examine such relationships for additional job outcomes.

Finally, research has shown that relationships between dark side personality characteristics and job performance may be curvilinear (Ames & Flynn, 2007; Grijalva, Harms, Newman, & Gaddis, 2012; Kaiser & Hogan, 2011). We encourage the examination of non-linear relationships using multiple samples. As these potential directions for future research suggest, researchers are only beginning to fully explore and realise the complex nature of relationships between dark side personality characteristics and job performance outcomes.


Managerial performance profoundly impacts individuals and broader systems. Given the prevalence and cost of failed leaders, the executive coaching industry continues to flourish. Coaches often rely on personality assessment to impart strategic self-awareness to clients as part of these development initiatives. These efforts, however, typically focus only on leveraging relationships between bright side personality and leadership behavior. This research provides new insights by illustrating that critical leader behaviors focus not on technical savvy or leadership insights, but on inner characteristics that facilitate these skills. Moreover, we demonstrate that scores on dark side personality measures significantly predict critical leader behaviors, and that these relationships generalise to leaders across the globe. Taken together, this research provides coaches with new insights to mitigate the negative impact of certain personality attributes on critical performance behaviors. Given the consequences of failed or destructive leadership, I/O professionals can use these insights to monitor and mitigate these tendencies before they become toxic for individual leaders, their organisations, or society.