• Renee Engeln-Maddox, Department of Psychology, Northwestern University; Steven A. Miller, Department of Psychology, California State University, Fullerton.

  • Portions of this data were presented at the 18th annual convention of the Association for Psychological Science. Thanks to the anonymous reviewers who provided helpful feedback on an earlier draft of this manuscript.

Address correspondence and reprint requests to: Renee Engeln-Maddox, Department of Psychology, Northwestern University, 2029 Sheridan Road, Evanston, IL 60208. E-mail:


This article details the development of the Critical Processing of Beauty Images Scale (CPBI) and studies demonstrating the psychometric soundness of this measure. The CPBI measures women's tendency to engage in critical processing of media images featuring idealized female beauty. Three subscales were identified using exploratory factor analysis and confirmed using confirmatory factor analysis. The Fake subscale assesses women's tendency to critique media images of women as being too perfect to be real. The Questioning/Accusing subscale assesses women's tendency to produce direct accusations suggesting that these types of images are harmful to women. The Too Thin subscale assesses the tendency to think models are too thin or eating disordered. Scores on all subscales demonstrated strong internal consistency and test-retest reliability. Evidence of convergent, discriminant, and predictive validity are presented. The CPBI may be useful in assessing the outcomes of media literacy efforts and explicating relationships between critical processing of beauty images and body image–related concerns.

The ideal female body, as represented by mainstream media sources, has become ever more unattainable for the average woman (e.g., Owen & Laurel-Seller, 2000; Spitzer, Henderson, & Zivian, 1999; Wiseman, Gray, Mosimann, & Ahrens, 1992). High rates of body image disturbance among Western girls and women have been much discussed (e.g., Cash & Henry, 1995; Rodin, Silberstein, & Striegel-Moore, 1984), and it is not uncommon for researchers and activists to place at least part of the blame for this discontent on the media and its heavily idealized images of female beauty (e.g., Anderson & DiDomenico, 1992; Becker & Hamburg, 1996; Thompson & Heinberg, 1999; Tiggemann & Pickering, 1996; Wegner, Hartmann, & Geist, 2000). Research examining how exposure to these images affects girls and women in Western cultures is readily available (e.g., Becker, Burwell, Gilman, Herzog, & Hamburg, 2002; Botta, 1999, 2003; Cattarin, Thompson, Thomas, & Williams, 2000; Cusumano & Thompson, 1997; Dittmar & Howard, 2004; Jones, 2001; Levine & Harrison, 2004; Tiggemann & McGill, 2004), and the link between such exposure and increases in body image disturbance is well established at this point (for a meta-analytic review, see Groesz, Levine, & Murnen, 2002). Given this focus on the media as the dominant supplier of images of female beauty, it is not surprising that a number of authors have now turned their attention to the creation and evaluation of media literacy–based prevention/intervention techniques designed to attenuate the impact of exposure to such images (for a review of these efforts, see Levine & Harrison, 2004). However, the design and evaluation of such programs have been limited by the lack of a measurement tool to assess the extent to which women are critical of female beauty ideals presented by the media.

Several authors (e.g., Becker & Hamburg, 1996; Irving, DuPen, & Berel, 1998; Milkie, 1999) have noted the importance of acknowledging that women do not just passively receive media but instead actively select and process media content. In other words, instead of being victims of media images, women can critically analyze and reject these images, subverting the very process that might normally lead them toward increased levels of dissatisfaction with their own appearance. Despite this apparent recognition, researchers have devoted little attention to exploring the ways in which women fight back in the battle to define female beauty. Instead, the implicit assumption has often been that women need to be taught to be critical of these images, and that psycho-educational interventions will encourage women to avoid using such media images as the standard when evaluating their own appearance.

The notion that critical processing of media images of women could be a protective factor in the fight against body image disturbance is generally based on applications of social comparison theory (Festinger, 1954). If media literacy education can decrease the credibility of media images/messages about female beauty, and increase the extent to which women view the images as unrealistic (Becker & Hamburg, 1996), this type of attitude change could theoretically disrupt the social comparison process that has been implicated as a key mediator in the relationship between exposure to these images and body image disturbance (Bessenoff, 2006; Shaw & Waller, 1995). Given the perceived importance of media literacy in women's attempts to reject unrealistic beauty standards, it is concerning that little empirical data regarding how and when women engage in such critical processing are available. Before assuming that women need to be taught how to argue against these images, it is essential to assess the ways in which they already do so.

Some studies have offered evidence regarding the nature of critical processing in response to images of idealized female beauty. In a survey of 214 high school females, Botta (1999) asked participants how often, when watching television, they questioned “why the characters need to have such perfect bodies” and “why the characters do not look more like how my friends and I look” (p. 29). These adolescents reported engaging in critical processing; the mean for an average of the responses to these two questions was 2.28, on a scale ranging from 1 (never) to 5 (always). In terms of attitudes (drive for thinness and body dissatisfaction), critical viewing was not found to be a protective factor. In a similar study focusing on magazine images, results suggested that critical processing was associated with increased eating disordered behaviors and drive for thinness and decreased body satisfaction (Botta, 2003). After exploring parental attempts to encourage critical processing of these images of women, Nathanson and Botta (2003) concluded that such attempts may actually increase some types of body image processing in adolescents. Together, these studies suggest that critical processing is not rare, but neither is it necessarily a protective factor. However, it is unclear whether the measure of critical processing used in these studies captured the construct as a whole because the measure appeared to be limited to thoughts suggesting that women shown in the media are too attractive to represent the appearance of the average female. Thus, conclusions about the impact of critical processing based on this two-item measure may be premature.

In a study examining the effectiveness of three different media literacy–based interventions, Posavac, Posavac, and Weigel (2001) reported that, following exposure to these interventions, participants were more likely to produce a “discounting statement” in a recall-based thought-listing task after viewing images of women from fashion magazines than were control group participants. Specifically, the authors identified three different types of discounting statements: statements mentioning deceptive techniques used by advertisers to create an artificial beauty standard, statements noting that fashion models do not represent the appearance of most women, and statements mentioning the conflict between these images and health concerns/biological realities.

After coding women's qualitative responses to print advertisements (collected via a thought-listing task), Engeln-Maddox (2005) reported that approximately 75% of college women included at least one statement indicative of critical processing (without any intervention) when viewing a print advertisement featuring an attractive model in a bathing suit. However, the number of such statements generated by participants in response to this ad varied greatly, ranging from 0 to 8 (out of 10 possible thoughts).

Other qualitative methodologies have also been employed to assess critical processing of beauty standards. In a series of in-depth interviews, Milkie (1999) found that adolescent girls were quick to note that the images of girls and women in the magazines they read were unrealistic and not representative of the general population. They often referred to the images as “too perfect” or “fake.” In this same study, African American girls were especially likely to criticize these images as promoting a beauty standard that was both irrelevant and unattainable for them. Hirschman and Thompson (1997) also conducted in-depth interviews, but with a sample of participants ranging in age from 6 to 28. These authors found that, in response to media images, girls and women frequently engaged in a strategy the authors referred to as “deconstructing and rejecting.” Several women in their sample described the media's female beauty ideal as unrealistic, or even fictional. They also expressed concerns over the media's power to teach this ideal and the harm that can result from this teaching.

Other evaluations of the effectiveness of media literacy–based interventions have included attempts to measure critical processing as part of the evaluation process. For example, Irving, DuPen, and Berel (1998) reported that a single-session media literacy program that involved watching a Jean Kilbourne video (from the Still Killing Us Softly series that deconstructs advertisers' representations of the female body) and engaging in a semi-structured discussion did result in an increased tendency to perceive media images of women as unrealistic. However, this result was based on a scale the authors created specifically for use with their high school participants (the Media Attitudes Questionnaire [MAQ]), for which no validity-related evidence was presented. In a similar study, Irving and Berel (2001) reported that, following exposure to a media literacy–focused program, college women demonstrated increased skepticism about these types of images and saw them as less realistic. However, the unvalidated MAQ was again used to assess this skepticism. Rabak-Wagener, Eickhoff-Shemek, and Kelly-Vance (1998) examined the outcomes associated with completion of a “healthful living” program conducted over four consecutive meetings of a college course. This intervention involved several components, including watching a Jean Kilbourne video, critical analysis of fashion ads, challenging fashion industry norms, and an exercise in the creation of counter advertisements. These authors also created a scale to assess the outcomes of this intervention (an 11-item survey regarding beliefs and behaviors related to fashion advertising images), but presented little detail regarding the development or validation of this scale.

The varied methodologies described above clearly suggest the need for a valid and reliable self-report scale measuring women's tendency to critically process media images of female beauty ideals. Because such a measure has not been available, researchers wishing to assess this construct have had to rely on cumbersome qualitative data or unvalidated survey questions written specifically for individual studies. The previously described MAQ comes closest to a self-report scale assessing critical processing of beauty images. However, two of the three subscales of this measure (desirability of looking like individuals in the media and positive expectancies associated with being thin) do not seem to tap into true critical processing. Such attitudes are oft-predicted outcomes related to levels of critical processing, but not critical processing per se. Furthermore, these two subscales are similar to measures of internalization of media ideals, which has been shown to be somewhat independent of critical processing (see Engeln-Maddox, 2005). Although qualitative data regarding women's critical responses to media images can be quite useful, working with such data is also labor intensive, and such data are difficult to compare across multiple studies by authors using different methodologies. Additionally, the frequent use of posttest-only designs has made it difficult to assess the extent to which women already questioned and critiqued the beauty standard represented by media images prior to participating in a program designed to encourage this type of response.

The purpose of the present series of studies is to detail the creation and validation of the Critical Processing of Beauty Images scale (CPBI), designed to measure women's tendency to critically process media images of female beauty. Such a scale could be useful in determining how common critical processing of beauty-focused media images is among various groups of women. The potential to assess this construct prior to conducting a media literacy intervention could help to shed light on the actual effects of such interventions on critical processing and the extent to which critical processing is truly related to the outcome variables of interest. These issues are especially important given that research up to this point has failed to establish clearly whether critical processing is a risk factor or a protective factor with regard to body image disturbance. This scale could aid in further exploring the relationship of critical processing to appearance-related dissatisfaction, eating disordered behaviors, self-esteem, and other variables of interest to researchers. Finally, if distinct subcategories of critical processing are identified, the CPBI could help researchers assess whether different types of criticisms have different origins and/or outcomes.

In the current studies, the process through which initial scale items were generated is first described, followed by a preliminary item analysis. Half of the data collected using the scale were subjected to exploratory factor analysis and the other half to confirmatory factor analysis. After subscales addressing different types of critical processing were identified, the relationship of these subscales to several measures of body image–related constructs was explored and test-retest reliability was assessed. A final study used scores on the CPBI to predict open-ended responses to media images featuring beauty ideals.


The purpose of Study 1 was to create a measure of critical processing of media images of the female beauty ideal and to conduct a preliminary item analysis to refine and shorten the initial version of the scale.


Generation of Initial Item Pool

No participants were allowed to complete more than one of the studies described below. Prior to the generation of items for this scale, over 300 undergraduate women participated in one of several different studies (e.g., Engeln-Maddox, 2005) during which they wrote their thoughts in response to print advertisements featuring highly attractive female models. In previous research, thoughts were examined by research assistants for evidence of critical processing (see Engeln-Maddox, 2005, for detailed coding guidelines). Based on this research, several categories of criticisms of the media's female beauty ideals were identified. These included (a) criticisms of the model for being too thin, unhealthy, or eating disordered; (b) suggestions that the model is unrealistic, too perfect, fake, airbrushed, or otherwise graphically manipulated; (c) comments indicating that the model is not representative of the general population of women; (d) suggestions that such images cause women and girls to feel bad about themselves; (e) comments that such images make the viewer angry; and (f) criticisms of the advertising industry and media for using these types of models. In an additional study, 10 participants identified as engaging in high levels of critical processing participated in individual interviews regarding their reactions to media images of ideal beauty. These interviews were transcribed to capture more fully the language these participants used when criticizing idealized media images of women. Several items were written to address each of the categories of criticisms listed above. Attempts were made to use the type of language employed by participants in these studies. Timing constraints did not allow for obtaining feedback about these items from the same participants whose data provided the language and content for their construction. However, the items were reviewed by a team of female research assistants who agreed that all items were face valid and clearly expressed.

To create a scale that would apply to media images in a variety of contexts, the following question format was used: “When you see a female model in a magazine, on television, or on a billboard, how often do you have the following types of thoughts?” The response scale included five options: 1 (I never have thoughts like this), 2 (I rarely have thoughts like this), 3 (I sometimes have thoughts like this), 4 (I frequently have thoughts like this), and 5 (I always have thoughts like this). The choice to use a frequency-based response scale was made on the basis of earlier research (Engeln-Maddox, 2005) demonstrating that the frequency with which women generated different types of thoughts in response to advertisements featuring highly attractive female models was significantly associated with several body image–related variables. This approach is also consistent with earlier research by Botta (1999) and Nathanson and Botta (2003) using a response scale ranging from never to always to measure critical body image processing.

In the qualitative research mentioned above, many participants indicated knowing how they should react to such images. In other words, these women often felt that criticizing these images was the socially appropriate, intelligent thing to do. Thus, there was a concern about creating a scale that would be unduly biased by social desirability concerns. To address such concerns, a large number of distracter items were included in the scale. These questions were also based on the previously mentioned qualitative data; they comprised other types of reactions women had to these images that were not criticisms related to beauty ideals (e.g., “I wonder how much money she makes?”). Thus, the final scale does not appear as a litany of criticisms of media images, but rather a list of a wide variety of thoughts one could have in response to such images. The first version of the scale contained 66 items, 37 of which were distracters.


Participants for the first administration of the scale were 101 female students from an introductory psychology participant pool. Participants ranged in age from 18 to 25 (M= 18.48, SD= 0.95). The majority of the participants (73%) identified themselves as White/Caucasian, 13% as South Asian/Indian/Pakistani, 5% as Hispanic/Latina, 4% as Black/African American, 4% as East Asian, and 1% as Middle Eastern/Arabic.


The initial 66-item survey was administered online via Zoomerang survey software, along with a brief demographic questionnaire (see Birnbaum, 2000, 2001; Gosling, Vazire, Srivastava, & John, 2004, for data supporting the use of online methodology). Participants were e-mailed a link to the survey that allowed them to complete the survey at any computer with Internet access. All participants were given course credit in exchange for completing the survey.

Results and Discussion

Responses to distracter questions were not included in analyses. After a review of descriptive statistics for each item and an additional review of the content of each item by a team of research assistants, two items were deleted as a result of low item–total correlations. An additional item was deleted as a result of an unusually high mean and low standard deviation, and a fourth was deleted because its wording was too similar to another item. Additionally, 12 distracter items were removed (not due to statistical considerations, but rather to shorten the scale). The revised version of the scale had 50 items, 25 of which were distracters. A Cronbach's alpha of .94 was obtained for the 25 remaining nondistracter items. Thus, results indicated that the initial version of the scale had acceptable internal consistency.


The purpose of Study 2 was to establish evidence of the scale's convergent and discriminant validity and to evaluate its factor structure. The revised scale was administered with a number of measures predicted to be related to this type of critical processing (measures are described in detail below). It was predicted that scores on the CPBI would correlate positively with interest in women's studies, interest in media studies, and the number of women's studies courses taken because these areas of study generally focus heavily on media literacy. Additionally, to establish that this type of critical processing is not simply a manifestation of the tendency to engage in effortful cognitive processing in general, a measure of need for cognition was included. Research up to this point has generated mixed conclusions with regard to whether this type of critical processing is a protective factor (e.g., Irving et al., 1998), risk factor (e.g., Nathanson & Botta, 2003), or unrelated factor (e.g., Engeln-Maddox, 2005) relative to body image disturbance. One reason for this mixed evidence may be a failure to consider that there are several distinct types of critical processing in which one could engage, each with a potentially distinct effect. Thus, although body image–related measures were included to explore these relationships, it was not possible to make specific predictions regarding the relationships of different types of critical processing to body image variables.



Participants in this phase of the research included 393 female college students in an introductory psychology participant pool who received course credit in exchange for their participation. The mean age of participants in this sample was 18.53 (SD= 1.12). Sixty-five percent of the participants identified as White/Caucasian, 14% as Hispanic/Latina, 7% as Black/African American, 6% as South Asian/Indian/Pakistani, 4% as East Asian, 2% as biracial, and 2% as Middle Eastern/Arabic. Additionally, all women's studies majors and minors (a total of approximately 100 students) at the same university were invited to participate in the study via an e-mail announcement. In exchange for completing the survey, these participants were given the option to enter a raffle for a bookstore gift certificate. Twenty-two women from this group opted to complete the survey. The mean age of this subsample was slightly older (23.64, SD= 6.95). All but two of the women in this group identified themselves as White/Caucasian. A Fisher's Exact Test confirmed that the two subsamples were not significantly different in ethnic composition (p= .12). Including both the introductory psychology students and the women's studies students resulted in a total sample size of 415.


Half of the sample completed the CPBI prior to the validation measures; the other half completed the validation measures first. All measures were completed online using Zoomerang survey software (see above). Multivariate analysis of variance confirmed no significant differences in scores on any of the measures described below based on whether the CPBI was presented prior to or after the validation measures, F(12, 335) = 0.42, p= .95.


Sociocultural Attitudes Toward Appearance Questionnaire–3 (SATAQ-3).  The SATAQ-3 (Thompson, van den Berg, Roehrig, Guarda, & Heinberg, 2004) is a recently updated measure of social influences on body image. It includes subscales assessing two types of internalization of media influence (one general, e.g., “I would like my body to look like the models who appear in magazines,” and one athlete/sport figure specific, e.g., “I wish I looked as athletic as sports stars”), the use of media as an information source regarding physical appearance (e.g., “Magazine advertisements are an important source of information about fashion and ‘being attractive’”), and perceived pressure from the media to look like media ideals (e.g., “I've felt pressure from TV or magazines to have a perfect body”). Response options on this scale range from 1 (completely disagree) to 5 (completely agree); the scores for each subscale are the sum of the responses to each relevant item, after reverse scoring the appropriate items. Validity evidence for this scale includes positive correlations with measures of body image disturbance and higher scores among eating disordered subjects (compared to controls; Thompson et al., 2004). The creators of the scale reported alphas for scores on these subscales ranging from .89 to .94 (Thompson et al., 2004). For scores obtained in this study, alphas were .96 for the Internalization-General, Information, and Pressure scales. Alpha for the Internalization-Athlete scale was .89.

Eating Disorder Inventory–2, Body Dissatisfaction Subscale (EDI-BD).  This subscale of the well-validated EDI-2 (Garner, 1991) assesses dissatisfaction with the overall shape and size of specific regions of the body (e.g., “I think that my stomach is too big”). Participants indicate their level of satisfaction with various body areas on a scale ranging from 1 (always) to 6 (never). After reverse scoring the appropriate items, participants were assigned one point for each item to which they responded always, usually, or often, with higher scores indicating greater dissatisfaction. Scores on this scale are positively associated with eating-disordered behavior (Spillane, Boerner, Anderson, & Smith, 2004) and body weight (Garner, Olmstead, & Polivy, 1983) and can reliably distinguish patients with eating-disorder diagnoses from comparison group participants (Garner et al., 1983). Reported reliability coefficients for college women range from .83 to .93 (Garner et al., 1983). Cronbach's alpha was .89 in the present sample.

Multidimensional Body-Self Relations Questionnaire—Appearance Scales (MBSRQ-AS).  This is a nationally standardized, well-validated (Brown, Cash, & Mikulka, 1990; Cash, Winstead, & Janda, 1985, 1986) measure of the affective, cognitive, and behavioral components of body image (Cash, 2000). Although the measure comprises five subscales, the subscale employed in this study was the Appearance Evaluation scale (e.g., “I like my looks just the way they are”), which measures satisfaction with one's appearance (not body specific). This measure uses a response scale ranging from 1 (definitely disagree) to 5 (definitely agree). The total score for this scale is the mean of responses to the seven Appearance Evaluation subscale items (after reverse scoring). Scores on this measure are negatively correlated with depression and other measures of body satisfaction (Denniston, Roth, & Gilroy, 1992; Engeln-Maddox, 2005). Additionally, scores were shown to increase following weight loss among a sample of obese participants (Dixon, Dixon, & O'Brien, 2002). Cronbach's alpha for scores on this scale has been reported as .88 (Cash, 2000). In this sample, alpha was .87.

Body Mass Index (BMI).  Participants' self-stated height and weight (collected via a demographics questionnaire) were used to calculate BMI. The following formula was used: BMI ={[weight (lbs.)]/[height (in)2]}× 703.

Need for Cognition Scale (NCS).  The NCS assesses the degree to which an individual seeks to engage in and enjoys effortful cognitive processing (e.g., “I would prefer complex to simple problems”). A shortened, 18-item version of the scale was used (Cacioppo, Petty, & Kao, 1984). Response options ranged from 1 (extremely uncharacteristic of me) to 5 (extremely characteristic of me). Total scores were obtained by summing responses to each of the 18 items after reverse scoring the appropriate items. Scores on this scale obtained from a wide variety of samples have demonstrated reliability (alpha of .90) and validity (the scale predicts participants' enjoyment of complex vs. simple tasks and discriminates between those in jobs associated with high vs. low need for cognition; Cacioppo & Petty, 1982; Cacioppo et al., 1984). Alpha was .91 for this sample.

Balanced Inventory of Desirable Responding (BIDR-7).  This version of the BIDR (Paulhus, 1988) assesses both self-deceptive positivity (the tendency to give self-reports that are honest but positively biased, e.g., “Many people think that I am exceptional”) and impression management (deliberate presentation to an audience, e.g., “I never cover up my mistakes”). Response options range from 1 (totally disagree) to 7 (totally agree). After reverse scoring appropriate items, one point is added for each extreme score (6 or 7). Global scores from the BIDR (summing both subscales) have demonstrated adequate internal consistency (α= .83; Paulhus, 1991) and strong test-retest reliability. Scores on the BIDR correlate highly with other measures of social desirability such as the Marlow-Crowne scale (Paulhus, 1991). In this study alpha was .77.

Additional measures.  Participants rated their interest in women's studies and media studies. Seven-point scales were used, ranging from 1 (not at all interested) to 7 (extremely interested). Basic demographic information was also collected.

Results and Discussion

Exploratory and Confirmatory Factor Analysis

The data from Study 2 were randomly split into two halves; the first half was used for exploratory factor analysis (198 participants) and the second for confirmatory factor analysis (182 participants after accounting for missing data). Although it was predicted that multiple factors would emerge, no specific factor structure was hypothesized. Consistent with recommendations on use of factor analysis for scale creation (e.g., Gorsuch, 1997; Preacher & MacCallum, 2003), principal axis factoring with direct oblimin rotation was used to examine the factor structure of the scale. Examination of Kaiser-Meyer-Olkin's (Kaiser, 1970, 1974) measure of sampling adequacy (MSA) revealed that these items had a high degree of common variance, falling into these authors' highest category of common variance. Additionally, each item's MSA was examined. The lowest individual MSA was .86, again indicating a good degree of factorability. Research has found that parallel analysis (i.e., Horn, 1965) works particularly well for identifying how many factors to retain when conducting exploratory factor analyses (Kahn, 2006; Zwick & Velicer, 1986). Parallel analysis was conducted using O'Conner's (2000) SPSS program; it suggested either a three- or four-factor structure for the present sample. An examination of the rescaled pattern matrix revealed that only one item was loading on the fourth factor; thus, this item was removed and a three-factor solution was forced. The pattern matrix for the three-factor solution was examined for loadings on any of the three factors of at least .30. Items with loadings greater than .30 on any factor were retained unless the items demonstrated loadings of greater than .30 on multiple factors. In these cases, if such loadings were less than .10 apart, the items were discarded. Based on these criteria, seven additional items were removed. This left 17 items (not including distracters). For these 17 items, Eigenvalues prior to rotation were 9.25, 2.39, and 1.79, respectively (5.99, 4.41, and 4.42, after rotation) and the cumulative common variance accounted for was 62%. The first and second factors had a correlation of .52, the first and third a correlation of .57, and the second and third a correlation of .40. See Table 1 for rescaled pattern matrix coefficients from the exploratory factor analysis.

Table 1. 
Critical Processing of Beauty Images Scale Rescaled Pattern Matrix Coefficients
ItemFakeQuestioning/accusingToo thin
  1. N= 198.

She's way too thin. .00 .03.68
Women like that set the bar too high for the rest of us. .16−.45 .08
Images like that make women feel like they have to look perfect.−.12−.81 .07
She should eat more.−.05−.11.68
Images like that make women feel badly about themselves.−.04−.87 .05
Why do models have to be so perfect-looking? .05−.78−.10
She looks malnourished. .05 .00.77
It takes a ton of make-up to make someone look that good..59 .05 .14
She's airbrushed..66 .01 .17
It's not good for women to have to look at things like this. .23−.47 .04
Nobody looks like that without computer tricks..84 .03−.02
They probably used computer re-touching to make her look like that..80−.05 .02
That kind of perfection isn't real..78−.01 .04
She's too skinny to be healthy. .15−.03.65
Nobody's that perfect..41−.20 .22
It takes a lot of camera tricks to make someone look that good..75 .03 .03
You have to have a make-up artist to look like that..57−.10−.13

Confirmatory factor analysis using maximum likelihood estimation via LISREL VIII was conducted on the second subsample to compare the three-factor solution and a one-factor solution. To assess fit, Hu and Bentler's (1999) two-index strategy of presenting the standardized root mean squared residual (SRMR) and the confirmatory fit index (CFI) was utilized. The root mean square error of approximation (RMSEA) was also examined (Bentler, 2007). According to Hu and Bentler's (1999) guidelines, SRMR should be approximately .08 and CFI at least .95 to conclude a relatively good fit between the hypothesized model and the observed data. Steiger and Lind (1980) stated that RMSEA should be less than .08 to be considered a good fit. The first model tested included the 17 items (three factors) identified during exploratory factor analysis. Because this model did not demonstrate adequate fit with data using RMSEA (or other fit indices such as the goodness of fit index), the lowest loading items were trimmed iteratively from the model. The final model contained 11 items. See Table 2 for fit indices (additional fit indices are available from the authors). According to these guidelines, the trimmed three-factor model demonstrated adequate fit and was superior to the one-factor model (Δχ2= 315.18, Δdf = 3, p < .001). Because items were trimmed during this stage, additional distracter items were removed as well, resulting in a 22-item scale with an equal number of active and distracter items.

Table 2. 
Confirmatory Factor Analysis
  1. Note. All coefficients were statistically significant at p < .001. All R2 values were greater than .55. N= 182. Phi, Theta Delta, and Lambda matrices and R2 values are available upon request from the first author. CFI = confirmatory fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean squared residual.

Untrimmed 17-item three-factor model278.72116.971.067.088
One-factor model401.28 44.864.118.229
Trimmed 11-item three-factor model 86.10 41.983.048.076

Factor 1, named the Fake factor (5 items), comprises items noting the unrealistic perfection of the appearance of models and that such perfection is often artificially created. The second factor was named the Questioning/Accusing factor (3 items). The items loading on this factor criticize and question the use of such images and directly suggest that such images are harmful to women. The final factor, Too Thin (3 items), includes arguments that the women in these images are too thin and/or unhealthy. The Fake and Questioning/Accusing factors had a correlation of .52, the Fake and Too Thin factors at .69, and the Questioning/Accusing and Too Thin factors at .52. Each factor was treated as a subscale, with the score for each subscale being the mean of items loading on that factor. Cronbach's alphas for subscales were high (.92, .85, and .84, respectively). Thus, the combination of exploratory and confirmatory factor analysis was successful in identifying three meaningful and internally consistent CPBI subscales. The complete CPBI scale (including distracter questions) is located in the Appendix.

Convergent and Discriminant Validity

Descriptive statistics for the three subscales and the correlation matrix are located in Table 3. Due to the number of correlations being examined (a total of 39, a family-wise error rate of .86), a Bonferroni-type correction was used such that the family-wise error rate was divided by the number of comparisons being made, yielding an individual alpha level of .02.

Table 3. 
Correlations Between Three Critical Processing of Beauty Images Scale Subscales and Published Measures
MeasureFakeQuestioningToo thin
  1. Note. N= 350. The complete correlation matrix is available from the first author. BIDR = Balanced Inventory of Desirable Responding; EDI = Eating Disorder Inventory–2; MBSRQ–AS = Multidimensional Body-Self Relations Questionnaire—Appearance Scales; SATAQ-3 = Sociocultural Attitudes Toward Appearance Questionnaire-3.

  2. aPossible scores range from 1 (infrequent critical thoughts in this area) to 5 (frequent critical thoughts in this area).

  3. bPossible scores range from 0 to 40 (high levels of socially desirably responding).

  4. cPossible scores range from 18 (low need for cognition) to 90 (high satisfaction with appearance).

  5. dBMIs below 18.5 are considered underweight, 18.5–24.9 normal, 25–29.9 overweight, and 30 and above obese.

  6. ePossible scores range from 1 (not at all interested) to 7 (extremely interested).

  7. fPossible scores range from 0 (low levels of dissatisfaction) to 27 (high levels of dissatisfaction).

  8. gPossible scores range from 1 (low satisfaction with appearance) to 5 (high satisfaction with appearance).

  9. hPossible SATAQ scores range from 9–45 (Internalization General and Information), 5–25 (Internalization Athlete), and 7–35 (Pressures).

  10. *p <.001.

Fake subscalea 
Questioning subscalea.44*    – 
Too thin subscalea.59*.41*  –
Need for cognition scalec.05.01.05
Body Mass Indexd.00.11.17*
Interest in women's studiese.13*.19*.08
Interest in media studiese.05.12*.04
Number of women's studies courses taken.11.10.14*
Body dissatisfaction (EDI-2)f.04.24*.02
Appearance evaluation (MBSRQ-AS)g.01.26*.03
SATAQ-3: Internalization generalh.01.36*.06
SATAQ-3: Internalization athleteh.08.37*.16*
SATAQ-3: Informationh.03.20*.08
SATAQ-3: Pressuresh.08.46*.05
Standard deviation0.981.031.00

The pattern of correlations with body image–related variables was complex. Interestingly, higher scores on the Too Thin subscale were associated with higher BMIs. Thus, the less participants' bodies exemplified the typical media ideal in terms of size, the more likely participants were to criticize that ideal as being too thin. Additionally, the Too Thin subscale was positively correlated with the Internalization-Athlete subscale of the SATAQ-3. The Fake subscale was not significantly correlated with any of the body image–related variables. However, the Questioning/Accusing subscale demonstrated a number of significant relationships. Specifically, higher scores on this subscale were associated with increased body dissatisfaction and decreased satisfaction with appearance. Scores on the Questioning/Accusing subscale were also significantly and positively associated with all subscales of the SATAQ-3.

Overall, Study 2 provided strong evidence of the discriminant validity of CPBI scores. Specifically, CPBI scores were not significantly correlated with socially desirable responding or the more general measure of the tendency to engage in effortful cognitive processing. The three subscales demonstrated a more complex series of results with regard to convergent validity. Scores on all three subscales were associated with interest in women's studies and number of women's studies courses taken. These results are consistent with several possible interpretations. First, those concerned with media representations of women may gravitate toward women's studies courses. Second, women's studies courses may foster an increased tendency to notice and critique media representations of women that are deemed harmful by many. Additionally, individuals with feminist attitudes may be more likely both to enroll in such courses and to critique media representations. Of course, these possible interpretations are not mutually exclusive and may be related in a reciprocal manner. Interest in media studies was associated only with the Questioning/Accusing subscale. Given that this subscale comprises the most general of the media-related criticisms included in the CPBI, it makes sense that those with a more general interest in media studies (i.e., not specifically as it relates to women's studies) would have higher scores on this subscale. The pattern of correlations with scores on the Too Thin subscale also suggests multiple interpretations. Given that women with higher BMIs had higher scores on this subscale, claiming that women in media images are too thin may simply be a defensive technique, designed to reduce the negative impact of the perceived discrepancy between one's own body and the idealized female bodies often seen in the media. On the other hand, scores on this subscale were also positively correlated with the Internalization-Athlete subscale of the SATAQ-3. Questions on this subscale of the SATAQ focus on the desire to look athletic. Thus, it is possible that women scoring higher on the Too Thin subscale have internalized a body ideal that is more athletic (and perhaps more consistent with their own body type), such that many models or actresses really do appear too thin when evaluated via such a standard. Surprisingly, scores on the Fake subscale showed no relationships with appearance-related satisfaction or internalization of media ideals. Given that many media literacy efforts focus on demonstrating how media images of women are graphically manipulated to achieve an unrealistic level of perfection, this finding is worthy of further research. It is possible that such arguments help some to dismiss such images as inappropriate targets for social comparison; the same arguments may discourage other women because they are a reminder of how impossible the ideal is to achieve. Finally, scores on the Questioning/Accusing subscale were consistently associated with greater appearance-related dissatisfaction and higher levels of internalization. This finding is consistent with the research by Botta (1999, 2003) and Nathanson and Botta (2003), suggesting that questioning the nature of such images is related to increased body image disturbance. However, it is impossible to determine with this data the causal direction of this relationship. In other words, does one question these images because they are already experienced as harmful or does questioning these images lead to greater body image disturbance, perhaps due to an increased amount of attention devoted to processing the images? Thus, more research is also warranted in this area, particularly experimental or longitudinal designs that could help to explicate the direction of this relationship.


Due to concern about the ability of participants to accurately gauge the frequency with which they have the types of thoughts listed in the CPBI, an additional study was undertaken to test whether scores on the CPBI could predict the number of critical statements related to beauty standards made in response to advertisements featuring idealized images of female beauty.



Participants were 120 female undergraduate students (mean age = 18.36, SD = 0.68) who took part in the study to receive course credit as part of an introductory psychology participant pool. Sixty-six percent of participants identified themselves as White/Caucasian, 9% as Black/African American, 8% as Hispanic/Latina, 8% as South Asian/Indian/Pakistani, 6% as East Asian, and 3% as biracial.


For the first half of this study, participants viewed three advertisements from recent women's magazines. The models in these ads were all rated as highly attractive during pilot testing. Two ads were for bathing suits and featured women in bikinis; the third was a makeup ad, featuring a close-up of a woman's face. Participants were asked to write their thoughts in response to these advertisements and were given space to write up to 10 thoughts (see Engeln-Maddox, 2005, for more details about this methodology). All thoughts listed were coded by undergraduate research assistants into one of four categories. The first three categories were descriptions of the three subscales of the CPBI listed above; the fourth category was for other thoughts not consistent with one of these three categories. Comments coded into the Fake category included those indicating that the model was airbrushed, fake, or computer-enhanced; comments indicating that her appearance was the result of makeup or photography tricks; and comments indicating that the model's appearance was too perfect to be real. The second category (consistent with the Questioning/Accusing Subscale) included comments that these types of models make women feel bad about their own appearance in comparison or make women have low self-esteem, comments questioning why advertisers use models like this or why models have to look this way, comments suggesting that such images have a negative impact on women or girls, comments suggesting that the model does not represent the average woman, and expressions of anger over the use of such images. The final category included thoughts consistent with the Too Thin Subscale. Specifically, comments indicating that the model is too skinny, needs to eat, has an eating disorder, is unattractive due to her ribs showing, and suggestions that such images cause eating disorders in others were coded into this category.

All thoughts were coded by two research assistants, who demonstrated adequate inter-rater reliability with a kappa of .81. Disagreements with regard to coding category were resolved through discussion.

Three weeks after completing the first phase of this study, participants were e-mailed a link to an online version of the CPBI scale and a brief demographics questionnaire. Sixteen participants were dropped from analyses for failure to complete this second phase of the study.

Results and Discussion

The data from the thought-listing task are not normally distributed, but instead are count data with Poisson distributions. Thus, Poisson regressions were performed for the following analyses. For each subscale, the count of thoughts generated (in the relevant coding category) was predicted from its associated subscale score. In a second step, the other two factor scores were added as predictors of the outcome variable. Results of these analyses are contained in Table 4. In the first step, the Fake and the Questioning/Accusing subscale scores successfully predicted the number of thoughts in their respective coding categories. For both of these subscales, in the second step the coefficient for the factor score of interest remained statistically significant, whereas the coefficients of the additional two scale scores were not significant. These results provide evidence of both convergent and discriminant validity for the Fake and Questioning/Accusing subscales. However, results were more mixed for the Too Thin subscale. Scores on this subscale marginally predicted the count of related thoughts (p= .05) at Step 2 of the analysis. However, scores on the Fake Subscale also marginally predicted this count (p= .06), and the Too Thin subscale did not predict related thoughts at Step 1.

Table 4. 
Poisson Regression Analyses Predicting Count of Thoughts Relating to Each Subscale
Count of thoughts consistent with fake subscaleaβSEzp
  1. Note. N= 112. Poisson regression coefficients must be exponentiated for interpretation purposes.

  2. aPseudo R2= .04 for Step 1, Δ pseudo R2= .01 for Step 2, p= .22.

  3. bPseudo R2= .05 for Step 1, Δ pseudo R2= .01 for Step 2, p= .59.

  4. cPseudo R2= .01 for Step 1, Δ pseudo R2= .02 for Step 2, p= .07.

Step 1 
Fake subscale score.
Step 2 
Fake subscale score.41.113.61<.001
Questioning/accusing subscale score.
Too thin subscale.
Count of thoughts consistent with questioning/accusing subscalebβSEzp
Step 1 
Questioning subscale score.
Step 2 
Questioning subscale score.
Fake subscale score.
Too thin subscale.
Count of thoughts consistent with too thin subscalecβSEzp
Step 1 
Too thin subscale score.
Step 2 
Too thin subscale score.
Questioning/accusing subscale score.
Fake subscale score.

Overall, scores on the CPBI subscales were related to the actual number of critical thoughts generated in response to advertisements featuring idealized images of women, suggesting that participants were able to provide a realistic account of their tendency to have such thoughts in response to media images. Evidence in this regard was strongest for the Questioning/Accusing and Fake subscales. However, scores on both the Too Thin and Fake subscales predicted the number of thoughts consistent with the Too Thin theme, suggesting the need for further research to determine the overlap between these two types of responses.


To determine whether scores on the three subscales of the CPBI are relatively stable over time, an additional study was undertaken to examine test-retest reliability.



Participants were 96 college women (mean age = 19.07, SD= 1.74) who participated in the study in exchange for course credit. Participants were recruited from a psychology department participant pool. Sixty-seven percent were White/Caucasian, 9% Hispanic/Latina, 8% East Asian, 8% South Asian/Indian/Pakistani, 5% Black/African American, 1% Middle Eastern/Arabic, and 1% biracial.


Participants were randomly assigned to one of four groups. Each participant completed the CPBI scale twice (1 week apart for Group 1, 2 weeks for Group 2, 3 weeks for Group 3, and 4 weeks for Group 4). The scale was completed online each time. Two participants were dropped for failing to complete the second administration of the scale.

Results and Discussion

Test-retest reliability coefficients were strong for all four groups, ranging from .67 to .86. All test-retest coefficients are listed in Table 5. Thus, scores on the CPBI appear to be relatively consistent over time.

Table 5. 
Test-Retest Coefficients
Time between testsSubscale
FakeQuestioningToo Thin
  1. Note. For 1-week group, N= 24; for 2-week group, N= 23; for 3-week group, N= 25; for 4-week group, N= 22.

1 week.86.82.85
2 weeks.67.75.75
3 weeks.80.84.81
4 weeks.73.78.76


This series of studies provided evidence that scores generated by the CPBI scale are reliable, both in terms of internal consistency and test-retest reliability. Both exploratory and confirmatory factor analyses supported the existence of three meaningful subscales on the CPBI. Additionally, several types of validity evidence supported the construct validity of the scale and its subscales. Most important in terms of construct validity, scores on the three subscales successfully predicted the number of critical arguments participants made when viewing idealized images of women in advertisements, suggesting that participants were able to provide a relatively accurate account (via the self-report scale of the CPBI) of how often they have different types of critical thoughts when faced with the media's beauty ideal for women. Additionally, as predicted, participants who had more interest in women's studies and had taken more women's studies courses (and presumably encountered issues surrounding media images of women during these courses) scored higher on all three subscales. Scores on the CPBI scales were not positively correlated with social desirability scores, suggesting that social desirability concerns did not influence responses. Likewise, subscale scores did not correlate with need for cognition, suggesting that the tendency to critique these images is not simply a manifestation of a general willingness to engage in effortful cognition.

The CPBI scale offers researchers an efficient way to collect data regarding how their participants typically respond to media images of women and has the potential to be useful in a number of research contexts. For example, prior to implementing media literacy programs aimed at teaching women how to criticize the beauty ideals presented by these images, researchers can use this scale to gauge the ways in which they already do so. Furthermore, given the mixed results regarding whether and how critical processing influences the impact of exposure to idealized images, the scale can be of use to researchers seeking to examine the impact of increases in this type of critical processing. One possibility is that different types of critical responses to these images have different effects. Some ways of critiquing these images may work as protective factors whereas others may result in increased risk, perhaps as a result of increased cognitive processing. Such a possibility is supported by the data from Study 2. Specifically, the Questioning/Accusing subscale of the CPBI demonstrated a consistent pattern of relationships with body image–related measures, such that those most likely to accuse these images of being harmful to women were actually those with the highest scores on body dissatisfaction and internalization. Thus, those who are most influenced by these images may be the most likely to attack them. In some ways, it is not surprising that those who endorse the idea that these images hurt women may also be the most likely to be hurt by these images themselves. A woman who believes that these images lead her to evaluate her own appearance negatively may generalize her experience to other women.

On the other hand, consistent with earlier research by Engeln-Maddox (2005), believing that idealized media images of beauty are fake and unrealistic had no relationship to appearance-related dissatisfaction or internalization. Knowing that a standard is unrealistic may not necessarily diminish one's desire to emulate that standard. Thus, perhaps these types of arguments do not represent an especially powerful form of critical processing. This finding is notable, given that media literacy programs often focus heavily on the degree to which media images featuring models are fake and unrealistic. Furthermore, endorsing beliefs that the media's beauty ideal is too thin was not significantly associated with appearance-related satisfaction. Too Thin arguments were positively associated with greater internalization of the ideal represented by athletes/sports figures. Perhaps those who long for a strong athletic body are more likely to be critical of the ultra-thin standard that has become ubiquitous among models and actresses.

Although the initial evidence for the viability of this scale is strong, the limitations of this series of studies suggest several areas for additional work. The CPBI was developed and tested with college women participants who were recruited primarily from introductory psychology participant pools. Future work should explore whether the scale is appropriate for use with girls and noncollege adult female populations. The samples of participants in these studies were relatively diverse in terms of ethnicity, but sample sizes of individual ethnic/racial groups were not large enough to explore possible differences in CPBI scores by ethnicity. Given evidence that some minority groups may be especially critical of mainstream beauty ideals (e.g., Milkie, 1999), this is certainly an area for further exploration. Future research may also determine whether this scale could be used with male respondents. Given the recent focus on the impact of the muscular media ideal on men, an adaptation of this scale to reflect such concerns may also be warranted.

An additional limitation is that the strongest piece of validity evidence for this scale (discussed above in Study 3) was based on responses to only a few print advertisements. However, the CPBI scale was designed to assess responses to a wide variety of forms of media. Thus, future research should explore the potential for this scale to predict responses to idealized media images of women found in other contexts.

The results of this series of studies suggest that the CPBI scale can be a valuable tool in helping researchers explicate the relationships between exposure to idealized images of female beauty, critical processing of media-enforced beauty ideals, and body image disturbance. The results of Study 2 in particular suggest a number of avenues for future research. For example, given the findings regarding the Questioning/Accusing subscale, future research should explore whether those who are most critical of these images may also be the most influenced and distressed by them. Given our findings, there is reason to question whether media literacy campaigns can be effective in reducing the impact of these images, especially given concerns that arguing against such images is essentially giving them more air time or allowing for more ruminative processing. The initial evidence presented here suggests that such interventions could exacerbate body image disturbance and increase internalization. This is a cautious interpretation however, as the direction of influence cannot be established with the methodology used in these studies. It may be that avoidance of in-depth processing of these images is more effective in reducing their impact than engaging in critical processing. The CPBI is proposed as a useful tool to address these and other questions with empirical data.


CPBI Scale

  • 1I wish I had clothes like that.d
  • 2Images like that make women want to shop.d
  • 3Images like that make women feel like they have to look perfect.b
  • 4She should eat more.c
  • 5She probably has a fun life.d
  • 6Why do models have to be so perfect-looking?b
  • 7Her breasts are too big.d
  • 8She looks malnourished.c
  • 9She's too old.d
  • 10I wonder how she gets her hair to look like that.d
  • 11Images like that make women feel badly about themselves.b
  • 12Nobody looks like that without computer tricks.a
  • 13I wonder what her life is like.d
  • 14She's airbrushed.a
  • 15I like her clothes.d
  • 16They probably used computer re-touching to make her look like that.a
  • 17She should change her hairstyle.d
  • 18She has nice eyes.d
  • 19That kind of perfection isn't real.a
  • 20It takes a lot of camera tricks to make someone look that good.a
  • 21I wonder how much money she makes.d
  • 22She's too skinny to be healthy.c

aFake Subscale; bQuestioning/Accusing Subscale; cToo Thin Subscale; dDistracter.