Development of a Brief Measure to Assess Quality of Life in Obesity

Authors


1004 Norwood Avenue, Durham, NC 27707. E-mail: kolot001@mc.duke.edu

Abstract

Objective: Obesity researchers have a growing interest in measuring the impact of weight and weight reduction on quality of life. The Impact of Weight on Quality of Life questionnaire (IWQOL) was the first self-report instrument specifically developed to assess the effect of obesity on quality of life. Although the IWQOL has demonstrated excellent psychometric properties, its length (74 items) makes it somewhat cumbersome as an outcome measure in clinical research. This report describes the development of a 31-item version of the IWQOL (IWQOL-Lite).

Research Methods and Procedures: IWQOLs from 996 obese patients and controls were used to develop the IWQOL-Lite. Psychometric properties of the IWQOL-Lite were examined in a separate cross-validation sample of 991 patients and controls.

Results: Confirmatory factor analysis provided strong support for the adequacy of the scale structure. The five identified scales of the IWQOL-Lite (Physical Function, Self-Esteem, Sexual Life, Public Distress, and Work) and the total IWQOL-Lite score demonstrated excellent psychometric properties. The reliability of the IWQOL-Lite scales ranged from 0.90 to 0.94 and was 0.96 for the total score. Correlations between the IWQOL-Lite and collateral measures supported the construct validity of the IWQOL-Lite. Changes in IWQOL-Lite scales over time correlated significantly with changes in weight, supporting its sensitivity to change. Significant differences in IWQOL-Lite scale and total scores were found among groups differing in body mass index, supporting the utility of the IWQOL-Lite across the body mass index spectrum.

Discussion: The IWQOL-Lite appears to be a psychometrically sound and clinically sensitive brief measure of quality of life in obese persons.

Introduction

In 1947, the World Health Organization defined health as “not merely the absence of disease, but complete physical function, social function, role function, mental health, and general heath perceptions” (1). In recent years, the terms quality of life and, more specifically, health-related quality of life (HRQOL) have been used to refer to the “physical, psychological, and social domains of health, seen as distinct areas that are influenced by a person's experiences, beliefs, expectation, and perceptions” (2). With increasing frequency, clinical researchers are choosing measures of HRQOL as primary and secondary outcomes in clinical trials (3). As new anti-obesity drugs are developed, there is an increasing need to measure HRQOL in obese persons participating in clinical trials for the treatment of obesity.

It is well known that obesity may impact important aspects of HRQOL, such as physical health, emotional well-being, and psychosocial functioning (4) (5) (6) (7). HRQOL has been found to vary directly with the severity of obesity among individuals seeking weight-loss treatment, with the most obese individuals having the poorest quality of life (8) and obese persons reporting pain showing the greatest impairments (9). Obese persons seeking treatment have been shown to be significantly more physically impaired than those who are not trying to lose weight (10). Furthermore, increases in weight have been associated with deteriorated physical, but not emotional, well-being (11) (12), and weight loss has been associated with improved physical rather than mental components of HRQOL (12). HRQOL has been shown to improve after treatment (13) (14) (15).

In each of the above studies, the researchers measured HRQOL using the SF-36, a generic quality-of-life measure that assesses physical functioning, role limitations due to physical health, bodily pain, general health, vitality, social functioning, role limitations due to emotional problems, and mental health (16). Whereas generic instruments allow for comparisons of quality of life across various disease states and provide insight into improvements in general health, disease-specific instruments are designed to focus on the domains, characteristics, and complaints most relevant to a particular disease. “Disease-specific measures are clinically sensible in that patients and clinicians intuitively find the items directly relevant; their increased potential for responsiveness is particularly compelling in the clinical trial” (3). Disease-specific instruments are usually more sensitive to changes in quality of life that result from treatment (17). In clinical trials in which specific therapeutic interventions for specific diseases are being evaluated, it is generally recommended that disease-specific instruments be used in addition to generic measures (2) (3). The Impact of Weight on Quality of Life questionnaire (IWQOL) is one such disease-specific instrument (18) (19).

The IWQOL was the first instrument specifically developed to assess quality of life in obesity (18) (19).) The IWQOL was developed in a clinical setting for moderate to severe obesity, and measures those aspects of quality of life that were identified by obese persons in treatment to be of greatest concern. Eight areas of functioning are assessed by the IWQOL: health, social/interpersonal, work, mobility, self-esteem, sexual life, activities of daily living, and comfort with food.

Since the initial development of the IWQOL, several other quality-of-life instruments for obesity have been in development. Mathias et al. (20) developed a 55-item HRQOL measure containing global domains (general health, comparative health), obesity-specific domains (overweight distress, depression, self-regard, and physical appearance), and an obesity-specific health state preference. Sullivan et al. (21) developed an HRQOL measure (assessing general health perceptions, mental well-being, mood disorders, social interaction, and obesity-specific symptoms) that was derived largely from existing instruments (General Health Rating Index, Mood Adjective checklist, Hospital Anxiety and Depression scale, Sickness Impact profile, and an eight-item obesity-specific module developed from a cancer survivors’ study questionnaire) and is being used in the Swedish Obese Subjects intervention trial on morbidly obese participants. An 11-item instrument that assesses physical state, vitality/desire to do things, relations with other people, and psychological state (Obesity-Specific Quality of Life questionnaire) was developed in France using a large community-based sample (22). Another instrument, the Obesity-Related Well-Being scale (ORWELL 97), was developed to examine the occurrence of physical and psychosocial symptoms as well as the subjective relevance of these symptoms in obese patients. (23). The ORWELL 97 consists of 18 items and 2 factors: Factor 1 measures psychological status and social adjustment; Factor 2 measures physical symptoms and impairment. The Obesity Adjustment Survey (24) measures the psychological distress of individuals who are morbidly obese.

The IWQOL has been shown to be a reliable and valid instrument for measuring posttreatment changes in HRQOL (18) (19). However, researchers designing clinical trials have commented unfavorably on the length of the IWQOL (74 items), citing the potential for response burden to research subjects. As new anti-obesity compounds are developed, there is an increasing need for brief, psychometrically sound measures of HRQOL in obese persons undergoing clinical trials for the treatment of obesity.

This report describes the development and psychometric properties of a brief version of the IWQOL (IWQOL-Lite) that is more convenient for use as an outcome measure in obesity research and even more psychometrically sound and clinically sensitive than the original IWQOL.

Research Methods and Procedures

Subjects

Baseline data from the original 74-item IWQOL were accumulated from 1987 individuals who had previously taken the IWQOL in a variety of settings. These individuals included 211 obese subjects from an open-label study of phentermine-fenfluramine, 834 subjects who participated in an intensive, day treatment program for weight loss and lifestyle change, 668 subjects who were participants in outpatient weight reduction studies or weight control programs, 51 subjects who were undergoing gastric bypass surgery for obesity, and 223 community subjects including employees and friends of local businesses. One-year data were also available for 160 subjects from the open-label study of phentermine-fenfluramine who completed at least 1 year in the treatment program. Information on gender, age, and body mass index (BMI) is presented in Table 1. From Table 1, we can see that age and BMI varied depending on group membership. Mean age varied from 37 in the community sample to 52 in the day treatment program. Mean BMI varied from 27 in the community sample to 51 in the gastric bypass sample. Information on race was available on 1538 subjects (77% of the sample), with the following racial breakdown: 71% white, 16% Hispanic, 10% African American, 1.4% Asian, 1.6% Other.

Table 1.  Study demographics
StudyGroupNGenderAgeBMI
Open-label phen-fenObese21132 males47.8 ± 10.342.0 ± 8.7
    179 females44.4 ± 9.340.7 ± 7.1
Day treatmentObese834 322 males52.1 ± 14.341.7 ± 11.1
    512 females50.2 ± 17.637.0 ± 9.1
Weight-reduction studies/programsObese66891 males47.4 ± 9.435.5 ± 6.2
    577 females43.9 ± 11.335.0 ± 6.1
Gastric bypassObese5111 males46.6 ± 4.550.6 ± 10.0
   40 females38.5 ± 9.751.0 ± 12.9
Employees/friendsCommunity223 159 males37.2 ± 11.427.7 ± 3.6
   64 females39.0 ± 14.126.5 ± 10.1
Total 1987 615 males47.3 ± 14.137.2 ± 10.8
   1372 females45.9 ± 14.336.6 ± 9.4

Collateral Measures

BMI data were available for 1907 (96%) of the subjects at baseline. In addition, data from the Rosenberg Self-Esteem scale (25), the Beck Depression Inventory (BDI) (26), and the SCL-90-R (27) were available at baseline and after 1 year for 160 subjects in the open-label study of phentermine-fenfluramine.

Procedures and Statistical Analyses

Item Selection and Scale Construction.

The sample of 1987 individuals was divided into two groups through random assignment: the development sample (N = 996) and the cross-validation sample (N = 991). The development sample and the cross-validation sample were not statistically different in terms of age, BMI, gender, or race (p > 0.25). The development sample provided data on which decisions about item selection and scale construction were based. Selection of items and construction of scales in the development sample was accomplished by compiling data on each of the 74 original IWQOL items. Data included inter-item correlations, frequency distributions, α coefficients for scales and total score, item-to-scale correlations for scales and total score, baseline correlations with collateral measures, exploratory factor analysis, and 1-year change correlations between items and collateral measures. Items were selected if they were adequately distributed across item responses (“never true” to “always true”), maximized α coefficients, maximized item-to-scale correlations, and correlated significantly with relevant collateral measures, both cross-sectionally and longitudinally. Initially, we deleted items from the original 74 if they correlated poorly with the scale score, correlated poorly with the total score, or correlated poorly with BMI. Changes in items at the 1-year follow-up were then correlated with changes in BMI at the same follow-up. Additional items were deleted if they did not correlate well with changes in BMI. Exploratory factor analyses were then performed on the reduced set of items. Factor loadings were examined, and additional items were deleted if they did not load adequately on the derived factors. This process was repeated until an acceptable and interpretable factor structure was obtained. Assignment of items to scales was therefore based both on exploratory factor analysis and on the original scale composition. All decisions about item selection were finalized before analyzing any data in the cross-validation sample.

Psychometric Evaluation.

Psychometric evaluation was based entirely on the cross-validation sample. Using the cross-validation sample, a two-way ANOVA (gender by BMI classification; <25, 25 to 29.9, 30 to 34.9, 35 to 39.9, and >40) was performed for each of the newly developed scales and total score. An α coefficient of 0.01 was used to control for multiple comparisons. Post hoc tests for comparisons between BMI groups were performed using Tukey's honestly significant difference procedure (28) based on an α coefficient of 0.01.

A series of confirmatory factor analyses was performed on the cross-validation sample to evaluate the hypothesized scale structure using EQS software (Multivariate Software Inc., Encino, CA) (29). Details of confirmatory factor analyses are presented in Appendix 1.

The sensitivity of the IWQOL-Lite was evaluated on the cross-validation sample using two different methods. First, effect sizes were calculated between adjacent BMI groups (<25 vs. 25 to 29.9, 25.9 to 29.9 vs. 30 to 34.9, 30 to 34.9 vs. 35 to 39.9, etc.) and between extreme BMI groups (<25 vs. >40). Effect sizes were calculated as the difference between group means (after adjusting for age and gender) divided by the SD for the entire sample. For example, the effect size comparing <25 with >40 would be calculated as the adjusted mean for <25 minus the adjusted mean for >40. That quantity is then divided by the SD for the entire group. Next, effect sizes were calculated in the 1-year longitudinal sample (N = 160) for three groups: those subjects losing <10% of baseline BMI, those subjects losing 10% to 20% of their baseline BMI, and those subjects losing >20% of their baseline BMI. Effect sizes were calculated as the difference between IWQOL-Lite scores at baseline and at 1 year (after adjusting for age and gender) divided by the SD at baseline (30). Effect sizes were calculated rather than Guyatt's responsiveness statistic because there were not sufficient cases with stable weight over the 1-year period (97% lost at least 10 pounds and 89% lost at least 20 pounds, leaving only five subjects who lost <10 pounds.)

Results

Analysis of the development sample led to the specification of a 31-item instrument (IWQOL-Lite) consisting of five scales: Physical Function (11 items), Self-Esteem (7 items), Sexual Life (4 items), Public Distress (5 items), and Work (4 items). The correlation between the new 31-item instrument (IWQOL-Lite) and the longer 74-item instrument (IWQOL) was 0.97. A description of the IWQOL-Lite items and scales follows.

As in the original IWQOL, all items are rated by the research subject as “always true,” “usually true,” “sometimes true,” “rarely true,” or “never true;” “always true” responses were given a score of 5, “never true” responses were given a score of 1. Scale scores are obtained by adding item scores, and the total score is obtained by adding scale scores. Higher scores indicate poorer quality of life. All items except for four begin with the phrase “because of my weight.” The 11-item Physical Function scale is concerned with mobility and day-to-day physical functioning (e.g., “Because of my weight, I have difficulty getting up from chairs.”). Seven of the 11 items on the Physical Function scale were originally on the Mobility scale of the longer IWQOL, and 4 were from the Health scale. The seven-item Self-Esteem scale assesses self-esteem concerns related to weight (e.g., “Because of my weight, I don't like myself.”). Five of the seven items were on the original Self-Esteem scale of the IWQOL, and two were from the Social/Interpersonal scale. The four-item Sexual Life scale assesses sexual limitations related to obesity (e.g., “Because of my weight, I have little or no sexual desire.”). All of the items on the IWQOL-Lite Sexual Life scale were on the original IWQOL Sexual Life scale. Three of the five items on the Public Distress scale pertain to fitting in public places (e.g., “Because of my weight, I worry about finding chairs that are strong enough to hold my weight.”) and were on the Activities of Daily Living scale of the IWQOL. Two of the items on the Public Distress scale pertain to negative reactions from others (e.g., “Because of my weight, I experience ridicule, teasing, or unwanted attention.”) and were on the Social/Interpersonal scale of the IWQOL. The four items on the Work scale are concerned with work performance as it relates to weight (e.g., “Because of my weight, I have trouble getting things accomplished or meeting my responsibilities.”). All of the items on the Work scale were on the Work scale of the 74-item IWQOL.

All results reported below are based on the crossvalidation sample.

Reliability and Separation of Scales

Reliability coefficients (Cronbach α coefficients) for the individual scales were as follows: Physical Function, 0.94; Self-Esteem, 0.93; Sexual Life, 0.91; Public Distress, 0.90; and Work, 0.90, with the overall α coefficient equaling 0.96.

Table 2 presents correlations between each item and its designated scale in bold type (corrected for the influence of that item), and the correlations between each item and the other scales in normal type. Correlations between each item and its designated scale were all significant at p < 0.001. For each item, the item-to-scale correlation was higher for the designated scale (indicated in bold) than for any other scale. In addition, for all but one item (“Worried about health”), that correlation was higher than the corrected item-to-total score correlation. Finally, for all items, the item-to-total score correlation is higher than all item-to-scale correlations except the designated scale. This pattern of results suggests excellent reliability and distinct separation of scales.

Table 2.  Item-to-scale correlations
 Physical FunctionSelf- EsteemSexual LifePublic DistressWorkTotal
  1. Correlations between each item and its designated scale are in bold type. Also in bold type are correlations between each item and total score. All correlations are corrected for overlap.

Physical Function (α = 0.94)      
Picking up objects0.840.430.500.610.540.75
Tying shoes0.840.440.490.610.510.74
Getting up from chairs0.840.440.470.640.550.75
Using stairs0.840.460.480.630.530.76
Dressing0.800.450.480.560.570.73
Mobility0.840.480.490.620.610.77
Crossing legs0.760.470.460.630.440.71
Feel short of breath0.660.450.420.480.470.64
Painful stiff joints0.610.320.390.380.430.55
Swollen ankles/legs0.610.310.360.430.390.54
Worried about health0.440.400.320.380.370.48
Self-Esteem (α = 0.93)      
Self-conscious0.470.840.530.510.480.69
Self-esteem not what it could be0.470.860.540.490.510.69
Unsure of self0.480.840.520.540.550.71
Do not like myself0.410.780.540.440.470.63
Afraid of rejection0.410.740.480.560.490.63
Avoid looking in mirrors0.440.730.500.520.420.63
Embarrassed in public0.500.730.540.560.510.69
Sexual Life (α = 0.91)      
Do not enjoy sexual activity0.410.480.800.330.410.55
Little sexual desire0.510.560.790.370.480.63
Difficulty with sexual performance0.570.540.770.490.490.68
Avoid sexual encounters0.510.600.830.450.490.67
Public distress (α = 0.90)      
Experience ridicule0.420.550.350.620.410.57
Fitting in public seats0.660.480.410.830.470.70
Fitting through aisles0.670.520.430.870.480.73
Worry about finding chairs0.650.470.380.830.470.69
Experience discrimination0.480.610.370.640.440.62
Work (α = 0.90)      
Trouble accomplishing things0.580.530.460.490.780.67
Less productive than could be0.550.540.500.450.790.66
Do not receive recognition0.570.470.490.490.850.66
Afraid to go on interviews0.500.490.410.470.720.61

Interscale Correlations

Interscale correlations ranged from 0.46 (Sexual Life and Public Distress) to 0.70 (Physical Function and Public Distress). Uncorrected correlations with total scale score ranged from 0.74 (Sexual Life) to 0.89 (Physical Function), whereas corrected correlations ranged from 0.66 (Sexual Life) to 0.73 (Public Distress). The equivalence of correlations between individual scales and total score corrected for overlap suggests that scales contribute comparably to the total score.

IWQOL-Lite Scores by Gender and BMI

The results of the two-way ANOVAs are shown in Figure 1. (Means and SDs for each group are available from the authors on request.) The main effect for BMI group was significant for all scales and for the total score (p < 0.001), with higher scores associated with increasing BMI. Post hoc tests revealed that all groups were significantly different (p < 0.01) from each other with the exception of the following: <25 vs. 25 to 29.9 for all five scales and the total score; 25 to 29.9 vs. 30 to 34.9 for Work; and 30 to 34.9 vs. 35 to 39.9 for Self-Esteem, Sexual Life, and Work. In addition, the main effect for gender was significant for Sexual Life, with females reporting more problems than males. Finally, both the main effect for gender and the BMI group-by-gender were significant for Self-Esteem and total score. In both cases, females overall reported more problems, with the greater disparity between genders occurring for the lower BMI groups.

Figure 1.

IWQOL-Lite Scores by BMI and gender.

Baseline Correlations with BMI

All five scales and the total IWQOL-Lite score correlated significantly (p < 0.001) with BMI at baseline. Individual correlations were: Physical Function, 0.61; Self-Esteem, 0.34; Sexual Life, 0.30; Public Distress, 0.68; Work, 0.35; and Total Score, 0.59.

One-Year Change Correlations

We examined correlations between changes in IWQOL scores and changes in collateral measures (BMI, Rosenberg Self-Esteem scale, BDI, SCL-90-R Global Severity Index [GSI]) for the 160 patients from the open-label phentermine-fenfluramine study who had data available both at baseline and 1 year. At this point in treatment, patients had lost on average ∼18% of their body weight. Changes in all five scales and the total IWQOL-Lite were significantly (p < 0.01) correlated with changes in BMI, with Physical Function and total score correlating most highly. Four of the five scales (all except Public Distress) and the total score correlated with changes in the Rosenberg Self-Esteem scale, and, as would be expected, the correlation was highest for Self-Esteem. All five scales and the total score correlated significantly with changes in BDI and GSI, with the correlation highest in both cases with the total score.

Confirmatory Factor Analysis

Results from confirmatory factor analysis provided strong support for both the proposed scale structure of the IWQOL-Lite and for the existence of a higher order factor, presumably HRQOL. Appendix 1 provides details of the confirmatory factor analyses.

Sensitivity

Cohen (31) describes an effect size of 0.20 as small, 0.50 as medium, and 0.80 as large. The average effect size across adjacent groups (e.g., <25 vs. 25 to 29.9, 25 to 29.9 vs. 30 to 34.9, etc.) was 0.44 for the total score, 0.44 for Physical Function, 0.34 for Self-Esteem, 0.28 for Sexual Life, 0.44 for Public Distress, and 0.25 for Work. The effect size contrasting the extreme groups (i.e., <25 vs. >40) was 1.76 for the total score, 1.70 for Physical Function, 1.34 for Self-Esteem, 1.07 for Sexual Life, 1.76 for Public Distress, and 0.97 for Work. These results demonstrate that groups differing in BMI also differed in IWQOL-Lite scores.

We hypothesized that those groups experiencing greater BMI loss would exhibit greater changes (i.e., larger effect sizes) in IWQOL-Lite measures. These results are presented in Table 3. The results again clearly support the sensitivity of the IWQOL-Lite. With the exception of the Work scale, even modest losses in BMI (<10%) were associated with decreases in IWQOL-Lite scores ranging from 0.20 to 0.50 SD.

Table 3.  One-year change effect sizes in IWQOL-Lite scores by percent BMI loss group
 % BMI loss 
Scale<10% (n = 25)10 to 20% (n = 77)>21% (n = 58)Overall (n = 160)
Physical Function0.500.621.200.81
Self-Esteem0.430.650.950.73
Sexual Life0.200.360.730.46
Public Distress0.280.470.620.50
Work0.090.190.440.26
Total Score0.460.651.120.79

Discussion

This report describes the development and validation of a brief version of the IWQOL (IWQOL-Lite) that consists of 31 items and five scales: Physical Function (11 items), Self-Esteem (7 items), Sexual Life (4 items), Public Distress (5 items), and Work (4 items). The IWQOL-Lite was created partly in response to feedback from clinical researchers who felt that a 74-item instrument created a response burden for their subjects in clinical trials. In addition, the original IWQOL had been developed and tested on a homogeneous sample of patients from one treatment center who had similar racial and socioeconomic characteristics; consequently, this original questionnaire may not have had widespread generalization to other samples. In contrast to the development of the original IWQOL, the development of the new IWQOL-Lite used multiple data sources, including subjects from a controlled, nonpharmaceutical trial, subjects from an open-label pharmaceutical trial, participants at a day treatment program, subjects in outpatient weight-reduction treatment, patients undergoing gastric bypass surgery, and individuals from the community. The sample was racially diverse and contained a large number of males as well as females. One notable strength of this study is the use of separate samples to develop and validate the IWQOL-Lite. The use of separate samples minimizes the possibility of capitalizing on chance relationships among the variables that may occur in a single sample. The strong psychometric properties of the IWQOL-Lite found in the cross-validation sample provide strong evidence that this new instrument performs consistently across samples.

Another strength of the present study is the use of confirmatory factor analysis. Unlike exploratory factor analysis, which calculates factor loadings based on the actual data, confirmatory factor analysis compares the actual data to an a priori model of how the data should look. Thus, confirmatory factor analysis is hypothesis driven. The results of the confirmatory factor analyses provide strong support for the proposed scale structure of the IWQOL-Lite and the existence of a higher order factor, presumably HRQOL.

The five identified scales of the IWQOL-Lite and the total IWQOL-Lite score demonstrated excellent psychometric properties. Internal consistency reliabilities ranged from 0.90 to 0.94 for the five scales and equaled 0.96 for the total score. Furthermore, correlations between the IWQOL-Lite and collateral measures were found to support the validity of this new, brief instrument. To begin with, the IWQOL-Lite total score had a strong and fairly linear relationship with BMI at baseline. This finding is consistent with previous reports that the impact of obesity on HRQOL varies directly with severity of obesity (8). Changes in the IWQOL-Lite total score at the 1-year follow-up were also related strongly to changes in BMI at this same 1-year follow-up, consistent with other studies that reported improved HRQOL associated with weight reduction (13) (14) (15). The concordance between the total IWQOL-Lite change score and other measures of similar constructs provides solid evidence for the construct validity of the IWQOL-Lite total score. The particularly strong correlation between Physical Function and BMI at follow-up (0.45) is also consistent with the findings of other researchers who observed that BMI changes were associated more with the physical dimensions of HRQOL rather than the psychological dimensions (11) (12). As expected, scores on the Self-Esteem scale of the IWQOL-Lite at the 1-year follow-up were associated strongly with scores on the Rosenberg Self-Esteem scale, providing support for the construct validity of this scale. BDI scores and GSI scores also changed in the expected direction after subjects lost ∼18% of their weight (i.e., indicating improved quality of life).

The IWQOL-Lite demonstrated the ability to differentiate between adjacent overweight and obese groups for all scales and the total IWQOL-Lite score. The average effect size of the IWQOL-Lite total score between adjacent BMI groups was nearly a half SD (0.44). This delineation was even more pronounced for the extreme BMI groups (effect size of 1.76 for the total score). In the longitudinal analysis, the IWQOL-Lite exhibited larger effect sizes for those groups experiencing greater decreases in BMI at 1 year. For the Physical Function scale and the IWQOL-Lite total score, even modest losses in BMI (<10%) were associated with medium effect sizes (about a one-half SD decrease). With the exception of the Work scale, the other scales demonstrated small to moderate effect sizes in the group of subjects who lost <10% of their weight. These results clearly support the sensitivity of the IWQOL-Lite instrument.

When exploring gender differences, we found that women experienced the effects of their weight more profoundly than did men in the areas of Sexual Life, Self-Esteem, and total score. In each case, females overall reported more problems than males, with the greatest disparity between genders occurring in the lower BMI groups (BMI between 25 and 35). This finding is consistent with what is generally known about gender differences and body image (32).

Although this study successfully developed a new, brief measure of obesity-related quality of life and subsequently validated this instrument, there are some limitations in this study. One limitation is that subjects completed the original, 74-item IWQOL, rather than the newer 31-item version. This study, therefore, assumes that subjects would have responded similarly to items if they had taken the 31-item questionnaire. Additional psychometric data on subjects completing the 31-item IWQOL-Lite are currently being collected. Another limitation is that no statistically significant differences were observed on the IWQOL-Lite between the group with BMIs of <25 and the group with BMIs of 25 to 29.9. However, there were consistent trends suggesting lower quality of life in the higher BMI group. The failure to detect significant differences between these groups could have been due to smaller differences between BMI groups at lower BMI levels or reduced power due to small sample sizes for the group with BMIs of <25. A further limitation is that we did not obtain test–retest data on the IWQOL-Lite. Finally, the IWQOL-Lite does not specifically ask individuals to assess the relevance of each item to themselves. Rather, the original IWQOL items were developed in a clinical setting based on complaints from moderately to severely obese patients. Thus, the issue of importance of the items has been addressed for moderately to severely obese persons as a group but not on an individual basis. Most measures of symptoms (e.g., SCL-90, BDI) do not assess the importance of those specific symptoms to the individual. Therefore, this approach is by no means unique to HRQOL. In addition, the researchers who developed the ORWELL 97 did not find any relation between relevance and BMI (23).

On the basis of the above findings, we believe that the IWQOL-Lite is a convenient, clinically sensitive, and valid instrument with strong psychometric properties. We recommend the use of the IWQOL-Lite instead of the 74-item IWQOL for the following reasons: the IWQOL-Lite was developed on a heterogeneous sample of subjects, separate development and cross-validation samples were used, the scales were based on hypothesis-driven confirmatory factor analyses, and psychometrically unsound items and scales have been omitted. Although other obesity-specific HRQOL instruments have been developed since the creation of the original IWQOL, the IWQOL-Lite is the only instrument to be cross-validated in a separate sample and to have the scale structure verified with confirmatory factor analysis. In our opinion, the IWQOL-Lite would make a useful addition to the assessment protocol of clinical trials for obesity.

Future research is being planned concerning the interpretation and application of the IWQOL-Lite in different samples. Our goals are to establish separate norms for men and women and for adolescents, to determine the psychometric properties of the IWQOL-Lite in other languages and cultures, to determine clinically meaningful changes in IWQOL-Lite scores, and to collect test–retest data.

Acknowledgments

This study was supported by a research grant from Knoll Pharmaceutical Company. We thank the following individuals for their assistance: Kathleen Meter and Colleen Mc Kendrick, research assistants, for their role in data compilation; Stan Heshka, Jim Hill, George Cowan, Cynthia Buffington, Gil Hartley, and Duncan Adams for sharing their data; and Jim Mitchell for his suggestion of the name IWQOL-Lite.

Appendix 1

A series of confirmatory factor analyses was performed on the cross-validation sample to evaluate the hypothesized scale structure using EQS software (29). Three models were compared. The first model (Model 1) was a single-factor model in which all 31 items were considered to be indicators of a single global factor. The second model (Model 2) was a correlated-factors model in which items were assigned to one of the five hypothesized scales. This model specified no a priori relationship among scales, but rather allowed the correlation among scales to vary freely according to the data. The final model (Model 3) was a second-order model in which items were assigned to scales, and scales were considered to be part of a higher order construct, presumably HRQOL. The adequacy of the models was evaluated using the χ2 goodness of fit test, the Normed Fit Index (NFI), the Tucker-Lewis Index (TLI), the Comparative Fit Index (CFI), and the Standardized Root Mean Residual (SRMR). Evidence of an adequate model fit was considered if coefficients (NFI, TLI, and CFI) of ≥0.90 and an SRMR of ≤0.05 were observed (33). It was anticipated that Model 1 would not provide an adequate fit of the data due to the lack of scale specification, Model 2 would provide a test of the adequacy of the item-to-scale assignment, without specifying the relationship among scales, and Model 3 would evaluate the extent to which scales can be considered part of a more global construct. The extent to which the correlation among the five scales could be accounted for by the higher-order construct was evaluated using the target coefficient described by Marsh and Hocevar (34), calculated as the ratio of Model 2 to Model 3 χ2.

As anticipated, the fit indices for Model 1 (single-factor model) indicated that this model did not adequately characterize the data (χ2 [434] = 9678, NFI = 0.60, TLI = 0.59, CFI = 0.61, SRMR = 0.07). Model 2 (correlated-factors model) resulted in an acceptable fit (χ2 [424] = 2139, NFI = 0.91, TLI = 0.92, CFI = 0.93, SRMR = 0.05). The fit of Model 3 (second-order model) was also adequate (χ2 [429] = 2316, NFI = 0.91, TLI = 0.92, CFI = 0.93, SRMR = 0.05. The target coefficient of Marsh and Hocevar (34) indicated that 92% of the covariation among scales could be accounted for by the second-order factor. In summary, the confirmatory factor analysis provides strong support both for the proposed scale structure and for the existence of a higher-order factor, presumably HRQOL.

Ancillary