Expert panel report: Guidelines (2013) for the management of overweight and obesity in adults


Section 1: Background and Description of the NHLBI Cardiovascular Risk Reduction Project

A Background

Since the 1970s, the National Heart, Lung, and Blood Institute (NHLBI) has sponsored the development of clinical practice guidelines that have helped to accelerate the application of health research to strategies and programs for the prevention, detection, and treatment of cardiovascular, lung, and blood diseases. In 2005, recognizing the need to update the most recent cardiovascular guideline reports, namely those on high blood cholesterol, high blood pressure), and overweight and obesity in adults, NHLBI convened stakeholder groups to provide input on the next-generation guideline development process. The stakeholders emphasized the need to:

  • Maintain risk factor-specific cardiovascular clinical practice guidelines.
  • Take a standardized and coordinated approach to the risk factor guidelines updates.
  • Take a more evidence-based approach to guideline development and implementation.
  • Give more attention to dissemination and implementation issues and work closely with stakeholders in health care and community systems for translation and dissemination of the evidence base.

In 2008, NHLBI established three expert panels to develop updates of the guidelines for high blood cholesterol, high blood pressure, and overweight/obesity using a rigorous, systematic evidence review process across all of the groups. Three crosscutting work groups on risk assessment, lifestyle, and implementation were formed to develop their own reports and to provide crosscutting input to the expert panels. A Guidelines Executive Committee composed of all panel and work group co-chairs and NHLBI staff provided coordination for the work of the panels and work groups. The six report topics (blood cholesterol, blood pressure, overweight/obesity, lifestyle, risk assessment, and implementation) are seen as integral and complementary.

While the expert panels and work groups were undertaking a rigorous, systematic, evidence-based approach to updating the guidelines, the Institute of Medicine (IOM) also convened experts to examine the methodology for guideline development. In 2011, the IOM issued two reports that established new “best practice” standards for generating systematic evidence reviews and developing clinical practice guidelines [1, 2]. The reports underscore that these are two distinct, yet related, activities that require careful intersection and coordination.

Because of these developments and the changing approaches to developing guidelines, in June 2012, the NHLBI Advisory Council recommended that the Institute transition to a new model in accordance with the best practice standards established by IOM. In mid-2013, NHLBI adopted a new collaborative partnership model whereby it will focus on generating high-quality systematic evidence reviews and developing subsequent clinical practice guidelines by partnering with professional societies and other organizations [3]. The systematic evidence review components of the five adult clinical practice guidelines (including this systematic evidence review by the Overweight and Obesity Expert Panel) have been released as a public resource to complement the associated publication of the corresponding clinical practice guidelines in collaboration with partner organizations. The American Heart Association (AHA) and the American College of Cardiology (ACC) have agreed to spearhead the collaborative development of the CVD prevention guidelines utilizing the Adult CVD evidence reviews. The Obesity Society, as well as the AHA and ACC published “2013 AHA/ACC/TOS Guideline for the Management of Overweight and Obesity in Adults — A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines and The Obesity Society” [4]. This represents the condensed Executive Summary portion of “Guidelines (2013) for managing overweight and obesity in adults” which includes the full report of the Expert Panel in addition to the Executive Summary.

The Obesity Society chose to support the publication of the full report, inclusive of Evidence Review, Recommendations, Treatment Algorithm and Executive Summary. The decision to make available printed copies of this document to subscribers of Obesity was based on the following rationale: 1. The document represents a landmark in the field, by virtue of the stringent methodology and painstaking approach to produce evidence statements that are trustworthy. 2. Because the recommendations are based on evidence of the highest caliber, this full document is enormously important to everyone in the field, from scientists at the bench to clinicians who face treatment decisions every day. 3. Those who rely on the recommendations promoted by this effort need ready access to the full scope of the document. It is hoped that the print publication of this work will benefit most the patients who rely on the expertise of their primary care providers to understand their weight loss struggles and to help them lose weight safely and effectively.

B Overview of the Evidence-Based Methodology

“GUIDELINES (2013) FOR MANAGING OVERWEIGHT AND OBESITY IN ADULTS: Full Report including the Executive Summary, published by The Obesity Society with the ACC/AHA Task Force on Practice Guidelines and based on a Systematic Evidence Review supported by the NHLBI” represents the state of the art in critical appraisal of the scientific evidence in five important areas: Risks of obesity and overweight, benefits of weight loss, and three treatment modalities for achieving weight loss—diet, comprehensive lifestyle change, and bariatric surgery. This report was developed by an expert panel appointed by NHLBI to update the 1998 “Clinical Guidelines on the Identification, Evaluation, and Treatment of Overweight and Obesity in Adults” (hereafter referred to as the 1998 overweight and obesity clinical guidelines) [5]. The Obesity Expert Panel chose the five areas based on their importance and relevance to primary care providers (PCPs) and the availability of quality research in each area.

This evidence review is based on the following five critical questions (CQs):

Overweight and Obesity Panel—Critical Questions

No.Question
CQ1.Among overweight and obese adults, does weight loss produce CVD health benefits, and what health benefits can be expected with different degrees of weight loss?
CQ2.What are the CVD-related health risks of overweight and obesity, and are the current cutpoints for overweight (BMI 25-29.9 kg/m2), obesity (BMI ≥30 kg/m2), and waist circumference (>102 cm (M) and >88 cm (F)) appropriate for population subgroups?
CQ3.Which dietary strategies are effective for weight loss?
CQ4.What is the efficacy/effectiveness of a comprehensive lifestyle intervention program (i.e., diet, physical activity, and behavior therapy) in facilitating weight loss or maintaining weight loss?
CQ5.What is the efficacy and safety of bariatric surgery? What is the profile (BMI and comorbidity type) of patients who might benefit from surgery for obesity and related conditions?

Using a strict evidence-based methodology to ensure rigor and minimize bias, the Obesity Expert Panel formulated the five CQs from a broad and comprehensive list of 23 questions. The panel followed a prespecified methodological development process. The first step included a systematic review of the literature for a specific period of time and obtaining a quality rating for each of the papers meeting inclusion criteria. Generally, the panel used papers rated at least good or fair to develop evidence tables and summary tables for all five CQs. When papers rated good or fair were not available to address a specific component of a CQ, the panel used papers rated as poor quality to draw conclusions from the evidence. Due to resource constraints, efforts to address CQ1 and CQ2 relied on systematic reviews or meta-analyses rather than individual studies. For each of the CQs, the panel members reviewed the final list of included and excluded articles along with the quality ratings and had the opportunity to raise questions on citations that were missing from the literature search as well as appeal the quality ratings to the methodology team. The team then reexamined these papers and presented their rationale for either keeping or changing the quality rating of the papers. The panel members also played a key role in examining the evidence tables and summary tables to be certain that the data from each paper were accurately displayed. For CQ1 and CQ2, the panel created spreadsheets of the data from the systematic reviews and meta-analyses included in their evidence review.

In the next step, the panel reviewed summary tables of the evidence to formulate evidence statements, rating them as high, moderate, or low according to the strength of evidence. To grade the body of evidence, NHLBI adapted a system developed by the U.S. Preventive Services Task Force. Throughout the process, there were strict measures to manage conflict of interest and to prevent bias, with the goal of producing reliable, trustworthy, evidence-based statements. Beginning in September 2011, the Guideline Executive Committee (GEC) for the CVD panels and work groups (P/WGs) established an approach to manage relationships with industry (RWI) and other potential conflicts of interest (COI). The GEC comprises a GEC chair and the chairs and co-chairs of the six P/WGs. The results of this evidence review may be used to establish clinical recommendations to diagnose and treat overweight and obese individuals with or without additional risk factors.

This report contains seven sections.

  • Section 1: Background and Description of the NHLBI Cardiovascular Risk Reduction Project.
  • Section 2: Process and Methods Overview.
  • Section 3: CQ1 addresses the expected health benefits of weight loss as a function of the amount and duration of weight loss.
  • Section 4: CQ2 addresses the health risks of overweight and obesity and seeks to determine if the current waist circumference cutpoints and the widely accepted body mass index (BMI) cutpoints defining persons as overweight (BMI 25.0 to 29.9 kg/m2) and obese (BMI ≥30 kg/m2) are appropriate for population subgroups.
  • Section 5: Because patients are interested in popular diets that are promoted for weight loss and see the PCP as an authoritative source for information, CQ3 asks which dietary intervention strategies are effective for weight loss.
  • Section 6: CQ4 seeks to determine the efficacy and effectiveness of a comprehensive lifestyle approach (diet, physical activity, and behavior therapy) to achieve weight loss and weight loss maintenance.
  • Section 7: CQ5 seeks to determine the efficacy and safety of bariatric surgical procedures, including benefits and risks. CQ5 also seeks to determine patient and procedural factors that may help guide decisions to enhance the likelihood of maximum benefit from surgery for obesity and related conditions.
  • Appendixes provide in-depth methods, evidence summary tables, and references:
    • A detailed description of the evidence-based approach and methods is provided in Appendix A. The appendix presents all quality assessment tools used in the development of the present systematic reviews as well as documentation for search strategies and results from the search of the published literature.
    • Appendix B provides information on the literature search strategies used for each of the CQs considered in the evidence review, their PRISMA diagrams, and the list of studies rated as poor with the rationale behind the rating. PRISMA stands for Preferred Reporting Items for Systematic Reviews and Meta-Analyses and is an evidence-based minimum set of items for reporting in systematic reviews and meta-analyses.

C Scope of the Problem

More than 78 million adults in the United States were obese in 2009-2010 [6]. Obesity raises the risk for morbidity from hypertension, dyslipidemia, type 2 diabetes, coronary heart disease (CHD), stroke, gallbladder disease, osteoarthritis, sleep apnea and respiratory problems, and some cancers. Obesity is also associated with increased risk in all-cause and CVD mortality. The biomedical, psychosocial, and economic consequences of obesity have substantial implications for the health and well-being of the U.S. population.

According to the 1998 overweight and obesity clinical guidelines [5] overweight is defined as a BMI of 25 to 29.9 kg/m2 and obesity as a BMI of ≥30 kg/m2. Current estimates are that 69 percent of adults are either overweight or obese, with approximately 35 percent obese [7]. These latest data from the National Health and Nutrition Examination Survey (NHANES) report that for both men and women obesity estimates for 2009-2010 did not differ significantly from estimates for 2003-2008 and that the increases in the prevalence rates of obesity appear to be slowing or leveling off [7]. Yet, overweight and obesity continue to be highly prevalent, especially in some racial and ethnic minority groups as well as in those with lower incomes and less education. Overweight and obesity are major contributors to chronic diseases in the United States and, as such, present a major public health challenge. Finkelstein and colleagues [8] reported that, compared with normal weight individuals, obese patients incur 46 percent increased inpatient costs, 27 percent more physician visits and outpatient costs, and 80 percent increased spending on prescription drugs. The medical care costs of obesity in the United States are staggering. In 2008 dollars, these costs totaled about $147 billion [8].

Since the publication of the 1998 overweight and obesity clinical guidelines [5], the rates of overweight (BMI 25.0 to 29.9 kg/m2) and obesity (BMI ≥30 kg/m2) among U.S. adults have not diminished although there may be a slowing in the trajectory of increase. There are continuing adverse shifts in the distribution of BMI among the U.S. population. From 1998 to 2008, overweight rates were stable and obesity prevalence showed no significant increasing trend among women (adjusted odds ratio for 2007-2008 vs. 1999-2000) while the rates of obesity in men have significantly increased [9]. Figure 1 shows the changes in prevalence of overweight, obesity, and extreme obesity (BMI 40 kg/m2 or greater) from 1960 through 2008, using measured body weight and height from NHANES [10]. Furthermore, the latest available rates [6] indicate no decline in obesity rates in the United States; the age-adjusted rates for U.S. adults for 2010 indicate that 35.7 percent are obese, with women aged 60 and older having the highest rates of obesity (42.3 percent). Perhaps of greatest concern is the shift in the obese BMI distribution to a higher prevalence of BMI ≥40 kg/m2, which was 6.6 percent in years 2009-2010 [11] and appears to be stabilizing over the last 5 years.

Figure 1.

Trends in Overweight, Obesity, and Extreme Obesity Among Adults Aged 20 to 74 years: United States, 1960–1962 Through 2009–2010

Note: Age-adjusted by the direct method to the year 2000 U.S. Bureau of the Census using age groups 20–39, 40–59 and 60–74 years. Pregnant females were excluded. Overweight defined as a BMI of 25 or greater but less than 30; obesity is a BMI greater than or equal to 30; extreme obesity is a BMI greater than or equal to 40. Source: CDC/NCHS. National Health and Nutrition Examination Survey 1988–194, 1999–2000, 2001–2002, 2003–2004, 2005–2006, 2007–2008, and 2009–2010.

D History

The 1998 overweight and obesity clinical guidelines on overweight and obesity addressed the identification of and risks associated with overweight and obesity and the health effects of weight loss. They also presented various treatment strategies for weight loss, including diet, physical activity, behavior therapy, pharmacotherapy, and surgery. Unlike the previous obesity guidelines, the 2013 evidence review is not intended to be comprehensive; instead, it is focused on five CQs based on current knowledge. Notably, the 1998 guidelines accomplished a number of major objectives and presented valuable recommendations on assessment and classification of continued value. First, the guidelines classified overweight and obesity according to BMI, which is calculated by dividing the weight in kilograms by the square of the height in meters (kg/m2). Overweight is classified as a BMI of 25.0 to 29.9 kg/m2, and obesity is classified as Class 1 (BMI 30.0 to 34.9 kg/m2), Class II (BMI 35.0 to 39.9 kg/m2), and Class III or extreme obesity (BMI ≥40 kg/m2). This terminology replaces the term “morbid obesity,” because this term has derogatory connotations, the panel recommends that health care practitioners avoid using it.

Second, the 1998 guidelines linked body fat location to health risks. BMI correlates fairly well with total body fat on a population basis; however, it has limitations in predicting excess body fat associated with health risk on an individual basis. The panel identified excess abdominal fat as associated with greater health risks (abnormal lipid, prothrombotic, and proinflammatory risk factors as well as organ infiltration by fat) than that in peripheral regions. Waist circumference is the most practical measure of abdominal fat. The panel identified waist circumference cutpoints of 40 inches for men and 35 inches for women to aid in individual risk assessment. These guidelines on assessment and classification enabled physicians to identify and treat high-risk patients in their practice. Figure 2 summarizes the 1998 guidelines. This 2013 evidence review reevaluates the data underlying the association of BMI and waist circumference cutpoints with the risk for CVD, its risk factors, and overall mortality.

Figure 2.

The 1998 Clinical Guidelines: Classification of Overweight and Obesity by BMI, Waist Circumference, and Associated Disease Risk.

*Disease risk for type 2 diabetes, hypertension, and CVD

+Increased waist circumference can also be a marker for increased risk even in persons of normal weight. Source: http://www.nhlbi.nih.gov/guidelines/obesity/ob_home.htm

E Critical Questions on Overweight and Obesity

The first two CQs address weight-related CVD health risks and benefits of weight loss associated with detectable improvements in CVD risk factors/events while the other three address treatments for overweight and obesity. See the Overweight and Obesity Panel—Critical Questions for a complete listing of the five CQs.

E.1 Critical Questions on Weight-Related Health Risks and Benefits of Weight Loss

The panel chose CQ1 and CQ2 to help health care practitioners determine when to recommend weight loss. CQ1 asks if weight loss affects CVD risk factors and events and what cardiovascular health benefits can be expected with different degrees of weight loss. The association of weight loss with increased mortality in many epidemiologic studies challenges explanation. Many scientists think this may be due to measurement of unintentional weight loss in those studies. Still, this association mandates caution in prescribing weight loss unless patients are at significant risk for comorbidity and there is evidence that patients will benefit from weight loss.

CQ2 addresses the CVD-related health risks of overweight and obesity. This question asks if the current, widely accepted cutpoints for overweight (BMI 25.0 to 29.9 kg/m2) and obesity (BMI ≥30 kg/m2) and waist circumference (>102 cm (M) and >88 cm (F)) are appropriate for identifying elevated risk for CVD, diabetes, hypertension, dyslipidemia, and all-cause mortality in the overall population and key subgroups. This is an important topic because PCPs need to know when to recommend weight loss. All recommended weight loss interventions should be based on an assessment of benefits and risks.

E.2 Critical Questions on Treatments for Overweight and Obesity

Patients are interested in popular weight loss diets and often see the health care practitioner as an authoritative source on such diets. CQ3 asks which dietary strategies are effective for weight loss.

The 1998 overweight and obesity clinical guidelines' approach to obesity treatment begins with lifestyle modification—changes in dietary intake and physical activity—and can be facilitated by key behavioral strategies. Typically, comprehensive weight loss programs employ all three components but may vary in mode of delivery, setting, and implementation strategies. CQ4 seeks to examine evidence related to the efficacy and effectiveness of a comprehensive approach. This question asks how much weight loss can be achieved and how long it can be sustained when these state-of-the-art approaches are used and what is the relative impact of varying some key characteristics of how comprehensive programs are delivered to patients.

Surgery for obesity is an increasingly accepted and accessible option. In fact, Medicare and many insurers now reimburse for this type of surgery. The most frequently used surgical procedures are the laparoscopic gastric band, laparoscopic or open Roux-en-Y gastric bypass, sleeve gastrectomy, and biliary pancreatic diversion. CQ5 asks about the efficacy and safety of these procedures by evaluating long-and short-term benefits (risk factors, morbidity, and mortality) and safety. CQ5 also explores data related to the profile of patients (BMI and comorbidity type) who might benefit from this surgery. Answers to these questions will guide PCPs on appropriate recommendations for obese patients who may be surgical candidates.

The panel decided not to address pharmacotherapy for chronic obesity management with a specific CQ. When the panel selected CQs, only two medications were available and approved for chronic use (orlistat and sibutramine). In addition, neither was prescribed widely in primary care, and sibutramine was removed from the market in 2010. The panel did, however, address the effect of orlistat on weight loss and risk factors in CQ1 since the question dealt with the effect of weight loss on risk via a variety of methods, and several meta-analyses covered this topic. Other medications were in later stages of development, but there were insufficient published data to conduct a systematic review. In the interim, two recently approved medications for weight loss—the combination phentermine and topiramate [12-14] and lorcaserin [15-17] —have a growing evidence base. There are also several systematic reviews of pharmacotherapy [18-21].

F Challenges of Achieving Weight Loss in Primary Care Practice

Patients face many challenges in achieving weight loss, including learning a certain set of skills and behaviors. Part of a PCP's role is to help patients learn and practice these skills. CQ4 presents evidence that a comprehensive approach to lifestyle change for weight loss is achievable, and CQ3 underscores the efficacy of many alternative dietary interventions for healthy weight loss when implemented by qualified nutrition professionals. PCPs may also prescribe weight loss medications as an adjunct or refer appropriately selected patients for different kinds of bariatric surgery, which CQ5 examines.

PCPs also must be knowledgeable about the underlying biology that fights weight loss and promotes weight regain. Since publication of the 1998 guidelines, research has shown that for a given environment, body size is predicted largely by genetic factors. In fact, there are strong physiologic mechanisms that resist weight loss and promote regain after weight loss: Changes in fat, gut, and neural signals that regulate appetite and metabolism. Dynamic physiological adaptations occur with decreased body weight, which may alter the time course of individual weight change in response to behavioral interventions [21].

Understanding obesity as a complex, chronic disease is essential for providing effective health care for overweight and obese patients. The pathway to effective weight loss and weight loss maintenance is through long-term changes to eating and physical activity behaviors. Respect for patients and their autonomy and practitioner skills in coaching and motivating patients are qualities that promote successful obesity management. Increasingly, these skills are being addressed in post-graduate training programs. An important responsibility for PCPs is to develop their skills in these areas so that they can better assist their patients, including referral to trained interventionists when appropriate.

A final note on the challenges of weight loss must address the potential harms of weight loss itself. In addition to the risks inherent in pharmacological or surgical treatments for obesity, there are other weight loss risks such as, cholelithiasis. Improvements in fertility with weight loss may result in unplanned pregnancy. In addition, dietary restriction and weight loss may lead to hypoglycemia or hypotension in those on medication, requiring careful monitoring and dose adjustment as needed. Finally, for older individuals, weight loss is advised with caution and is not advisable for those older than 80 years; reduced muscle mass and frailty may occur.

Section 2: Process and Methods Overview

A Overview of Evidence-Based Methodology

To continually improve the quality and impact of NHLBI guidelines, the evidence review process was updated to ensure rigor and minimize bias. Part of this process included using a rigorous, evidence-based methodology and developing evidence statements based on systematic reviews of the biomedical literature for specific periods of time.

The development process followed most of the standards from the IOM report [2] “Clinical Practice Guidelines We Can Trust,” which states that trustworthy guidelines should:

  • Be based on a systematic review of the existing evidence.
  • Be developed by a knowledgeable, multidisciplinary panel of experts and representatives of key affected groups.
  • Consider important patient subgroups and patient preference, as appropriate.
  • Be based on an explicit and transparent process that minimizes distortion, biases, and conflicts of interest.
  • Provide a clear explanation of logical relationships between alternative care options and health outcomes and provide ratings of both the quality of evidence and the strength of the recommendations.
  • Be reconsidered and revised as appropriate when important new evidence warrants modifications of recommendations.

The Obesity Expert Panel included individuals with specific expertise in a range of areas: Psychology, nutrition, physical activity, bariatric surgery, epidemiology, internal medicine, and other clinical specialties. All panels and work groups followed the same methods, with variations as needed to reflect the evidence in the field as well as time and resource constraints. The methodology included numerous components and followed a prespecified development process. Directed by NHLBI, with support from a methodology contractor and a systematic review and general support contractor, the expert panels and work groups:

  • Developed an evidence model.
  • Constructed CQs most relevant to clinical practice. CQs followed the “PICOTS” (patient population, intervention/exposure, comparison group, outcome, timing, and setting) format.
  • Identified (a priori) inclusion/exclusion (I/E) criteria for each CQ.

Directed by NHLBI, with input from the panels and work groups, the contractor staff:

  • Developed a search strategy based on I/E criteria for each CQ.
  • Executed a systematic electronic search of the published literature from relevant bibliographic databases for each CQ. The date for the overall literature search was from January 1998 to December 2009. Because CQ1 and CQ2 used systematic reviews and meta-analyses, the literature search included publications from January 2000 to October 2011. CQ3 and CQ4 added major randomized controlled trials (RCTs) published after 2009 with greater than 100 individuals per treatment arm; and CQ5 added some major studies published after 2009 that met the I/E criteria.
  • Screened, by two independent reviewers, thousands of abstracts and full-text articles to identify relevant original articles, systematic reviews, and/or meta-analyses. They applied rigorous validation procedures to ensure that the selected articles met the preestablished I/E criteria before they were included in the final review results.
  • Determined, by two independent raters, the quality of each included study. The methodology staff, with NHLBI input, adapted study rating instruments and trained study raters on the use of these instruments. Six quality assessment tools were designed to assist reviewers in the critical appraisal of a study's internal validity.
  • Reviewers used the study ratings to judge each study to be of “good,” “fair,” or “poor” quality. The reviewers used the ratings to assess the risk of bias in the study due to flaws in study design or implementation.
  • Abstracted relevant information from the included studies into an electronic database, called the Central Repository. Templates with lists of data elements pertinent to the established I/E criteria were constructed and used to support abstraction.
  • Constructed detailed evidence tables to organize the data from the abstraction database.
  • Analyzed the evidence tables and constructed summary tables, which display the evidence in a manageable format to answer specific parts of the CQ.

The expert panels and work groups:

  • Used summary tables to develop evidence statements for each CQ. The quality of evidence for each evidence statement was graded as high, moderate, or low based on scientific methodology, scientific strength, and consistency of results. (See discussion below.) For CQ1 and CQ2, spreadsheets with relevant data from systematic reviews and meta-analyses rather than summary tables were developed.
  • Drafted a report that underwent review by representatives of key Federal agencies, a group of experts selected by NHLBI, and a subgroup of the NHLBI Advisory Council.

See Appendix A for more details on the evidence-based process and Appendix B for literature search strategies used for CQs.

System for Grading the Body of Evidence

NHLBI adapted a system developed by the U.S. Preventive Services Task Force to grade the body of evidence. The panels graded the evidence statements for quality as high, moderate, or low (see Table 1). They then graded the recommendations as Strong Recommendation (Grade A), Moderate Recommendation (Grade B), Weak Recommendation (Grade C), Recommendation Against (Grade D), Expert Opinion (Grade E), or No Recommendation for or Against (Grade N) (see Table 1A). The grades provide guidance to PCPs and other practitioners on how well the evidence supports the evidence statements and recommendations. The strength of the body of evidence represents the degree of certainty, based on the overall body of evidence, that an effect or association is correct. Appendix A2-6 describes how four domains of the body of evidence—risk of bias, consistency, directness, and precision—were used to grade the strength of evidence.

Table 1. Evidence Quality Grading System
image
Table 1A. Grading the Strength of Recommendations
image

A.1 Peer-Review Process

A formal peer-review process was undertaken that included inviting several scientific experts and representatives from multiple Federal agencies to review and comment on the draft documents. NHLBI selected scientific experts with diverse perspectives to review the reports. Potential reviewers were asked to sign a confidentiality agreement, but NHLBI did not collect COI information from the reviewers. DARD staff collected reviewers' comments and forwarded them to the respective panels and work groups for consideration. Each comment received was addressed—either by a narrative response and/or a change to the draft document. A compilation of the comments received and the panels' and work groups' responses was submitted to the NHLBI Advisory Council working group; individual reviewers did not receive responses.

B Critical Question-Based Approach

The body of this report is organized by CQ. For each CQ:

  • The rationale for its selection is provided and methods described.
  • The body of evidence is summarized, and evidence statements, which include a rating for quality, are presented. A narrative summary also supports each evidence statement.

A detailed description of the evidence-based approach and methods is provided in Appendix A. The appendix presents all tools used in the development of the present systematic reviews as well as documentation for search strategies and results from the search of the published literature. See Appendix B for information on the literature search strategies used for each of the CQs considered in the evidence review, their PRISMA diagrams, and their list of studies rated as poor with the rationale behind the rating. (PRISMA stands for Preferred Reporting Items for Systematic Reviews and Meta-Analyses and is an evidence-based minimum set of items for reporting in systematic reviews and meta-analyses.)

B.1 Critical Questions on Overweight and Obesity

The Obesity Expert Panel began the process of selecting CQs by collecting proposed questions and topic areas, prioritizing questions based on resource constraints, and ranking the questions through discussion and voting. From the 23 identified questions, the panel chose 5 CQs to address. The topics considered but not selected included the following: Genetics, binge eating disorders, physical activity, pharmacotherapy, and cost effectiveness of interventions to treat and manage obesity.

The first two CQs address weight-related health risks of obesity and benefits of weight loss while the other three address treatments for overweight and obesity.

The panel chose CQ1 and CQ2 to help health care practitioners determine when to recommend weight loss. CQ1 asks if weight loss affects CVD risk factors and events and what CVD-related health benefits can be expected with different degrees of weight loss. The association of weight loss with increased mortality in many epidemiologic studies challenges explanation. Many scientists believe this may be due to measuring unintentional weight loss in those studies. Still, this association suggests that practitioners should be cautious in prescribing weight loss unless patients are at high risk for comorbidity or there is evidence that the patient will benefit from weight loss.

CQ2 addresses the CVD-related health risks of overweight and obesity. This question asks if the widely accepted cutpoints defining individuals as overweight (BMI 25.0 to 29.9 kg/m2) and obese (BMI ≥30 kg/m2) and the current waist circumference cutpoints (>102 cm (M) and >88 cm (F)) are appropriate for identifying elevated risk for CVD, diabetes, hypertension, dyslipidemia, and all-cause mortality in population subgroups. This is an important topic on which to comment because PCPs need to know when to recommend weight gain prevention or weight loss.

CQ3 asks which dietary strategies are effective in achieving weight loss. Patients are interested in the popular weight loss diets and view the PCP as an authoritative source on such diets. To achieve weight loss, most practitioners recommend a comprehensive approach: Diet, physical activity, and behavior therapy. CQ4 seeks to determine the efficacy and effectiveness of a comprehensive approach. Specifically, this CQ asks how much weight loss can be achieved and how long it can be sustained when these state-of-the-art approaches are used and what is the impact of each of the components of the comprehensive programs.

Since publication of the overweight and obesity clinical guidelines in 1998 [5], bariatric surgery has evolved and is now being used more frequently. Surgical procedures most often used include the laparoscopic gastric band, laparoscopic or open Roux-en-Y gastric bypass, sleeve gastrectomy, and biliary pancreatic diversion. CQ5 explores the efficacy and safety of these procedures, including the long- and short-term benefits (risk factors, morbidity, and mortality) and safety. In addition, CQ5 asks what profile (BMI and comorbidity type) of patients might benefit from bariatric surgery. Answers to these questions will help guide PCPs in advising and referring obese patients for this surgery.

The five CQs on overweight and obesity will help practitioners identify patients who need intervention and determine which weight loss techniques to recommended. Importantly, the questions target areas in which recent research has yielded discoveries. They also highlight important topics in which informed practitioners can impact public health.

Section 3: Critical Question 1

A Statement of the Question

  1. Among overweight and obese adults, does achievement of reduction in body weight with lifestyle and pharmacological interventions affect cardiovascular disease (CVD) risk factors, CVD events, morbidity, and mortality?

    1. Does this effect vary across population subgroups defined by the following demographic and clinical characteristics:

      1. Age

      2. Sex
      3. Race/ethnicity
      4. Baseline BMI
      5. Baseline waist circumference
      6. Presence or absence of comorbid conditions
      7. Presence or absence of CVD risk factors
    2. What amount (shown as percent lost, pounds lost, etc.) of weight loss is necessary to achieve benefit with respect to CVD risk factors, morbidity, and mortality?

      1. Are there benefits on CVD risk factors, CVD events morbidity, and mortality from weight loss?

      2. What are the benefits of more significant weight loss?
    3. What is the effect of sustained weight loss for 2 or more years in individuals who are overweight or obese on CVD risk factors, CVD events, and health and psychological outcomes?

      1. What percent of weight loss needs to be maintained at 2 or more years to be associated with health benefits?

By Population Subgroups

  • Age
  • Sex
  • Socioeconomic status (no evidence anticipated)
  • Race/ethnicity
  • Baseline BMI (overweight (BMI 25.0 to 29.9) vs. obese (BMI ≥30.0))
  • Baseline waist circumference
  • Presence or absence of comorbid conditions

    • Diabetes

    • Metabolic syndrome
    • Depression
    • Quality-of-life issues
  • Presence or absence of CVD risk factors

    • Smoking

    • More than one risk factor
    • Baseline (not necessarily pretreatment) low-density lipoprotein cholesterol (LDL-C) >100 mg/dL
    • Triglycerides ≥200 mg/dL
    • High-density lipoprotein cholesterol (HDL-C) <40 mg/dL
    • Hypertension
    • Diminished cardiorespiratory fitness
    • Previous CVD event
    • Elevated C-reactive protein (CRP)

A.1 By Amount of Weight Loss

  • Different cutpoints
  • Significant weight loss

A.2 By Weight Loss Maintenance

  • Different cutpoints

B Selection of the Inclusion/Exclusion Criteria

Panel members developed eligibility criteria, based on a population, intervention/exposure, comparison group, outcome, time, and setting (PICOTS) approach, for screening potential studies for inclusion in the evidence review. The criteria included the PICOTS criteria as the first six and then also several others related to study design, language, publication type, and publication timeframe. Table 2 presents the details of the PICOTS approach for CQ1.

Table 2. Criteria for Selection of Publications for CQ1
image
image
image

C Introduction and Rationale for Question and Inclusion/Exclusion Criteria

CQ1 addresses the health benefits of weight loss in overweight and obese adults in terms of reduction in cardiovascular risk factors and events, morbidity, and mortality. The goal for this CQ was to determine whether risk reduction varied as a function of pre-weight loss risk factors, degree of overweight, age, sex, ethnicity, and waist circumference. An additional goal was to assess what degree of weight loss is associated with detectable improvements in CVD risk factors/events, whether there is evidence for greater improvements with greater weight loss, and the benefits of prolonged (≥2 years) weight loss. This is an important topic with respect to providing evidence that could support judgments about the relative benefits of reducing weight and being able to explain these benefits to patients considering a weight loss program.

D Methods for Critical Question 1

The Obesity Expert Panel formed work groups for each of its five CQs. For CQ1, the work group was chaired by a physician and was composed of physicians and investigators representing academic institutions across the United States.

CQ1 addresses the relationship between weight loss and reductions in CVD risk factors and events as a function of the preexisting status of the patients being treated. The methodology team assisted by applying the PICOTS criteria. The methodology team also worked with the CQ1 panel members to develop and refine the detailed I/E criteria. CQ1 was initially intended to be a de novo systematic review of original studies plus systematic reviews and meta-analyses. In 2011, the CQ was re-scoped and restricted to systematic reviews and meta-analyses only. In order to accomplish the goal within the allocated resources, NHLBI staff and panel members decided that CQ1 and CQ2 would focus primarily on evidence available from systematic reviews, meta-analyses, and a limited number of individual articles that represent studies with impact equal to systematic reviews and meta-analyses. This approach allowed the CQ1 members to address some, but not all, elements of CQ1.

The literature search for CQ1 included an electronic search of the Central Repository for systematic reviews and meta-analyses published in the literature from January 2000 to October 2011. The Central Repository contains citations pulled from seven literature databases (PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts). The search produced 1,633 citations, including 3 additional citations identified from nonsearch sources (i.e., by the panel members) [22-24]. Figure 3 outlines the flow of information from the literature search through the various steps used in the systematic review process.

Figure 3.

PRISMA Diagram Showing Selection of Articles for Critical Question 1

The titles and abstracts of 1,630 publications were screened against the I/E criteria independently by 2 reviewers, which resulted in 669 publications being excluded and 697 publications being retrieved for full-text review to further assess eligibility. These full-text publications were independently screened by 2 reviewers who assessed eligibility by applying the I/E criteria; 669 of these publications were excluded based on one or more of the I/E criteria (see specified rationale as noted in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagram).

Of the 697 full-text publications, 42 met the criteria and were included. The quality (internal validity) of these 42 publications was assessed using the quality assessment tool developed to assess systematic reviews and meta-analyses or RCTs (see Appendix tables A-1 and A-2). Of these, 14 publications were rated as poor quality [25-38]. The rationales for rating all of the poor-quality studies are included in Appendix table B-10. The remaining 28 publications were rated good or fair quality [22, 23, 39-64] and included in the evidence base that was used to formulate the evidence statements. The panel members reviewed these 28 articles along with their quality ratings and had the opportunity to raise questions. The review of evidence for CQ1 was based largely upon systematic reviews and meta-analyses of randomized controlled trials and observational cohort studies that were published between the years 2001 and 2011. Results from selected individual RCTs that included approximately the same number of participants/observations as were available in the systematic reviews and meta-analyses within topic areas (diabetes/glucose, lipids, and blood pressure) were also used.

Approval was received from NHLBI to use relevant data from an RCT (i.e., Look AHEAD (Action for Health in Diabetes)) based on the following rationale. Look AHEAD is a prospective, multicenter, randomized controlled trial that examined the effects of intervention versus usual diabetes care, referred to as diabetes support and education (DSE), on cardiovascular morbidity and mortality in 5,145 overweight or obese participants with type 2 diabetes. This single trial provides data on more patients than the two meta-analyses by Norris et al. [53] and Norris et al. [50] (N = 4,659) and almost as many as Norris et al. [51] (N = 5,956) and Orozco [65] (N = 5,956). The Look AHEAD investigators provided 4-year comparison outcome data [23] and, more important, ≥1-year dose-response data that relate the amount of weight loss to predefined CVD risk factors [22].

Subsequent to receiving approval to include relevant data from Look AHEAD, an additional search was made of the de novo citations included during the early screening stages for RCTs of similar size to Look AHEAD (≥5,000); through this process; no additional relevant studies were found.

For CQ1, spreadsheets containing key information from the systematic reviews, meta-analyses, and the Look AHEAD studies were created by the panel members; these spreadsheets, cross-checked by the methodology and systematic review teams for accuracy, formed the basis for panel deliberations.

To examine the possible effects of weight loss on mortality, longitudinal, prospective cohort studies were used to ensure enough events were recorded for a reasonably accurate estimate of effect. These types of studies are, by necessity, different from prospective randomized controlled trials, in which for ethical reasons the control group must receive the standard of care for cardiovascular risk factors. In observational cohort studies, the participants may or may not receive community standard care for risk factors.

E Evidence Statements and Summaries

E.1 Weight Loss and Risk for Diabetes: Spreadsheets 1.1.−1.4b

Diabetes outcomes were derived from nine systematic reviews and meta-analyses and two primary publications from Look AHEAD. The literature available did not specifically address whether age, sex, ethnicity, or waist circumference influence the response to weight loss in terms of CVD risk reduction.

ES1. In overweight and obese adults at risk for type 2 diabetes, average weight losses of 2.5 to 5.5 kg at 2 or more years, achieved with lifestyle treatment (with or without orlistat), reduces the risk for developing type 2 diabetes by 30 to 60 percent.

Strength of evidence: High

Rationale: Systematic reviews and meta-analyses [39, 41, 51, 52], largely using the same database, consistently find that intentional weight loss reduces the risk for developing type 2 diabetes in at-risk populations. Typically, at-risk populations are overweight/obese and have glucose intolerance, a family history of type 2 diabetes, and often other comorbidities such as hypertension and dyslipidemia. The estimates of risk reduction are quite consistent between studies.

ES2. In overweight and obese adults with type 2 diabetes, 2- to 5-percent weight loss achieved with 1 to 4 years of lifestyle treatment (with or without orlistat) results in modest reductions in fasting plasma glucose concentrations and lowering of HbA1c by 0.2 to 0.3 percent.

Strength of evidence: High

Rationale: Some of the meta-analyses included in the evidence base used pooled results from studies dating from the late 1970s through the early 2000s [41, 42, 53]. As a result, these authors included noncomprehensive weight loss approaches and studies with widely varying degrees of success in terms of weight loss. Modest average weight loss was reported in many older studies, which was associated with insignificant reductions in fasting blood glucose. Furthermore, some meta-analyses combined glucose and HbA1c data from persons with and without type 2 diabetes [42]. The concern was that this analytical approach would not truly reflect the impact of interventions on improvements in HbA1c in type 2 diabetes because a person's normal glucose and HbA1c values do not become “more normal” with weight loss. Thus, some of the pooled data from systematic reviews and meta-analyses were difficult to interpret with regard to the question of whether and how much weight loss is needed to affect diabetes-related outcomes. One advantage of examining outcomes from older studies, however, is that the control groups generally received weak interventions, both in terms of support and pharmacotherapy. The evolving evidence that pharmacotherapy for hyperglycemia, hyperlipidemia, and hypertension had clear medical benefits required changes in subsequent study designs—ethically, the control groups for lifestyle treatment must be provided with aggressive pharmacotherapy for these CVD risk factors. As a result, improvements in CVD risk factors in lifestyle treatment relative to control groups in more recent studies is less impressive than in older studies. Unfortunately, in the literature base for this CQ, only the Look AHEAD papers provided data as to the confounding effects of greater use of medications in control groups. The between-group differences in medication use were not addressed in systematic reviews and meta-analyses in a manner that could be assessed.

ES3. In overweight and obese adults with type 2 diabetes, those who achieve greater weight loss at 1 year with lifestyle therapy (with or without orlistat) have greater improvements in HbA1c. Weight loss of 5 to 10 percent is associated with HbA1c reductions of 0.6 to 1.0 percent and reduced need for diabetes medications.

Strength of evidence: High

Rationale: This pattern is seen both in a meta-analysis examining different studies with different amounts of weight loss [50, 51] and within a large, prospective randomized controlled trial [22]. As noted, the probability of achieving a clinically meaningful reduction in HbA1c is increased with weight loss of 2 to 5 percent, and the probability increases further as the amount of weight loss increases. The relationship between the amount of weight loss and the improvement in HbA1c between different studies was commented upon by Norris et al [50, 51]. The reports of Wing [22, 23], and Korhonen, Heller, Uusitupa, and Zapotoczky (all cited and referenced in Norris et al. [50, 51]) and the Norris et al. meta-analyses [19, 50-52] included data on the average weight loss and average change in HbA1c. Those studies in which patients had the greatest weight loss reported the greatest decline in HbA1c. Look AHEAD [49] provided data regarding the relationship between weight loss and improvement in glycemic control, blood pressure, and blood lipids. They found a strong relationship between the amount of weight loss and the improvement in these risk factors irrespective of the group to which the participants were assigned (intensive lifestyle or diabetes support and education (DSE)). In Look AHEAD [22], there was a dose-response relationship between weight loss and the likelihood of achieving a clinically meaningful improvement in HbA1c (a reduction of at least 0.5 percent). A 2- to 5-percent weight loss results in a statistically significant increase in the likelihood of achieving a reduced HbA1c compared with the weight-stable (gained ≤2 percent or lost <2 percent) group. However, on average the improvements in glucose and HbA1c with 2- to 5-percent weight loss are modest. The Look AHEAD investigators [22] found that the dose-response relationships between weight loss and the average reduction in fasting glucose and HbA1c were such that those losing ≥15 percent of body weight over 1 year had an average reduction in fasting glucose of ∼35 mg/dL and an average decrease of ∼0.9 percent in HbA1c. These improvements in fasting glucose and HbA1c were seen despite a significantly reduced need for antidiabetic medications in the group treated with intensive lifestyle intervention (ILI) compared with the control group [23].

ES4. In overweight and obese adults with type 2 diabetes treated for 1 year with lifestyle therapy (with or without orlistat), those who lose more weight achieve greater reductions in fasting plasma glucose concentrations. Those who achieve weight losses of 2 to 5 percent are more likely to have clinically meaningful (>20 mg/dL) reductions in fasting glucose than those who remain weight stable (defined as gaining ≤2 percent or losing <2 percent).

Strength of evidence: High

Rationale: The Look AHEAD investigators and Avenell et al. [41] examined the dose-response relationship between weight loss and lowering of fasting plasma glucose concentrations. Both found a dose-response relationship, such that greater degrees of weight loss were associated with greater reductions in fasting glucose. The Look AHEAD investigators examined the relationship between weight loss and weight loss categories and the likelihood of achieving a clinically meaningful improvement in fasting blood glucose (a priori defined as >20 mg/dL decrease). This group reported that a 2- to 5-percent weight loss results in an ∼70 percent increase in the likelihood of achieving a 20 mg/dL reduction in fasting glucose compared to being weight stable (gained ≤2 percent or lost <2 percent). In addition, those who lost 2 to 5 percent of body weight were less likely to require antidiabetic medications than those who remained weight stable. However, the odds of achieving this ≥20 mg/dL glucose reduction goal in the weight-stable group was not reported in a manner that allowed us to determine the absolute likelihood of significant glucose lowering with 2 to 5 percent weight loss. It appears there were no significant differences in average fasting plasma glucose between the weight-stable and 2 to 5 percent weight loss groups in the Look AHEAD participants at 1 year. The reductions in HbA1c with weight loss may be more apparent than reductions in fasting plasma glucose because HbA1c reflects the integrated glycemic response. Lifestyle intervention (with or without orlistat) may be effective in improving insulin action/secretion, such that post-prandial blood glucose levels may be more improved than fasting blood glucose. In addition, the day-to-day variability in fasting blood glucose in type 2 diabetes will make it more difficult to detect improvements in glycemia using this outcome than if HbA1c is used.

ES5. As comprehensive lifestyle treatment of overweight and obese adults with type 2 diabetes continues over 4 years, some weight regain will occur on average; partial weight regain is associated with an increase in HbA1c, but HbA1c remains below preintervention levels and the reduction remains clinically meaningful [23].

Strength of evidence: Moderate

Rationale: Look AHEAD enrolled more than 5,000 patients with type 2 diabetes and has achieved a followup rate of 93 and 94 percent in the ILI and DSE groups, respectively. Although only a single study, the number of observations is approximately equal to that obtained in the available systematic reviews and meta-analyses, none of which addressed part “c” of this question, “What is the effect of sustained weight loss for 2 or more years in individuals who are overweight or obese on CVD risk factors, CVD events and health and psychological outcomes?” The ILI cohort had maintained a mean weight loss of 4.7 percent at 4 years, compared with a 1.1 percent weight loss in the DSE group. The pattern of weight regain between 1 and 4 years in the ILI group was mirrored by gradual increases in HbA1c although the need for antidiabetic medication remained lower in the ILI group over all 4 years. At the end of 4 years of treatment, those in the ILI group were more likely to meet goals for HbA1c and LDL-C than those receiving DSE. Those receiving intensive lifestyle treatment were also less likely to have started antidiabetic medication (including insulin) and more likely to have discontinued diabetes medications. ILI patients were more likely to have discontinued antihypertensive medications and less likely to have started lipid-lowering medication than DSE patients.

ES6. In observational cohort studies, overweight and obese adults with type 2 diabetes who intentionally lost 9 to 13 kg had a 25-percent decrease in mortality rate compared to weight-stable controls [39, 57].

Strength of evidence: Low

Rationale: One aspect of this CQ was to address whether and how much weight loss is associated with reduced mortality rates in those with CVD risk factors. Poobalan et al. [57] examined the literature for evidence that weight loss reduces mortality. There was evidence that intentional weight loss in both men and women with diabetes reduced mortality rates. One of the studies included in this systematic review indicated that women with obesity-related illnesses who intentionally lost >20 pounds of weight had reduced mortality rates within 1 year, whereas this was not seen for men who intentionally lost weight. Because none of the studies included were prospective, randomized trials of lifestyle treatment to achieve weight loss, these findings were considered to have low strength of evidence. After these evidence statements were developed, Look AHEAD was stopped because of the low likelihood of a difference in cardiovascular events between the ILI group compared with the DSE control group. Both groups received aggressive medical management of cardiovascular risk factors, a situation not comparable to the observational studies reviewed by Poobalan et al.

ES7. In overweight and obese adults with type 2 diabetes, orlistat compared to placebo, both with lifestyle treatment, results in a 2 to 3 kg greater weight loss at 1 and 2 years. The addition of orlistat is associated with greater reductions in fasting blood glucose, averaging 11 and 4 mg/dL at 1 and 2 years, respectively, as well as an average greater reduction in HbA1c of 0.4 percent at 1 year [18, 42, 51].

Strength of evidence: High

Rationale: One aspect of this question was to address whether “reduction in body weight with lifestyle and pharmacological interventions affect CVD risk factors, CVD events, morbidity, and mortality”; however, the only agent that was FDA approved for long-term treatment of obesity at the time of the literature review was orlistat. Therefore, the review focused on systematic reviews and meta-analyses that examined the published orlistat results. Four publications analyzed the effects of orlistat on fasting blood glucose and HbA1c. Avenell et al. [41, 42] reported that orlistat at the standard prescribed dose of 120 mg three times daily with meals resulted in an average extra 1 to 3 kg of weight loss at 1 and 2 years compared with placebo and that this was associated with greater reductions in fasting blood glucose of 11 mg/dL and 4 mg/dL at 1 and 2 years, respectively, and a 0.3-percent greater reduction HbA1c at 1 year. Norris et al. [51] reported that 1 year of orlistat therapy resulted in an average 2 kg greater weight loss than placebo, a 13 mg/dL greater reduction fasting blood glucose, and a 0.4-percent greater reduction in HbA1c compared with placebo. Similar findings were reported by O'Meara et al. [18]. Standard orlistat therapy for 1 year resulted in an average 2.4 kg greater weight loss, 11 mg/dL greater fasting blood glucose reduction, and a 0.4-percent greater HbA1c reduction compared with placebo.

F Weight Loss and Impact on Cholesterol/Lipid Profile—Spreadsheet 1.5a-1.5c

Seven systematic reviews and meta-analyses and three reports from the Look AHEAD research group were used to examine the effects of weight loss on lipid outcomes achieved in overweight and obese adults with lifestyle interventions or weight loss drugs combined with lifestyle modification. The literature available to us did not specifically address whether age, sex, ethnicity, or waist circumference influence the response to weight loss in terms of CVD risk reduction. The Look AHEAD trial provides evidence of the effect of weight loss on lipids and lipid-lowering medication use at 1 to 4 years of followup achieved by comprehensive lifestyle intervention in overweight and obese individuals with type 2 diabetes.

ES1. In overweight or obese adults with or without elevated CVD risk, there is a dose-response relationship between the amount of weight loss achieved by lifestyle intervention and the improvement in lipid profile ([22],[58]). The level of weight loss needed to observe these improvements varies by lipid.

  • At a 3 kg weight loss, a weighted mean reduction in triglycerides of at least 15 mg/dL is observed.
  • At a 5 to 8 kg weight loss, LDL-C reductions of approximately 5 mg/dL and increases in HDL-C of 2 to 3 mg/dL are achieved ([22],[23],[41],[43],[45]).
  • With a less than 3 kg weight loss, more modest and more variable improvements in triglycerides, HDL-C, and LDL-C are observed ([64]).

Strength of evidence: High

Rationale: Systematic reviews, meta-analyses, and selected reports from Look AHEAD were used to determine if there is a dose-response relationship between the amount of weight loss achieved by lifestyle intervention and the improvement in lipid profile in overweight or obese adults with or without elevated CVD risk [22, 58] and the level of weight loss needed to observe improvements in lipids [22, 23, 41, 43, 45, 64]. Some of the meta-analyses reported the weighted mean difference (WMD) between lifestyle intervention and control, yet the WMD in lipids was based on a subsample of the studies reporting weight loss. Thus, it was difficult to directly match the weight loss with the changes in lipid in some papers. When possible, the weight loss and lipid data were matched from studies identified in these meta-analyses, and in those cases the range of weight loss and lipid change was examined to address this CQ. In situations where the weight loss and lipid data were not able to be matched, data from those meta-analyses were not used to support the evidence statement or recommendation made by the panel.

The systematic review conducted by Poobalan et al. [58] and the report from Look AHEAD [22] were used to determine if there was a dose-response relationship between the amount of weight loss achieved by lifestyle intervention and the improvement in lipid profile in overweight or obese adults with or without elevated CVD risk. While Poobalan et al. [58] included studies that reported on weight loss from either lifestyle or surgical approaches, the work group was able to identify the lifestyle studies on scatterplots illustrating the relationship between weight loss and change in lipids. These scatterplots showed a significant positive association between the mean difference in weight change and the change in LDL-C and triglycerides, with no identifiable association with change in HDL-C. However, the Look AHEAD investigators showed a clear dose-response relationship between the amount of weight loss and the increase in HDL-C, with no relationship between weight loss and change in LDL-C [22].

The amount of weight loss resulting in detectable improvements varied by lipid. With regard to triglycerides, Avenell et al. [41, 43] reported that weight losses of approximately 3 to 12 kg compared to control over a period of 12 months reduced fasting plasma triglyceride concentrations by approximately 15 to 50 mg/dL [41, 43]. A similar magnitude of change in triglycerides in response to weight loss was reported by Galani et al. [45] In a study by Witham and Avenell (2010) in older adults, weight loss of less magnitude (1.5 to 2.0 kg) resulting from a lifestyle modification was not associated with a significant reduction in triglycerides in overweight and obese adults ≥60 years of age [64].

With regard to LDL-C, Avenell et al. reported that a weight loss of 5 to 8 kg over 12 months was associated with reductions in LDL-C of approximately 5 to 8 mg/dL [41, 43], similar findings were reported by Galani et al. [45]. Among overweight and obese adults ≥60 years of age, lifestyle modifications that produced modest average weight loss of 1.5 to 2.0 kg over a period of 12 months compared to control did not change LDL-C [64]. Moreover, among overweight and obese adults with type 2 diabetes aged 45 to 75 years, 8 percent weight loss at 1 year and 5.3 percent weight loss over 4 years compared to controls did not result in a reduction in LDL-C comparable to controls. However, this difference in weight loss results in less frequent initiation of lipid-lowering medication [23, 49].

Among overweight and obese adults, lifestyle modification that produces weight loss of approximately 3.0 to 12.0 kg compared to control over a period of 12 months resulted in an increase in HDL-C of 2 to 4 mg/dL [41, 43]. However, among overweight and obese adults ≥60 years of age, lifestyle modifications that produced weight loss of only 1.5 to 2.0 kg over a period of 12 months compared to control resulted in no change in HDL-C [64]. Moreover, among overweight and obese adults with type 2 diabetes aged 45 to 75 years, 8 percent weight loss at 1 year and 5.3 percent weight loss over 4 years compared to control increased HDL by an additional 2 mg/dL and 1.6 mg/dL, respectively [23].

ES2. Among overweight and obese adults with type 2 diabetes, 8 percent weight loss at 1 year and 5.3 percent weight loss over 4 years compared to usual care control results in greater average increases (2 mg/dL) in HDL-C and greater average reductions in triglycerides.

Strength of evidence: Moderate

Rationale: Look AHEAD enrolled more than 5,000 patients with type 2 diabetes and has achieved a followup rate of 93 and 94 percent in the ILI and DSE groups, respectively. Although only a single study, the number of observations is approximately equal to that obtained in the available systematic reviews and meta-analyses, none of which specifically addressed the effect of weight loss on changes in lipids in overweight and obese adults with type 2 diabetes. The ILI group achieved a weight loss of 8.6 percent of initial body weight at 1 year compared to 0.7 percent in the DSE group, which served as the usual care control in this study [49]. Across 4 years of intervention, the mean weight loss was 6.2 percent in the ILI group versus 0.9 percent in the DSE group [23]. These magnitudes of weight loss resulted in HDL-C increases of 3 mg/dL in the ILI group and 1 mg/dL in the DSE group at 1 year, with a mean increase of 4 mg/dL in the ILI group versus 2 mg/dL in the DSE group when averaged across the 4 years of intervention. Triglycerides decreased by 30 mg/dL in the ILI group and 15 mg/dL in the DSE group at 1 year, with the mean decrease across 4 years being 26 mg/dL in the ILI group versus 20 mg/dL in the DSE group.

ES3. A mean 5-percent weight loss achieved over 4 years by lifestyle intervention in overweight or obese adults with type 2 diabetes is associated with a reduction in newly prescribed lipid-lowering medications compared with controls.

Strength of evidence: Moderate

Rationale: The Look AHEAD research group reported that the ILI produced significantly greater weight loss at 1 year and across 4 years of intervention compared to DSE. However, the reduction in LDL-C was not significantly different between the ILI and DSE groups at 1 year [49] after adjusting for lipid-lowering medication across 4 years [23]. Yet, the percentage of participants not prescribed lipid-lowering mediation at baseline who then initiated lipid-lowering medication across the 4 years of intervention was significantly less in the ILI (47.2 percent) compared to in DSE group (53.2 percent) [23]. There was no difference between the ILI and DSE group with regard to the percentage of participants who were prescribed lipid-lowering medication at baseline and continued to be prescribed lipid-lowering medication at 4 years (ILI = 90.9 percent; DSE = 90.4 percent) [23].

ES4. Among overweight and obese adults with type 2 diabetes, there is a dose-response relationship between the amount of weight loss and the increase in HDL-C that is most pronounced in those who are the least overweight at baseline.

Strength of evidence: Low

Rationale: As described above, Look AHEAD enrolled more than 5,000 patients with type 2 diabetes. The Look AHEAD research group reported that there was an interaction between baseline weight and weight change categories for HDL-C in patients with type 2 diabetes, such that the slope (increase in HDL-C as a function of weight loss) was steepest in those who weighed least at baseline [22].

ES5. Compared to placebo, the addition of orlistat to lifestyle intervention in overweight and obese adults results in an average 3 kg greater weight loss together with an 8 to 12 mg/dL reduction in LDL-C, a 1 mg/dL reduction in HDL-C, and variable changes in triglycerides.

Strength of evidence: High

Rationale: Among overweight and obese adults, an intervention that included lifestyle intervention plus orlistat versus placebo produced weight loss of approximately 1 to 4 kg over a period of at least 1 year and resulted in a decrease in LDL-C of approximately 11 mg/dL [41, 51]. This type of intervention over a period of 2 years that produced weight loss of approximately 3 to 4 kg compared to placebo decreased LDL-C by approximately 8 mg/dL [41]. A similar magnitude of change in LDL-C was reported by Rucker et al. [59] when examining studies that were at least 1 year in duration. In patients with type 2 diabetes, Hutton and Fergusson [47] reported that WMD for weight loss of 2.5 kg was associated with a reduction in LDL-C of approximately 10 mg/dL. Moreover, Norris et al. [51] reported that in patients with type 2 diabetes, an intervention that included orlistat and produced weight loss of approximately 1 to 4 kg compared to control over a period of 12 to 57 weeks resulted in a decrease in LDL-C of approximately 12 mg/dL [51].

Among overweight and obese adults, an intervention that included lifestyle intervention plus orlistat and produced weight loss of approximately 1 to 4 kg compared to placebo over a period of at least 1 year resulted in a nonsignificant decrease in HDL-C of approximately 1 mg/dL [41]. An intervention that included orlistat and produced weight loss of approximately 3 to 4 kg compared to control over a period of 2 years resulted in a decrease in HDL-C of approximately 1 mg/dL [41]. In patients with type 2 diabetes, Hutton and Fergusson [47] reported that a WMD for weight loss of 2.5 kg was associated with a reduction in HDL-C of approximately 1 mg/dL.

There are variable changes in triglycerides associated with weight loss resulting from an intervention that included orlistat. Among overweight and obese adults, an intervention that included orlistat and produced weight loss of approximately 1 to 0 kg compared to control over a period of at least 1 year resulted in a decrease in triglycerides of approximately 3 mg/dL [41, 51] An intervention that included orlistat and produced weight loss of approximately 3 to 4 kg compared to placebo over a period of 2 years resulted in a decrease in triglycerides of approximately 4 mg/dL [41]. A similar magnitude of change in triglycerides was reported by Rucker et al. [59] when examining studies that were at least 1 year in duration. In patients with type 2 diabetes, Hutton and Fergusson [47] reported that a WMD for weight loss of 2.5 kg was associated with a reduction in triglycerides of approximately 17 mg/dL. Moreover, in patients with type 2 diabetes, an intervention that included orlistat and produced weight loss of approximately 1 to 4 kg compared to control over a period of 12 to 57 weeks resulted in a decrease in triglycerides of approximately 20 mg/dL [51].

The Look AHEAD investigators reported that there was an interaction between baseline weight and weight change categories for HDL-C in patients with type 2 diabetes, such that the slope (increase in HDL-C as a function of weight loss) was steepest in those who weighed least at baseline.

G Weight Loss and Hypertension Risk—Spreadsheets 1.6a-f

Eight systematic reviews and meta-analyses and three reports from the Look AHEAD research group were used to examine the effects of weight loss on blood pressure outcomes achieved in overweight and obese adults with elevated CVD risk (including diagnosis of hypertension and type 2 diabetes) achieved by diet or lifestyle interventions or weight loss drugs combined with calorie-restricted diets or lifestyle modification. The literature available to address this question did not specifically examine whether age, sex, gender, ethnicity, BMI, or waist circumference influences the effect on blood pressure of weight loss achieved by alternative, nonsurgical methods. The Look AHEAD trial provides evidence of the effect of weight loss on blood pressure medication use at 1 to 4 years of followup achieved by comprehensive lifestyle intervention in overweight and obese individuals with type 2 diabetes.

ES1. In overweight or obese adults with elevated CVD risk (including type 2 diabetes and hypertension), there is a dose-response relationship between the amount of weight loss achieved for up to 3 years by lifestyle intervention alone or combined with orlistat and the lowering of blood pressure.

  • At a 5-percent weight loss, a weighted mean reduction in systolic and diastolic blood pressure of approximately 3 and 2 mmHg, respectively, is observed.
  • At less than 5-percent weight loss, there are more modest and more variable reductions in blood pressure.

Strength of evidence: High

Rationale: Eight systematic reviews and meta-analyses [40-42, 46, 48, 50, 51, 53, 54, 59, 61] and the Look AHEAD study [22, 23, 49] provided evidence on the effect of weight loss achieved by diet, physical activity, and orlistat combined with energy-restricted diets on systolic and diastolic blood pressure levels in overweight and obese adults with elevated CVD risk, including individuals with type 2 diabetes and hypertension. Three of the reports [40-42] formally modeled the linear relationships between weight loss achieved by lifestyle or orlistat and blood pressure outcomes in overweight and obese adults with elevated CVD risk. The studies reviewed in the systematic reviews and meta-analyses varied considerably in research design, including study subject characteristics and quality ratings; nonetheless, the focused nature of the reviews allowed conclusions regarding weight loss effects on blood pressure in subjects with elevated CVD risk including the presence of diagnosed hypertension or type 2 diabetes. Further distinctions on the relative effectiveness of weight loss on blood pressures of subjects with specific combinations of risk factors or comorbidities were not feasible from this literature. Examination of weight loss drug trials was limited to those involving orlistat since other weight loss drugs were not in clinical use at the time of this review. Surgical interventions for weight loss are not reviewed here because the effects of bariatric surgery results are addressed in CQ5. The authors of the systematic reviews and meta-analyses noted that bias may have been introduced in certain trials due to noncompliance with protocols or loss to followup. Despite this, relatively consistent, modest, and favorable effects on systolic and diastolic blood pressure levels were demonstrated across this literature as a result of weight loss by nonsurgical interventions in overweight and obese adults with elevated CVD risk.

Aucott et al.'s [39] Health Technology Assessment report, which addresses obesity treatments and health improvements, examined 8 trials involving 4,533 overweight and obese adults with high CVD risk; all trials involved orlistat combined with energy-restricted diets (with or without physical activity or other lifestyle behavioral interventions). Weight losses ranged from 1.3 kg to 4.2 kg at 12 to 24 months and resulted in a weighted mean reduction of 2.02 mmHg and 1.64 mmHg in SBP and DBP, respectively. Four trials of lifestyle intervention alone involving more than 550 overweight and obese subjects with elevated CVD risk also demonstrated that weight losses ranging from 2 to 8 kg at 12 to 24 months resulted in a mean 0 to 9 mmHg lowering of SBP and 1 to 12 mmHg reductions in DBP. Formal modeling of the combined orlistat and lifestyle intervention effects suggested linear relationships between weight reduction and blood pressure; a 5-percent change in weight was associated with a decline of 3 mmHg in SBP and 2 mmHg in DBP. Rucker et al. [59] and Padwal et al. [54] reviewed 30 original studies that examined lifestyle intervention alone or drug trials, typically combined with lifestyle intervention, for weight loss in 10,631 overweight and obese adults. WMDs in weight loss at 12-month followup or longer were 1.3 percent to 4.3 percent and resulted in WMD of 1.5 mmHg in systolic and 1.4 mmHg in DBP. Subgroup analyses in subjects with diabetes suggested that weight loss may be more modest. Johansson et al. 48) examined 12 trials of weight loss drugs combined with lifestyle interventions involving 5,540 overweight and obese subjects with elevated CVD risk; WMD in weight loss of −2.8 kg was achieved at 12 months in nondiabetic and diabetic subjects. In nondiabetic subjects, WMDs on SBP and DBP were 2.2 mmHg and 1.6 mmHg for SBP and DBP, respectively; blood pressure effects in adults with type 2 diabetes were more modest. Norris et al. [51] examined 8 weight loss trials of orlistat combined with lifestyle intervention involving 2,036 overweight and obese subjects with type 2 diabetes. A subset of four trials with combined weight loss and blood pressure outcomes demonstrated WMD weight losses ranging from about 1 to 4 kg, which resulted in WMDs of 3 mmHg in systolic and 4 mmHg in DBP. Aucott et al. [40] reviewed 11 trials of orlistat combined with energy-restricted diets in 489 overweight or obese adults with hypertension; at 2 years, WMD in weight (compared to placebo) was about 3 kg and was associated with a 3 mmHg improvement in SBP and a 0 to 2 mmHg in DBP. Formal modeling of the weight loss effects indicated that a 5 kg reduction in weight in overweight or obese adults with hypertension was associated with a 3 mmHg reduction in SBP and a 2 mmHg lowering of DBP. In subgroup analyses of 4 lifestyle interventions for weight loss of up to 5 years duration in 670 overweight and obese adults with elevated CVD risk, higher levels of weight loss (up to 12 kg) were consistently associated with improvements in SBP and DBP. Siebenhofer et al. [61] and Horvath et al. [46] reviewed drug trials for weight loss involving 3,132 adults with hypertension and noted that a 4 kg weight loss is needed to achieve a 2.5 mmHg reduction in SBP and a 2 mmHg reduction in DBP. Horvath et al. [46] also conducted a subgroup analysis of dietary interventions for weight loss of 6- to 36-month duration in 2,219 adults with hypertension. The observed WMDs in body weight of 5 to 6 kg that were associated with a 6 mmHg reduction in SBP and a 3 mmHg reduction in DBP led the authors to conclude that dietary intervention alone for weight loss may be more effective in lowering blood pressures than weight loss drugs combined with energy-restricted diets. The Look Ahead investigators [22, 23, 49] examined the 1- to 4-year outcomes associated with comprehensive lifestyle intervention for weight loss in 5,345 overweight and obese adults with type 2 diabetes. At 12 months, the 8-percent mean weight loss (minus controls) was associated with a 4 mmHg reduction in SBP and a 1 mmHg reduction in DBP. At 4 years, a 5-percent weight loss was retained, but blood pressure effects were attenuated (a reduction of 2 mmHg in systolic and 0.4 mmHg in DBP). Norris et al. [51, 52] examined nonpharmacological interventions for weight loss in 4,699 adults with type 2 diabetes and found that WMDs ranging from 2.8 to 4 kg at 1 to 2 years were associated with a 2 mmHg reduction in SBP and no change in DBP.

ES2. A 5-percent mean weight loss difference achieved over years by intensive lifestyle intervention in overweight or obese adults with type 2 diabetes is associated with a lower prevalence of patients who are prescribed anti-hypertensive medications compared with controls.

Strength of evidence: Moderate

Rationale: The Look Ahead investigators [22, 23, 49] provided evidence at 1- to 4-year followup in 5,145 overweight and obese adults with type 2 diabetes that comprehensive lifestyle intervention for weight loss results in reduced blood pressure medication use. Fewer adults involved in intensive interventions initiated or continued hypertensive medications over 1 to 4 years.

Recommendations

The panel decided that the recommendations from CQ1 should follow the recommendations made by CQ2, thus the following recommendation is numbered “Recommendation 2”. Recommendation 1 is included in the CQ2 section of the report, 3.8.1. Recommendation 2: Counsel overweight and obese adults with cardiovascular risk factors (high blood pressure, hyperlipidemia and hyperglycemia), that lifestyle changes that produce even modest, sustained weight loss of 3%-5% produce clinically meaningful health benefits, and greater weight losses produces greater benefits.

  1. Sustained weight loss of 3%-5% is likely to result in clinically meaningful reductions in triglycerides, blood glucose, HbA1C and the risk of developing type 2 diabetes;
  2. greater amounts of weight loss will reduce blood pressure, improve LDL-C and HDL-C and reduce the need for medications to control blood pressure, blood glucose and lipids as well as further reduce triglycerides and blood glucose.

Recommendation Grade: A (strong)

Rationale: The body of evidence was clearly in favor of a dose–response relationship between intentional weight loss and reduction in cardiovascular risk factors. By focusing on outcomes at 1 or more years after the beginning of treatment we were more confident that the reported improvements were related to the reduced weight/body fat, not due to the acute or sub-acute effects of negative energy balance. The amount of weight loss needed to detect a clinically meaningful improvement was not the same for all risk factors. Glycemia-related risk and triglyceridemia were responsive to modest (3-5 percent), sustained weight loss. For those at risk of developing type 2 diabetes, 3-5 percent sustained weight loss reduced the incidence of diabetes. This finding was perhaps the most clinically meaningful. Patients at risk for developing type 2 diabetes can substantially reduce that risk by sustaining a modest weight loss over time. Given the morbidity and cost of treatment for type 2 diabetes, and given that this degree of weight loss is readily achievable with lifestyle treatment, this group seems particularly suited to benefit fromparticipating in comprehensive lifestyle interventions. On average, 3-5 percent weight loss also is associated with clinically meaningful reductions in serum triglyceride concentrations, as well as lowering of fasting glucose and HbA1C in patients with type 2 diabetes. Greater degrees of weight loss result in greater reductions in fasting glucose and HbA1C, despite the need for less anti-diabetic medication, and further lowering of serum triglyceride concentrations. On average, clinically meaningful reductions in systolic and diastolic blood pressure, lowering of LDL-C and increases in HDL-C are seen in those who sustain weight losses of ≥5 percent of body weight. These improvements are greater in those who achieve greater amounts of weight loss via lifestyle interventions, despite the need for less medication to treat hyperlipidemia and hypertension. As the panel was completing its work it was announced that the Look AHEAD trial would be discontinued because, after up to 11 years of lifestyle intervention, it was judged the likelihood of detecting a significant difference in cardiovascular mortality between the lifestyle and control groups was too low. Both the intensive lifestyle group and the diabetes support and education group had many fewer cardiovascular events than had been previously reported in populations of type 2 diabetes, possibly due to the aggressive pharmacotherapy for known cardiovascular risk factors. The group treated with intensive lifestyle required fewer medications, had less sleep apnea, better quality of life and greater physical mobility. Our interpretation of this announcement is that pharmacotherapy, together with diabetes support and education interventions, is equal to comprehensive lifestyle interventions in reducing cardiovascular events. The data indicate that ILI reduce medication need and improve quality of life compared to a control intervention.

H Gaps in Evidence and Future Research Needs

The literature available to answer CQ1 did not specifically address whether age, sex, race, or baseline BMI/waist circumference modify the beneficial effects of weight loss with regard to cardiovascular risk factors. Likewise, the systematic reviews and meta-analyses did not specifically address the issue of how baseline comorbid conditions and cardiovascular risk factors modify the response to weight loss. Thus, although the work group was able to address parts “b” and “c,” they could not address all of part “a.” Because only systematic reviews and meta-analyses and the Look AHEAD data were used, however, it is possible that there is high-quality literature that does address these issues. Given that caveat, future research in this area should address the following questions:

  1. Do the observed improvements in cardiovascular risk factors, need for medication, and quality of life associated with weight loss differ by age, sex, race, and BMI/waist circumference?
  2. What is the cost effectiveness of modest weight loss as a preventive strategy for those at risk for developing type 2 diabetes?
  3. What is the best approach to identify and engage those who can benefit from weight loss?

Section 4: Critical Question 2

A Statement of the Question

CQ2 has three parts:

  1. Are the current cutpoint values for overweight (body mass index (BMI) 25.0 to 29.9) and obesity (BMI ≥30) compared with BMI 18.5 to 24.9 associated with elevated cardiovascular disease (CVD)-related risk (defined below)? Are the waist circumference cutpoints of >102 cm (M) and >88 cm (F) associated with elevated CVD-related risk (defined below)? How do these cutpoints compare with other cutpoints in terms of elevated CVD-related risk and overall mortality?

    1. Fatal and nonfatal CHD, stroke, and CVD (CHD and stroke)

    2. Overall mortality
    3. Incident type 2 diabetes
    4. Incident dyslipidemia
    5. Incident hypertension
  2. Are differences across population subgroups in the relationships of BMI and waist circumference cutpoints with CVD, its risk factors, and overall mortality sufficiently large to warrant different cutpoints? If so, what should they be?

    1. Fatal and nonfatal CHD, stroke, and CVD

    2. Overall mortality
    3. Incident type 2 diabetes
    4. Incident dyslipidemia
    5. Incident hypertension

Groups being considered include:

  1. Age
  2. Sex (both male and female)
  3. Race/ethnicity (African American, Hispanic, Native American, Asian, non-Hispanic White/Caucasian)

3. What are the associations between maintaining weight and weight gain with elevated CVD-related risk in normal weight, overweight, and obese adults?

Subgroup Analyses

  • By Population Subgroups

    • Age

    • Sex
    • Socioeconomic status (no evidence anticipated)
    • Race/ethnicity
    • BMI cutpoints (overweight (BMI 25.0 to 29.9) vs. obese (BMI ≥30.0) vs. normal (BMI 18.5 to 24.9) or whatever the evidence dictates)
    • Waist circumference cutpoints
  • By CVD Risk Factors

    • Fatal and nonfatal CHD, stroke, CVD

    • Overall mortality
    • Incident dyslipidemia
    • Elevated blood pressure, hypertension
    • Incident cases of type 2 diabetes
  • By Amount of Weight loss

    • Different cutpoints

  • By Weight Loss Maintenance

    • Different cutpoints

  • Modifiers To Take into Account

    • Smoking status (as an effect modifier only)

    • Diminished cardiorespiratory fitness (as an effect modifier only)
    • Depression (as an effect modifier only)
    • Metabolic syndrome (as a mediator)

B Selection of the Inclusion/Exclusion Criteria

Panel members identified inclusion/exclusion (I/E) criteria in 10 categories for CQ2 (see Table 3). The criteria included the population, intervention/exposure, comparison group, outcome, time, and setting (PICOTS) criteria as the first six as well as several others related to study design, type of publication, and timeframe for publication:

  • Population
  • Intervention
  • Comparator
  • Outcomes
  • Time
  • Setting
  • Study Design
  • Language
  • Publication Type
  • Publication Timeframe
Table 3. Criteria for Selection of Publications for CQ2
image
image

For each of these criteria, the panel members developed detailed specifications related to each component. The population of interest for CQ2 is American adults. For this CQ, intervention studies were not included.

C Introduction and Rationale for Question and Inclusion/Exclusion Criteria

Overall, CQ2 evaluates the utility of two well-established measures in obesity—BMI and waist circumference. Specifically, CQ2 addresses the CVD health risks associated with overweight (BMI 25.0 to 29.9 kg/m2) and obesity (BMI ≥30 kg/m2) defined by the current cutpoints. These cutpoints were established in the 1998 overweight and obesity clinical guidelines [5] and have been widely established as the standard in clinical practice and research settings. As a result, the classification for BMI has been broadly applied across the population. CQ2 also seeks to determine if the current cutpoints defining persons as overweight and obese are equally appropriate for key subgroups within the U.S. population. Lastly, CQ2 attempts to address the issue of elevated waist circumference, as defined by current cutpoints, and its association with CVD health risks. Waist circumference cutpoints of >102 cm (>40 in.) for men and >88 cm (>35 in.) for women were recommended in the 1998 overweight and obesity clinical guidelines to identify “increased risk in most adults with a BMI of 25 to 34.9 kg/m2.” A 2008 report of a WHO Expert Consultation [66] identified these waist circumference cutpoints as associated with “substantially increased” risk while cutpoints of >94 cm in men and >80 cm in women were identified as associated with “increased” risk. Other alternative waist circumference cutpoints include those by the International Diabetes Federation: ≥94 cm for men and ≥80 cm for women (Europids), ≥90 cm for men and ≥80 cm for women (South Asians and Chinese), and ≥85 cm for men and ≥90 cm for women (Japanese). The panel searched for evidence on all of these cutpoints as they relate to elevated CVD risk.

The utility of BMI and waist circumference cutpoints is of interest because it is important for PCPs to be able to understand how to use easily obtained measures that serve as surrogates of body fatness (BMI) and the distribution of that body fat (waist circumference) in decision making. It is important to know whom to identify as a potential candidate for weight reduction therapy or further evaluation of other CVD risk factors. Ultimately, the goal is to produce the basis for clear guidance for the practitioner to efficiently advise those patients likely to be at high risk and thus most likely to benefit from a weight control intervention. Note that the panel did not review the literature evaluating the diagnostic performance of BMI compared to more valid measures of percent body fat (e.g., dual energy x-ray absorptiometry); these are not as simple or inexpensive to use as BMI in clinical settings.

D Methods for Critical Question 2

The Obesity Expert Panel formed work groups for each of its five CQs. For CQ2, the work group was chaired by an epidemiologist and included physicians and researchers representing universities; NHLBI; and National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK).

CQ2 addresses the association of BMI and waist circumference with CVD events and CVD-related outcomes such as mortality, type 2 diabetes, dyslipidemia, and hypertension. Other indices, such as waist-to-hip ratio, waist-to-height ratio, or sagittal abdominal diameter, were not examined here. The methodology team assisted by applying the PICOTS criteria and also worked with CQ2 panel members to develop and refine the detailed I/E criteria. Due to resource limitations, the CQ2 work group limited its literature search and evidence review for CQ2 to systematic reviews, meta-analyses, and pooled analyses to reduce the number of individual articles to be searched, reviewed, and quality rated. The evidence review was limited to studies that were published between the years 2000 and 2011. Thus, the evidence statements and rationales for this CQ reflect the status of the literature as of the dates of the search, but some conclusions may change or be refined as new data become available. Panel members excluded studies that focused on specific subpopulations with a disease or condition (e.g., women with breast cancer, adults on maintenance hemodialysis) and constructed spreadsheets from the identified articles. Then the methodology team reviewed and checked the spreadsheets for accuracy.

The literature search for CQ2 included an electronic search of the Central Repository for systematic reviews and meta-analyses published in the literature from January 2000 to October 2011. The Central Repository contains citations pulled from seven literature databases: PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts. The literature search produced 1,566 citations, with 5 additional citations identified from nonsearch sources; i.e., by the panel members. Three of the five citations met the criteria and were eligible for inclusion in the CQ2 evidence base [67-69]. In contrast, the other two citations did not meet the criteria and were excluded from the CQ2 evidence base [70, 71]. Figure 4 outlines the flow of information from the literature search through the various steps used in the systematic review process.

Figure 4.

PRISMA Diagram Showing Selection of Articles for Critical Question 2

Two reviewers independently screened the titles and abstracts of 1,571 publications against the I/E criteria, which resulted in exclusion of 1,089 publications and retrieval of 482 publications for full-text review to further assess eligibility. Next, two reviewers independently screened the 482 full-text publications and assessed eligibility by applying the I/E criteria; 467 of these publications were excluded based on one or more of the I/E criteria (see specified rationale as noted in Figure 4).

Fifteen of the 482 full-text publications met the criteria and were included. The quality (internal validity) of these 15 publications was assessed using the quality assessment tool developed to assess systematic reviews and meta-analyses (see Appendix table A-2). Of these, 12 publications were rated as poor quality [67-69, 72-80], however, they were used as part of the evidence base since NHLBI policy indicated that poor studies could be used as part of this evidence base if the majority of included studies were not rated good or fair. Rationales for all of the poor-quality ratings are included in Appendix table B-12. Panel members reviewed the final articles on the list, along with their quality ratings, and had the opportunity to raise questions. They appealed some pooled analyses previously deemed to be of poor quality [81, 82], which were subsequently upgraded to fair quality upon closer review by the methodology team, who made the final decision. Three systematic reviews and meta-analyses were ultimately rated good or fair quality [81-83] and included in the evidence base that was used to formulate the evidence statements. Panel members created spreadsheets containing key information from the systematic reviews and meta-analyses; these were cross-checked by the methodology and systematic review teams and formed the basis for panel deliberations.

Some papers included in this evidence review examined BMI using the current cutpoints, and therefore, the panel was able to evaluate the performance of those cutpoints for CVD risk prediction. Other included papers evaluated BMI as a continuous variable, and the results were used to support categorical analyses, but these results were not used in isolation to evaluate the current cutpoints. To give some basis of comparison to categorical studies, when continuous BMI was used the panel calculated the risk estimate for a BMI of 27.45 (midpoint of overweight range) and 34.95 (midpoint of obese class I and II ranges) compared to 21.70 (midpoint of normal BMI range) as the reference. To calculate these risk estimates, the panel took the natural log of the risk ratio (RR) or hazard ratio (HR) reported in the study of continuous BMI and divided it by the number of BMI units that it represented to estimate the beta coefficient per 1 unit of BMI. Then the panel multiplied the result by 5.75, which is the distance between the midpoint of normal weight (21.70) and the midpoint of overweight (27.45), to estimate the contrast between overweight and normal weight. Similarly, the panel multiplied the beta coefficient per 1 unit of BMI by 13.25, which is the distance between the midpoint of normal weight (21.70) and the midpoint of obesity class I and II (34.95). Lastly, the panel exponentiated these values to convert them to RRs. For example, Wormser et al. [82] reported an HR of 1.29 per 4.56 units of BMI for incident CHD. The panel calculated risk estimates for overweight and obesity compared to normal weight as follows [84].

display math
display math

If interactions were tested, the results were used to determine effect modification. Plots and figures were studied carefully but interpreted with caution unless the results shown used current cutpoints and included confidence intervals (CIs). BMI below 18.5 kg/m2 was not examined as underweight was not part of CQ2.

Preference was given to estimates from studies that used measured rather than self-reported height and weight. Self-reported body weight is highly correlated with measured body weight, with many studies showing correlations between the two measures of greater than 0.9 [85-88]. Nevertheless, adults tend to underreport their weight, and the amount of underestimation is greater in heavier than in lighter individuals [86, 87, 89]. Also, since height is often over reported, especially in men [90], BMI calculated from self-report measures can underestimate measured BMI. It has been shown that self-reported and measured BMI values are equally correlated with CVD risk factors in cross-sectional studies [86]. Nevertheless, when the goal is to examine risk at a specified BMI cutpoint, as is the case here, the bias in self-reported data results in misclassification and could be problematic [91]. Some papers in the search included only studies that used measured weights and heights while other meta-analyses, pooled analyses, or systematic reviews included results from some individual studies that did not. The panel used both self-reported and measured data for evaluation of BMI as a continuous variable; however, when addressing cutpoints, the panel did not use the evidence from a study if it used measured height and weight to calculate BMI in less than 85 percent of studies included in a meta-analyses or less than 85 percent of individuals in pooled analyses. The criterion value of 85 percent was arbitrary and based only on expert opinion.

Crude, unadjusted results and results adjusted for mediators of the effect of BMI or waist circumference on outcomes were not used. Thus, the panel did not use the included study by Guh et al. [74] because that study only presented unadjusted risk estimates. This review included studies that adjusted their analysis at least for age; analyses that adjusted for age, gender, and smoking were used when available.

The panel was cautious in the interpretation of ratio estimates of risk (such as HRs or RR) used to compare obesity-associated risk by age group. It is well known that ratio estimates are inflated in groups in which the incidence of the outcome is rare in the reference group compared to groups in which the incidence is more common in the reference group [92, 93] Thus, even if the absolute increase in the number of events is higher in older than younger obese adults, ratio estimates of obesity risk can be higher in young adults than older adults if younger adults of normal weight are much less likely to have the event than older adults of normal weight (and therefore, the denominator of the ratio can be much smaller in the younger group). This circumstance applies to CVD, mortality, and type 2 diabetes since older normal weight adults are much more likely to experience events than younger normal weight adults.

Since for CQ2 the search was limited to systematic reviews, pooled analyses, or meta-analyses of observational studies, the strength of evidence was not considered to be high because only observational study designs were used rather than RCTs. The panel also limited its analysis to studies that were conducted in countries with predominantly Western cultures, limiting conclusions in some race groups.

Areas of insufficient evidence. The panel was not able to address parts of CQ2 due to lack of systematic reviews, meta-analyses, and pooled analyses identified in the systematic search. The panel members were aware of a large body of literature from individual studies examining the associations between BMI or waist circumference and hypertension or dyslipidemia, but they have not been summarized in meta-analyses, pooled analyses, or systematic reviews that met the panel's criteria. In addition, there were no studies in the search that compared alternative cutpoints to current cutpoints as they relate to risk for CHD, stroke, CVD, overall mortality, and diabetes. There were no studies identified that examined current waist circumference cutpoints as they relate to the risk for all outcomes addressed in CQ2; however, the panel examined studies that used waist circumference as a continuous variable. For several outcomes, there were no analyses in retrieved studies that examined current BMI and waist circumference cutpoints stratified by age, gender, and race/ethnicity. Finally, there was a lack of these types of studies examining the associations between maintaining weight and weight gain with elevated CVD risk in normal weight, overweight, and obese adults. For this reason, the panel did not develop evidence statements addressing questions related to these areas.

The methodology team and systematic review team worked closely with panel members to ensure the accuracy of data and the application of systematic, evidence-based methodology.

E Evidence Statements and Summaries

This section will describe the evidence statements and their rationale for CQ2a and 2b. Fifteen studies met the final I/E criteria (Spreadsheet Spreadsheet 2.1) [67-69, 72-83]. CQ2c could not be addressed in this report due to lack of systematic reviews, pooled, or meta-analyses examining the associations between maintaining weight and weight gain with elevated CVD risk in normal weight, overweight, and obese adults.

E.1 BMI Cutpoints and CVD-Related Risk

E.1.1 a. Fatal and Nonfatal Coronary Heart Disease

Areas of Insufficient Evidence. The available evidence from systematic reviews, pooled analyses, and meta-analyses is not sufficient to adequately address all sections of CQ2 related to the relationship between BMI and fatal and nonfatal CHD. Specifically, the panel was unable to address the adequacy of current BMI cutpoints for overweight and obesity in comparison to alternative cutpoints. In addition, the panel was unable to determine if age-specific or race-specific BMI cutpoints for overweight and obesity are warranted to delineate elevated risk for fatal and nonfatal CHD. Therefore, the panel could not prepare evidence statements addressing these areas.

Current BMI Cutpoints—Spreadsheet 2.2.1a.-b.

ES1. Among overweight and obese adults, analyses of continuous BMI show that the greater the BMI, the higher the risk for fatal CHD and combined fatal and nonfatal CHD. The current cutpoints for overweight (BMI ≥25.0 kg/m2) and obesity (BMI ≥30 kg/m2) compared with normal weight (BMI 18.5 to < 25.0 kg/m2) are associated with elevated risk for combined fatal and nonfatal CHD.

Strength of evidence: Moderate

Rationale: Associations between continuous BMI and combined fatal and nonfatal CHD were studied in two pooled analyses [72, 82] and two meta-analyses (Spreadsheet Spreadsheet 2.2.1a) [78, 80]. Pooled analyses using studies from predominantly Western, developed countries were conducted by Bogers et al. in 21 cohorts and the Emerging Risk Factors Collaboration (ERFC) in 58 cohorts. The meta-analyses by Owen et al. included 15 cohorts, and those by Whitlock et al. included 46 cohorts. The pooled studies were adjusted for age, gender, and smoking; the meta-analyses included adjustments for age and gender. The studies showed risk estimates associated with a 5 kg/m2 [72], a 1 standard deviation (SD) (4.56 kg/m2) [82], a 1 kg/m2 or 1 SD (2.5 kg/m2) [78], and a 2 kg/m2 [80] increase in BMI. All four studies indicated an increased risk for incident CHD with increasing BMI. Using continuous analyses from these four studies, the panel calculated risk estimates for the midpoints of overweight (BMI = 27.45) and obesity (BMI = 34.95) compared to the midpoint of the normal weight (BMI = 21.7). Compared to the midpoint of the normal weight range, the calculated risk estimate for the midpoint of overweight ranged from 1.32 to 1.56 while the calculated risk estimate for the midpoint of obesity ranged from 1.88 to 2.77. These estimates were consistent with the estimates from the categorical analyses.

Table Spreadsheet 2.2.1a. Combined Fatal and Nonfatal CHD—Results for BMI
image

The Prospective Studies Collaboration (PSC) presented pooled analyses on BMI associations with fatal CHD that included 57 studies and adjusted for age, gender, and smoking (Spreadsheet Spreadsheet 2.2.1b) [81]. In the PSC study examining the BMI as a continuous variable (per 5 kg/m2), the authors found positive associations with BMI. Using continuous analyses from Whitlock et al. [81], the panel calculated risk estimates for the midpoints of overweight (BMI = 27.45) and obesity (BMI = 34.95) compared to the midpoint of the normal weight (BMI = 21.7) (see Methods for Critical Question 2). The calculated risk estimates for overweight and obesity were 1.46 and 2.39, respectively, as compared to the midpoint of the normal BMI range. The PSC also examined the association between BMI and fatal CHD within 2 BMI strata and found that the slope was steeper with a BMI above 25 compared to a BMI of below 25.

Table Spreadsheet 2.2.1b. Fatal CHD—Results for BMI
image

Only a meta-analysis by Whitlock et al. investigated fatal and nonfatal CHD separately and found that the risk estimates for the two outcomes were similar (Spreadsheets Spreadsheet 2.2.1a and Spreadsheet 2.2.1b) [80]. Using the continuous analyses by Whitlock et al. [80], the panel calculated risk estimates for the midpoints of overweight (BMI = 27.45) and obesity (BMI = 34.95) and compared them to the midpoint of the normal weight (BMI = 21.7) (see Methods for Critical Question 2). The calculated risk estimates were 1.49 for overweight and 2.52 for obesity when compared to normal weight.

Two pooled analyses examined BMI using current cutpoints [77, 82]. The study by Wormser et al. indicated increased risk for combined fatal and nonfatal CHD at BMI levels higher than 25 compared to lower levels. The risk tended to be higher in the obese (BMI ≥30) when compared to the overweight group (BMI 25.0 to 29.9) (Spreadsheet Spreadsheet 2.2.1a) [82]. McGee et al. [77] presented a pooled analysis of 26 studies with 100-percent measured BMI that examined BMI associations with fatal CHD (Spreadsheet Spreadsheet 2.2.1b). This study adjusted for age and smoking and stratified the analysis by gender. McGee et al. [77] found a small increase in the BMI 25 to 30 category and a larger increase in the BMI ≥30 category when compared to a BMI from 18.5 to 24.9 in both men and women. The corresponding risk estimates were 1.097 (95 percent CI: 1.001 to 1.201) and 1.624 (95 percent CI: 1.459 to 1.806) for women; and 1.159 (95 percent CI: 1.088 to 1.235) and 1.508 (95 percent CI: 1.362 to 1.670) for men [77].

Age-, Gender-, and Race-Specific BMI Cutpoints—Spreadsheets 2.2.1a.-b.

None of the studies included in the search examined current BMI cutpoints in relation to CHD risk stratified by age or race.

ES2. Among overweight and obese adults, analyses of continuous BMI show that the greater the BMI, the higher the risk for fatal CHD and combined fatal and nonfatal CHD in both men and women. The current BMI cutpoints for overweight (BMI ≥25.0 kg/m2) and obesity (BMI ≥30.0 kg/m2) compared with normal weight (BMI 18.5 to <25.0 kg/m2) are associated with elevated risk for fatal CHD in both sexes.

Strength of evidence: Moderate

Rationale: Wormser et al. and Whitlock et al. examined the associations between continuous BMI and combined fatal and nonfatal CHD and fatal CHD, respectively (Spreadsheets Spreadsheet 2.2.1a and Spreadsheet 2.2.1b) [81, 82]. Both studies found that risk was significantly higher with increasing BMI in both men and women, and BMI-gender interactions were null (P = .643 and P = .2, respectively). In the study by Wormser et al., the HRs per 1 SD (4.56 kg/m2) were 1.24 (95 percent CI: 1.14 to 1.35) in women and 1.26 (95 percent CI: 1.18 to 1.34) in men. In the study by Whitlock et al., the HRs per 5 kg/m2 were 1.35 (95 percent CI: 1.28 to 1.43) in women and 1.42 (95 percent CI: 1.35 to 1.48) in men. Using continuous analyses from these two studies, the panel calculated risk estimates for the midpoints of overweight (BMI = 27.45) and obesity (BMI = 34.95), comparing them to the midpoint of the normal weight (BMI = 21.7) (see section D, Methods for Critical Question 2). Compared to the midpoint of the normal weight range, the calculated risk estimates for the midpoints of overweight and obesity were 1.34 and 1.96 in men and 1.31 and 1.87 in women, respectively, in the study by Wormser et al. The respective calculated risk estimates were 1.50 and 2.53 in men and 1.41 and 2.22 in women in the study by Whitlock et al.

McGee et al. [77] was the only study that investigated current BMI cutpoints in relation to fatal CHD stratified by gender (Spreadsheet Spreadsheet 2.2.1b). The author did not formally test gender interactions, but the CIs for the gender-specific risk estimates overlapped, indicating that the estimates were not different from each other. In women, the risk estimates were 1.10 (95 percent CI: 1.00 to 1.20) for overweight and 1.62 (95 percent CI: 1.46 to 1.81) for obesity. The corresponding risk estimates for men were 1.16 (95 CI: 1.09 to 1.24) and 1.51 (95 percent CI: 1.36 to 1.67), respectively. Thus, evidence does not indicate a need for gender-specific BMI cutpoints for CHD.

E.1.2 b. Fatal and Nonfatal Stroke

Areas of Insufficient Evidence: The available evidence from systematic reviews, pooled analyses, and meta-analyses is not sufficient to adequately address all sections of CQ2 related to the relationship between BMI and fatal and nonfatal stroke. Specifically, the panel was unable to address issues related to changing current BMI cutpoints for overweight and obesity when compared to alternative cutpoints. In addition, the panel was unable to determine if age-, sex-, or race-specific BMI cutpoints for overweight and obesity are warranted to delineate elevated risk for fatal and nonfatal stroke. Therefore, there will not be evidence statements addressing these areas.

Current BMI Cutpoints—Spreadsheets 2.2.2a.-b.

ES3. Among overweight or obese adults, analyses of continuous BMI show that the greater the BMI the higher the risk for fatal stroke overall as well as ischemic and hemorrhagic stroke. The same relationship holds for combined fatal and nonfatal ischemic stroke across the entire BMI range, not just in overweight and obese adults. There is no evidence from meta-analyses, pooled analyses, and systematic reviews to change current BMI cutpoints as they relate to risk for stroke.

Strength of evidence: Moderate

Rationale: One pooled analysis examined associations between BMI and stroke mortality from 57 cohorts: 894,576 adults predominantly from Europe, Israel, United States (including at least one study in Hawaii), and Australia with less than 10 percent from Japan (Spreadsheet Spreadsheet 2.2.2b) [81]. The underlying cause of death was obtained from death certificates, and confirmation was sought from other sources (e.g., autopsy findings or medical records) in some but not all studies. There were 6,128 stroke deaths overall, 1,395 deaths from ischemic stroke, and 1,107 deaths from hemorrhagic stroke reported over a mean 13 years of followup; the first 5 years were excluded to limit reverse causality. All but three studies used measured height and weight. This analysis adjusted for study, age at risk, and smoking status. For BMI in the range of 25 to less than 50 kg/m2, each 5 kg/m2 increase in BMI was associated with an increased risk for overall stroke mortality (1.39; 95 percent CI: 1.31 to 1.48) and death from ischemic (1.38; 95 percent CI: 1.23 to 1.56) and hemorrhagic (1.53; 95 percent CI: 1.32 to 1.78) stroke. However, for BMI in the range of 15 to less than 25, there was no increased risk for overall stroke (0.92; 95 percent CI: 0.82 to 1.03) and death from ischemic (0.87; 95 percent CI: 0.68 to 1.10) or hemorrhagic (0.75; 95 percent CI: 0.58 to 1.00) stroke. Findings did not differ for either BMI range when analyses were restricted to participants who had never smoked; however, the magnitude of risk for hemorrhagic stroke was less for BMI 25 to 50 (1.37; 95 percent CI: 1.09 to 1.73) although still significant.

Another pooled analysis [82] examined the relationship of BMI with combined fatal and nonfatal ischemic stroke but did not examine overall stroke or other stroke subtypes (Spreadsheet Spreadsheet 2.2.2a). This study used data from 58 studies including 85,169 participants (2,431 cases) in 17 countries, predominantly European, with 4 percent of participants each from Australia and Japan. There were 2,906 ischemic stroke outcomes; 43 of 58 studies reported diagnosis of stroke on the basis of brain imaging and attributed stroke subtype. In ischemic stroke analyses, 21 studies were used with a total of 2,582 outcomes over a median of 5.7 years. Almost all studies used measured height and weight, but the exact number for stroke analyses was not provided. Each ∼5 kg/m2 BMI increase was associated with an increased risk for fatal and nonfatal ischemic stroke 1.20 (95 percent CI: 1.12 to 1.28) after adjusting for age, sex, and smoking status.

Table Spreadsheet 2.2.2a. Combined Fatal and Nonfatal Stroke—Results for BMI
image
image
image
Table Spreadsheet 2.2.2b. Fatal Stroke—Results for BMI
image
image

Both studies examined risk for stroke by BMI in figures. In the 2009 study by Whitlock et al. [81], yearly death rates per 1,000 indicated greater risk for overall stroke mortality among participants with obesity (rates read from graphs were about 1.5 to 3) than overweight and normal weight participants (rates about 1.0 to 1.5), with nonoverlapping CIs (Spreadsheet Spreadsheet 2.2.2b). These relationships were similar for ischemic and hemorrhagic stroke. Wormser et al. presented risk for combined fatal and nonfatal ischemic stroke for BMI quintiles but not for other stroke outcomes (Spreadsheet Spreadsheet 2.2.2a) [82]. Compared to the lowest BMI quintile (between 20 and 25 kg/m2), a higher risk was observed in the quintiles overlapping the current overweight category (HR ∼1.2 to 1.4) after adjustment for age, sex, and smoking status. The risk for the highest quintile (BMI >30) was higher (HR ∼1.5 to 1.6) than those in the overweight category, but the 95 percent CI overlapped one of the quintiles in the overweight category. Although not entirely consistent, there is no evidence from meta-analyses, pooled analyses, and systematic reviews to change current BMI cutpoints as they relate to risk for stroke.

Age-, Gender-, and Race-Specific BMI Cutpoints

None of the studies included in the search examined current BMI cutpoints in relation to stroke risk stratified by age, gender, or race.

E.1.3 c. Fatal and Nonfatal CVD

Areas of Insufficient Evidence: The available evidence from systematic reviews, pooled analyses, and meta-analyses is not sufficient to adequately address all sections of CQ2 related to the relationship between BMI and fatal and nonfatal CVD. Specifically, the panel was unable to address issues of the adequacy of current BMI cutpoints for overweight and obesity in comparison to alternative cutpoints. Given the lack of absolute risk estimates, the panel was unable to determine if age-specific BMI cutpoints for overweight and obesity are warranted to delineate elevated risk for fatal and nonfatal CHD.

Current BMI Cutpoints—Spreadsheets 2.2.3a.-b.

ES4. Among overweight and obese adults, analyses of continuous BMI show that the greater the BMI, the higher the risk for combined fatal and nonfatal CVD. The current cutpoint for obesity (BMI ≥30 kg/m2) compared with normal weight (BMI 18.5 to <25.0 kg/m2) is associated with an elevated risk for fatal CVD in men and women.

Strength of evidence: Moderate

Rationale: Only the ERFC pooled analysis gave results for combined fatal and nonfatal CVD (Spreadsheet Spreadsheet 2.2.3a) [82]. They used continuous BMI. The HR associated with a 4.56 kg/m2 or 1 SD increase in BMI was 1.23 (95 percent CI: 1.17 to 1.29). The panel calculated risk estimates for the midpoints of overweight (BMI = 27.45) and obesity (BMI = 34.95) as compared to the midpoint of normal weight (BMI = 21.7) as 1.30 and 1.82, respectively (see Methods for Critical Question 2).

Table Spreadsheet 2.2.3a. Combined Fatal and Nonfatal CVD—Results for BMI
image

No study showed overall associations between BMI and fatal CVD, but one study showed analyses stratified by age and race in women [67]. Another showed associations stratified by gender (Spreadsheet Spreadsheet 2.2.3b) [77]. Abell et al. [67] examined African American and White women <60 and ≥60 years of age (n = 2,843), adjusting their analysis for age and smoking. The RR of obesity compared to normal weight ranged from 1.18 in African American women ≥60 years to 2.49 in White women <60 years). No associations were detected for overweight compared to normal weight [67].

Table Spreadsheet 2.2.3b. Fatal CVD—Results for BMI
image

The analysis of the McGee study, stratified by gender, is presented above. The RR of obesity compared to normal weight was 1.529 (95 percent CI: 1.381 to 1.692) in women and 1.453 (95 percent CI: 1.327 to 1.590) in men. Overweight was associated with an elevated risk for fatal CVD in men (HR = 1.10; 95 percent CI: 1.03 to 1.16) but not in women [77].

Age-, Gender-, and Race-Specific BMI Cutpoints—Spreadsheet 2.2.3b.

None of the studies included in the panel's search examined current BMI cutpoints in relation to CVD risk stratified by age alone.

ES5. In men but not women, the current BMI cutpoint for overweight (BMI 25.0 to 29.9 kg/m2) compared to normal weight (BMI 18.5 to <25.0 kg/m2) is associated with an elevated risk for fatal CVD. In both men and women, obesity (BMI ≥30.0 kg/m2) compared with normal weight is associated with an elevated risk for fatal CVD.

Strength of evidence: Low

Rationale: The pooled analysis by McGee et al. [77] investigated the effect of current BMI cutpoints on fatal CVD stratified by gender and estimated similar RRs for men and women for overweight and obesity (Spreadsheet Spreadsheet 2.2.3b). The RRs were 1.096 (95 percent CI: 1.034 to 1.163) and 1.453 (95 percent CI: 1.327 to 1.590) in men and 1.029 (95 percent CI: 0.948 to 1.116) and 1.529 (95 percent CI: 1.381 to 1.692) in women for overweight and obesity, respectively. However, the RR for overweight among women was not significant. The gender-BMI interaction was not tested. Only one study included in the systematic review — by Lenz et al. [83] was relevant to answering this question. It calculated CVD mortality rates (standardized mortality rates (SMRs) standardized to the overall German population) for high BMI levels (36.0 to 39.9 and ≥40.0) for men and women separately (Spreadsheet Spreadsheet 2.2.3b) [83]. The SMRs were larger in men than women; however, the levels of BMI investigated are above the current overweight and obesity cutpoints.

ES6. Using current BMI cutpoints, the relative risk for fatal CVD is higher in obese White than in obese African American women compared to normal weight women. In overweight women, there is no increase in risk for fatal CVD compared to normal weight women in either race group.

Strength of evidence: Low

Rationale: Abell et al. examined the effect of measured BMI on fatal CVD for White and African American women stratified by age (<60 and ≥60 years; Spreadsheet Spreadsheet 2.2.3b). [67]. They found that there was no increase in risk in overweight compared to normal weight women of either race. However, the risk associated with obesity was higher in White women than in African American women in both age categories. In women <60 years, risk for fatal CVD associated with obesity was 2.49 (95 percent CI: 1.91 to 3.22) in White women and 1.46 (95 percent CI: 1.07 to 2.01) in African American women. For women ≥60 years, the respective estimates were 1.44 (95 percent CI: 1.25 to 1.65) and 1.18 (95 percent CI: 0.90 to 1.55). Race interactions were not tested in women, and there was no evidence in men.

E.1.4 d. Incident Type 2 Diabetes

Areas of Insufficient Evidence: The available evidence from systematic reviews, pooled analyses, and meta-analyses is not sufficient to adequately address all sections of CQ2 related to the relationship between BMI and diabetes. Specifically, the panel was unable to address issues of the adequacy of current BMI cutpoints for overweight and obesity in comparison to alternative cutpoints. In addition, the panel was unable to determine if age-, gender-, or race/ethnic-specific BMI cutpoints for overweight and obesity are warranted to delineate an elevated risk for type 2 diabetes. Therefore, there will not be evidence statements addressing these areas.

Current BMI Cutpoints—Spreadsheet 2.2.4.

ES7. Analyses of continuous BMI across the entire BMI range show that the greater the BMI, the higher the risk for type 2 diabetes, without an indication of a threshold effect.

Strength of evidence: Moderate

Rationale: Two meta-analyses [75, 79] examined the association between BMI and incident type 2 diabetes (Spreadsheet Spreadsheet 2.2.4). Vazquez et al. conducted a meta-analysis of 32 prospective cohort studies (9 from Europe, 12 from the United States, 4 from Asia, and 7 others). The pooled RR for incident type 2 diabetes was 1.92 (95 percent CI: 1.70 to 2.17) per SD of BMI (4.3 kg/m2). In a meta-analysis of 31 prospective cohort studies, Hartemink et al. found a linear association between increasing BMI and type 2 diabetes risk; the pooled RR was 1.19 (95 percent CI: 1.17 to 1.21) per one unit increase in BMI. Neither study examined BMI as a categorical variable.

Table Spreadsheet 2.2.4. Incident Diabetes—Results for BMI
image

These meta-analyses indicate a linear relationship between BMI and type 2 diabetes, with no indication of any threshold effects. Even higher BMIs within the normal range are associated with increased type 2 diabetes risk as compared to those with lower BMI levels.

Age-, Gender-, and Race-Specific BMI Cutpoints—Spreadsheet 2.2.4.

None of the studies included in the search examined current BMI cutpoints in relation to type 2 diabetes stratified by age, gender, or race.

E.1.5 e. All-Cause Mortality

Areas of Insufficient Evidence: The available evidence from systematic reviews, pooled analyses, and meta-analyses is not sufficient to adequately address all sections of CQ2 related to the relationship between BMI and all-cause mortality. Specifically, the panel was unable to address issues of the adequacy of existing BMI cutpoints for overweight and obesity to delineate elevated risk for all-cause mortality for adults above the age of 65 or ethnic minority groups. Therefore, there will not be evidence statements addressing the need for age- or race-specific BMI cutpoints for overweight and obesity.

Current BMI Cutpoints—Spreadsheet 2.2.5.

ES8. Among overweight and obese adults, analyses of continuous BMI show that the higher the BMI, the greater the risk for all-cause mortality. The current category for overweight (BMI 25.0 to 29.9 kg/m2) is not associated with elevated risk for all-cause mortality, but a BMI at or above the current cutpoint for obesity (BMI ≥30 kg/m2) is associated with an elevated risk for all-cause mortality compared with normal weight (18.5 to 24.9 kg/m2).

Strength of evidence: Moderate

Rationale: The relationship between continuous BMI and all-cause mortality was examined in two pooled analyses (Spreadsheet Spreadsheet 2.2.5) [68, 81]. The pooled analyses included adult populations of 1.5 million [68] and 894,576 individuals [81] and were adjusted for age, sex, and smoking status. Weight was self-reported in all but 1 of the 19 cohorts from the Berrington de Gonzalez et al. [68] pooled analysis. In the largest study, Whitlock et al. [81] examined mortality risk associated with BMI among both men and women with a BMI >25 kg/m2 and found a risk ratio of 1.29 (95 percent CI: 1.27 to 1.32) per one BMI unit. This indicates that, among overweight and obese individuals, the higher the BMI, the greater the risk for all-cause mortality. These findings were confirmed with similar point estimates of risk for higher BMI from the Berrington de Gonzalez et al. [68] study that used primarily self-reported weight. Between a BMI of 15 and 25, BMI was inversely associated with all-cause mortality risk (0.79; 95 percent CI: 0.77 to 0.82). In smoking-stratified models shown in a figure from the paper, mortality was higher in BMI categories over 30 kg/m2 compared to <30 kg/m2, but this association was more consistent in the never smokers than in the current smokers. Results in the smokers may have been confounded.

Table Spreadsheet 2.2.5. Overall Mortality—Results for BMI
image
image
image
image

The relationship between the categories of overweight and obesity defined by the current BMI cutpoints and all-cause mortality was examined based on data from three studies, including one systematic review [76] and two pooled analyses (Spreadsheet Spreadsheet 2.2.4) [69, 77]. Weight was self-reported in 5 of the 13 studies included in the Heiat et al. systematic review. The pooled analysis included 388,622 adults and was adjusted for age, sex, and smoking status [77].

The European Prospective Investigation into Cancer and Nutrition (EPIC) study by Pischon et al. [69], which included 359,387 individuals, stratified by gender and used age as the underlying time variable; in addition, models were adjusted for smoking. The studies included a predominance of European and American Whites but also included African Americans, Asians, and Hispanics. The BMI category defined as obese (BMI ≥30 kg/m2) was clearly associated with an increased risk for all-cause mortality. Pischon et al. [69] compared mortality risk in men and women using six BMI categories. They found that risk was increased in both genders in two BMI categories (30 to <35 kg/m2 and ≥35 kg/m2) compared to a BMI of 23.5 to 25.0 kg/m2. McGee et al. (77) reported a significant increase in mortality risk in the obese BMI category (≥30 kg/m2) compared to the normal weight category in both men (1.201; 95 percent CI: 1.119 to 1.289) and women (1.275; 95 percent CI: 1.183 to 1.373). Taken together, this evidence supports that a BMI including and above the current cutpoint value for obesity (BMI ≥30 kg/m2) compared with a BMI 18.5 to <25.0 kg/m2 (i.e., normal weight) is associated with an elevated risk for all-cause mortality.

While the association between obesity and increased risk for all-cause mortality is well supported by the available evidence, there was no clear association between being in the overweight category (BMI 25.0 to 29.9 kg/m2) and an increased risk for all-cause mortality. Pischon et al. did not find an increased risk in three categories in the overweight range (25 to <26.5, 26.5 to <28, 28 to <30) compared to a BMI of 23.5 to 25.0 kg/m2. Using current cutpoints for overweight and normal weight, McGee et al. [77] did not find that overweight was associated with an increased risk for all-cause mortality in men and women separately. McGee et al. [77] found the lowest risk for all-cause mortality to be a BMI between 25.0 and 29.9 in women and between 18.5 and 29.9 in men. The findings from Heiat et al. [76], which included a significant proportion of self-reported weights, were consistent with these results, showing no effect of overweight on mortality in a study population limited to adults who were age 65 and older. The EPIC study [69] indicated that a BMI between 23.5 and 28.0 tended to be associated with lowest risk for mortality in women and men. In the study by Whitlock et al. [81], the lowest mortality appeared to be in categories between 20.0 and 27.5 in never smokers, thus having some overlap between the normal weight and overweight BMI categories.

The normal weight range (BMI 18.5 to 24.9 kg/m2) appears to be a transition zone where the risk for all-cause mortality associated with BMI reaches a nadir. However, the nadir is part of a J-shaped relationship between BMI and all-cause mortality, where all-cause mortality risk rises as BMI decreases or increases beyond this point. As noted above, the lowest risk for all-cause mortality was often inclusive of the normal weight category but not completely so in all studies reviewed. In the PSC [81], the point of lowest risk for all-cause mortality appears to be between a BMI of 22.5 and 24.9 kg/m2 for both men and women, according to a figure presented in the paper. Risk for all-cause mortality appears to rise in a linear fashion as BMI decreases below 22.5 kg/m2. Conversely, there is a fairly linear increase in all-cause mortality risk as BMI moves above 25 kg/m2. Therefore, the normal weight category appears to consistently be associated with the lowest risk for all-cause mortality but begins a transition to increased risk that goes through the overweight category. The increase in risk that occurs in the overweight category relative to the normal weight category does not appear to be a significant increase. However, once BMI reaches the obese category, the increased risk is significant and consistent for this category. Although there is some evidence that significant elevation in mortality risk is not observed until a BMI of 27.5, there is a lack of compelling evidence from this set of meta-analyses, pooled analyses, and systematic reviews to recommend new BMI cutpoints for normal weight, overweight, and obesity.

Age-, Gender-, and Race-Specific BMI Cutpoints—Spreadsheet 2.2.5.

None of the studies included in the panel's search examined BMI cutpoints in relation to all-cause mortality risk stratified by age or race.

ES9. Sex-specific analyses of continuous BMI among overweight and obese men and women show that the greater the BMI, the higher the risk for all-cause mortality. The risk for all-cause mortality associated with the current cutpoints of obesity is similar for men and women.

Strength of evidence: Moderate

Rationale: Whitlock et al. [81] examined mortality risk associated with BMI as a continuous variable among men and women with a BMI of 25 to 50 kg/m2 and found RRs of 1.32 (95 percent CI: 1.29, 1.36) in men and 1.26 (95 percent CI: 1.23, 1.30) in women per 5 BMI units (Spreadsheet Spreadsheet 2.2.5). Between a BMI of 15 and 25, BMI was inversely associated with risk in both genders, and risk estimates were similar (0.79; 95 percent CI: 0.76 to 0.82 in men and 0.80; 95 percent CI: 0.75 to 0.85 in women). Findings from the study by Berrington de Gonzalez et al. [68] used self-reported weights and reported similar findings for men and women as those seen in the Whitlock study, which used primarily measured weights (HR = 1.28; 95 percent CI: 1.26 to 1.31 for women and HR = 1.36; 95 percent CI: 1.32 to 1.40 for men for a BMI between 25 and 50) (Spreadsheet Spreadsheet 2.2.5).

Pischon et al. [69] compared mortality risk in men and women using six BMI categories (Spreadsheet Spreadsheet 2.2.5). They found that risk was increased in both genders in two BMI categories (30 to <35 kg/m2 and ≥35 kg/m2) compared to a BMI of 23.5 to 25.0 kg/m2. The authors did not formally test the gender-BMI interaction; the CIs overlapped. Similarly, CIs overlapped in the study by McGee et al. [77], who reported a significant increase in mortality risk in the obese BMI category (≥30 kg/m2) compared to the normal weight category in both men (1.201; 95 percent CI: 1.119 to 1.289) and women (1.275; 95 percent CI: 1.183 to 1.373) (Spreadsheet Spreadsheet 2.2.5). These studies indicate that there is no need for sex-specific BMI cutpoints based on the association with mortality risk.

E.2 Waist Circumference Cutpoints and CVD-Related Risk

Areas of Insufficient Evidence: The available evidence from systematic reviews, pooled analyses, and meta-analyses is not sufficient to adequately address the relationship between current waist circumference cutpoints and any of the outcomes in CQ2. Specifically, the panel was unable to address issues of the adequacy of current waist circumference cutpoints for overweight and obesity in comparison to alternative cutpoints. The panel was also unable to determine if age-, gender-, or race-specific waist circumference cutpoints for overweight and obesity are warranted to delineate elevated risk for all outcomes examined in CQ2. However, evidence from systematic reviews, pooled analyses, and meta-analyses did address the relationship between continuous waist circumference and several CQ2 outcomes. Because the panel was unable to address the adequacy of current waist circumference cutpoints for overweight and obesity in comparison to alternative cutpoints, the choice of cutpoints to apply in patient evaluation is somewhat arbitrary. The panel acknowledges that the absence of evidence from the available systematic reviews, meta-analyses, and pooled analyses for waist circumference cutpoints does not mean that waist circumference does not provide useful information. This evidence was summarized by the panel but not linked to any evidence statements per se; it did not directly address the questions in CQ2.

E.2.1 a. Fatal and Nonfatal CHD

Rationale: One pooled analysis [82] investigated the effect of continuous waist circumference on combined fatal and nonfatal CHD overall and stratified by age, sex, and race/ethnicity (Spreadsheet Spreadsheet 2.3.1). This study estimated a significant increase in CHD risk per 1 SD (12.6 cm) increase in waist circumference (HR = 1.31, 95 percent CI: 1.24 to 1.37) in unstratified analyses. This result was not adjusted for BMI. The authors presented HRs for continuous waist circumference associations with combined fatal and nonfatal CHD in three age groups (40 to 59, 60 to 69, and ≥70 years); risk estimates declined with age [82]. The HRs in the three age groups were 1.50 (95 percent CI: 1.37 to 1.63); 1.28 (95 percent CI: 1.20 to 1.37); and 1.13 (95 percent CI: 1.06 to 1.21), respectively; the interaction was significant (P < .0001). The HRs among men and women were 1.24 (95 percent CI: 1.17 to 1.32) and 1.31 (95 percent CI: 1.21 to 1.43), respectively, with a borderline significant (P = .056) gender-waist circumference interaction. Risk was significantly elevated in both Whites and non-Whites with HRs of 1.35 (95 percent CI: 1.27 to 1.44) and 1.33 (95 percent CI: 1.17 to 1.51), respectively, per 1 SD (12.6 cm) increase in waist circumference. However, the interaction between waist circumference and race for combined fatal and nonfatal CHD was null but not BMI [82].

Table Spreadsheet 2.3.1. Combined Fatal and Nonfatal CHD—Results for Waist Circumference
image
E.2.2 b. Fatal and Nonfatal Stroke: Spreadsheet 2.3.2

One pooled analysis [82] investigated the association of waist circumference and combined fatal and nonfatal ischemic stroke (Spreadsheet Spreadsheet 2.3.2) but did not present data for overall stroke or other stroke subtypes. Each 12.6 cm increase in waist circumference was associated with an increased risk for ischemic stroke (1.25; 95 percent CI: 1.18 to 1.33) after adjusting for age, sex, and smoking status; the risks were increased among both men (1.32; 95 percent CI: 1.22 to 1.42) and women (1.27; 95 percent CI: 1.12 to 1.43). Estimates of risk for combined fatal and nonfatal ischemic stroke for continuous waist circumference (per 12.6 cm) in three age strata (40 to 59, 60 to 69, ≥70 years) were 1.45 (95 percent CI: 1.30 to 1.60), 1.29 (95 percent CI: 1.20 to 1.40), and 1.10 (95 percent CI: 1.03 to 1.18) per 12.6 cm, respectively, with a significant interaction (P = .001) [82].

Table Spreadsheet 2.3.2. Combined Fatal and Nonfatal Stroke—Results for Waist Circumference
image

Wormser et al. presented HRs for continuous waist circumference with combined fatal and nonfatal ischemic stroke stratified by sex (Spreadsheet Spreadsheet 2.3.2). Risk estimates for continuous waist circumference (per 12.6 cm) were not different for men (1.33; 95 percent CI: 1.21 to 1.46) as compared to women (1.20; 95 percent CI: 1.05 to 1.37; P = .43). The authors also presented risk estimates for ischemic stroke in waist circumference quintiles in a supplemental figure. Among men, HRs ranged from 1.1 to 1.4 between 90 cm and 100 cm and were about 1.75 at about 110 and 115 cm compared to men with waist circumference of about 80 cm. Among women, HRs ranged from 1.3 to 1.5 between 80 and 95 cm and about 1.75 around 110 cm compared to women with a waist circumference of about 70 cm. These data show a graded relationship for waist circumference with ischemic stroke, but because 95 percent CIs overlap across sex-specific quintiles, no clear cutpoints were indicated.

E.2.3 c. Fatal and Nonfatal CVD: Spreadsheet 2.3.3

Only one meta-analysis of 15 cohort studies examined the effect of continuous waist circumference on combined fatal and nonfatal CVD [73], and it estimated an HR of 1.27 (95 percent CI: 1.20 to 1.33) per 1 SD (12.6 cm) increase in waist circumference, adjusted for age, gender, and smoking (Spreadsheet Spreadsheet 2.3.3). The authors also showed RRs for the association between continuous waist circumference and combined fatal and nonfatal CVD for men and women separately [73]. They reported that the RR was 1.02 (95 percent CI: 0.99 to 1.04) in men and 1.05 (95 percent CI: 1.00 to 1.09) in women for a 1 cm increase in waist circumference. A formal interaction test was not presented, but CIs overlapped, indicating a similar slope in the association between men and women.

Table Spreadsheet 2.3.3. Combined Fatal and Nonfatal CVD—Results for Waist Circumference
image
E.2.4 d. All-Cause Mortality: Spreadsheet 2.3.4

Pischon et al. [69] present data from the EPIC study, which examined the relationship between waist circumference and all-cause mortality (Spreadsheet Spreadsheet 2.3.4). The EPIC study examined waist circumference cutpoints stratified by gender but did not use current cutpoints. In this study, waist circumference (in cm) was analyzed in quintiles, and thus, the authors used lower cutpoints for women (<70.1; 70.1 to <75.6; 75.6 to <81.0; 81.0 to <89.0; and ≥89.0) than for men (<86.0, 86.0 to <91.5; 91.5 to <96.5; 96.5 to <102.7; and ≥102.7). In models that included adjustment for BMI, there was a consistent increase in RR the greater the waist circumference. At the highest quintile for women, a waist circumference ≥89 cm was associated with an all-cause mortality risk for 1.78 (95 percent CI: 1.56 to 2.04); for men in the highest quintile (waist circumference ≥102.7 cm), risk for all-cause mortality was 2.05 (95 percent CI: 1.80 to 2.33). In both men and women, risk was higher with increasing waist circumference cutpoints, and risk estimates were similar between men and women despite the different cutpoints used.

Table Spreadsheet 2.3.4. Overall Mortality—Results for Waist Circumference
image
image
Table Spreadsheet 2.3.5. Incident Diabetes—Results for Waist Circumference
image
E.2.5 e. Incident Type 2 Diabetes: Spreadsheet 2.3.5

One meta-analysis examined the association between continuous waist circumference and incident type 2 diabetes [79] and found that the RR for type 2 diabetes was 1.87 (95 percent CI: 1.58 to 2.20) per SD of waist circumference (11.6 cm; Spreadsheet Spreadsheet 2.3.5). This meta-analysis indicates a linear relationship between waist circumference and type 2 diabetes risk. The authors also reported pooled RR stratified by age [79]. The pooled RRs of type 2 diabetes per SD of waist circumference (11.6 cm) were 1.6 (95 percent CI: 1.4 to 1.9) and 2.0 (95 percent CI: 1.6 to 2.7) in cohorts with a mean age <50 and ≥50 years, respectively. Although the point estimate appears to be higher for the older age group, the 95-percent CIs between the two groups largely overlap. The pooled RRs for 1 SD increase in waist circumference (11.6 cm) were 2.3 (95 percent CI: 2.0 to 2.6) in women and 2.9 (95 percent CI: 1.8 to 4.9) in men. Although the RR is larger in men than in women, the CIs overlap, but the study did not formally test for sex and waist circumference interaction. The RRs for type 2 diabetes per SD of waist circumference (11.6 cm) were 2.4 (95 percent CI: 1.5 to 4.0) for studies in Asians, 1.9 (95 percent CI: 1.4 to 2.5) for studies in the United States (largely Caucasian participants), and 2.1 (95 percent CI: 1.7 to 2.6) for studies in Europe (largely Caucasian participants). Although the RRs appear to be higher for Asians than for U.S. or European populations, the 95-percent CIs overlap.

F Recommendations

The panel decided that the recommendations from CQ1 should follow the recommendations made by CQ2, thus the following recommendation is numbered “Recommendation 2”. Recommendation 1 is included in the CQ2 section of the report.

Recommendation 1a: Measure height and weight and calculate BMI at annual visits or more frequently Recommendation Grade: E (expert opinion)

4.6.2. Rationale: An essential component of office visits is to use routinely measured height and weight to calculate BMI and discuss with patients the disease risks associated with overweight and obesity. In a recent nationally representative survey of primary care physicians, only 49 percent reported recording BMI regularly, and fewer than 50 percent reported always providing guidance on diet, physical activity, or weight control [91]. BMI is a simple tool that uses data already being measured and can be easily calculated using widely available, downloadable programs. BMI also is calculated as part of electronic medical records systems and its use is likely to become widespread as those systems come into use.

4.6.3. Recommendation 1b. Use the current cutpoints for overweight (BMI ≥25.0-29.9 kg/m2) and obesity (BMI ≥30 kg/m2) to identify adults who may be at elevated risk of CVD and the current cutpoints for obesity (BMI ≥30) to identify adults who may be at elevated risk of mortality from all causes.

Recommendation Grade: A (strong)

Rationale: To identify adults 18 years and older who have an elevated risk of developing CVD, the panel recommend the continued use of cutpoints for overweight and obesity that were recommended in the 1998 Obesity Clinical Practice Guidelines [4]. The 1998 Guidelines classified normal weight as 18.5 to 24.9 kg/m2, overweight as 25.0 to 29.9 kg/m2, and obesity as ≥30 kg/m2. The current review found that overweight and obesity, as defined by these cutpoints, are associated with an elevated risk of combined fatal and non-fatal CHD and stroke as well as fatal CVD [64, 69, 70, 74, 75, 77-80]. The panel found few or no SRs, MAs, and pooled studies that explored alternative cutpoints that might be better at predicting elevated CVD risk. Thus, the panel concludes that there is currently no evidence to change the cutpoints for overweight and obesity to identify individuals who may have elevated CVD risk. Further, these cutpoints are used internationally to define overweight and obesity and are well-accepted in both clinical and research settings.

The Panel's review using only SRs, MAs, and pooled studies found that obesity as currently defined (BMI ≥30 kg/m2) is associated with an elevated risk of mortality from all causes compared with a normal weight [65, 66, 73, 74, 80]. We found few or no SRs, MAs, and pooled studies that explored alternative cutpoints that might be better at predicting elevated risk of dying from all causes. There was no difference in the association of obesity, as defined by a BMI ≥30, with an elevated mortality risk between sexes, leading the panel to conclude that there is no need for sex-specific cutpoints. For those in the overweight category, an increase in risk of mortality from all causes was not seen in the evidence reviewed. However, as previously noted, the overweight category is associated with increased risk of CVD.

The Panel also suggests that the same cutpoints continue to be used for all age, sex, and race/ethnic subgroups, given that the studies generally included data from various age groups, both sexes, and a variety of countries (predominantly Western but including African Americans, Asians, and Hispanics) and thus, appear to be generally applicable. However, in this review using only SRs, MAs, and pooled studies there was insufficient evidence to evaluate whether different cutpoints based on age, sex, and race/ethnicity were better at predicting elevated CVD risk or all-cause mortality than the current ones.

Recommendation 1c: Advise overweight and obese adults that the greater the BMI, the greater the risk of CVD, type 2 diabetes, and all-cause mortality.

Recommendation Grade: A (strong)

Rationale: The evidence among adults 18 years and older from SRs, MAs, and pooled studies consistently showed the continuous relationship between increasing BMI and increasing risks — that the greater the BMI, the greater the risk of elevated CVD, diabetes, and all-cause mortality [64].

Recommendation 1d: Measure waist circumference at annual visits or more frequently in overweight and obese adults. Advise adults that the greater the waist circumference, the greater the risk of CVD, type 2 diabetes, and all-cause mortality. The cutpoints currently in common use (from either NIH/NHLBI or WHO/IDF) may continue to be used to identify patients who may be at increased risk until further evidence becomes available.

Recommendation Grade: E (expert opinion)

Rationale: The 1998 Obesity Clinical Practice Guidelines [4] recommended that a waist circumference >102 cm (>40 in.) among men and >88 cm (>35 in.) among women be used to identify “increased risk in most adults with a BMI of 25 to 34.9 kg/m2”. The WHO Expert Consultation [63] concluded that these same waist circumference cutpoints were associated with “substantially increased” risk and recommended using lower cutpoints (>94 cm in men, >80 cm in women) to identify adults at “increased” risk. The same lower cutpoints were also recommended by the International Diabetes Federation to identify Europids with central abdominal obesity, but the Federation suggested that different cutpoints be used among South Asians and Chinese (>90 cm for men, >80 cm for women) and for Japanese (>85 cm for men and >90 cm for women). This search, using only SRs, MAs, and pooled studies, found that there was no evidence on any of the waist circumference cutpoints in categorical analyses as they relate to an elevated risk of CVD, all-cause mortality, and type 2 diabetes in adults. For this reason, the panel did not formulate evidence statements on specific waist circumference cutpoints to identify elevated risk of CVD, diabetes, and all-cause mortality.

However, there is clear evidence supporting the linear, continuous relationship between abdominal adiposity as measured by waist circumference and risk for CVD, type 2 diabetes and all-cause mortality. The SRs, pooled analyses, and MAs reviewed by CQ2 provided evidence on the continuous relationship between increasing waist circumference and increasing risk for CVD, type 2 diabetes and all-cause mortality [70, 76, 78, 80]. This evidence was summarized by the panel but not linked to any evidence statements as it did not directly address the questions about waist circumference cutpoints in CQ2. The panel made this recommendation because of the consistency of the continuous relationship between increasing waist circumference and increased risk of CVD, diabetes, and all-cause mortality.

G Gaps in Evidence and Future Research Needs

Evidence-based BMI and waist circumference cutpoints are essential for the identification of patients with elevated risk for CVD (including fatal and nonfatal CHD, stroke, and CVD), mortality, incident type 2 diabetes, dyslipidemia, and hypertension. Since the panel's review of the evidence was limited to systematic reviews, meta-analyses, and pooled analyses, this section will also only focus on the research gaps in these study types.

The panel's literature review indicated that more research is needed to compare current BMI and waist circumference cutpoints to alternative cutpoints for predicting CVD risk. In particular, studies need to compare simultaneously the predictive value of the NIH [5] and World Health Organization [66] waist circumference cutpoints. Research should clearly explicate the methods and logic for decisionmaking to guide the choice of cutpoints for adiposity-related variables such as BMI and waist circumference. From a practical perspective, assigning risk using categorical classification schemes based on predictor-outcome relationships that are linear without obvious thresholds can be useful for informing decisions about cost/benefit or risk/benefit balance. The panel's current classification schemes for BMI and waist circumference have a supporting evidence base, but additional research is needed to optimize the specificity of these cutpoints for higher risk for CVD. Future research should also examine the independent and combined effects of BMI and waist circumference to determine whether waist circumference adds to the prediction of chronic disease incidence and mortality by BMI and identify BMI levels at which waist circumference is most informative for disease prediction. The combined effect between BMI and waist circumference has been hypothesized to distinctly affect CVD risk, and both might be essential to correctly identify patients at elevated CVD risk.

Studies that use more valid measures of percent body fat may help optimize the use of measures of BMI and waist circumference in clinical settings. Associated research on percent body fat and changes in percent body fat on CVD risk could improve the fundamental understanding of the risk associated with waist circumference and simple-to-use BMI in the overall population and in subgroups. In addition, studies using BMI and waist circumference compared to more valid measures of percent body fat are needed to examine the predictive role of various adiposity measures. Further, the development and validation of new tools that are easy to use in clinical settings and more accurately measure body fat is needed.

The panel found that studies on appropriate cutpoints for BMI and waist are needed that show analyses stratified by age, gender, or race/ethnic groups. Studies that compare associations in different age groups using absolute risk measurements (such as events per persons at risk in a defined timeframe) would be useful, and this work would be facilitated by the development of software to more easily estimate covariate-adjusted absolute risk estimates and CIs for time-to-event analyses. There is a critical lack of studies on race/ethnic differences in Western countries to determine whether different cutpoints for subgroups might be appropriate. In this context, the lack of work is most striking in Asian Americans and Hispanic Americans.

There is an absence of systematic reviews, meta-analyses, and pooled analyses examining the associations between maintaining or gaining weight and risk for CVD, all-cause mortality, diabetes, hypertension, and dyslipidemia among normal weight, overweight, and obese adults. Research on methods to better identify the intentionality of weight change in observational studies would be an important contribution. Studies that test how weight change or maintenance modifies the association of baseline BMI status with the outcomes addressed in CQ2 are also needed. Likewise, studies are needed to examine whether changes in waist circumference over time, as a marker of changes in fat distribution, predict subsequent disease outcomes, independent of weight change. For studies using mortality as an outcome, special attention should be paid to address potential biases due to confounding by smoking and reverse causation by preexisting chronic diseases.

There are a substantial number of published individual studies examining the associations between BMI or waist circumference and hypertension, dyslipidemia, and diabetes, yet no systematic reviews, meta-analyses, or pooled analyses on these topics were identified during the literature search. This absence constitutes a lost opportunity to provide combined estimates and a means of better understanding appropriate BMI and waist circumference cutpoints and their clinical implications. Future research should include efforts to conduct systematic reviews, meta-analyses, and pooled studies to provide broader and more comprehensive evidence on the associations highlighted above as well as the relationship between waist circumference cutpoints and all outcomes examined in CQ2.

In reviewing the evidence, the panel and methodology team identified few well-designed, well-executed systematic reviews and meta-analyses. The majority of studies were rated as poor using the quality rating tool for systematic reviews and meta-analyses (see Appendix table A-2). This indicates a need for more rigorous research that complies with standard criteria for assessing the quality of such studies, including systematically rating the quality of the original individual component studies and applying other established criteria. In addition, improved methods for evaluating the quality of studies based on pooled individual-level data need to be developed. Given its distinct methodology and research approach, the panel believes that evaluation of pooled analysis may benefit from the development of different rating criteria from those used for meta-analyses or systematic reviews.

Section 5: Critical Question 3

A Statement of the Question

CQ3 has two parts:

  1. In overweight or obese adults, what is the comparative efficacy/effectiveness of diets of differing forms and structures (macronutrient content, carbohydrate and fat quality, nutrient density, amount of energy deficit, dietary pattern) or other dietary weight loss strategies (e.g., meal timing, portion-controlled meal replacements) in achieving or maintaining weight loss?
  2. During weight loss or weight maintenance after weight loss, what are the comparative health benefits or harms of the above diets and other dietary weight loss strategies?

B Selection of the Inclusion/Exclusion Criteria

Panel members developed eligibility criteria, based on a population, intervention/exposure, comparison group, outcome, time, and setting (PICOTS) approach, for screening potential studies for inclusion in the evidence review. Table 4 presents the details of the PICOTS approach for CQ3.

Table 4. Criteria for Selection of Publications for CQ3
image
image
image

C Introduction and Rationale for Question and Inclusion/Exclusion Criteria

Patients are interested in many types of popular diets that are promoted for weight loss and turn to their primary care providers as authoritative sources for information and referral to evidence-based intervention and treatment. They play critical roles as advocates of sound preventive weight management in clinical practice. CQ3 asks which types of diet and dietary strategies are helpful and efficacious to achieve weight loss. The rationale for the panel's inclusion and exclusion (I/E) criteria was to find evidence relevant to dietary habits prevalent in the United States and to be sure that the evidence was relevant to a typical dietary intervention prescribed by U.S. practitioners. In addition, to evaluate the dietary component of the intervention per se, only dietary intervention comparisons were allowed; the other components of the intervention (exercise, behavioral tools) had to be held constant across treatment comparisons. Thus, all trials included in the evidence review compared dietary interventions; some were comprehensive, including a diet component along with a physical activity and/or behavioral component. In those studies, the comparators had the same additional components; i.e., treatment groups differed by diet alone.

The panel chose a search strategy that was broad and included descriptors for popular diets and for strategies that employed all conceivable approaches. The search was to identify diets that might be broadly applicable to the overweight and obese U.S. population who are trying to lose weight with dietary approaches. The targeted evidence was around weight loss and included assessments of risk factors and health benefits in the context of weight-reducing diets. The search excluded diets for children as the focus was on reviewing the evidence relevant to weight reduction in the adult overweight and obese U.S. population. When this process began, the 2008 U.S. Department of Health and Human Services Physical Activity Guidelines for Americans [94], based on a comprehensive review of the evidence, had recently been released. Therefore, the panel focused their efforts on other questions that needed that level of review.

D Methods for Critical Question 3

The Obesity Expert Panel formed work groups for each of its five CQs. For CQ3, the work group was chaired by a physician and comprised physicians and investigators representing academic institutions across the United States.

As noted in section 2, Process and Methods Overview, a standardized approach to systematically reviewing the literature was conducted for all CQs. Panel members participated in developing the CQ and its I/E criteria and reviewed the included/excluded papers and their quality ratings. Contractor staff worked closely with panel members to ensure the accuracy of data abstracted into evidence and summary tables and the application of systematic, evidence-based methodology.

The literature search for CQ3 included an electronic search of the Central Repository for randomized controlled trials or controlled clinical trials published in the literature from January 1998 to December 2009. The Central Repository contains citations from seven literature databases: PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts. The search produced 1,416 citations, with 6 additional citations identified from nonsearch sources; i.e., by the panel members or hand search of systematic reviews/meta-analyses (obtained through the electronic search). Two of the six citations were published after December 31, 2009. Per NHLBI policy, certain lifestyle and obesity intervention studies published after the closing date could be allowed as exceptions. These studies must be RCTs in which each study arm contained at least 100 participants and was identified by experts knowledgeable of the literature. One of the two citations published after December 2009 met the criteria and was eligible for inclusion in the CQ3 evidence base [95]. In contrast, the other citation did not meet the criteria and was excluded from the CQ3 evidence base [96]. The remaining 4 citations were identified through nonsearch sources (i.e., hand search) by cross-checking the references listed in 28 systematic reviews/meta-analyses. The systematic reviews/meta-analyses were only used for manual searches and were not part of the final evidence base. This manual cross-check was done to ensure that major studies were not missing from the evidence base. As a result of this cross-check, two of six studies were screened and found eligible for inclusion [97, 98]. Subsequently, the quality of the studies was rated as poor. Figure 5 outlines the flow of information from the literature search through various steps used in the systematic review process for CQ3.

Figure 5.

PRISMA Diagram Showing Selection of Articles for Critical Question 3

The titles and abstracts of 1,422 publications were independently screened against the I/E criteria by 2 reviewers, resulting in 984 publications being excluded and 438 publications being retrieved for full-text review to further assess eligibility. These 438 full-text publications were independently screened by 2 reviewers, who assessed eligibility by applying the I/E criteria; 361 of these publications were excluded based on 1 or more of the I/E criteria (see specified rationale as noted in Figure 5). Furthermore, the CQ3 work group noted that since the focus of CQ3 is solely on the effect of different dietary approaches to weight loss, other possible interventions could not differ. Therefore, studies were excluded if treatment arms differed in their behavioral approach; i.e., the amount of participant contact and amount or method of prescribed physical activity.

Of the 438 full-text publications, 77 met the criteria and were included. The quality (internal validity) of these 77 publications was assessed using the quality assessment tool developed to assess RCTs (see Appendix table A-1). Of these, 54 publications were excluded because they were rated as poor quality [97-150], 52 of these studies were rated poor due to the intent-to-treat (ITT) and attrition rates. Rationales for all of the poor-quality ratings are included in Appendix table B-14. The remaining 17 RCTs (23 articles) were rated good or fair quality [95, 151-172] and included in the evidence base that was used to formulate evidence statements. Panel members reviewed the 17 RCTs, along with their quality ratings, and had the opportunity to raise questions. Some trials previously deemed to be of fair or good quality were downgraded to poor quality upon closer review of evidence tables. These trials used completers analyses rather than ITT analysis and had overall attrition rates exceeding 10 percent. If the study reported only an analysis of completers and had attrition at <10 percent, it was allowed in the evidence base. Methodologists worked with the systematic review team to reevaluate these trials and make a final decision. Evidence tables and summary tables consisted only of data from the original publications of eligible RCTs; these tables formed the basis for panel deliberations.

E Evidence Statements and Summaries

A total of 17 trials (23 articles) satisfied the final inclusion criteria for CQ3 and were rated fair or good quality [95, 145, 151-153, 155-170, 173] Most trials compared dietary interventions [95, 145, 151, 152, 155, 156, 158, 159, 162, 163, 165-170] and some compared the use of meal replacements or liquid diets [153, 157, 160, 161, 173]. Some of the dietary interventions were comprehensive [95, 145, 151, 152, 155, 156, 158, 159, 162-170] including a diet component along with a physical activity and/or behavioral component. In those studies, the comparators had the same behavioral and physical activity components; treatment groups differed by diet alone. From these trials, three overarching evidence statements may be made regarding counseling to achieve dietary intervention. Additional statements address comparative effectiveness/efficacy of specific dietary approaches. The stated strength of evidence applies to the overall evidence statement, including any bulleted items.

Summary tables 3.1 through 3.9 present summary data on the 17 included studies. Summary table 3.1 provides the dietary interventions that form the basis for overarching evidence statements. Summary tables 3.2 through 3.9 are organized around dietary form, structure, or pattern. Some studies appear in more than one summary table because they address more than one framework of analysis; e.g., macronutrient or dietary composition.

F Overall Dietary Intervention and Composition—Summary Table 3.1

F.1 Creating Reduced Dietary Energy Intake

ES1. To achieve weight loss, an energy deficit is required. The techniques for reducing dietary energy intake include the following:

  • Specification of an energy intake target that is less than that required for energy balance, usually 1,200 to 1,500 kcal/day for women and 1,500 to 1,800 kcal/day for men (kcal levels are usually adjusted for the individual's body weight and physical activity levels);
  • Estimation of individual energy requirements according to expert guidelines[174-176] and prescription of an energy deficit of 500 kcal/day or 750 kcal/day or 30 percent energy deficit; and
  • Ad libitum approaches where a formal energy-deficit target is not prescribed but lower calorie intake is achieved by restriction or elimination of particular food groups or provision of prescribed foods.

Strength of evidence: High

Rationale: A total of 12 trials (18 articles) [95, 110, 145, 151, 152, 155, 156, 158, 159, 162-165, 167-170] provided evidence on dietary intervention and weight loss. Summary table 3.1 summarizes the design characteristics and results of these trials. The diets included a range of macronutrient compositions or patterns. Three were rated good quality; nine were rated fair quality. The 12 studies described in Summary table 3.1 all confirm that to lose weight, a reduction in caloric intake is required. The energy balance equation requires that for weight loss, one must consume less energy than one expends or expend more energy than one consumes. Most weight loss programs reduce dietary intake by lowering energy consumption by 500 to 1,000 kcal per day (3,500 to 7,000 kcal per week) and increasing energy expenditure with moderate levels of physical activity, which will result in a weight loss of 1 to 2 lb./week.

Several strategies can achieve an energy deficit. First, one can assume that a total daily energy intake of 1,200 to 1,500 kcal for women and 1,500 to 1,800 kcal for men (with levels varying by the individual's body weight) will produce that deficit, and the dieter need only aim for that prescribed intake. That approach was used in some of the reports listed in Summary table 3.1 [95, 163, 167, 170].

The second approach tailors the prescription further, using an equation for daily energy requirements based on sex, weight, and age, such as the Harris Benedict Equation [174] Mifflin-St Jeor Equation [175], or the formula promoted by the World Health Organization (WHO)WHO [177]. The WHO formula promoted the following process: [1] Calculate daily caloric requirements by estimating an individual's energy requirement at rest (total calories) and then make adjustments for habitual physical activity and [2] subtract a sufficient amount of calorie from the daily calorie requirements to achieve the desired caloric deficit and weekly weight loss goal. This technique was used in two studies listed in Summary table 3.1 [145, 168]. In one study [169], resting energy expenditure was measured via indirect calorimetry to calculate an individual's daily energy requirement, which was then adjusted for activity to set the weight loss calorie deficit.

None of the studies examined by CQ3 work group directly compared any methods of estimating the targeted calorie deficit to others. These studies are all randomized controlled trials and compare a test dietary intervention implemented by highly trained or professional staff with a control diet. All dietary approaches were associated with weight loss when dietary energy deficits were achieved. To maintain dietary compliance, subjects in all trials participated in educational and/or behavioral therapy of varying intensity. In addition, they were to monitor food and calorie intake and physical activity.

Some of the studies examined did not require dieters to achieve a set calorie deficit target; in these trials, however, the approaches incorporated recommendations to avoid specific groups or classes of foods, which led to voluntary reductions in energy intake with resulting weight loss [95, 155, 162, 164-166]. In addition, several studies provided subjects with foods required for the prescribed diet, with either an energy deficit [110, 164] or ad libitum approach [155, 158]. Whether the latter approaches could be applied widely in free-living environments was not tested.

The weight loss trajectory in the studies examined is not linear, and after a few weeks, it does not reflect the targeted energy deficit resulting in pounds lost. This is the effect of both metabolic adaptation and suboptimal dietary adherence [178]. As weight loss occurs, energy requirements decrease out of proportion to the reduction accounted for in lowered weight. Consequently, targeted energy intake needs to be decreased if continued weight loss is to be achieved.

From the dietary approaches used to create energy deficits detailed in the studies shown in Summary table 3.1, the panel concludes that all can be successful in promoting weight loss. None offers superior short- or long-term success relative to the comparator energy deficit diet. The existing literature, however, does not exhaustively compare all strategies against each other. Most existing randomized trials of fair and good quality compare test diets to an energy-restricted American Heart Association (AHA) Step 1 or 2 diet or the NHLBI Adult Treatment Panel (ATP) III dietary protocols. Each approach that reduced food and calorie intake was associated with weight loss, but none achieved greater benefits when tested against the energy-restricted AHA or ATP III diets when assessing weight, metabolic, or CVD risk factor outcomes.

ii Diets of Differing Forms and Structures (Macronutrient Content, Carbohydrate and Fat Quality, Nutrient Density, Amount of Energy Deficit, Dietary Pattern) or Other Dietary Weight Loss Strategies (e.g., Meal Timing, Portion-Controlled Meal Replacements)

ES2. A variety of dietary approaches can produce weight loss in overweight and obese adults. The following dietary approaches (listed in alphabetical order below) are associated with weight loss when a reduced dietary energy intake is achieved:

  • A diet from the European Association for the Study of Diabetes (EASD) guidelines, which focuses on targeting food groups rather than formal, prescribed energy restriction while still achieving an energy deficit;
  • Higher protein diet (25 percent of total calories from protein, 30 percent of total calories from fat, 45 percent of total calories from carbohydrate) with provision of foods that realized energy deficit;
  • Higher protein Zone®-type diet (5 meals/day, each with 40 percent of total calories from carbohydrate, 30 percent of total calories from protein, 30 percent of total calories from fat) without a formal prescribed energy restriction diet but with realized energy deficit;
  • Lacto-ovo vegetarian-style diet with prescribed energy restriction
  • Low-calorie diet with prescribed energy restriction;
  • Low-carbohydrate diet (initially less than 20 g/day carbohydrate) without formal prescribed energy restriction but with a realized energy deficit;
  • Low-fat, vegan-style diet (10 to 25 percent of total calories from fat) without prescribed energy restriction but with realized energy deficit;
  • Low-fat diet (20 percent of total calories from fat) without formal prescribed energy restriction but with realized energy deficit;
  • Low-glycemic load diet either with formal prescribed energy restriction or without formal prescribed energy restriction but with realized energy deficit;
  • Lower fat (≤30 percent fat), high-dairy (4 servings/day) diets with or without increased fiber and/or low-glycemic index/load foods (low-glycemic load) with prescribed energy restriction;
  • Macronutrient-targeted diets (15 percent or 25 percent of total calories from protein; 20 percent or 40 percent of total calories from fat; 35 percent, 45 percent, 55 percent, or 65 percent of total calories from carbohydrate) with prescribed energy restriction;
  • Mediterranean-style diet with prescribed energy restriction;
  • Moderate-protein diet (12 percent of total calories from protein, 58 percent of total calories from carbohydrate, 30 percent of total calories from fat) with provision of foods that realized energy deficit;
  • Diet of high-glycemic load or low-glycemic load meals with prescribed energy restriction; and
  • The AHA Step 1 diet (with prescribed energy restriction of 1,500 to 1,800 kcal/day, <30 percent of total calories from fat, <10 percent of total calories from saturated fat).

Strength of evidence: High

Rationale: The 12 studies described in 18 reports [95, 110, 145, 151, 152, 155, 156, 158, 159, 162-165, 167-170] provided evidence about different dietary interventions and weight loss. Summary table 3.1 summarizes the design, characteristics, and results of these trials. The diets included a range of macronutrient compositions or dietary patterns. Three studies were rated good quality, and nine were rated fair quality. The 12 studies in Summary table 3.1 inform evidence statements regarding macronutrient content (fat, carbohydrate, and protein), some dietary patterns, and carbohydrate quality (glycemic index/load). However, adequate numbers of good- or fair-quality studies were not found to make statements about the following dietary weight loss approaches: Fat quality; simple or complex carbohydrates; diets of varying nutrient density (including energy density); or alternative levels of energy deficits, meal timing, or meal replacements. Well-executed, large-scale studies (with high subject retention and dietary compliance levels) in overweight and obese free-living individuals of varying age ranges and ethnic diversity, as well as an analysis of those studies on an ITT methodology, are needed to inform future evidence reviews. Further research is needed in optimal dietary patterns, both for high-risk populations and the general population.

For the different dietary approaches (either with or without comprehensive lifestyle change) that the CQ3 work group evaluated, it is evident that all prescribed diets that achieved an energy deficit were associated with weight loss. There was no apparent superiority of one approach when behavioral components were balanced in the treatment arms.

The availability of such a wide range of options with established efficacy offers health care practitioners many evidence-based strategies to suggest to their patients who are overweight and obese. Notably, these approaches were all found effective only under conditions where multidisciplinary teams of medical, nutrition, and behavioral experts and other highly trained professionals worked intensively with individuals on weight loss management. With a similar level of attention to patient education and counseling, practitioners should expect comparable success (independent of a diet's effect on other aspects of health factors) regardless of the energy-restricted dietary approach or targeted food-based method.

Pattern of Weight Loss Over Time With Dietary Intervention

ES3. With dietary intervention in overweight and obese adults, average weight loss is maximal at 6 months with smaller losses maintained for up to 2 years while treatment and followup tapers. Weight loss achieved by dietary techniques aimed at reducing daily energy intake ranges from 4 to 12 kg at a 6-month followup. Thereafter, slow weight regain is observed, with a total weight loss of 4 to 10 kg at 1 year and 3 to 4 kg at 2 years.

Strength of evidence: High

Rationale: The characteristics of the 12 studies [95, 110, 145, 151, 152, 155, 156, 158, 159, 162-165, 167-170] that form the evidence basis for statements on the duration of the dietary intervention are displayed in Summary table 3.1. Of the studies, three were rated good quality while nine were rated fair quality. All 12 studies that were evaluated and displayed in Summary table 3.1 produced maximal weight loss at 6 months following initiation of the intervention, with some weight regain occurring up to 2 years but with some level of weight loss retention achieved from baseline. Of note, the amount of weight loss at these time points was variable, and the interventions often had physical activity components. These studies did not evaluate the mechanisms of weight regain after initial weight loss; both behavioral and biologic factors contribute to weight regain. Over time, dieters can grow fatigued with the dietary prescriptions and find it difficult to maintain interest and commitment. Behavioral factors, metabolic adaptation, and changes in neurohormonal regulation can thwart maintenance of lost weight. Overweight and obese individuals who lose weight can have disproportionately reduced energy requirements (including resting energy expenditure) and increased appetitive signals compared to those of the same age, sex, and weight who have not lost weight. Future studies are needed to identify strategies that prevent or minimize weight regain after successful dieting.

G. Low-Fat Approaches—Summary Table 3.2

ES4a. In overweight and obese adults, there is comparable weight loss at 6 to 12 months with instruction to consume a calorie-restricted (500 to 750 kcal deficit per day), lower fat diet (<30 percent of total calories from fat) compared to a higher fat (>40 percent of total calories from fat) diet. Comprehensive programs of lifestyle change were used in all trials. Comparator diets had 40 percent or more of total calories from fat, either with a low-carbohydrate or low-glycemic load diet or one that targets higher fat with either average or low protein.

Strength of evidence: Moderate

ES4b. With moderate weight loss, lower fat, higher carbohydrate diets, compared to higher fat, lower carbohydrate diets, have the following differential effects:

  • Greater reduction in LDL-C;
  • Lesser reduction in serum triglycerides; and
  • Lesser increases in HDL-C.

Strength of evidence: Moderate

ES4c. There is inconsistent evidence regarding blood pressure differences between lower fat, higher carbohydrate diets and higher fat, lower carbohydrate diets.

Strength of evidence: Low

Rationale: Three trials (two good quality and one fair quality) address interventions with low- fat approaches [95, 165, 169]. Summary table 3.2 summarizes these trials. Weight loss and CVD outcomes were reported in all three studies over 6 to 24 months. In addition to diet, all included a behavioral or counseling component. The three studies that examine low-fat diets are displayed in Summary table 3.2. During the 1980s and 1990s, an evidence base emerged on the efficacy of a lower fat diet for chronic disease risk reduction. Various expert guidelines advocated the adoption of such protocols for disease prevention and health promotion [179-182] Since fat is energy dense with 9 kcal/g compared to protein and carbohydrate with 4 kcal/g, high-fat foods tend to be high calorie. Individuals who eat lower fat diets tend to consume more volume and weight of foods, compared to a higher fat diet. However, the satiety of lower fat or fat-free foods might wane over time since these foods may not offer the flavor or same hedonic attributes of fat-containing foods. Similarly, higher fat diets may become monotonous, making long-term compliance more difficult.

This review yielded three good- or fair-quality studies [95, 165, 169] comparing the prescription of low-total and low-saturated fat diets to low-glycemic load or low-carbohydrate approaches. A low-fat diet is generally defined as containing 20 to 30 percent of total calories from fat; those levels were used in the studies. The lower carbohydrate approaches used in comparative studies consist of <45 percent of calories from carbohydrates (considerably higher than the very low-carbohydrate approaches). One 6-month study [165] prescribed an ad libitum approach, albeit with restriction of certain food choices, which resulted in an energy restriction of approximately 400 to 500 kcal/day. Two longer studies [95, 169] prescribed a calorie restriction (deficit of 750 kcal/day or 1,200 to 1,500 kcal/day for women and 1,500 to 1,800 kcal/day for men). All studies demonstrate comparable weight loss with lower or higher fat dietary approaches given that other factors (food restrictions, instructions on amount of calorie deficit) are held constant.

H. Higher Protein (25 to 30 Percent of Energy) Approaches—Summary Table 3.3

ES5a. In overweight and obese adults, recommendations to increase dietary protein (25 percent of total calories) as part of a comprehensive weight loss intervention result in equivalent weight loss as compared with a typical protein diet (15 percent of total calories) when both diets are calorie restricted (500 to 750 kcal/day deficit).

Strength of evidence: High

ES5b. In overweight and obese adults, when compared to typical protein diets (15 percent of total calories), high-protein diets (25 percent of total calories) do not result in more beneficial effects on CVD risk factors in the presence of weight loss and other macronutrient changes.

Strength of evidence: Low

ES5c. Based on studies conducted in settings where all food provided delivers increased protein (25 percent of total calories), either as part of caloric restriction or with ad libitum energy consumption, there is insufficient evidence to inform recommendations for weight loss interventions in free-living overweight or obese individuals.

Rationale: Five RCTs (10 articles) included interventions with higher protein (25 to 30 percent of total calories) approaches [110, 114, 151, 152, 156, 158, 162, 164, 168, 169]. Summary table 3.3 summarizes the design, characteristics, and results of these trials. One trial was rated good quality; four were rated fair quality. In two trials [110, 151, 152, 156, 158, 164], all food was provided. All trials reported outcomes for weight change and CVD risk factors. Duration of followup ranged from 6 months to 2 years.

Physiologic experiments and human diet studies [182, 183] point to dietary protein as promoting satiety with a potential increase in resting energy expenditure. One strategy to improve weight loss and maintenance of lost weight would be to promote satiety and resting energy expenditure while on reduced caloric intake. Thus, some investigators have increased dietary protein from levels typically seen in the American diet (15 percent of total calories as protein) as a pathway to more efficacious dieting strategies for weight loss. In real-world settings, however, prescription of increased protein can be difficult to achieve due to the wide availability of palatable foods and snacks that are low in protein and high in carbohydrates and fat. Therefore, to follow a diet of increased protein consumption, one must simultaneously reduce consumption of fat and carbohydrate, primary elements of many lower cost and convenient foods and snacks.

These studies took two different approaches to testing the effect of increased dietary protein on weight loss: [1] Prescribing an energy deficit and specific macronutrient targets with increased protein to 25 percent of total calories [168, 169] or 30 percent of total calories [162] and [2] providing most or all foods that met specified macronutrient targets (25 percent of total calories as protein) [110, 156, 158] or 30 percent [110, 164].

When a prescription was given for both energy deficit and specific macronutrient targets (increasing protein to 25 percent of total calories) as in the POUNDS LOST and SMART studies, the macronutrient targets for protein were not reached. In addition, there was no difference in weight loss between groups assigned to energy-restricted lower or higher protein intake diets; however, both groups successfully achieved weight loss. A study by McAuley et al. [127, 162] tested the Zone® diet popularized by the diet book [184]. Each of five meals per day was required to have 30 percent of calories as protein. This diet was tested against an Atkins-type low-carbohydrate diet and an EASD-endorsed food group diet. The study did not test higher or lower protein levels per se but rather the two diets described in the Zone® diet book and the updated Atkins diet book [185].

CALERIE [110, 164] tested 30 percent dietary energy restriction in a metabolic ward setting and tested changes in glycemic load and protein. There was no significant difference in weight loss between groups assigned to high-glycemic load and 20 percent of total calories as protein versus low-glycemic load and 30 percent of total calories as protein. These were not free-living individuals, and the enforced total calorie restriction (all food provided in a metabolic ward) of 30 percent of estimated requirements, whether assigned to the group with 20 percent or 30 percent of the reduced total calories as protein, was likely the major determinant of weight loss.

A Danish study [151, 152, 156, 158] tested increasing dietary protein and reducing glycemic load using the unique methodology of providing all foods in a university setting supermarket where dietitians instructed participants in selecting foods with appropriate macronutrient content. Participants could eat ad libitum from the selected foods and were not given an energy deficit target. The higher protein, lower glycemic load approach demonstrated a weight loss benefit. This study, while intriguing, cannot be translated to real-world guidance without evidence that free-living individuals can achieve the dietary regimen and that the regimen produces weight loss.

As to the effect of higher and lower protein levels on CVD risk factors, these studies did not show ascertainable differences in lipids or blood pressure due to protein alterations alone although manipulations of fat content and carbohydrate content will result in changes in lipids.

I Low-Carbohydrate (<30 g/Day) Approaches—Summary Table 3.4

ES6a. In overweight and obese adults, there are no differences in weight loss at 6 months with instructions to consume a carbohydrate-restricted diet (20 g/day for up to 3 months, followed by increasing levels of carbohydrate intake up to a point at which weight loss plateaus) in comparison to instruction to consume a calorie-restricted, low-fat diet. The comparator diets on which this statement is based were either calorie-restricted, higher carbohydrate, and lower protein (55 percent of total calories from carbohydrate, 30 percent of total calories from fat, 15 percent of total calories from protein) or a lower fat EASD food group dietary pattern (40 percent of total calories from carbohydrate, 30 percent of total calories from fat, and 30 percent of total calories from protein).

Strength of evidence: Low

ES6b. There is insufficient evidence to comment on the CVD risk factor effects of low carbohydrate diets.

Rationale: Two RCTs addressed interventions with low-carbohydrate approaches [95, 162]. Summary table 3.4 presents a brief description of these trials. The first, a good quality trial conducted in the United States, reports on weight loss and CVD risk factors at 6 and 12 months [95, 162]. The second, a fair quality trial, was conducted in an academic setting in New Zealand for 24 weeks [95, 162]. Low-carbohydrate approaches to dieting, as endorsed by the Atkins Diet books [185], have been popular among consumers for many decades. This is, in part, due to the initial rapid weight loss, which occurs in the first few weeks of the low-carbohydrate diet because of glycogen depletion. Glycogen is stored with two molecules of water; fat is anhydrous. As glycogen is depleted, water is released, and weight loss is amplified. The low-carbohydrate diet prescribes an initial period of only 20 grams of daily carbohydrate intake. In the studies examined [95, 162] this was sustained for 3 weeks to 3 months. Then, carbohydrates are reintroduced at 5 grams per day, per week until stable and desirable weight loss is achieved. Another effect of the low-carbohydrate diet is to eliminate certain food types or food groups, especially packaged and processed foods, from the diet. Diet books using the low-carbohydrate approach emphasize what one can eat (such as steak, chicken, fish, shrimp, eggs, hollandaise sauce, asparagus, lettuce, whipped cream) rather than what one cannot eat (breads, sweets, chips, potatoes, rice, apples).

The first [95] of two studies examined was rated good quality and compared the low-carbohydrate diet with a low-fat (<30 percent of total calories as fat), energy-restricted approach. The other aspects of the intervention (contact time and exercise instruction) were controlled across groups. There was no difference in weight loss between groups at 6 and 12 months; in fact, both groups lost more than 10 kg, on average, at 1 year. The second study [162] was rated fair quality and compared a low-carbohydrate diet to the Zone® diet [184] and to an EASD food group diet [186] in obese, insulin-resistant women. Although the study reports greater weight loss at 6 months with the EASD diet compared to the other two, the amount does not reach the study's a priori definition of clinical significance (−3 kg).

The number of good- or fair-quality studies to address low-carbohydrate approaches is limited. Maintaining long-term compliance can be difficult because of the restriction in food choices; retention is often problematic. Since the two included studies show that the low-carbohydrate dietary approach does not produce weight loss greater than calorie-restricted, low-fat approaches over 6 to 12 months, it would be useful to have more observations to better evaluate this popular approach.

J Complex Versus Simple Carbohydrates—Summary Table 3.5

ES7. There is insufficient evidence to comment on the value of substituting either simple or complex carbohydrates for dietary fat for overweight or obese adults for achieving weight loss.

Rationale: Only one fair-quality randomized trial comparing ad libitum intake of either complex or simple carbohydrates satisfied eligibility criteria for inclusion [155]. Summary table 3.5 summarizes the design, characteristics, and results of this trial. This study included a control arm; however, the attrition rate in this arm was substantially higher than in the two test diet arms. Only the results from the test diet arms were included as part of this panel's evidence base. Since the study provided food constituting more than 60 percent of energy requirements, the generalizability to free-living conditions is limited. Weight loss was greater at 6 months with the low-fat, high-complex carbohydrate diet than the low-fat, high-simple carbohydrate diet. Because there is only one study, no evidence statement can be made.

K Glycemic Load Dietary Approaches—Summary Table 3.6

ES8. In overweight and obese adults, both high- and low-glycemic load diets produce a comparable weight loss with a similar rate of loss over 6 months.

Strength of evidence: Low

Rationale: Two randomized trials, described in three papers [110, 164, 165] met the inclusion criteria; both trials were conducted in the United States and were rated fair quality. One trial [110, 164] was conducted for 12 months; however, only the 6-month results are discussed. The 12-month results were not included due to the use of completers analysis and an overall withdrawal rate exceeding 10 percent. The second trial reports results on a number of CVD risk factors at 6, 12, and 18 months [165]. The two trials that tested low-glycemic load approaches are shown in Summary table 3.6.

Diets that manipulate glycemic load have become popular because they attempt to modify insulin secretion. The available evidence suggests that insulin secretion can be stimulated by foods that contain rapidly absorbed carbohydrates. Low-glycemic load foods tend to be higher in fiber and have higher levels of complex carbohydrates and lower levels of simple carbohydrates. The rationale for low-glycemic load diets is that they will produce a lower and more moderated insulin response and result in less hunger in the long term although this has never been convincingly shown. The low-glycemic load approach is popular in patients with diabetes and prediabetes.

Because of the widespread prescription of low-glycemic load diets for individuals with insulin resistance, the panel sought studies comparing low- and high-glycemic load approaches. Unfortunately, the available evidence from good- or fair-quality studies is sparse; it shows that both high- and low-glycemic load diets can be successful for weight loss over 6 to 18 months. One study [110, 164] conducted in a metabolic ward with all food provided and an imposed caloric deficit resulted in no difference in weight loss among diets of high- and low-glycemic load. Another study [165] of ad libitum food intake among free-living individuals showed no difference in weight loss between a low-glycemic load diet (40 percent carbohydrate, 35 percent fat, 25 percent protein) and a low-fat diet (55 percent carbohydrate, 20 percent fat, 25 percent protein). However, this study did demonstrate significant differences at 6 and 18 months in LDL-C, favoring the low-fat diet, and in HDL-C and triglycerides, favoring the low-glycemic load diet. There were no differences in blood pressure, glucose, or insulin between diets.

Because of the widespread use of diets that manipulate glycemic load in populations with diabetes and prediabetes, the panel identified the need for studies of glycemic load (or glycemic index) in free-living individuals, with attention to retention, compliance, and ITT analysis of outcomes.

L. Dietary Pattern (Mediterranean Style and Vegetarian and Other Dietary Pattern) Approaches—Summary Table 3.7

ES9. In overweight and obese adults, a variety of calorie-restricted dietary patterns (i.e., Mediterranean-style, lower fat lacto-ovo vegetarian or vegan style, or lower fat high dairy/calcium with added fiber and/or low-glycemic index/load foods) produce weight loss and cardiovascular benefits that are comparable to an energy-restricted, lower fat (25 to 30 percent of total calories from fat, ATP III, or AHA Step 1) dietary pattern.

Strength of evidence: Low

Rationale: Four RCTs (one good quality, three fair quality) described in six articles met inclusion criteria for strategies focusing on alternative types of dietary patterns for weight loss (145,159,163,166,167,170). Summary table 3.7 summarizes the design, characteristics, and results of the four studies and six articles. The good-quality trial compared a Mediterranean-style diet to a low-fat dietary pattern [167] and followed subjects in a university setting for 4 years. Two fair-quality trials compared a vegetarian-style dietary pattern intervention [159, 163, 166, 170] to a lower fat dietary pattern. Only two of the trials reported on CVD risk factor outcomes in addition to weight change. The fourth trial evaluated lower fat, higher dairy/calcium, and higher fiber and lower glycemic index/load dietary pattern approaches [145].

Recent epidemiological evidence from population-based prospective studies suggests that healthier overall dietary patterns are associated with lower rates of obesity and reduced chronic disease risk, including CVD and diabetes. Dietary patterns are generally defined in two different ways: [1] In terms of the population's habitual eating practices (also referred to as empirical or a posteriori patterns) such as Mediterranean or vegan/vegetarian dietary patterns or [2] as patterns that are specifically designed to target certain foods, improve macronutrient profiles, and achieve higher nutrient density and overall improved dietary quality based upon existing expert evidence (also referred to as theoretical or a priori patterns) such as the DASH (817), DASH-Sodium [188], AHA Step 1 [179] and National Cholesterol Education Program ATP III dietary patterns [180]. The randomized controlled trials were designed to test the effectiveness of dietary patterns that targeted specific foods and nutrients, based on evidence suggesting that they might promote weight loss and potentially improve other health outcomes; e.g., decrease CVD risk factors. In all the trials, the test dietary patterns were compared to lower fat (25 to 30 percent) protocols, which typically advocated fat restriction, higher complex carbohydrates, and whole grains. Energy intake was either explicitly restricted or achieved by subjects voluntarily. Trial arms were balanced in terms of physical activity recommendations and intensity of behavioral therapy managed by highly trained professionals.

For weight loss outcomes, observed advantages of the Mediterranean-style; low-fat, lacto-ovo vegetarian-style; vegan; or low-fat, high-dairy/calcium with and without high fiber (low-glycemic index/load) foods are inconsistent across available research studies and modest, at best [167]. At followup ranging from 6 to 18 months, only one trial indicated that the Mediterranean-style dietary pattern resulted in greater weight losses (2 kg) at 12 months and greater reductions in waist circumference (1.3 cm) compared to the AHA Step 1 dietary pattern. One trial indicated that at 6 and 18 months, the energy-restricted, low-fat, lacto-ovo vegetarian-style diet did not result in differences in weight, BMI, or waist circumference changes compared to the low-fat dietary pattern [163, 170] One other trial suggested that the low-fat, vegan dietary pattern resulted in greater reductions in weight (4.9 kg and 3.1 kg at both 12 and 24 months, respectively) compared to the low-fat National Cholesterol Education Program pattern (1.8 kg and 0.8 kg, respectively). In one study, changes in weight loss and secondary weight outcomes (body fat, trunk fat, waist and hip circumferences) did not differ between a high-dairy diet and a diet high in dairy and fiber and with a low-glycemic index [145]. At the 4-year followup, one trial demonstrated that shorter term differences in weight outcomes between the Mediterranean-style and AHA Step 1 dietary patterns were attenuated [167].

For CVD risk factors, like weight loss outcomes, the evidence of modest favorable benefits of certain dietary patterns also is inconsistent. One study indicated that the Mediterranean-style dietary pattern compared to the AHA Step 1 dietary pattern at the 12-month followup resulted in modest but favorable changes in glucose control, lipids, and blood pressure, with a decrease of 0.6 percent in HbA1c, 1.2 mmol/L in plasma glucose, 0.22 mmol/L in triglycerides, 3.1 mmHg in SBP, and 1.0 mmHg in DBP and an increase of 0.08 mmol/L in HDL-C [167]. At year 4, the Mediterranean-style diet changes remained favorable for glucose control and lipids, but the blood pressure impacts were attenuated [167]. One study [170] indicated that LDL-C decreased by 0.16 mmol/L at 6 months in a low-fat, lacto-ovo vegetarian dietary group versus the low-fat dietary pattern group, in which it increased by 0.05 mmol/L at 6 months. However, at 18 months, the LDL-C levels no longer differed in vegetarian and low-fat groups.

Diets that differ substantially from an individual's habitual dietary patterns may be difficult to adhere to and maintain long term and appear to result in lower dietary treatment compliance. Since calorie-restricted diets appear to be similar in their short- and long-term effects on weight, metabolic, and CVD outcomes with no consistent, strong evidence of the benefits of one dietary pattern over another, practitioners are advised to tailor dietary interventions for weight loss to the individual's habitual eating practices, when possible [163, 167, 170].

M Meal Replacements and Adding Foods to Liquid Diets—Summary Table 3.8

ES10a. In overweight and obese women, the use of liquid and bar meal replacements is associated with increased weight loss at up to 6 months in comparison to a balanced deficit diet utilizing only conventional food. Longer term evidence of continued weight loss advantage is lacking.

Strength of evidence: Low

ES10b. There is insufficient evidence to comment on the value of adding various types of foods to a low-calorie liquid diet.

Rationale: Three studies (rated fair quality) used meal replacements with differing counseling approaches [160, 173] and with adding specific solid food to a liquid diet [157]. The two studies that evaluated meal replacements compared dietitian-led counseling with and without the use of liquid or bar meal replacements. Those studies evaluated only women, and neither trial reported data on CVD risk factors. The study that evaluated the addition of various types of foods to a liquid diet allowed male and female subjects to add either almonds or a food of equivalent caloric content from a food list. Summary table 3.8 describes the three studies evaluating meal replacements.

Meal replacements are commercial, portion-controlled products. They are packaged as powders that can be reconstituted as shakes or soups, liquids, bars, or packaged or frozen entrees or snacks. In practice, meal replacements are usually employed to enforce caloric restraint, and they were used in this way in the Look AHEAD study [160]. In the two studies described in Summary table 3.8, the use of liquid and bar meal replacements resulted in greater weight loss at 6 months [154] and 20 weeks [160] although no significant difference in weight loss was documented by 40 or 60 weeks in one of the studies [160].

Although most meal replacement diets replace one to two meals and are usually prescribed as part of a 1,200 to 1,500 kcal/day low-calorie diet, one fair-quality study [157] addressed the strategy of liberalizing the diet while on a liquid diet for all three meals. That study involved adding targeted foods to an energy-restricted liquid formula. The issue of helping patients sustain compliance while on meal replacement diets has practical implications. However, just one small study addresses the issue [157].

N Very Low-Calorie Diet Approaches—Summary Table 3.9

ES11a. There is insufficient evidence to comment on the value of liquid protein supplementation following the VLCD induction of weight loss as an aid to weight loss maintenance.

ES11b. There is insufficient evidence to comment on strategies to provide more supervision of VLCD adherence or to liberalize VLCD therapy with the addition of conventional foods as an aid to the induction of weight loss.

Rationale: Two studies (rated fair quality) addressing VLCDs for weight loss net inclusion criteria and are characterized in summary table 3.9 [153, 161] They evaluated VLCDs as an initial phase for weight loss maintenance programs. In addition, CQ4 addressed VLCDs as part of behavioral interventions, assessing different behavioral approaches as additions to VLCDs [189-192].

VLCDs are usually defined as diets providing <800 kcal/day (<3,347 kilojoules) and are designed to produce rapid weight loss while preserving lean body mass. The macronutrient content of these diets, therefore, typically consists of 0.8 to 1.5 grams of protein/kg of ideal body weight per day. The protein usually is provided as a milk-, soy-, or egg-based powder, which is mixed with water and ingested as a liquid. These powders also contain carbohydrate (up to 80 g/day) and fat (up to 15 g/day) and include 100 percent of the recommended daily allowance for essential vitamins and minerals [193]. Another method of obtaining a VLCD is to use the protein-sparing modified fast (PSMF), which consists of lean meat, fish, and fowl. The PSMF must be supplemented with vitamins and minerals as well as large amounts of water and noncaloric fluids [5, 194]. VLCDs are considered safe and effective when used by individuals under carefully supervised medical monitoring.

Despite the popularity of VLCDs, only two studies using liquid diets in obese subjects [157, 161] were rated of adequate quality (fair) to be included in CQ3. They used VLCD without intensive lifestyle intervention (ILI), and both had weight loss and weight maintenance phases. Lejeune et al. [161] used a 2,100 kJ (500 kcal)/day VLCD (Modifast powders that were reconstituted with water into a milkshake, pudding, soup, or cereal and supplemented with fruits and vegetables) during 4 weeks of weight loss followed by a 6-month maintenance phase where groups consumed a “usual diet” and one group received an additional liquid, reconstituted pure protein powder supplement (30 g protein/day). Weight differences at 6 months favored the protein supplement group. Torgerson et al. [153] used a 2,100 kJ (500 kcal)/day VLCD for 16 weeks using three different approaches: Inpatient liquid-only diet, outpatient liquid-only diet, and outpatient liquid diet plus fruit and vegetable supplements. The study found no differences in weight losses.

The 1998 overweight and obesity clinical guidelines state that long-term (<1 year) weight loss with VLCDs is not different from that of low-calorie diets, despite superior initial weight loss [5]. The equivalence of long-term weight losses was attributable to greater weight regain among the VLCD-treated subjects. Two studies [191, 192] evaluated under CQ4 in this evidence review met inclusion criteria and addressed VLCDs for weight maintenance in the context of lifestyle intervention. Borg et al. [190] used an intervention that included Nutrilett (2,100 kJ or 500 kcal)/day versus control. Fogelholm et al. [192] used the same Nutrilett program and had a similar duration. Each study started with a weight loss program, which was 2 and 3 months in length, respectively, for the VLCD, followed by maintenance and followup. The modest short term weight loss was not sustained for a longer term (33 months), with average weight regains between 5.9 to 9.7 kg at 33 months. Exercise maintenance seemed to attenuate the weight regain.

O Recommendations—Dietary Strategies for Weight Loss

Recommendation 3a. Prescribe a diet to achieve reduced calorie intake for obese or overweight individuals who would benefit from weight loss, as part of a comprehensive lifestyle intervention. Any one of the following methods can be used to reduce food and calorie intake:

  1. Prescribe 1,200-1,500 kcal/day for women and 1,500-1,800 kcal/day for men (kcal levels are usually adjusted for the individual's body weight);
  2. Prescribe a 500 kcal/day or 750 kcal/day energy deficit; or
  3. Prescribe one of the evidence-based diets that restricts certain food types (such as high-carbohydrate foods, low-fiber foods or high-fat foods) in order to create an energy deficit by reduced food intake.

Recommendation Grade: A (strong)

Rationale: Foundational to weight loss is the necessity of creating a negative energy balance during the active weight loss period. To do this, the emphasis must be on reducing energy intake from food. This is a requirement since to create a substantial energy deficit by increasing energy expenditure in physical activity alone is, for most Americans, very difficult. Thus, for the active weight loss phase of weight management, the emphasis in lifestyle counseling is on constructing a healthy low-calorie diet that can produce weight loss of 1 to 2 lb/wk. We provide three recommended ways to achieve this aim, but health care providers must understand that the key to achieving them is to give patients not only instructions but also tools to implement them. In the majority of the studies included in our review, registered dietitians provided the behavioral counseling. Therefore, practitioners who are not proficient and willing to devote the considerable time required should refer patients who would benefit from weight loss to a nutrition professional or to counselors trained in nutrition intervention, so that patients may benefit from behavioral intervention.

Evidence Statement 1 identified three pathways (used in the studies from the evidence base for CQ3) to achieve negative energy balance through reduced food intake and those pathways form the basis for Recommendation 1. The three pathways described in Recommendation 1 are all documented to produce weight loss. The provider should consider the patient's preference, ability, and health status to select a pathway to achieve negative energy balance. The most common technique is to limit intake to 1200 to 1500 kcal/day for women or 1500 to 1800 kcal for men [92, 160, 164, 167]. The higher limits are for individuals who have greater weights at baseline. In another technique, the baseline energy requirement is calculated with a formula, described above [169, 170, 172], and modified for patients' habitual exercise pattern. This “tailors” the estimation of baseline energy requirements; the calorie goal is determined by subtracting 500 to 750 kcal/day. Of course, the goal may be readjusted depending on actual weight loss. For both of these techniques, patients are then given tools and strategies to achieve daily caloric targets and monitor daily intake. As variants of these approaches, sometimes patients are provided with tools to monitor “points” that correspond to a calorie limit. Some diets are so called “ad libitum” approaches [92, 152, 159, 161-163]. However, these diets restrict certain food types or food groups. “Ad libitum” refers only to food intake of certain prescribed foods. It must be emphasized that the “ad libitum” is illusory. In these approaches, reduced caloric intake is well documented, and the weight loss achieved is due to negative energy balance.

Recommendation 3b. Prescribe a calorie restricted diet, for obese and overweight individuals who would benefit from weight loss, based on the patient's preferences and health status and preferably refer to a nutrition professionali for counseling. A variety of dietary approaches can produce weight loss in overweight and obese adults, as presented in CQ3, Evidence Statement 2.

Recommendation Grade: A (strong)

Rationale: A myriad of dietary approaches to weight loss can be successful. The evidence supports all the approaches listed in Evidence Statement 2, above. Diets recommended for Americans by the AHA and American Diabetes Association (ADA) can produce weight loss and are nutritionally balanced [174-176, 190]. Weight loss can be achieved with vegetarian or vegan diets [142, 156, 163] and with dietary patterns modeled after certain traditional cultures [164]. Diets employed in the popular diet books can induce weight loss [179, 180] but may not be nutritionally balanced. Although not considered in this evidence review, health care practitioners must consider overweight and obese patients with hypertension as good candidates for a calorie-restricted DASH diet, with nine servings of fruits and vegetables and three servings of low-fat dairy products [182, 183, 191]. The charge to practitioners is to support patients' preferences and strong aversions and to guide their patients in selecting healthy dietary patterns that can be sustained over the longer term.

The message to the practitioner is three-fold. First, there are many options/choices that can work to help patients lose weight and achieve health benefits. Second, when selecting a weight-loss diet consider the contribution of the diet to management of other risk factors or diseases (e.g., type 2 diabetes, hypertension, gout). Also consider the long-term nutritional adequacy and sustainability of the diet, and tailor the dietary intervention to the needs, habitual patterns, and preferences of the individual. Third, no diet will be effective for weight loss without calorie reduction. Losing weight requires reduction in calorie intake, whether patients are tracking calories, points, grams of a macronutrient, or eating from a limited list of food choices. If a diet “doesn't work,” then an analysis should reveal an excessive consumption of calories, relative to energy expenditure, and a modification of approach is indicated.

P Gaps in Evidence and Future Research Needs

Dietary interventions are a critical element of any attempt to lose weight and maintain weight loss. While this review of the evidence supports that there are a variety of dietary patterns and alternative dietary forms, structures, or composition to achieve and sustain weight loss, further research is still needed.

Because long-term dietary adherence is problematic in weight management, studies should test pragmatic approaches to diet intervention delivery in free-living individuals for at least 2 years. What works over 6 months may not be durable over 2 years. The long-term outcome is of utmost importance to determine the best dietary approach to sustain weight loss over the long term.

Additionally, studies are needed that test the impact of tailoring choice of dietary interventions to the individual's ability to adhere long term. One of the findings of this assessment of the longer term diet studies was that a minority of participants were adhering to dietary recommendations at 2 years. The diet can only have an effect if an individual will follow it.

Last, to fully understand and develop remedies for the challenges of long-term weight maintenance, studies are needed that evaluate the physiologic and biologic adaptations to weight loss; i.e., studies of metabolic response to weight reduction. Understanding the physiologic response to weight reduction might enable the field to define better dietary methods of caloric restriction during weight reduction and maintenance.

Section 6: Critical Question 4

A Statement of the Question

CQ4 has two parts:

  1. Among overweight and obese adults, what is the efficacy/effectiveness of a comprehensive lifestyle intervention program (i.e., comprising diet, physical activity, and behavior therapy) in facilitating weight loss or maintenance of lost weight?
  2. What characteristics of delivering comprehensive lifestyle interventions (e.g., frequency and duration of treatment, individual versus group sessions, onsite versus phone/e-mail contact) are associated with greater weight loss or weight loss maintenance?

B Selection of the Inclusion/Exclusion Criteria

Panel members developed eligibility criteria, based on a population, intervention/exposure, comparison group, outcome, time, and setting (PICOTS) approach, to use for screening potential studies for inclusion in the evidence review. Table 5 presents the details of the PICOTS approach for CQ4. Only randomized controlled trials (RCTs) were considered.

Table 5. Criteria for Selection of Publications for CQ4
image
image
image

C Introduction and Rationale for Question and Inclusion/Exclusion Criteria

Prior national and international expert panels (e.g., 5,195-199) have independently recommended that overweight and obese adults be provided a comprehensive lifestyle intervention to achieve weight loss. Comprehensive programs employ diet, physical activity, and behavior therapy in combination. CQ4 seeks to determine the short- and long-term weight losses that can be achieved with a comprehensive lifestyle intervention.

Traditionally, comprehensive lifestyle interventions have been delivered onsite, in frequent face-to-face meetings; i.e., high-intensity, onsite treatment. This approach is generally considered the state of the art for lifestyle intervention. In the past decade, however, comprehensive programs delivered by electronic means, including the Internet, e-mail, and text messaging, as well as by person-to-person telephone counseling have been emerging. Comprehensive interventions also are being delivered in new settings that diverge from the academic centers in which most RCTs have been conducted. For example, health professionals who work in primary care settings have been implementing lifestyle interventions in these settings. Some commercially based programs also have incorporated the components of a comprehensive lifestyle intervention into their programs, which they offer to the public through face-to-face and telephone- and electronically based contact. CQ4 describes the short- and long-term weight losses from RCTs that have examined the results of comprehensive interventions delivered through these different modalities and venues. In most cases, the efficacy of each comprehensive lifestyle intervention was compared with usual care; i.e., minimal treatment, attention control group.

Few RCTs have directly compared the efficacy of comprehensive lifestyle interventions as delivered by one modality versus another or as offered in one setting versus another. For example, only one trial has directly compared the efficacy of the same lifestyle intervention delivered onsite (i.e., face to face) versus by Internet. This prevented the panel, in several instances, from drawing definitive conclusions about the relative efficacy of the different interventions examined. This limitation also applied to the panel's efforts to draw conclusions about several characteristics of traditional onsite comprehensive interventions that the panel thought might influence short- or long-term weight losses. These characteristics included the intensity of the intervention (i.e., how frequently participants had counseling contacts), the duration of care, and whether participants received individual or group counseling. In the absence of RCTs that directly tested these issues, the panel examined the difference in mean weight loss between participants who were assigned to the intervention and those assigned to the usual care groups. Net-of-control difference for one group of trials, such as those that offered high-intensity onsite treatment, were then compared with net-of-control differences for a second group of trials, such as those that provided low-intensity onsite treatment. Large differences between two groups of studies in their net-of-control differences suggested that one intervention approach was potentially superior (or inferior) to another. These comparisons, however, were not subjected to statistical analysis, thus limiting definitive conclusions in some areas.

RCTs that examined interventions to improve the maintenance of lost weight were of particular interest given the widely acknowledged problem of weight regain following the end of lifestyle interventions. Later sections of this document (section viii, Trials of Weight Loss Induction Versus Maintenance of Lost Weight) describe the different study designs that are used to examine the induction versus the maintenance of weight loss as well as progress over the past decade in improving the maintenance of lost weight.

The panel did not attempt to isolate the effects in terms of weight loss induction of one intervention component (i.e., diet, physical activity, or behavior therapy) relative to others given the review of this issue by prior expert panels with the resulting consensus that all three components should be prescribed. Additional information about the effects on weight loss of diet composition and form are covered in CQ3, whereas findings concerning the contribution of different types of physical activity to weight reduction recently have been reviewed by another expert panel [200]. Behavior therapy is used to facilitate participants' adherence to diet and physical activity recommendations; it is not used in isolation (by itself) for weight loss.

i A Dictionary of Lifestyle Intervention Terms

This section defines select terms used in CQ4.

ii Comprehensive Lifestyle Intervention

Comprehensive lifestyle interventions for overweight/obese adults include three principal components: [1] Prescription of a moderately reduced calorie diet, [2] prescription of increased physical activity, and [3] a program of behavior change to facilitate adherence to diet and activity recommendations. (All three components, described later in greater detail, should be included.) Adherence to diet and activity recommendations also is facilitated by ongoing guidance and feedback from a trained interventionist.

iii Intervention Delivery

  • Onsite: The intervention is delivered to participants by a trained interventionist in face-to-face meetings held at a clinic, community center, worksite, or other settings.
  • Electronic: The intervention is delivered to participants by e-mail, Internet, mobile phone, text message, or similar electronic means. Interventionists may communicate personally with participants by electronic means (e.g., e-mail) but not by telephone; i.e., speaking with each other.
  • Telephone: The intervention is delivered to participants by telephone; i.e., live person-to-person contact.
  • Commercial: The intervention is delivered to participants who pay a fee to a proprietary weight loss program. Interventionists, trained by the company, deliver the intervention.
  • Primary care: The intervention is delivered to participants in a primary care practice by health professionals and staff who work in the practice.

iv Intervention Intensity

The panel defined the intensity of lifestyle interventions by the number of treatment contacts provided in the first 6 months.

  • High: 14 or more contacts. (Weekly contact for the first 3 to 6 months is common.)
  • Moderate: 6 to 13 contacts; i.e., monthly to every-other-week contact.
  • Low: 1 to 5 contacts; i.e., less than monthly.

v Intervention Duration

The panel defined the duration of lifestyle intervention as well as the time point at which body weight was last assessed after intervention (i.e., nonintervention followup) as follows:

  • Short term: ≤6 months
  • Intermediate term: >6 to ≤12 months
  • Long term: >12 months

vi Individual Versus Group Intervention

A trained interventionist may deliver a lifestyle intervention to a single participant (i.e., individual contact) or to a group of individuals (typically 10 to 20 participants).

vii Trained Interventionist

In the studies reviewed, trained interventionists included mostly health professionals (e.g., registered dietitians, psychologists, exercise specialists, health counselors, or professionals in training) who adhered to formal protocols in weight management. In a few cases, lay persons were used as trained interventionists; they received instruction in weight management protocols (designed by health professionals) in programs that have been validated in high-quality trials published in peer-reviewed journals.

viii Trials of Weight Loss Induction Versus Maintenance of Lost Weight

  • Weight Loss Induction: RCTs of weight loss induction assign participants to different interventions and examine changes in body weight (from baseline) at different intervals, which may include at the end of treatment delivery and then at 3 or more months after treatment has concluded; i.e., 3-month, nonintervention followup). Such trials often provide information about the effects of an intervention on both short- and long-term weight changes. Long-term changes in weight are sometimes referred to as the “weight loss maintenance” phase although “long-term weight change” is a more appropriate term.
  • Maintenance of Lost Weight: This term often is used interchangeably with “maintenance of weight loss” or “preventing weight regain” although there are differences between the terms. A major difference is that “maintenance of weight loss” suggests that the intervention is designed to facilitate participants' continued loss of weight following the initial period of weight loss. “Maintenance of lost weight,” by contrast, suggests that the goal is to keep off (or maintain) the weight loss that was achieved in the initial weight loss phase. “Prevention of weight regain” suggests trying to limit the amount of weight that is regained from the prior weight loss.

RCTs designed to address the maintenance of lost weight use a different experimental design than those that examine weight loss induction. In the former case, all participants must first lose a certain amount of weight (e.g., 5 percent of initial weight) to qualify for randomization in the weight loss maintenance trial. The initial weight loss often is described as occurring during a “diet run-in” period. Success in the maintenance trial typically is measured by the percentage of the prior (run-in) weight loss that is maintained or by the absolute change in body weight from the randomization weight (achieved after the diet run-in). This latter assessment often translates into a measure of weight regain from randomization.

D Methods for Critical Question 4

The Obesity Expert Panel formed work groups for each of its five CQs. For CQ4 the work group included one internal medicine physician and two clinical psychologists representing academic institutions across the United States. Chairmanship rotated among the members.

The wording of CQ4 evolved over time, from a comprehensive intervention initially including two or more components (i.e., dietary prescription, physical activity, or behavioral therapy) to one including all three components. Additional exclusion criteria were later put in place to remove trials that included comprehensive lifestyle interventions but were designed principally to compare different dietary interventions. The panel felt that these trials were more appropriately addressed under CQ3. One seminal RCT, the Diabetes Prevention Program, did not meet inclusion criteria because of the lower BMI inclusion criteria for the trial (24 kg/m2 or 22 kg/m2 for Asians). However, because of the importance of this trial, an exception was made to include the Diabetes Prevention Program in the evidence base.

As noted in section 2, Process and Methods Overview, a standardized approach to systematically reviewing the literature was conducted for all CQs. The panel members participated in developing CQ4 and its inclusion/exclusion (I/E) criteria and in reviewing the included/excluded papers and their quality ratings. Contractor staff worked closely with panel members to ensure the accuracy of data abstracted into evidence tables and summary tables and the accuracy of the application of systematic, evidence-based methodology.

The literature search for CQ4 included an electronic search of the Central Repository for RCTs or controlled clinical trials published in the literature from January 1998 to December 2009. The Central Repository contains citations pulled from seven literature databases (PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts). The search produced 2,145 citations, with 15 additional citations identified from nonsearch sources; i.e., by the panel members or through a hand search of systematic reviews/meta-analyses obtained through the electronic search. The systematic reviews/meta-analyses were only used for manual searches and were not part of the final evidence base. This manual cross-check was done to ensure that major studies were not missing from the evidence base. Of the 15 citations identified from nonsearch sources, 11 were published after December 31, 2009. Per NHLBI policy, certain lifestyle and obesity intervention studies published after the closing date could be allowed as exceptions. These studies were required to be RCTs in which each study arm contained at least 100 participants and were identified by experts knowledgeable with the literature. Ten of the 11 citations published after December 2009 met these criteria and were eligible for inclusion in the CQ4 evidence base [23, 201-209]. In contrast, 1 of the 11 citations did not meet the criteria and was excluded from the CQ4 evidence base [210]. The remaining four citations, identified through nonsearch sources, were published before 2009. Of these four, one citation had no abstract, two citations had no indication in the abstract or Medical Subject Heading terms that they were related to overweight or obese populations, and one citation had no indication in the abstract or Medical Subject Heading terms that the publication was related to comprehensive lifestyle interventions. Of the 15 citations identified through nonsearch sources, 14 were screened and found eligible for inclusion; 2 of these studies were subsequently rated as poor-quality studies. Figure 6 outlines the flow of information from the literature search through the various steps used in the systematic review process for CQ4.

Figure 6.

PRISMA Diagram Showing Selection of Articles for Critical Question 4

Two reviewers (i.e., independent contractors) independently screened the titles and abstracts of 2,160 publications against the I/E criteria, which resulted in 1,776 publications being excluded and 384 publications being retrieved for full-text review to further assess eligibility. Next, 2 reviewers independently screened the 384 full-text publications, assessing eligibility by applying the I/E criteria; 215 of these publications were excluded based on one or more of the I/E criteria (see specified rationale as noted in Figure 6).

Of the 384 full-text publications, 146 met the criteria and were included. The quality (internal validity) of these 146 publications was assessed using the quality assessment tool developed to assess RCTs (see Appendix table A-1). Of these, 74 publications were excluded because they were rated as poor quality [107, 115, 119, 148, 154, 173, 201, 211-277], of these, 43 studies were rated poor because they did not use an intent-to-treat (ITT) analysis and had high attrition rates. Rationales for all of the poor-quality ratings are included in the Appendix table B-16 [207, 209, 276-283]. The remaining 51 trials (72 articles) [23, 49, 189-192, 202-209, 278-335] were rated good or fair quality and included in the evidence base that was used to formulate the evidence statements. Panel members reviewed the final studies on the included list, along with their quality ratings, and had the opportunity to raise questions. Some trials previously deemed to be of fair or good quality were downgraded to poor quality upon closer review of evidence tables. These trials used completers analyses rather than ITT analysis and had overall attrition rates exceeding 10 percent. If a study reported only an analysis of completers and had attrition rates <10 percent, it was allowed in the evidence base. Methodologists worked with the systematic review team to reevaluate these trials and make a final decision. Evidence tables and summary tables consist only of data from the original publications of eligible RCTs; these tables formed the basis for panel deliberations.

Creation of the evidence tables followed the methodology described in Appendix A. For each RCT (all included articles combined into one entry) that met the inclusion criteria for this CQ and was rated good or fair quality, the following data are presented in an evidence table:

  • Study Characteristics: Author, year, study name, country and setting, funding, study design, research objective, year study began, overall study number, quality rating
  • Study Design Details: Treatment groups, descriptions of interventions, duration of treatment, duration of followup, number of contacts, format of intervention, provider, assessments or collection of outcome data
  • Criteria and End points: I/E criteria, primary outcome, secondary outcome, outcome ascertainment
  • Baseline Population Characteristics: Age, sex, race/ethnicity, BMI, weight, history of MI, CHD, CVD, congestive heart failure (CHF), hypertension, diabetes, comments on demographics
  • Results: Outcomes of interest (weight change in kg, percent reduction in initial weight, weight change within specific percent change groups such as 5 or 10 percent) by time periods, adverse events, attrition at end of study, adherence. Because waist circumference, waist-hip ratio, and percent body fat were not consistently reported in many of the included studies, the panel elected to focus on direct measures of body weight only.

Summary tables for CQ4 followed the same general format, as described below; however, the organization of trials or sections within each table varied by the panel's preference of how to present the evidence:

  • Study Characteristics: Study name, author, year, study name, study design, type of intent-to-treat (ITT) analysis, country/setting, primary outcome quality rating
  • Intervention Groups and Component Details: Interventions concisely describing key elements of the three required components
  • Study Duration, Contents, Health Care Practitioner: Duration of treatment and followup, description of contacts, practitioner
  • Sample Characteristics, Group Size, Baseline Characteristics: Brief sample description, intervention (not specified), weight, BMI
  • Outcomes: These are presented in three separate columns: ≤6 months mean weight loss change (kg/percent change); >6- and ≤1-year mean weight loss change (kg/percent change); and ≥1-year mean weight loss change.
  • Attrition, Adherence: Withdrawals by group at study end, attendance at sessions

While preparing summary tables, it came to the panel's attention that many included trials reported only completers' analysis data with greater than 10 percent attrition. These trials were downgraded to poor quality and removed from the analysis. (Trials with ≤10 percent attrition were retained in the analysis.)

Panel members developed very preliminary evidence statements prior to the development of evidence tables. This served to organize the studies into categories of questions addressed and to ensure that the appropriate data elements would be presented in the evidence tables and incorporated into the summary tables for the evidence synthesis.

E Evidence Statements and Summaries

A total of 51 trials (72 articles) met the final I/E criteria and were quality rated fair or good quality. The panel members decided to consider only RCTs for CQ4 (in order to work from the original data and to reach their own conclusions). Of these, 27 trials (45 articles) compared a comprehensive intervention arm to usual care, minimal intervention, or no intervention. Thirteen trials (13 articles) had at least 2 or more comprehensive intervention arms compared to usual care, minimal intervention, or no intervention. The remaining 11 trials (13 articles) compared a comprehensive intervention to another comprehensive intervention (the latter included a different physical activity or behavior therapy component). Details regarding specific trials included for each summary table are presented below.

E.1 Introduction of Evidence Statements

The panel reviewed and summarized evidence in three key areas on the efficacy/effectiveness of a comprehensive lifestyle intervention for facilitating weight loss and maintaining lost weight. The first key area covers data on weight loss induction and programs providing extended interventions to facilitate adherence to the initial program. Comprehensive, high-intensity, onsite interventions, the most widely studied models, were the focus. These models, considered state of the art for behavioral interventions, were primarily conducted in academic research settings. Using this evidence base, panel members reviewed different types of intervention programs and then drew conclusions about the effectiveness of the various approaches; i.e., commercial programs, very low-calorie diets, primary care-based programs, electronic interventions, and telephone-based counseling. The second area of evidence examines programs designed to help patients maintain lost weight. Included in this evidence base are RCTs that assigned participants to intervention strategies after the initial weight-reduction period was completed. During this review, weight loss maintenance strategies were closely examined. Finally, the third area of evidence examines the delivery characteristics of interventions, including structural components of the interventions or modes of delivery associated with differences in outcomes. This area covers evidence on the intensity (i.e., frequency) of intervention contact (moderate and low intensity); individual versus group counseling; and onsite versus remote, electronically delivered counseling.

F Diet, Physical Activity, and Behavior Therapy Components in High-Intensity, Onsite Lifestyle Interventions—Summary Tables 4.1a-c

ES1. The principal components of an effective high-intensity, onsite comprehensive lifestyle intervention include ([1]) prescription of a moderately reduced-calorie diet, ([2]) prescription of increased physical activity, and ([3]) the use of behavioral strategies to facilitate adherence to diet and activity recommendations. All three components should be included.

  • Reduced-calorie diet. In comprehensive lifestyle interventions, overweight/obese individuals typically are prescribed a diet designed to induce an energy deficit ≥500 kcal/day. This deficit often is sought by prescribing 1,200 to 1,500 kcal/day for women and 1,500 to 1,800 kcal/day for men. Alternatively, dietary energy deficits can be determined by one of the methods described in CQ3.
  • Increased physical activity. Comprehensive lifestyle intervention programs typically prescribe increased aerobic physical activity (such as brisk walking) for ≥150 minutes per week (≥30 minutes a day, most days of the week). Higher levels of physical activity, approximately 200 to 300 minutes per week, are recommended to maintain lost weight or minimize weight regain long term (>1 year).
  • Behavioral strategies. Comprehensive lifestyle interventions usually provide a structured program that includes guidance on behavioral strategies and approaches to accomplish prescribed dietary intake and physical activity goals. One common strategy is regular self-monitoring, including monitoring of food intake, physical activity, and weight. These same behaviors are recommended to maintain lost weight, with the addition of frequent (i.e., weekly or more often) monitoring of body weight.

Strength of evidence: High

Rationale: The treatment components of high-intensity, onsite (i.e., face to face) comprehensive lifestyle interventions were identified from 10 RCTs (summarized in Summary tables 4.1a-c) that compared a lifestyle intervention with a usual care control group (several trials have two or more publications, which represent followup evaluations) [285, 324, 326]. Six were rated good quality while four were rated fair quality. In 3 of the 10 trials, women were prescribed approximately 1,200 to 1,500 kcal/day and men 1,500 to 1,800 kcal/day (this practice assumes that men have higher energy requirements than women based on their generally greater body weight and greater amount of fat-free mass [49, 206, 278, 307]. Alternatively, in three trials, energy intake was prescribed based on body weight, regardless of gender [49, 206, 278, 307]. If followed, both prescriptive methods are likely to help participants achieve an energy deficit of at least 500 kcal/day, independent of changes in physical activity. One trial recommended a 500-kcal/day deficit [315] and another a 300- to 400-kcal/day deficit [205]. Another trial recommended reducing calorie and fat intake but did not specify the targeted goals [208, 309]. Only one trial [325] did not recommend a calorie-restricted diet, instead proposing a reduced fat intake with an increased intake of fruits, vegetables, and fiber.

Comprehensive interventions typically prescribe brisk walking (or similar aerobic activity) to increase participants' physical activity and, thus, energy expenditure. Of the 10 RCTs, 8 recommended that participants gradually build to 90 to 225 minutes of walking per week, with the most common goal being 150 to 200 minutes per week [49, 205, 206, 278, 280, 307, 309, 315, 324, 326]. Participants were instructed to exercise on their own in 8 of 10 trials [49, 205, 206, 278, 280, 307, 309, 315, 324, 326], with onsite supervision provided in the other 2 [285, 325].

Behavior therapy provides overweight/obese participants with a set of skills to help them adopt recommended eating and activity behaviors. Self-monitoring is the most frequently recommended practice. In 9 of 10 trials, participants were instructed to monitor their food intake (usually including calories) [49, 205, 206, 278, 315, 324], and 6 trials encouraged participants to monitor their physical activity [49, 205, 206, 278, 315, 324]. Regular monitoring of body weight also is recommended, often once or twice a week during initial weight loss [23, 49, 206, 278, 285, 307]. Individual trials in Summary tables 4.1a-c show that many interventions provided a behavior change program that included additional techniques such as stimulus control, slowing the rate of eating, problemsolving, cognitive restructuring, and relapse prevention. Collectively, these techniques comprise a “behavioral package,” different components of which may be emphasized in different trials.

G Comprehensive Interventions Compared With Usual Care, Minimal Care, or No-Treatment Control—Summary Tables 4.2a-d

ES2a. (Short-Term Weight Loss). In overweight and obese individuals in whom weight loss is indicated and who wish to lose weight, comprehensive lifestyle interventions consisting of diet, physical activity, and behavior therapy (all three components) produce average weight losses of up to 8 kg in 6 months of frequent (i.e., initially weekly), onsite treatment provided by a trained interventionist in group or individual sessions. Such losses (which can approximate reductions of 5 to 10 percent of initial weight) are greater than those produced by usual care; i.e., characterized by the limited provision of advice or educational materials. Comparable 6-month weight losses have been observed in treatment comparison studies of comprehensive lifestyle interventions, which did not include a usual care group.

Strength of evidence: High

ES2b. (Intermediate-Term Weight Loss). Longer term comprehensive lifestyle interventions, which additionally provide weekly to monthly onsite treatment for another 6 months, produce average weight losses of up to 8 kg at 1 year, losses which are greater than those resulting from usual care. Comparable 1-year weight losses have been observed in treatment comparison studies of comprehensive lifestyle interventions, which did not include a usual care group.

Strength of evidence: Moderate

ES2c. (Long-Term Weight Loss). Comprehensive lifestyle interventions which, after the first year, continue to provide bimonthly or more frequent intervention contacts, are associated with gradual weight regain of 1 to 2 kg per year (on average) from the weight loss achieved at 6 to 12 months. Long-term (>1 year) weight losses, however, remain larger than those associated with usual care. Comparable findings have been observed in treatment comparison studies of comprehensive lifestyle interventions, which did not include a usual care group.

Strength of evidence: High

Rationale: The three preceding evidence statements are based on findings from 10 RCTs that compared a high-intensity, comprehensive lifestyle intervention, delivered onsite with a usual care or minimal treatment control group; i.e., characterized by the limited provision of advice or educational materials (several trials have two or more publications, which represent followup evaluations) [23, 49, 202, 205-207, 278, 280, 285, 290, 307, 309, 315, 324-326, 335]. Six were rated good quality while four were rated fair quality. Comprehensive programs were delivered by trained interventionists, in group or individuals sessions, and provided recommendations on the following: Consuming a low-calorie diet (e.g., 1,200 to 1,800 kcal/day based on body weight or gender); engaging in regular physical activity (e.g., 180 minutes per week of brisk walking); and using behavioral strategies to achieve these recommendations; e.g., self-monitoring, goal-setting, problemsolving. High-intensity interventions were defined as providing a minimum of 14 sessions during the first 6 months; i.e., the period of short-term weight loss. Most interventions provided weekly sessions for the first 3 months and, often, at least every-other-week sessions from months 4 to 6. From months 7 to 12 (i.e., the period of intermediate weight loss), several of the high-intensity interventions continued to provide onsite counseling, with the frequency of contact varying from 3 sessions per month to only 1 session every 2 months. After 1 year (i.e., the period defined as long-term weight loss), the frequency of onsite contact provided in some of these high-intensity trials ranged from monthly to every other month. Some of these trials provided additional contacts by telephone or e-mail.

Short term. Six of the 10 studies [324, 325] reported weight losses at 6 months. Four of these six trials reported mean losses of approximately 6 to 8 kg. Two other studies reported mean losses of 4.4 kg and 3.0 kg, respectively [324, 325]. The last trial [325] differed from the other eight in not prescribing a specific calorie target or requiring participants to record their food intake (although a curriculum of lifestyle modification was provided). In these 6 studies, the difference in weight loss between the intervention and control groups (i.e., net of control) at 6 months ranged from 3.2 kg to 8.6 kg (in favor of the intervention in all studies).

Intermediate term. Three of the 10 RCTs reported weight loss at 12 months [49, 205, 278, 307, 335]. All three trials reported a mean loss of approximately 7 to 8.5 kg at this time (equal to an approximately 7- to 8.5-percent reduction in initial weight). Two studies found that 61 percent and 68 percent of intervention participants lost ≥5 percent of initial weight, compared with 16 percent and 14 percent, respectively, in the usual care groups [49, 205, 335]. The third study reported that 50 percent of intervention participants lost ≥7 percent of their initial body weight [278, 307]. Similar weight loss outcomes were achieved despite differences in the use of individual [278] versus group [49, 205] sessions and different frequencies of onsite contact during months 7 to 12, ranging from three times monthly [49] to once every other month [278]. The largest weight losses at 12 months were reported in a trial that included the use of meal replacement products during the first 4 months (e.g., liquid shakes and snack bars) to replace two meals and one snack daily [49]. (Participants continued to replace one meal or snack daily thereafter.) In these three studies, the difference in weight loss between intervention and control groups at 12 months ranged from 4.3 to 7.9 kg (in favor of the intervention in all studies). These data suggest that, at 1 year, overweight/obese individuals can maintain approximately the full amount of their initial weight loss (achieved in the first 6 months) when provided continuing onsite lifestyle counseling from months 7 to 12. On average, participants typically do not lose additional weight from 7 to 12 months.

Long term. Eight of the 10 RCTs [23, 49, 205, 206, 278, 280, 307, 309, 315, 324, 325] reported weight losses beyond 1 year, ranging from 18 months [325] to 4 years [23]. Mean weight losses ranged from 0.2 to 5.6 kg. The difference between intervention and control groups ranged from a mean of 2.0 to 5.5 kg (in favor of intervention in all studies). In all trials that reported weight losses for two or more followup assessment periods, weight losses were consistently smaller at long-term followup than at the 6- or 12-month assessments. Thus, even in studies in which participants were provided continued onsite counseling after the first year, overweight/obese individuals were not able to maintain their maximum weight loss achieved in the first 6 to 12 months. Weight regain averaged approximately 1 to 2 kg per year. For example, in a 4-year study [23, 49], participants lost a mean of 8.6 kg at 12 months and maintained a loss of 4.7 kg at 48 months. These findings indicate that further research is needed on facilitating maintenance of lost weight as discussed in section Q (below), Gaps in Evidence and Future Research Needs. They also underscore that overweight/obese individuals need to continue to participate in a weight loss maintenance program following initial weight loss in the first 6 to 12 months.

Comparative treatment studies. Five additional RCTs [279, 284, 302, 304, 305] were identified that compared a high-intensity, comprehensive lifestyle intervention, delivered onsite, with other high-intensity, onsite interventions that varied the physical activity or behavior therapy component (usually by intensifying it). One study was rated good quality; four were rated fair quality. These five studies did not include a usual care or minimal-treatment control group. However, they provide additional estimates of the efficacy of high-intensity, onsite interventions. Trials that compared different dietary interventions, prescribed as part of comprehensive lifestyle interventions, are discussed in CQ3.

Four of the five studies used a calorie-restricted diet and behavior therapy while varying the physical activity component, typically by increasing the duration and/or intensity of prescribed activity [302, 304, 305]. In three studies [302, 304, 305], in which one group of participants received a conventional activity prescription (such as expending 1,000 kcal/week), participants achieved average weight losses of 8.1 to 8.3 kg at 6 months. Two such studies reported 12-month losses, which ranged from 6.1 to 6.3 kg [302, 305]. Two trials reported 18-month mean losses, which ranged from 4.1 to 5.8 kg [304, 305]. These short- and long-term weight losses are comparable to those described above for high-intensity, onsite interventions compared with usual care. One [305] of two trials that prescribed a higher dose of physical activity (i.e., expending 2,500 kcal/week) observed significantly greater weight loss at 18 months compared with the conventional dose of activity (i.e., expending 1,000 kcal/week); the other trial [302, 303] did not observe a significant difference in short- or long-term weight loss for higher versus lower doses of physical activity. Post hoc analyses (which are not shown in Summary tables 4.3a-n) performed in three studies revealed that higher (participant-reported) long-term levels of physical activity were associated with larger long-term weight losses (independent of participants' original assignment to physical activity groups) [284, 302, 304].

A fifth study [279] examined the addition of motivational interviewing to a high-intensity lifestyle intervention. Motivational interviewing is designed to help participants resolve ambivalent feelings they may have about changing their behavior. All participants had type 2 diabetes. Those in the traditional high-intensity intervention lost 3.1, 2.7, and 1.7 kg at 6, 12, and 18 months, respectively. The addition of motivational interviewing increased weight loss significantly at each time point by 1.6 to 2.1 kg.

H Efficacy/Effectiveness of Electronically Delivered, Comprehensive Interventions in Achieving Weight Loss—Summary Tables 4.3a-n

ES3. Electronically delivered, comprehensive weight loss interventions developed in academic settings, which include frequent self-monitoring of weight, food intake, and physical activity—as well as personalized feedback from a trained interventionist—can produce weight loss of up to 5 kg at 6 to12 months, a loss that is greater than that resulting from no or minimal intervention (i.e., primarily knowledge based) offered on the Internet or in print.

Strength of evidence: Moderate

Rationale: Summary tables 4.3a-n present 13 randomized trials in which weight loss interventions, or weight loss maintenance interventions, developed primarily in academic settings, were delivered by one or more of the following methods: Counseling via e-mail, text messaging, interactive Web sites, and automated phone calls. Seven of the trials were rated good quality; six were rated fair quality. During initial weight loss, nine of these trials used interactive Web sites, e-mail, or text messaging [203, 283, 286, 295, 297, 301, 318, 328, 329]. One of these studies [203] compared group sessions held onsite (face to face) to group sessions held virtually using a Web site (with a chat room). As in other parts of this report, consideration of initial weight loss interventions has been separated from consideration of weight loss maintenance interventions. For this reason, four studies listed in Summary tables 4.3a-n [281, 288, 298, 327] are not discussed here but are reviewed in the weight loss maintenance section of this report (section M below).

Three RCTs compared an Internet-delivered program, which included personalized feedback (from a trained interventionist) by e-mail, with information-only control groups [301, 318, 328]. Hunter et al. [301] reported a significantly greater 6-month weight loss in intervention versus control participants (−1.5 kg vs. +0. 5 kg) and that significantly more participants in the former group lost ≥5 percent of initial weight (22.6 percent vs. 6.8 percent). Morgan et al. [318] reported mean 6-month losses of 5.3 and 3.5 kg for intervention and control groups, respectively, which did not differ significantly. Tate et al. [328] compared a static Internet program (which included a Web site with a tutorial on weight loss and a message board, plus weekly nonpersonalized e-mails with weight loss tips and reminders) with a more intensive intervention that included all of the features of the basic program, plus regular e-mail communications with a trained interventionist and entry of food diaries on the Web site. At 12 months, the intensive Internet program produced significantly greater weight loss than the static program (4.4 kg vs. 2.0 kg). A fourth trial [297] compared a no-treatment control condition to a weight loss intervention delivered by text message and a Web site. Weight loss at 12 months did not differ significantly between the control and intervention participants (0.7 and 3.1 kg, respectively).

Five additional randomized trials compared various combinations of electronically delivered weight loss interventions without also including a no-treatment or minimal-treatment control condition [203, 283, 286, 295, 329]. A primary finding of these trials is that interventions that are more intensive—that is, have more frequent contacts and provide personalized feedback to participants—tend to be more effective in achieving weight loss. One of these trials [295] compared a commercial, Internet-based intervention program to a more intensive Internet- and e-mail-based intervention developed for this project. The primary difference between these two interventions was that the more intensive program included weekly advice and personalized feedback from a professional health counselor over a 6-month initial intervention period. Mean weight loss at 12 months for participants in the commercial program was significantly less than the mean weight loss for the more intensive personalized program (5.1 kg vs. 2.6 kg). In another trial [283] comparing the same commercial Internet-based program to an intervention that included a detailed weight loss manual and occasional individual meetings with a weight loss counselor, participants assigned to the in-person program had significantly greater mean weight loss at 12 months than those assigned to the Internet program (4.0 kg vs. 1.1 kg).

Tate et al. [329] randomly assigned participants to intervention programs at three levels of intensity. The lowest intensity program included the following: A 1-hour group session; exercise advice; a personalized calorie target; instruction on using structured meals and meal replacements; a recommendation for using two meal replacements per day; a week's supply of meal replacements; coupons for discounts on future purchases of meal replacements; and encouragement to use an interactive weight loss Web site that included self-monitoring, e-mail prompts to report weight weekly, weekly e-mail tips for weight loss, and an e-mail social support system. The more intensive intervention programs included more e-mail and telephone prompts plus either automated feedback on progress (midlevel of intervention intensity) or weekly feedback from a human counselor (highest level of intensity). An ITT analysis at 6 months did not find a significant difference in mean weight loss between the groups but did find a significant difference in the proportion of participants in each treatment group that achieved a 5-percent or greater weight loss: 27 percent in the lowest intensity group, 34 percent in the higher intensity group, and 52 percent in the highest intensity group. One trial [203] compared the same weight loss program delivered in three ways: In person by a professional counselor; via the Internet; and a hybrid intervention that included both in-person and Internet contacts. In an ITT analysis, the in-person intervention resulted in significantly more weight loss at 6 months compared to the Internet-only or combined conditions (7.6 kg, 5.5 kg, and 5.7 kg, respectively).

One trial identified for this review as using electronic intervention methods [286] has not been discussed here because the intervention program was not described in enough detail to make comparisons to the other studies.

I Efficacy/Effectiveness of Comprehensive, Telephone-Delivered Lifestyle Interventions in Achieving Weight Loss—Summary Tables 4.4a-b

ES4. In comprehensive lifestyle interventions that are delivered by telephone or face-to-face counseling and that also include the use of either commercially prepared, prepackaged meals or an interactive Web-based program, the telephone and face-to-face-delivered interventions produced similar mean net weight losses of approximately 5 kg at 6 months and 24 months, compared with a usual care control group.

Strength of evidence: Low

Rationale: Three trials [204, 206, 320] have compared the efficacy of providing behavioral counseling onsite (in person) or by telephone. Two were rated good quality; one was rated fair. Rock et al. [204] assigned obese women to a control condition or one of two active interventions. Participants in both interventions received prepackaged meals as well as access to a Web site providing weight loss advice and a message board for communicating with interventionists and other participants. Participants in both interventions also were offered weekly counseling contacts with a trained interventionist throughout the 2-year study. In one group, contacts occurred in person (onsite) while in the other group counseling was delivered by telephone. At 12 months, the two interventions achieved mean losses of 10.1 and 8.5 kg, respectively, with no significant differences between groups; both interventions were superior to the control group, in which participants achieved mean losses of 2.6 kg). A similar pattern of results was observed at 24 months.

Appel et al. [206] randomly assigned participants to either a control condition in which they received only physician advice to lose weight (usual care) or to one of two active intervention conditions: In person or remote. Participants in both intervention conditions received the following: Physician advice to lose weight; encouragement to use a project weight loss Web site that included learning modules, opportunities for self-monitoring weight/calorie intake/exercise, and feedback on progress in these key behaviors; and e-mail prompts to check into the Web site every week. In the “in-person” condition, participants were offered a series of onsite group and individual sessions conducted by a trained interventionist. In the “remote” condition, participants were offered individual counseling via telephone. The number of counselor contacts was the same in both conditions. At 24 months, participants the two intervention groups lost an average of 4.6 and 5.1 kg, respectively; these losses did not differ significantly from one another but were superior to the control condition, where participants lost an average of 0.8 kg. A third study [320] examined the use of a telephone-based intervention to facilitate the maintenance of lost weight. This study is discussed in section M (below), Efficacy/Effectiveness of Comprehensive Lifestyle Interventions in Maintaining Lost Weight.

J Efficacy/Effectiveness of Comprehensive Weight Loss Programs in Patients Within a Primary Care Practice Setting Compared With Usual Care—Summary Tables 4.5a-f

ES5. In studies to date, low- to moderate-intensity lifestyle interventions for weight loss provided to overweight or obese adults by primary care practices alone have not been shown to be effective.

Strength of evidence: High

Rationale: Studies of comprehensive lifestyle interventions in the primary care setting were included if the intervention was delivered within the context of the primary care practice and used members of the primary care team to deliver the intervention. Four studies were identified that met these criteria [208, 287, 313, 331]. Two were rated good quality; two were rated fair quality. Members of the primary care team could include physicians, nurse practitioners, physician assistants, nurses, or medical assistants. In all studies included in Summary tables 4.5a-f, members of the primary care team responsible for intervention delivery were trained by the research team to provide the intervention. Participants included in these studies were broadly representative of obese patients within a general primary care practice although in some instances they specifically included those with CVD risk factors or diabetes [287, 331]. The interventions tested were generally of low or moderate intensity, with treatment contacts occurring on average less than twice per month. Typically, intervention encounters were conducted onsite (face to face), by telephone, or a combination of the two. Across several different practice settings and intervention strategies, researchers reported similar findings, with negligible to modest weight losses compared to usual care. At 12 months, weight losses in the intervention group were ≤1.1 kg greater than the control group; at 24 months, weight losses were ≤1.2 kg greater than the control group. An exception to this conclusion was observed in the 2011 study by Wadden et al. [208], which included a third treatment arm that added either meal replacements or a weight loss medication (orlistat or sibutramine) to lifestyle counseling. Participants who received this intervention lost 2.7 kg more than control participants. However, the removal of sibutramine from the market limits the clinical significance of this finding.

K Efficacy/Effectiveness of Commercial-Based, Comprehensive Lifestyle Interventions in Achieving Weight Loss—Summary Table 4.6a

ES6. Commercial-based, comprehensive weight loss interventions that are delivered in person have been shown to induce an average weight loss of 4.8 to 6.6 kg at 6 months in two trials when conventional foods are consumed and 6.6 to 10.1 kg at 12 months in two trials with provision of prepared food, losses that are greater than those produced by minimal-treatment control interventions.

Strength of evidence: Low

Rationale: Four RCTs were identified that compared a commercial-based weight loss programs with a minimal-intervention control group [204, 299, 323, 332]. All four were rated fair quality. Two were electronically delivered interventions, one was onsite, and the other involved a mixture of delivery methods. Trials of commercial-based interventions were included if they assessed comprehensive weight loss programs delivered in a manner consistent with the usual business practice for that commercial program. Therefore, the studies reviewed replicated the usual program delivery provided to paying customers. However, the issue of payment raises the only notable difference between the studies included in the summary table and typical commercial practice. In all instances, study volunteers received services and food for free as a part of the study protocol rather than paying for the program out of pocket.

All four studies identified included an onsite (face to face) counseling intervention to deliver the educational and behavioral components of the program [204, 299, 323, 332]. One of the four studies [299, 332] also included a telephone counseling intervention to provide the same content as the face-to-face program. Two [299, 332] of the four studies used commercial programs that employed trained peer counselors to deliver face-to-face counseling in group settings. These peer counselors are typically individuals who have demonstrated long-term, ongoing success in the program but may not have any specific professional degrees or certifications in behavior change, nutrition, physical fitness, or other health-related field.

In all four studies, the intervention groups lost significantly more weight than the control groups. In the Rock et al. [323] study, maximal weight loss was achieved at 6 months (7.2 kg), and the weight change in the second 6 months of the study was a weight gain of 0.6 kg. In the longer study by Rock et al. [204], maximum weight loss was not reached until month 12 (10.1 and 8.5 kg for in-person and telephone interventions, respectively). In the following year, participants regained approximately 2.3 to 2.7 kg. In the two studies that employed peer counseling to deliver the group-based intervention, weight loss at 6 months ranged from 4.8 to 6.6 kg [299, 332]. Treatment and followup of the originally randomized groups was continued in the Heshka et al. [299] study for a total of 2 years. The 6-month weight loss represented the maximum weight loss achieved. At 1 and 2 years, the weight loss was 4.3 and 2.9 kg, respectively.

Two of the four studies [204, 323] provided prepackaged foods to participants and achieved weight losses that were greater at 6 months (7.2 to 9.2 kg) than the other two studies [299, 332] of similar intervention length that did not provide food, in which 6-month weight losses ranged from 4.8 to 6.2 kg. The differences persisted at 12 and 24 months. Only two studies [204, 300] reported weight losses at 24 months. The study that provided food [204] reported approximately twice the amount of weight loss at 24 months (5.4 kg, net of control) compared to the study that did not [300] (2.7 kg, net of control). The CQ4 work group noted that the 4 trials do not provide an adequate assessment of the effects of prescribing prepackaged foods versus conventional foods. The lifestyle interventions provided in the four studies differed in several important ways, in addition to the prescription of prepackaged versus conventional foods.

L Efficacy/Effectiveness of Very Low-Calorie Diets Used as Part of a Comprehensive Lifestyle Intervention in Achieving Weight Loss—Summary Tables 4.7a-c

ES7a. Comprehensive, high-intensity, onsite lifestyle interventions that include a medically supervised very low-calorie diet (VLCD) (often defined as <800 kcal/day), as provided by complete meal replacement products, produce total weight loss of approximately 14.2 to 21 kg over 11 to 14 weeks, which is larger than that produced by no intervention or a usual care control group; i.e., advice and education only.

Strength of evidence: High

ES7b. Following the cessation of a high-intensity lifestyle intervention with a medically supervised VLCD of 11 to 14 weeks, weight regain of 3.1 to 3.7 kg has been observed during the ensuing 21 to 38 weeks of nonintervention followup.

Strength of evidence: High

ES7c. The prescription of various types (resistance or aerobic training) and doses of moderate-intensity exercise training (e.g., brisk walking 135 to 250 minutes/week), delivered in conjunction with weight loss maintenance therapy does not reduce the amount of weight regained after the cessation of the VLCD as compared with weight loss maintenance therapy alone.

Strength of evidence: Low

Rationale: The three preceding evidence statements are based on findings from four RCTs that used VLCD as part of the dietary component of the comprehensive interventions. Two were weight loss trials [190, 192], and two were weight loss maintenance trials in which participants received the VLCD before randomization [190, 192]. All four were rated fair quality.

All four studies [189-192] provided VLCD, typically in the range of 420 to 525 kcal/day, for a period of time that was generally ≤14 weeks. In each instance, the VLCD was provided under medical supervision, and participants were supplied with the diet at no cost. The VLCD was followed by a period of transitioning back to regular foods with a gradual increase in caloric intake to reach a point at which weight maintenance was achieved. Two trials [189, 191] were designed as comparisons of short-term weight loss, reporting results at the conclusion of the exclusive use of the VLCD, with followup reports of weight maintenance at 32 and 52 weeks. Weight loss ranged from 14.2 kg to 21.1 kg at the end of the VLCD (up to 14 weeks). During a period of nonintervention followup, approximately 21 to 38 weeks following completion of the VLCD, weight regain of 3.1 to 3.7 kg was observed.

Two of the four studies were designed to assess weight maintenance interventions after completing initial weight loss using a VLCD [190, 192]. After completing 6 to 8 weeks of VLCD followed by 2 to 4 weeks of LCD, participants were assigned to behavioral weight maintenance interventions with various exercise prescriptions to test the effects on weight maintenance. Both studies reported weight change at 6 to 9 months following randomization, and one study [192] (Fogelholm et al. 2000) reported weight change at 33 months post-randomization. In the short term, weight change ranged from a loss of 0.7 kg to a gain of 1.8 kg. Including specific exercise prescriptions in the weight maintenance intervention resulted in a range of weight change at 6 to 9 months of −2.7 kg to 0.3 kg net of the control intervention. At 33 months, participants regained weight, on average between 5.9 and 9.7 kg. The range of weight change for the exercise interventions was 3.5 kg to 0.2 kg less than the control weight maintenance intervention (differences not statistically significant).

M Efficacy/Effectiveness of Comprehensive Lifestyle Interventions in Maintaining Lost Weight—Summary Tables 4.8a-e

ES8a. After initial weight loss, some weight regain can be expected, on average, with greater regain observed over longer periods of time. Continued provision of a comprehensive weight loss maintenance program (onsite or by telephone), for periods of up to 2.5 years following initial weight loss, reduces weight regain as compared to the provision of minimal intervention; e.g., usual care. The optimal duration of weight loss maintenance programs has not been determined.

Strength of evidence: Moderate

Rationale: Eleven RCTs were identified that examined the maintenance of lost weight following initial weight loss during a run-in period [190, 192, 209, 281, 288, 289, 298, 310, 320, 321, 327]. Five were rated good quality; six were rated fair quality. In the first nine trials cited, participants were randomized to treatment conditions after achieving initial weight loss whereas in the last two they were randomized to conditions at the start of the run-in period, but the randomization was concealed until the maintenance phase began. Nine trials provided an onsite (i.e., face to face) intervention that often included regularly scheduled contacts and guidance on behavioral strategies for weight maintenance [190, 192, 209, 281, 289, 310, 320, 321, 327], and five trials included an electronic or telephone intervention arm [192, 281, 289, 298, 327]. Five trials provided weight change outcomes at 3 to 6 months [190, 281, 298, 310, 327], eight trials included 6- to 12-month outcomes [192, 209, 281, 288, 289, 310, 320, 327], and five trials provided weight change data at greater than 12 months [192, 281, 289, 298, 327]. Participants in all weight loss maintenance interventions were instructed to consume a reduced-calorie diet (needed to maintain their lower body weight) and to engage in high levels of physical activity (typically ≥225 minutes per week of brisk walking or similar activity).

The most compelling test of the efficacy of weight loss maintenance interventions is to randomly assign individuals who have achieved initial weight loss to either a minimal-treatment control condition or to an active maintenance intervention. Six trials with at least 1 year of post-randomization followup assessment have used this design [190, 281, 288, 320, 321, 327]. Four of these trials showed that the provision of continuing maintenance intervention, compared to a minimal treatment or no-treatment control, significantly reduced the amount of weight regain following initial weight loss [281, 320, 321, 327]. The nature of these continuing intervention contacts may be important. Two studies found that person-to-person counseling, delivered by a trained interventionist either face to face or by telephone, was associated with significantly less weight regain than automated interventions delivered via the Internet [281, 327]. Of the two of six trials with negative results, one compared no further treatment to an Internet-based maintenance intervention [288], and the other [190] compared no further treatment to two active maintenance interventions. In the Borg et al. study, one intervention focused on moderate intensity physical activity (walking), and the other focused on resistance training [190].

Four other trials compared two or more active interventions that were designed to promote weight loss maintenance [190, 192, 209, 281, 288, 289, 298, 310, 320, 321, 327]. Leermakers et al. [310] randomly assigned participants who had completed an initial 6-month weight loss program to either an exercise-focused maintenance program or a weight-focused weight maintenance program. At 1-year post-randomization, the weight-focused program resulted in significantly less weight regain than the exercise-focused program. West et al. [209] compared two maintenance programs following an initial 6-month weight loss program and did not find a significant difference in weight loss maintenance between a motivation-focused intervention and a skill-focused program. Fogelholm et al. [192] compared two different levels of recommended physical activity—2 to 3 hours per week of walking versus 4 to 6 hours per week—and did not find significant differences in weight loss maintenance at 1-year post-randomization. Dale et al. [289] compared two maintenance programs among women who lost 5 percent of their initial body weight on their own in the previous 6 months. The two maintenance programs involved either a nurse support program (weekly contact as weigh-in or brief phone call) or an intensive support program (supervised exercise and professional individual counseling). Both interventions led to comparable results of an average weight loss of approximately 2 kg over 2 years of intervention. These weight loss outcomes are exceptions to the typical weight regain observed in the other included studies of weight loss maintenance interventions. It is notable that the Dale et al. study [336] provided many more participant contacts, even in the lower intensity intervention, than any of the other included weight loss maintenance interventions. One additional trial comparing weight loss maintenance interventions has not been discussed here because the outcome data were not reported as change from randomization to final followup data collection [298].

ES8b. Approximately 40 to 60 percent of overweight/obese adults who participate in a high-intensity, long-term comprehensive lifestyle intervention maintain a loss of 5 percent or more of initial body weight at 2 or more years followup (post-randomization).

Strength of evidence: Moderate

Rationale: Three large studies (presented previously in Summary tables 4.2a-d and 4.8a-e) have reported on the proportion of participants who achieved different categories of weight loss 2 years or more following the start of behavioral intervention [204, 206, 207]. These studies are presented together in the final section of Summary tables 4.8a-e. One was rated good quality; two were rated fair quality. Each of these studies used various weight loss categories and followup intervals, but in each study trial a substantial proportion of participants achieved meaningful long-term weight loss; i.e., ≥5 percent of initial weight. Rock et al. [204] reported that among participants who received a combination of behavioral counseling and prepared meals, 62 percent of those who received counseling in person and 56 percent of those who received counseling by telephone achieved a 5-percent or greater weight loss at a 2-year followup. Appel et al. [206] reported that the proportion of participants maintaining at least a 5-percent reduction in body weight at a 2-year followup was 38 percent of those assigned to a telephone-based intervention and 41 percent of those assigned to an in-person-delivered intervention. In the Look AHEAD study, 46 percent of participants in the intensive lifestyle intervention achieved a loss of at least 5-percent of initial body weight at year 4.

N Characteristics of Lifestyle Intervention Delivery That May Affect Weight Loss: Intervention Intensity—Summary Tables 4.9a-f

ES9a. Moderate intensity, onsite, comprehensive lifestyle interventions, which provide an average of one to two treatment sessions per month, typically produce mean weight losses of 2 to 4 kg in 6 to 12 months, losses that generally are greater than those produced by usual care; i.e., minimal-intervention control group.

Strength of evidence: High

Eight RCTs were identified that compared a moderate-intensity, onsite, comprehensive lifestyle intervention with usual care or minimal treatment; i.e., characterized by the limited provision of advice or educational materials [208, 282, 292, 293, 296, 308, 312, 313, 333]. Six were rated good quality; two were rated fair quality. Moderate intensity was defined as providing 6 to 13 sessions during the first 6 months; i.e., approximately monthly to every-other-week contact. In all trials, treatment was delivered by trained interventionists in group or individual sessions, who provided recommendations for consuming a lower calorie diet, engaging in regular physical activity, and using behavioral strategies to achieve these recommendations (as described previously for high-intensity interventions).

Two of eight moderate-intensity trials reported weight loss at 6 months, at which time the intervention group was superior to usual care [208, 296]. Four trials reported mean 12-month weight losses, which ranged from 2.4 to 4.2 kg [208, 282, 308, 333]. The difference at 12 months between intervention and usual care groups ranged from 1.1 to 3.4 kg; differences were statistically significant in two studies [282, 308]. Five of eight RCTs [208, 292, 312, 313, 333] reported mean weight losses at 24 to 36 months. In two studies reporting results at 24 months, differences between intervention and control groups of approximately 0.2 and 1.2 kg were not statistically significant [208, 313]. A third study, the Finnish Diabetes Prevention Study [208, 292, 312, 313, 333], reported weight loss outcomes at 24 and 36 months. At 24 months, the investigators observed a loss of 3.5 kg, with a significant net-of-control difference of 2.7 kg, and at 36 months the intervention group maintained the 3.5 kg loss, with a significant net-of-control difference of 2.6 kg. By contrast, two other studies reported 24-month weight losses of 14.0 and 15.0 kg, which were achieved through monthly intervention sessions the first year and every-other-month sessions in year 2 [292, 293]. The differences between intervention and control groups were 11.0 and 13.0 kg, respectively. The same investigative team conducted these two studies, using the same protocol. The details of the intervention for these two studies are only briefly described in the publication, therefore preventing adequate understanding of the above average results achieved. It is possible that participants in both studies had separate, monthly visits with a dietitian, exercise specialist, and psychologist, which would have increased the intensity of the intervention. The study, however, does not provide sufficient detail to resolve this question. Similar questions arose about the intensity of the intervention used in the study by Tuomilehto et al. [333]. In summary, the preponderance of the evidence (i.e., 5 of 8 studies) indicates that moderate-intensity lifestyle interventions produce short- to long-term weight losses of up to 4 kg, although substantially larger losses have been reported in two studies [292, 293].

ES9b. Low-intensity, onsite comprehensive lifestyle interventions, which provide fewer than monthly treatment sessions, do not consistently produce weight loss when compared to usual care.

Strength of evidence: Moderate

Rationale: Two RCTs were identified that compared a low-intensity, onsite comprehensive lifestyle intervention (i.e., provided ≤5 sessions over the first year) with usual care or minimal treatment. One was rated good quality and the other fair quality. The two studies reported mean 12-month weight losses of approximately 0.1 percent to 1.9 percent of initial weight [287, 331]. The difference between intervention and usual care groups ranged from approximately 0.7 percent to 1 percent and was statistically significant in one study (ter Bogt [331]).

E9c. When weight loss with each intervention intensity (i.e., low, moderate, high) is compared to usual care, high-intensity lifestyle interventions (>14 sessions in 6 months) typically produce greater net-of-control weight losses than low-to-moderate-intensity interventions.

Strength of evidence: Moderate

Rationale: No RCTs of fair or good quality were identified that directly compared the effects of interventions of different intensity (i.e., low, moderate, or high intensity) on weight loss. Thus, in trying to assess the effects of intervention intensity on weight loss, the subpanel examined trials that provided a moderate-intensity intervention (i.e., 6 to 13 sessions in 6 months) compared with usual care (i.e., minimal intervention) as well as studies that evaluated low-intensity interventions (1 to 5 sessions in 6 months) compared with usual care. The mean net weight losses associated with these two treatment intensities were compared (indirectly) with the mean net weight loss produced by high-intensity, onsite interventions (14 or more sessions in 6 months), considered the gold standard in lifestyle management for obesity. As discussed in section G (above), Comprehensive Interventions Compared With Usual Care, Minimal Care, or No-Treatment Control (and shown in Summary tables 4.2a-d), net-of-control weight losses at 1 year in high-intensity interventions ranged from 4.3 to 7.9 kg, in favor of the intervention group (over usual care). As discussed in this section, net-of-control losses at 1 year in moderate intensity interventions ranged from 1.1 to 3.0 kg, in favor of the intervention over usual care. Corresponding differences at 1 year in low-intensity interventions ranged from approximately 0.7 to 1.0 kg (values estimated from percentage change data). These comparisons suggest that high-intensity interventions produce greater net-of-control weight losses than do low- and moderate-intensity interventions.

O Characteristics of Lifestyle Intervention Delivery That May Affect Weight Loss or Weight Loss Maintenance: Individual Versus Group Treatment—Summary Tables 4.1a-c

ES10. There do not appear to be substantial differences in the size of the weight losses produced by individual- and group-based sessions in high-intensity, comprehensive lifestyle intervention delivered onsite by a trained interventionist.

Strength of evidence: Low

Rationale: No RCTs of fair or good quality were identified that compared the effectiveness of high-intensity, onsite comprehensive interventions that were delivered using individual versus group treatment sessions. Nine of the 10 high-intensity, onsite interventions reviewed in Summary tables 4.10a-j were delivered predominantly in group sessions, as reviewed elsewhere in this report (section G above) [49, 205, 206, 280, 285, 309, 315, 324-326]. Six were rated good quality; four were rated fair quality. In contrast to group-based interventions, one major trial, the Diabetes Prevention Program [307], provided weight loss induction interventions exclusively in an individual counseling format and achieved weight loss results that were similar to those achieved in group-based interventions. Although it would be helpful to conduct additional direct comparisons of group and individual counseling, it appears that they result in similar weight losses.

P Characteristics of Lifestyle Intervention Delivery That May Affect Weight Loss or Weight Loss Maintenance: Onsite Versus Electronically Delivered Interventions—Summary Tables 4.10a-j

ES11. Weight losses observed in comprehensive lifestyle interventions, which are delivered onsite by a trained interventionist in initially weekly and then biweekly group or individual sessions, are generally greater than weight losses observed in comprehensive interventions that are delivered by Internet or e-mail and include feedback from a trained interventionist.

Strength of evidence: Low

Rationale: Only one RCT rated of fair quality has directly compared the efficacy of a high- intensity, comprehensive lifestyle intervention delivered onsite (face to face) with the same program delivered by Internet [203]. Twenty-four weekly group sessions, delivered onsite produced a mean loss of 7.6 kg (at month 6) compared with a significantly smaller 5.5 kg mean loss for the same program delivered by Internet using a chat room facilitated by a trained interventionist and electronic food and activity diaries. A third treatment arm, which combined one onsite meeting per month with three electronic contacts per month, produced a mean loss of 5.7 kg.

Ten onsite, high-intensity RCTs [23, 49, 202, 205-207, 278, 280, 285, 290, 307, 309, 315, 324-326, 335] were compared with nine electronically delivered trials [203, 283, 286, 295, 297, 301, 318, 328, 329], all of which examined the induction (not maintenance) of weight loss. These studies, which are drawn from Summary tables 4.1a-c and 4.3a-n, respectively, are summarized in Summary tables 4.10a-j. Of the onsite trials, six were rated good quality; four were rated fair quality. Of the electronically delivered trials, four were rated good quality; five were rated fair quality. In all trials, an intervention group delivered by a trained interventionist was compared with a usual care, minimal-intervention group, thus allowing calculation of net-of-control differences in weight loss; i.e., intervention-usual care difference. Net-of-control differences were consistently greater in the onsite, high-intensity trials than in the electronically delivered interventions. Differences were consistent with those observed in the RCT by Harvey-Berino et al. [203], which directly compared the method of intervention delivery.

Q Recommendation 4a

Recommendation 4a. Advise overweight and obese individuals who would benefit from weight loss to participate for ≥6 months in a comprehensive lifestyle program that assists participants in adhering to a lower calorie diet and in increasing physical activity through the use of behavioral strategies.

Recommendation Grade: A (strong)

Rationale: The objective with overweight and obese patients is to produce weight loss that is clinically meaningful. This is generally considered to be approximately a 5-10 percent loss of initial body weight, which is associated with reductions in key cardiometabolic risk factors (e.g., blood pressure, blood glucose control, risk of diabetes). Participation for 6 months in a comprehensive, high-intensity, onsite lifestyle intervention produces a mean weight loss of approximately 5 to 8 kg (and up to 10 kg in some studies), equal to about a 5 to 8 percent reduction in initial weight [Summary table 4.2 [274, 281, 303, 322]]. In comprehensive lifestyle programs, more than 50 percent of participants can be expected to achieve a loss of 5 percent or more of initial weight [Summary table 4.2- [46, 202, 274, 303, 331]]. Most individuals achieve their maximum weight loss in the first 6 months of a comprehensive intervention. The average weight loss does not change substantially from months 6 to 12, even when participants are provided continued, although less frequent (e.g., every-other-week) treatment sessions [Summary table 4.2 [46, 202, 274, 303, 331]]. To achieve these weight losses, a comprehensive lifestyle program should include specific behavioral strategies for reducing calorie intake and increasing physical activity [Summary table 4.1]. Long-term participation in a comprehensive lifestyle intervention is recommended to prevent or slow weight regain, which is common following the cessation of weight loss interventions [20, 46].

Recommendation 4b: Prescribe on site, high intensity (i.e., ≥14 sessions in 6 months) comprehensive weight loss interventions provided in individual or group sessions by a trained interventionist.

Recommendation Grade: A (strong)

Rationale: The studies reviewed show evidence of greater weight loss with higher frequency of contact. Intervention contact has typically been provided in group or individual, face-to-face sessions. Therefore, comprehensive lifestyle interventions that provide 14 or more face-to-face meetings over 6 months are preferred, because this intervention format consistently produces clinically meaningful weight loss [Summary table 4.2 [20, 46, 202, 203, 274, 281, 303, 311, 320-322]]. However, some initial evidence suggests that telephone-based counseling interventions can be similarly successful with high frequency of contact [Summary table 4.4 [201, 203]]. High-intensity interventions are usually provided in research or medical centers but also may be offered in community or worksite settings, as well as commercial programs.

Another important feature of the studies reviewed (that included high-intensity comprehensive weight loss programs) is the use of trained interventionists. Interventionists can have different professional backgrounds (e.g., registered dietitians, psychologists, exercise specialists, health counselors) and, in some cases, may be trained laypersons.

If a high-intensity, onsite comprehensive program is not available to the patient, a similar intervention of moderate intensity (i.e., providing 6 to 13 sessions in 6 months) can be recommended as an alternative approach. Evidence suggests that the weight losses achieved (averaging 1 to 3.5 kg in 6 to 12 months) will be lower than those produced by high-intensity programs [Summary table 4.9 [204, 278, 292, 304, 309]]. Low-intensity lifestyle interventions, as typified by quarterly counseling delivered by a primary care practitioner, are probably more appropriate for facilitating weight stability (i.e., preventing weight gain) than inducing clinically meaningful weight loss [Summary table 4.9 (283), ter Bogt (331).

Recommendation 4c: Electronically delivered weight loss programs (including by telephone) that include personalized feedback from a trained interventionist can be prescribed for weight loss but may result in smaller weight loss than face-to-face interventions.

Recommendation Grade: B (moderate)

Rationale: Interventions delivered using electronic media, such as Web sites, text messaging (via telephone or the Internet), and similar methods, have great promise to reach large numbers of patients at potentially low cost. To date, relatively few controlled clinical trials of electronic media for weight loss have been published, but more can be expected soon. It remains to be seen which combination of intervention strategies and communication channels will be the most effective for helping patients lose weight and maintain their weight loss. Until additional research is available, it is advisable to anticipate that smaller weight losses will be achieved with electronic-based programs, as compared with traditional face-to-face programs [Summary table 4.10 [200, 279, 282, 291, 293, 297, 314, 324, 325]].

Recommendation 4d: Some commercial-based programs that provide a comprehensive lifestyle intervention can be prescribed as an option for weight loss, provided there is peer-reviewed published evidence of their safety and efficacy.

Recommendation Grade: B (moderate)

Rationale: A range of commercial programs may be effective in producing weight loss. The intention of this recommendation is not to endorse a specific commercial program but instead is to guide practitioners in helping patients identify which types of programs may be effective.

Characteristics of commercial programs that produce significant weight loss include providing all components of a comprehensive behavioral intervention program (including a calorie-restricted diet, a physical activity prescription, and behavioral counseling) with a high frequency of contact [Summary table 4.6 [201, 295, 319, 328]]. Because a wide range of commercial programs are available, published peer-reviewed research should be used to evaluate program components, as well as the safety and efficacy of the program.

Recommendation 4e: Use a very low calorie diet (defined as <800 kcal/day) only in limited circumstances and only when provided by trained practitioners in a medical care setting where medical monitoring and high intensity lifestyle intervention can be provided. Medical supervision is required because of the rapid rate of weight loss and potential for health complications.

Recommendation Grade: A (strong)

Rationale: There was no direct evidence from RCTs to change the previous conclusion reached by the NIH/NHLBI Obesity Expert Panel in 1998, which stated “VLCDs produce greater initial weight loss than LCDs. However, the long-term (>1 year) weight loss is not different from that of the LCD. Category A. Therefore, the recommendation using VLCDs is limited to short-term use and induction of rapid weight loss. To warrant implementation of this treatment strategy, the benefits of rapid, short-term weight loss should be greater than the risks associated with VLCDs.

Beyond the short-term, the benefit of VLCDs over LCDs has not been demonstrated. The lack of a demonstrated long-term benefit, combined with higher medical risk and weight regain, suggests that there is no indication for long-term or widespread use of VLCDs [Summary table 4.7 [184-187]]. When this strategy is medically indicated for short-term weight loss induction (e.g., pre-bariatric surgical weight reduction protocol), oversight by trained medical professionals is required to manage the medical risk of the VLCD.

Recommendation 4f: Advise overweight and obese individuals who have lost weight to participate long-term (≥1 year) in a comprehensive weight loss maintenance program.

Recommendation Grade: A (strong)

Maintaining a reduced body weight, following initial weight loss, is a long-term process.

Ongoing support and intervention are effective in slowing weight regain, which occurs in most patients following weight loss [Summary table 4.8 [185, 187, 206, 277, 284, 294, 306, 316, 317, 323]].

Rather than considering short-term weight loss in a comprehensive lifestyle intervention as the end goal, advise patients to seek a long-term weight loss maintenance intervention that includes behavioral counseling to sustain key behaviors associated with maintenance of a lower body weight. These includes increased physical activity and regular self-monitoring of dietary intake and body weight.

Recommendation 4g: For weight loss maintenance, prescribe face-to-face or telephone-delivered weight loss maintenance programs that provide regular contact (monthly or more frequent) with a trained interventionist who helps participants engage in high levels of physical activity (i.e., 200-300 minutes/week), monitor body weight regularly (i.e., weekly or more frequent), and consume a reduced-calorie diet (needed to maintain lower body weight).

Recommendation Grade: A (strong)

Rationale: A weight maintenance intervention helps participants focus on maintaining several key behaviors following initial weight loss. A skilled interventionist is a key component of the weight loss maintenance intervention, providing ongoing behavioral support for 1 year or more.

A successful weight loss maintenance intervention would be expected to include a prescription for high levels of moderate-intensity physical activity, regular self-monitoring of body weight, and an appropriate calorie intake required to maintain the new lower body weight [Summary table 4.8- [185, 187, 206, 277, 284, 294, 306, 316, 317, 323]].

Weight maintenance interventions have been delivered using a variety of methods. Real-time communication with a trained interventionist, either face-to-face or via telephone, appears to lead to the best outcome (i.e., less weight regain) [Summary table 4.8 [185, 187, 206, 277, 306, 316, 317, 323]]. However, some studies have included Internet-based approaches, and this approach may be more accessible or acceptable to some patients. There is evidence for the effectiveness of this approach [Summary table 4.8 [277, 284, 294, 316, 323]].

Given the limited number of options currently available, it is most important to identify a weight loss maintenance intervention that has the requisite components and that the patient is able to access consistently to achieve the best outcome.

R. Gaps in Evidence and Future Research Needs

  1. Further research is needed with onsite interventions (and those delivered by other methods) to determine the optimal frequency (and duration) of contact needed to induce clinically significant weight loss (≥5 percent of initial weight). The literature suggests that high-intensity interventions (≥14 contacts in 6 months) are more effective than moderate-intensity interventions (6 to 13 contacts in 6-months), but no RCTs have addressed this issue. Such RCTs are needed and could include efforts to match the intervention intensity to the needs of the specific patient. Also warranted is the further study of stepped-care interventions that provide overweight/obese individuals with some minimum number of contacts over 6 months, with more sessions provided only to persons who do not achieve clinically significant weight loss.
  2. More research is needed on how to effectively translate and disseminate comprehensive, high-intensity lifestyle interventions, shown to be effective in efficacy studies based in academic research centers, into programs that can be delivered in community, worksite, and other settings (including commercial programs). This includes determination of the personal characteristics, formal credentials, and training required for intervention counselors who deliver a comprehensive lifestyle intervention.
  3. RCTs are needed to identify the most effective methods of delivering lifestyle interventions remotely (e.g., Internet, mobile phone, text messaging, telephone, DVDs, etc., or some combination of these) to achieve and maintain clinically significant weight loss (>5 percent of initial weight). In addition, there is a need for head-to-head comparisons to evaluate relative effectiveness and associated costs of delivery of onsite, remote, and hybrid lifestyle interventions for achieving weight loss and health improvements.
  4. More research is needed to better understand how to promote additional weight loss beyond the first 6 months, at which time weight loss plateaus in most individuals. Is the cessation in weight loss at this time, when many individuals still remain overweight or obese, due to a decline in adherence to behavioral weight loss interventions, or is it related to physiologic changes that occur with prolonged energy deficit, or some combination of the two? Examination of these issues could identify methods to extend weight loss, with lifestyle intervention, beyond 6 months (and beyond the average loss of approximately 8 kg achieved at this time).
  5. Methods of improving the maintenance of lost weight also require additional study. This includes determining whether overweight/obese individuals require continuous, long-term treatment (i.e., as provided by indefinite participation in a weight loss maintenance program) or if they can be successful with periodic bouts of intervention in response to weight regain (or the desire for further weight loss). The use of new technologies (e.g., mobile phones) and therapies (e.g., motivational interviewing or acceptance and commitment therapy), following weight loss, also should be examined to determine whether they improve the maintenance of lost weight. Studies are needed to assess the efficacy of long-term (2 to 5 years) weight loss maintenance interventions.
  6. Further research is needed to identify the optimal role of PCPs in managing obesity using comprehensive lifestyle intervention. Options range from serving as trained interventionists, as supported by new regulations from the Centers for Medicare & Medicaid Services, to referring patients to appropriate lifestyle intervention programs or practitioners and checking weight management progress at regular intervals. Economic analyses are needed of different models involving PCPs in the management of obesity.
  7. Further study is needed on the effect of weight loss treatment on health care utilization and cost. Observational data suggest that weight loss for some groups of patients, such as older individuals with type 2 diabetes, would have a substantial effect on health and health care utilization.
  8. Further research is needed on the effects of weight loss for some key populations, including older adults and ethnic minority groups. The overall safety of weight loss interventions for patients aged 65 and older remains controversial. Although older participants tend to respond well to comprehensive behavioral weight loss treatments and they experience the same improvements in CVD risk factors as do middle-age participants, the effect of weight loss treatment on risk for CVD, longevity, and osteoporosis has not been extensively studied. More studies on the health consequences of weight loss treatment with this age group are needed. Additionally, individuals from ethnic minority populations in the United States typically have less mean weight loss when provided the same intervention as non-Hispanic Whites. This difference has been observed over a number of different types of comprehensive lifestyle interventions; e.g., group, individual, electronic. Further research is needed to understand the most appropriate strategies and prescriptions for those who may systematically lose less weight in response to a standard, comprehensive behavioral intervention.

Section 7: Critical Question 5

A Statement of the Question

CQ5 has three parts:

Efficacy

What are the long-term effects of the following surgical procedures on weight loss, weight loss maintenance, cardiovascular (CV) risk factors, related comorbidities, and mortality?

  1. Laparoscopic adjustable gastric banding (LAGB)
  2. Laparoscopic Roux-en-Y gastric bypass (RYGB)
  3. Open RYGB
  4. Biliopancreatic diversion (BPD) with or without duodenal switch
  5. Sleeve gastrectomy (SG)

What are the long-term effects of the surgical procedures (listed above) in patients with different body mass indices (BMIs) and comorbidities?

  1. BMI <35
  2. BMI of 35 to <40 with no comorbidities
  3. BMI ≥35 with comorbidities

BMI ≥40 with no comorbidities

A.1 Predictors

What are the predictors associated with long-term effects of the following surgical procedures on weight loss, weight loss maintenance, CV risk factors, related comorbidities, and mortality?

  1. LAGB
  2. Laparoscopic RYGB
  3. Open RYGB
  4. BPD with or without duodenal switch
  5. SG

What are the predictors associated with long-term effects of the surgical procedures (listed above) in patients with different BMIs and comorbidities?

  1. BMI <35
  2. BMI of 35 to <40 with no comorbidities
  3. BMI ≥35 with comorbidities
  4. BMI ≥40 with no comorbidities

A.2 Complications

What are the short-term (less than 30 days) and long-term (30 days or more) complications of the following bariatric surgical procedures? What are the predictors associated with complications?

  1. LAGB
  2. Laparoscopic RYGB
  3. Open RYGB
  4. BPD with or without duodenal switch
  5. SG

What are the complications of the surgical procedures (listed above) in patients with different BMIs and comorbidities?

  1. BMI <35
  2. BMI of 35 to <40 with no comorbidities
  3. BMI ≥35 with comorbidities
  4. BMI ≥40 with no comorbidities
A.2.1 a. Subgroup Analyses

By Population Subgroups

  • Age (especially >65 years of age)
  • Sex
  • Socioeconomic status (no evidence anticipated)
  • Race/ethnicity
  • Baseline BMI <35; BMI of 35 to <40 with no comorbidities; BMI >35 with comorbidities; BMI >40 with no comorbidities; by different comorbidities
  • Presence or absence of comorbid conditions

    • Diabetes

    • Metabolic syndrome
    • Chronic kidney disease
    • Nonalcoholic steatohepatitis and liver disease
    • Cancer
    • Sleep apnea
    • Skeletal disability
    • Genetic syndromes (i.e., Prader-Willi)
    • Psychiatric disorders (depression, psychosis, mental retardation, addiction, borderline personality disorder)
    • Quality-of-life issues
    • Multiple (≥2) risk factors (that do not constitute metabolic syndrome)
    • Diagnosed CHD/CVD
  • Presence or absence of CVD risk factors (diagnosed or treated)

    • Smoking status
    • Multiple (≥2) risk factors (that do not constitute metabolic syndrome)
    • Baseline (not necessarily pretreatment) low-density lipoprotein cholesterol (LDL-C) ≥100 mg/dL
    • Triglycerides (TG) ≥200 mg/dL
    • High-density lipoprotein cholesterol (HDL-C) ≤40 mg/dL
    • Hypertension
    • Elevated fasting insulin, fasting glucose, HbA1c
    • Previous CVD event
    • Elevated C-reactive protein (CRP)
    • Diagnosed CVD/CHD (acute coronary syndrome; CAD; CHF; history of MI; angina with objective evidence of atherosclerotic CHD; history of coronary revascularization (angioplasty or bypass); cerebrovascular disease; other forms of atherosclerotic CVD (e.g., peripheral artery disease))

By Amount of Weight Loss

  • Different cutpoints

By Weight Loss Maintenance

  • Different cutpoints

Note: Predictors will address patient factors, provider factors, and procedure (surgery) factors. Patient factors include BMI, age, comorbidities, and functional status and can include multiple risk factors assessed.

B Selection of the Inclusion/Exclusion Criteria

Panel members developed eligibility criteria, based on a population, intervention/exposure, comparison group, outcome, time, and setting (PICOTS) approach, to use for screening potential studies for inclusion in the evidence review. Table 6 presents the details of the PICOTS approach for CQ5. Studies considered included randomized controlled trials (RCTs), non-RCTs, prospective cohort studies, retrospective cohort studies, case cohort studies, case control studies, nested case control studies, case-crossover studies, interrupted time series studies, before-after studies, time series studies, and case series. For the predictors and complications component of CQ5, observational studies were included if the sample size was ≥100 with 10 or more years of followup or studies on BPD procedures or SG procedures. Other observational studies were included if the sample size was ≥500. Due to time and resource constraints, the panel was not able to conduct all the subgroup analyses originally planned; e.g., race/ethnicity as predictors; nonalcoholic fatty liver disease or sleep apnea as outcome measures.

Table 6. Criteria for Selection of Publications for CQ5
image
image
image
image

C Introduction and Rationale for Question and the Inclusion/Exclusion Criteria

Extreme obesity, also known as Class III obesity, is prevalent in the U.S. population, with 8.1 percent of women and 4.4 percent of men having a BMI of 40 or above [7]. Among some racial and ethnic minority populations, extreme obesity is even more common. For example, 17.8 percent of non-Hispanic African American women have a BMI ≥40 [7]. Patients with extreme obesity have a high prevalence not only of complications such as CVD and type 2 diabetes but also of nonalcoholic fatty liver disease, joint disease, sleep apnea, and thromboembolic disease [337]. Patients with extreme obesity also have a substantially elevated mortality risk [338].

Many, if not most, patients with extreme obesity have tried to lose weight numerous times. Some have lost substantial amounts of weight successfully, only to regain it. Although lifestyle intervention is the mainstay of all weight management treatment, there is increasing recognition of the need for adjunctive treatments for patients with obesity who are at high medical risk and who are unable to achieve or maintain sufficient weight loss to improve their health. Bariatric surgery is one treatment option that has been increasingly used in patients with extreme obesity or with lesser degrees of obesity but with obesity-related comorbid conditions. Since the 1998 overweight and obesity clinical guidelines were published [5], there have been new bariatric procedures, devices, and surgical approaches introduced as well as additional data on short-term and longer term benefits and risks. This section reviews evidence about the efficacy of bariatric surgery for weight loss and improvement in health and quality of life, which patient or procedural factors influence outcomes, and what short- and long-term complications can be expected. With these data, primary care providers (PCPs) can better advise their patients about the risks and benefits of bariatric surgery compared with other treatment approaches.

Bariatric surgery is, by definition, invasive and has inherent short-term risks as well as adverse effects that may only become apparent during long-term followup. Incurring these risks may be acceptable if health benefits are sustained over time. Therefore, the work group members believe that evaluation of efficacy end points for weight loss and change in CVD risk factors and other health outcomes requires studies with a minimum post-surgical followup of 2 years and inclusion of a nonsurgical comparator group. Studies evaluating predictors of weight change or medical outcomes, including patient factors (e.g., presence vs. absence of diabetes) or surgical factors (e.g., RYGB vs. BPD), require direct comparison of these factors plus a minimum 2-year followup. Studies evaluating complications of bariatric surgery require at least a 30-day post-surgical followup. For observational studies with 10 or more years of followup or for studies on BPD or SG procedures, the work group agreed to require a sample size ≥100 and for all other observational studies to require a sample size ≥500. This sample size requirement was instituted because the most important complications are infrequent (e.g., perioperative mortality rates are <1 percent) so that smaller studies could give inaccurate estimates of complication rates.

C.1 Bariatric Surgical Procedures

C.1.1 a. Classifications/Mechanisms of Action

In the past, bariatric surgical procedures have been classified as restrictive, in which a small gastric pouch is created, thereby limiting the amount of food that can be ingested; malabsorptive; or a combination of the two. It is now clear that, while elements of gastric constriction, which limits food intake and malabsorption, may be components of bariatric surgical procedures, the mechanisms of action are considerably more complex. Neuroendocrine signaling to appetite and satiety centers in the central nervous system from the gastric pouch and possibly the distal esophagus, as well as behavioral variations, all contribute to the efficacy of gastric restriction. Procedures that alter the gastrointestinal anatomy, including bypass or resection of variable portions of the stomach, alter the delivery of ingested nutrients to more distal sites in the small intestine for digestion and absorption. They also produce numerous neuroendocrine signals that have complex interactions with central nervous system receptors [339]. The following is a brief review of the more commonly used procedures—past, present, and possibly future.

C.1.2 b. Procedures Used in the Past

The first bariatric surgical procedure to gain popularity was the jejunoileal bypass. As much as 90 percent of the small intestinal absorptive surface was bypassed such that ingested nutrients were delivered to the very distal ileum. This resulted in a definite degree of malabsorption, especially of fat. Diminished nutrient intake, however, accounted for the predominant explanation for the major weight loss that occurred [340]. Multiple complications secondary to micronutrient malabsorption, liver and renal dysfunction, and others led to abandonment of these procedures, despite the successful weight loss that was regularly achieved.

C.1.3 c. Vertical Banded Gastroplasty

This procedure was developed in response to the unacceptable metabolic complications that led to abandonment of the jejunoileal bypass. As shown in Figure 7, a stapling device was used to partition a small upper gastric pouch. To prevent dilation of the stomach at the “stoma” point of nutrient entry into the body of the stomach, the stomach wall was reinforced with a prosthetic band. This procedure was the predominant bariatric surgical procedure in the 1980s but has been largely abandoned due to insufficient durable weight loss and complications secondary to progressive narrowing at the point of the fixed gastric banding [341].

Figure 7.

Vertical Banded Gastroplasty

C.2 Currently Used Procedures

C.2.1 a. Roux-en-Y Gastric Bypass

The RYGB combines gastric restriction and neuroendocrine modulation of appetite and satiety signals. As shown in Figure 8, a gastric pouch is created by transecting the upper stomach. Intestinal continuity is reestablished by Roux-en-Y gastrojejunostomy. The size of the gastric pouch as well as the length of the jejunal limbs vary. Malabsorption of micronutrients (calcium, iron, vitamin B12) may occur, but malabsorption of macronutrients is minimal.

Figure 8.

Roux-en-Y Gastric Bypass

Gastric bypass as well as the other procedures described below can all be done using a minimally invasive or laparoscopic approach [342, 343].

C.2.2 b. Laparoscopic Adjustable Gastric Banding

In this procedure, a gastric band or collar, shown in Figure 9, is placed above the upper stomach just below the gastroesophageal junction, creating a small gastric pouch as in vertical banded gastroplasty and gastric bypass. The inner aspect of the band consists of a balloon, which can be adjusted by injecting or withdrawing saline through a subcutaneous port positioned on the anterior abdominal wall. Thus, the tightness of the band can be adjusted for optimal effect.

Figure 9.

Laparoscopic Adjustable Gastric Banding

C.2.3 c. Biliopancreatic Diversion With or Without Duodenal Switch

The BPD, shown in Figure 10, was devised in Italy in the late 1970s. It combines a subtotal gastric resection, Roux-en-Y gastrojejunostomy, and distal intestinal anastomosis such that the digestive enzymes contained in bile and pancreatic juice do not mix with ingested nutrients until the terminal ileum is reached, creating a degree of gastric restriction, malabsorption, and neuroendocrine signaling that combines to accomplish weight loss. A modification of this procedure known as the duodenal switch, also shown in Figure 10, consists of a substantial gastric resection leaving a tubular stomach along the lesser curvature of the stomach. A bypass of the small intestine is created by anastomosis of the small intestine to the transected duodenum distal to the pylorus and, as with BPD, a distal mixing of digestive enzymes with ingested nutrients.

Figure 10.

Biliopancreatic Diversion With or Without Duodenal Switch

C.2.4 d. Sleeve Gastrectomy

In response to problematic perioperative complications and mortality, particularly among the most severely obese patients, the BPD with or without duodenal switch procedure was done in two stages. The first stage consisted of the SG followed by weight loss, reduction of operative risk, and construct of the intestinal component of the procedure at second operation. Subsequently, SG became an independent procedure. This procedure, shown in Figure 11, has considerable popularity at the present time.

Figure 11.

Sleeve Gastrectomy

C.3 Investigational Procedures

In an effort to develop interventions for obesity and related metabolic diseases that would represent lesser degrees of invasion, risk, and/or cost but maintain efficacy, a number of procedures or approaches are being evaluated at differing stages of development including gastric imbrication, neuromodulation, and gastrointestinal luminal (flexible) endoscopic interventions. These guidelines do not further discuss these investigational procedures.

D Methods for Critical Question 5

The Obesity Expert Panel formed work groups for each of its five CQs. For CQ5, the work group was chaired by a surgeon and included physicians and researchers representing universities and NIDDK.

The literature search for CQ5 included an electronic search of the Central Repository for RCTs, controlled clinical trials, and observational studies published in the literature from January 1998 to December 2009. The Central Repository contains citations pulled from seven literature databases (PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts). The search produced 2,317 citations, with 9 additional citations identified from nonsearch sources; i.e., by the panel members or hand search of systematic reviews/meta-analyses obtained through the electronic search. The systematic reviews/meta-analyses were only used for manual searches and were not part of the final evidence base. This manual cross-check was done to ensure that major studies were not missing from the evidence base. A similar manual cross-check of citations from the American Society for Metabolic and Bariatric Surgery position statement was performed in May 2012 [344]. Eight of the 9 citations identified from nonsearch sources were published after December 31, 2009. Per NHLBI policy, certain lifestyle and obesity intervention studies published after the closing date could be allowed as exceptions. These studies must be RCTs in which each study arm contained at least 100 participants and were identified by experts' knowledgeable of the literature. Three of the 9 citations published after December 2009 met the criteria and were eligible for inclusion in the CQ5 evidence base [345-347]. In contrast, 5 of the 9 citations did not meet the criteria and were excluded from the CQ5 evidence base [348-352]. The remaining citation identified through nonsearch sources was published before 2009 [353]. This citation met the criteria and was eligible for inclusion. Thus, of the nine citations identified through nonsearch sources, four were screened and found eligible for inclusion; all these studies were subsequently quality rated as good studies. Figure 12 below outlines the flow of information from the literature search through the various steps used in the systematic review process for CQ5.

Figure 12.

PRISMA Diagram Showing Selection of Articles for Critical Question 5

A natural language processing filter was used to identify studies with sample sizes less than 100, 100 to 299, and/or a followup time of less than 6 months. The natural language processing filter was executed against titles and abstracts. Of the 2,317 citations identified through the database search, 811 citations were automatically excluded using the natural language processing filter. Two reviewers independently screened the titles and abstracts of the 1,515 remaining citations against the inclusion/exclusion criteria (I/E) criteria for each of the three components (efficacy, predictors, and complications). This resulted in exclusion of 1,062 publications (on 1 or more of the I/E criteria for each of the 3 components of this CQ) and retrieval of 453 publications for full-text review to further assess eligibility.

Sixty-four of the 453 full-text publications met the criteria and were included. The quality (internal validity) of these 64 publications was assessed using the six quality assessment tools that were developed (see Appendix tables A-1 through A-6). Of these, 29 publications were excluded because they were rated as poor quality (342,343,354-380); 18 of these studies were rated poor due to the intent-to-treat (ITT) and/or attrition rates. Rationales for all of the poor-quality ratings are included in the Appendix table B-18. The remaining 22 studies (35 articles) that met the criteria for at least 1 of the 3 components were rated good or fair quality and were included in the evidence base [345-348, 353, 381-410]. Of these, eight articles did not provide additional or useful data beyond the data in the summary tables; seven articles are listed in the summary tables for the efficacy component [353, 389, 397, 398, 403-405], and one article is listed in the summary tables for the complications component [387]. The remaining articles were used to formulate the evidence statements (with the exception of Agaba [382] and Weiner [410] for the complications component (see Section 7.E.iii below). For the efficacy, predictors, and complications components, there were 5 studies (17 articles), 10 studies (12 articles) and 14 studies (15 articles) rated as good/fair, respectively. A total of eight articles were used across more than one component [346, 383, 384, 386, 390, 399, 406, 407].

CQ5 work group members reviewed the final studies on the included list, along with their quality ratings, and had the opportunity to raise questions. Some trials previously deemed to be of fair or good quality were downgraded to poor quality upon closer review of the evidence tables. These trials used completers analyses rather than ITT analysis and had overall attrition rates exceeding 10 percent. If the study reported only an analysis of completers and had attrition at <10 percent, it was allowed in the evidence base. The methodologists worked with the systematic review team to reevaluate these trials and make a final decision. Evidence tables and summary tables consisted only of data from the original publications of eligible RCTs and observational studies; these tables formed the basis for work group deliberations.

E Evidence Statements and Summaries

In all, 22 studies (35 articles) satisfied the final inclusion criteria and were rated fair or good quality. Studies of complications included RCTs, cohort studies, before-after studies, and case series if they met methodologists' search and quality criteria.

E.1 Component 1: Efficacy—Summary Table 5.1

Five studies (17 articles) met search and quality criteria for determining the efficacy of bariatric surgery for weight loss and impact on obesity-related comorbid conditions [346, 353, 386, 389-392, 397-399, 401-407]. The number of studies meeting inclusion criteria was limited due to the requirement that surgical treatment be compared to a nonsurgical comparator group with a minimum post-surgical followup of 2 years [346, 386, 390, 391, 399]. Three of the studies were RCTs comparing surgical treatment against conventional medical treatment, lifestyle intervention, or medically supervised weight loss [346, 386, 399]. One trial was a 3-year prospective cohort study with nonsurgical comparators [390]. The largest one with the longest followup, the Swedish Obese Subjects study, was a nonrandomized prospective cohort study of patients who underwent vertical banded gastroplasty, gastric banding, or gastric bypass; this was compared with a matched cohort who received standard clinical care [353, 389, 391, 392, 397, 398, 401-407]. Patients in all but one study [399] had a mean BMI >35, but in the most recent included studies, patients with obesity-related comorbid conditions who had an initial BMI as low as 30 kg/m2 were enrolled [386, 399]. Comparator groups ranged from intensive lifestyle treatment that included very low-calorie diets, pharmacotherapy, and lifestyle counseling [399] to usual care by the PCP [391, 392, 401, 402, 406, 407]. From these trials, evidence statements may be made regarding the efficacy of bariatric surgery for weight loss; for reduction in CVD risk factors, including progression to or remission from type 2 diabetes; for impact on quality of life; and for impact on mortality.

Summary table 5.1 presents summary data from the five included studies on efficacy. Some studies appear in more than one summary table because they address more than one framework of analysis; e.g., predictors or complications.

For the purposes of this document, all the weight loss data are reported as percent of total weight lost or calculated as the percent of BMI lost. It is common among surgical studies to report weight loss as the “excess weight loss,” or EWL. This form of weight loss reporting is problematic, however, due to varying definitions of ideal body weight, which are frequently not provided in the manuscripts. In addition, the relationship between percent total weight loss and percent excess weight loss is not linear throughout a full range of BMI values [411].

ES1. In obese adults, bariatric surgery produces greater weight loss and maintenance of lost weight than that produced by usual care, conventional medical treatment, lifestyle intervention, or medically supervised weight loss, and weight loss efficacy varies depending on the type of procedure and initial body weight.

  • Weight loss at 2 to 3 years following a variety of surgical procedures in adults with presurgical BMI ≥30 varies from a mean of 20 to 35 percent of initial weight and a mean difference from nonsurgical comparators of 14 to 37 percent depending on procedure.

Strength of evidence: High

  • Mean weight loss at 10 years following a variety of bariatric surgical procedures (predominantly vertical banded gastroplasty) is approximately 16 percent of initial weight, representing a mean weight regain of 7 percent.

Strength of evidence: Low

Rationale: Five studies meeting criteria for inclusion [346, 386, 390, 399, 402] assessed weight loss at 2 to 3 years after surgery. Surgical procedures included LAGB [386, 399], gastric bypass [390, 402], and other procedures such as vertical banded gastroplasty [402] and BPD.346 Data are not presented on SG because no studies met inclusion criteria for efficacy outcomes. All included studies showed substantial weight loss following surgery, but weight loss varied with type of procedure (see Section 7.E.ii below) as well as presurgical BMI. Only one small study meeting inclusion criteria (399) restricted patient BMI to <35; all other included studies, even those that recruited patients with a BMI as low as 30,386 had a mean BMI of >35. Thus, there are limited data on weight loss and maintenance outcomes 2 years or more post-surgery in patients with a BMI >35.

One included study (Swedish Obese Subjects (SOS) study) had 10-year followup data [406] and found regain of 7 percent between 2 and 10 years post-surgery. As previously noted, this study evaluated patients undergoing a variety of bariatric procedures, including vertical banded gastroplasty, nonadjustable or adjustable gastric banding, or gastric bypass. Only a minority underwent RYGB—the most common bariatric procedure currently performed, which is more efficacious for weight loss than procedures such as vertical banded gastroplasty.

ES2. In obese adults, bariatric surgery generally results in more favorable impact on obesity-related comorbid conditions than that produced by usual care, conventional medical treatment, lifestyle intervention, or medically supervised weight loss.

  • At 2 to 3 years following a variety of bariatric surgical procedures in adults with BMI ≥30 who achieve a mean weight loss of 20 to 35 percent, fasting glucose and insulin are reduced, incidence of type 2 diabetes is decreased, and there is a greater likelihood of diabetes remission* among those with type 2 diabetes at baseline.

Strength of evidence: High

  • At 10 years, incidence and prevalence of type 2 diabetes are lower in those who have undergone surgery. However, among those in whom type 2 diabetes remits after surgery, diabetes may recur over time.

Strength of evidence: Low

* Remission was defined variously depending on the study.

Rationale: Over the short term (2 to 3 years), several RCTs and prospective cohort studies comparing usual care, lifestyle treatment, or medical therapy to bariatric surgical procedures, including LAGB, RYGB, and BPD [346, 386, 390, 399, 402, 406] for type 2 diabetes have consistently found more improvement in fasting glucose and insulin levels in individuals who had bariatric surgery. This improvement was seen in both those without diabetes and those with an established diagnosis of type 2 diabetes. Mean percent decrease in plasma glucose at 2 years ranged from 56 percent in patients with type 2 diabetes who underwent BPD (vs. 14 percent for medical management) [346] to 7 percent in mostly nondiabetic patients who underwent LAGB (compared with <1 percent with nonsurgical weight loss treatment) [399]. Among those without type 2 diabetes at baseline, the SOS study reported a reduced incidence of diabetes after undergoing a variety of surgical procedures, with 1 percent of the surgical group versus 8 percent of the control group developing diabetes at 2 years. Data are not presented on SG because no studies met inclusion criteria for efficacy outcomes. Some studies have also reported rates of remission from type 2 diabetes. The American Diabetes Association (ADA) consensus statement defines remission as complete (normal glycemic measures of at least 1-year duration with no active pharmacologic therapy or ongoing procedures) or partial (hyperglycemia below diagnostic thresholds for diabetes of at least 1-year duration with no active pharmacologic therapy or ongoing procedures). Prolonged remission is further defined as complete remission of at least 5 years' duration [412]. However, the included studies have defined diabetes remission or recovery variably. Regardless of definition, surgical treatment groups as compared to nonsurgical controls have greater 2-year remission from type 2 diabetes defined variably as fasting plasma glucose (FPG) <100 mg/dL and HbA1c <6.5 percent without pharmacologic therapy [346], FPG <126 mg/dL and HbA1c <6.2 percent without use of oral hypoglycemic agents or insulin [386], or fasting blood glucose <6.7 mmol/L with no antidiabetic medications [402]. Among those with type 2 diabetes, remission of diabetes lasting at least 2 years is reported in 72 to 95 percent in the included studies compared with 0 to 21 percent in nonsurgical comparators.

One of the studies [386] enrolled patients with a recent (within 2 years) onset of diabetes. However, another study [346] enrolled patients with “uncontrolled” diabetes (defined as HbA1c of 7 or more) and longer duration of diabetes.

Ten-year data are from the SOS study [406]. To be concordant with new ADA criteria [413], diabetes remission (recovery) was defined as fasting blood glucose ≤110 mg/dL (≤6.1 mmol/L), corresponding to an FPG <126 mg/dL (7.0 mmol/L) with no antidiabetic medications. Using these criteria, although 72 percent of patients with diabetes were in remission at 2 years post-surgery, only 36 percent were in remission at 10 years (compared with 13 percent in the nonsurgical comparator group). Thus, long-term diabetes remission may not be durable for all patients. There was still, however, a significantly lower rate of both incidence of new cases and remission of diabetes in the surgical group compared with controls at 10 years. Only a minority, recruited later in the study, underwent RYGB, which leads to greater weight loss than other procedures such as vertical banded gastroplasty or gastric banding. Thus, long-term results from this study may show smaller effects than those attained with RYGB or other procedures, such as BPD, that may have metabolic effects on glycemia greater than that expected by weight loss alone. (See section 7.E.ii below for impact of type of surgical procedure on glycemic outcomes.)

Only one small study meeting inclusion criteria [399] restricted patient BMI to <35; all other included studies, even those that recruited patients with a BMI as low as 30 [386] had a mean BMI >35. Thus, there are limited data on outcomes 2 years or more post-surgery related to glycemic control and remission of diabetes in patients with BMI <35.

In summary, bariatric surgery in adults with type 2 diabetes is more likely than usual care, medical management, or lifestyle treatment to result in improvement or diabetes remission over 2 years. There are limited data on long-term (5 years or more) durability of remission of diabetes after bariatric surgery.

ES2. (continued) In obese adults, bariatric surgery generally results in more favorable impact on obesity-related comorbid conditions than that produced by usual care, conventional medical treatment, lifestyle intervention, or medically supervised weight loss.

  • At 2 to 3 years following a variety of bariatric surgical procedures in adults with BMI ≥30 who achieve mean weight loss of 20 to 35 percent, blood pressure or use of blood pressure medication is reduced compared with nonsurgical management. Blood pressure tends to increase over time, and at 10 years post-surgery, there is no difference in mean systolic blood pressure or the incidence of new cases of hypertension in those who underwent bariatric surgery compared to those who did not undergo surgery.

Strength of evidence: Low

  • Among obese adults with baseline hypertension, a greater percentage are in remission* at 2 to 3 years and 10 years following bariatric surgery compared with nonsurgical management.

Strength of evidence: Low

* Remission was defined variously depending on the study.

Rationale: Some [399, 402, 406] but not all [346] studies showed a decrease in SBP and/or DBP at 2 to 3 years or a reduction in antihypertensive medication use [386] when compared with a nonsurgical group receiving standard care or lifestyle intervention. Blood pressure changes were calculated from percentiles or by subtraction from baseline when not presented in the paper as change values. Mean blood pressure reductions ranged from 6 to 26 mmHg systolic and 1 to 14 mmHg diastolic (vs. 0 to 21 mmHg systolic and 0 to 9 mmHg diastolic in nonsurgical comparators). For example, in the SOS study [402, 406], mean blood pressure fell from 144/90 at baseline to 137/84 at 2 years in the surgical group versus 139/86 to 139/85 in controls (P-value between groups <.001).

Two studies [390, 402] reported higher likelihood of recovery from hypertension [402] and/or lower incidence or prevalence of hypertension in the surgical group versus comparator group [399, 402] at 2 to 3 years. In addition, the SOS study [406] reported a slightly lower DBP and greater rate of recovery from hypertension at 10 years although incidence of new cases of hypertension and change in SBP were not different between groups. Sjöström et al. [402] defined recovery as SBP <160 and DBP <95 and no antihypertensive medications at 2 years, revised to SBP <140 and DBP <90 and no antihypertensive medication for the 10-year data [406]. There are no standardized definitions for remission or recovery from hypertension although the Framingham Heart Study [414] defined remission as normotension (blood pressure below both 140 mmHg systolic and 90 mmHg diastolic) in a previously hypertensive individual without receiving antihypertensive medication while relapse was defined as return to blood pressure medication use and/or blood pressure of at least 140/90 mmHg or death due to CVD.

ES2. (continued) In obese adults, bariatric surgery generally results in more favorable impact on obesity-related comorbid conditions than that produced by usual care, conventional medical treatment, lifestyle intervention, or medically supervised weight loss.

  • At 2 to 3 years and 10 years following a variety of bariatric surgical procedures in adults with BMI ≥30 who achieve mean weight loss of 20 to 35 percent, serum TG levels are lower, HDL-C levels are higher, TC-to-HDL-C ratio is lower, and changes in TC or LDL-C levels are inconsistent compared with nonsurgical management.

Strength of evidence: Low

Rationale: Some [390, 402] but not all [386, 399] studies showed reductions in TC or LDL-C at 2 to 3 years after bariatric surgery compared with nonsurgical management. HDL-C [386, 399, 402] was higher and TG lower [386, 399, 402] after bariatric surgery, with the SOS study [402] finding a decreased incidence of low HDL-C (defined as HDL-C <0.9 mmol/L) and hypertriglyceridemia (defined as TG ≥2.8 mmol/L) but no difference in incidence of hypercholesterolemia (defined as total cholesterol ≥6.2 mmol/L). One study [346] showed improvement from baseline in LDL-C (−65 percent) and TG (−57 percent) compared with conventional medical therapy (−21 percent and −18 percent, respectively) only among those who had undergone RYGB but not BPD. However, there was a higher HDL-C (+30 percent) compared with medical therapy (+6 percent) only in those who underwent RYGB. Mingrone et al. [346] also found that significantly more surgical patients had “normalized” TC, TG, and HDL-C levels compared with those who received medical treatment. In those studies reporting the measure [386, 399], TC-to-HDL-C ratio was lower at 2 years in those who underwent bariatric surgery. Ten-year data from the SOS study [406] found that those who had undergone bariatric surgery had higher HDL-C and lower TG compared with matched controls receiving usual care. TC was slightly higher in the surgical group; there was no difference in incident new cases of or recovery from hypercholesterolemia (defined as TC ≥201 mg/dL, 5.2 mmol/L) between groups.

ES2. (continued) In obese adults, bariatric surgery generally results in more favorable impact on obesity-related comorbid conditions than that produced by usual care, conventional medical treatment, lifestyle intervention, or medically supervised weight loss.

  • Most measures of health-related quality of life (HRQOL) are improved at 2 and 10 years following bariatric surgery.

Strength of evidence: Moderate

Rationale: Three papers, representing two studies, found that most measures of HRQOL in those who underwent a variety of bariatric surgical procedures improved compared with nonsurgical management controls at 2 [391, 399] and 10 [392] years. O'Brien et al. [399] found greater improvements in patients who underwent bariatric surgery compared with a nonsurgical control group in 5 of 8 physical and mental health domains of the well-validated Medical Outcomes Study 36-item short-form general health survey, including physical function, physical role, general health, energy, and emotional role, but not in pain or mental health. The SOS study measured HRQOL in multiple domains: Subjective health measured by the current health scale of the General Health Rating Index; mental well-being using the Mood Adjective Checklist and the Hospital Anxiety and Depression Scale; social interaction measured by the Sickness Impact Profile, a study-specific module developed to assess obesity-related problems in everyday life; and self-assessment of eating behavior through the Three-Factor Eating Questionnaire, which also includes measures of general health and psychosocial function. At 2 years, there were greater improvements in all measures of HRQOL in the surgical group although there was no absolute difference between the surgical and control groups in anxiety. Amount of weight loss was correlated with improvement in HRQOL measures, and weight regain tended to be accompanied by decreased HRQOL [391]. However, at 10 years, there was still a significantly better outcome in the surgical group on most measures of HRQOL, with greater improvements in current health perceptions, social interaction, obesity-related problems, and depression but not overall mood or anxiety [392].

ES2. (continued) In obese adults, bariatric surgery generally results in more favorable impact on obesity-related comorbid conditions than that produced by usual care, conventional medical treatment, lifestyle intervention, or medically supervised weight loss.

  • Total mortality is decreased compared with nonsurgical management at mean followup of 11 years after undergoing a variety of bariatric surgical procedures (predominantly vertical banded gastroplasty) in patients with mean BMI >40 who achieve a mean long-term weight loss of 16 percent.

Strength of evidence: Low

Rationale: The SOS study found a reduced hazard ratio (0.76; confidence interval: 0.59 to 0.99) in patients who underwent bariatric surgery compared with the nonsurgical comparator group [407]. This was a prospective cohort study with comparators matched on a variety of biomedical, psychosocial, and demographic factors; however, lack of randomization is a limitation. As previously noted, this study evaluated patients undergoing a variety of bariatric procedures, including vertical banded gastroplasty, nonadjustable or adjustable gastric banding, or gastric bypass. Only a minority, recruited later in the study, underwent RYGB, the most common bariatric procedure currently performed and considered more efficacious for weight loss than procedures such as vertical banded gastroplasty. Mean weight loss at 10 years ranged from 14 percent with banding to 25 percent with gastric bypass [407]. In addition, medical and bariatric surgical advances, including increasing use of laparoscopic surgery, have reduced early morbidity and mortality from bariatric surgery [394]. Thus, the above-referenced study may represent a conservative estimate of the impact of bariatric surgery on mortality. As the SOS study was not a randomized trial, it is also possible that those who underwent surgery differed in unmeasured ways from those who did not or that current surgical approaches have different short- or long-term complications that may impact mortality. Thus, both the directionality and magnitude of the impact of bariatric surgery on all-cause and cause-specific mortality requires additional study.

ES3. There are insufficient data on the efficacy of bariatric surgical procedures for weight loss and maintenance or CVD risk factors 2 or more years post-surgery in patients with a BMI <35.

Rationale: In the current evidence review, only one small study meeting inclusion criteria [399] restricted patient BMI to ≤35; all other included studies, even those that recruited patients with a BMI as low as 30 [386],had a mean BMI >35. Although the Food and Drug Administration has approved laparoscopic adjustable gastric banding for patients with a BMI of 30 to <35 with comorbid conditions [415], the primary end point for the pivotal approval study for this indication was 12-month weight loss [416].

Thus, there are limited data on outcomes at 2 years or more post-surgery related to weight loss and maintenance, adverse effects, glycemic control, dyslipidemia, or blood pressure control in patients with BMI <35.

Component 2: Predictors—Patient Characteristics—Summary Table 5.2 and Types of Surgery—Summary Table 5.3

The predictors component of CQ5 addresses aspects of bariatric surgery specific to different operative procedures as well as other potential predictors of outcome such as patient characteristics or provider aspects of bariatric surgery. The search criteria, as with the efficacy component, required a comparator group, but not necessarily a nonsurgical comparator, as well as outcomes regarding specific bariatric operative procedures. A total of 10 studies (12 articles) met these criteria and were rated as good or fair quality and are included in the summary table [345, 346, 383, 384, 386, 390, 396, 399, 400, 406, 407, 409]. The literature search included studies that address patient factors such as BMI, age, gender, or the presence of associated comorbid conditions. Several published studies have indicated patient factors may influence outcomes after bariatric surgery. One of the studies included in the predictors component addresses the outcomes following BPD among patients with or without preoperative type 2 diabetes [396]. No studies met search criteria addressing provider factors such as surgeon or center experience, center designation, or protocols. The following comparative studies are included as outcome predictors: LAGB versus no surgery [386, 399], modified RYGB versus no surgery [390], RYGB versus LAGB [383, 409], RYGB versus laparoscopic SG [345], laparoscopic versus open RYGB [400]; RYGB with or without an added gastric band [384]; and various procedures (the SOS study described above [406,407]). Four of these studies (five articles) are also included in the evidence base for efficacy (386,390,399,406,407). As for efficacy, weight loss is reported as percent loss of total baseline weight. Mean percent total weight loss was calculated when mean weight loss in pounds or kilograms or mean BMI change from baseline was reported. Comorbidity remission was as designated by the study authors.

ES4. Weight loss following bariatric surgery, expressed as percentage of total body weight loss, varies by procedure.

In direct comparative studies at 2 to 3 years post-surgery:

  • Weight loss following gastric bypass exceeds LAGB

Strength of evidence: Moderate

  • Weight losses following BPD, gastric bypass, and SG are similar.

Strength of evidence: Low

Included studies reporting the weight loss following LAGB, RYGB, BPD, and laparoscopic SG in direct comparative studies show short-term weight loss following gastric bypass exceeds LAGB. Weight loss is similar among RYGB, BPD, and SG in the limited number of studies that met inclusion criteria. Two- to three-year weight loss following LAGB is reported by four studies and is somewhat variable at 15 to 23 percent. Weight loss for RYGB is robust with data from studies that are somewhat more consistent: 2- to 3-year mean weight loss of 30 to 38 percent. Two studies include a direct comparison between LAGB and RYGB [383, 409]; both report superior weight loss following RYGB (30 percent vs. 18 percent; 37 percent vs. 32 percent, P < .05). The weight loss following gastric banding versus RYGB comparison from the SOS study is not reported in the evidence statements as the gastric banding procedures were done prior to the availability of the adjustable gastric band.

Two included studies report weight loss at 2 to 3 years following BPD of 34 to 38 percent [346, 396]. No study reporting on the duodenal switch modification of the BPD is included in the evidence base. Weight loss 3 years post-laparoscopic SG was 34 percent in the single included study [396]. Thus, weight loss at 2 to 3 years following RYGB, BPD, and laparoscopic SG is similar.

Weight loss ranges discussed above differ slightly from those in evidence statement 1 due to different studies being included in the evidence base for efficacy versus predictors.

ES4. (continued) Weight loss following bariatric surgery, expressed as percentage of total body weight loss, varies by procedure

In direct comparative studies at 5 to 10 years post-surgery:

  • Weight loss following gastric bypass exceeds LAGB.

Strength of evidence: Low

Long-term weight loss at 5 or more years is reported in three included studies [383, 396, 406, 407]. Following rapid weight loss for 12 to 24 months after bariatric surgery, weight commonly stabilized or some weight regain occurred. At 5 years post-surgery, Angrisani et al. [383] reported 18 percent weight loss for LAGB and 30 percent for RYGB while Marinari [396] reported 37 percent for BPD. The SOS study [406, 407] reported 10-year weight loss of 13 percent for gastric banding (fixed and adjustable) and 25 percent for gastric bypass. No report of weight loss beyond 3 years is included in the present evidence base regarding laparoscopic SG.

Several published studies have indicated patient factors may influence outcomes after bariatric surgery. One of the studies included in the predictors component addresses the outcomes following BPD among patients with or without preoperative type 2 diabetes [396]. The presence of diabetes did not impact weight loss up to 5 years following BPD.

ES5. The remission of obesity-related comorbidities varies by procedure.

  • Type 2 diabetes remission or improved glycemic control occurs with increasing frequency according to procedure as follows: LAGB, gastric bypass, and then BPD.

Strength of evidence: Low

The induction of remission of type 2 diabetes is described and discussed in the efficacy narrative above (see Section 7.E.i). None of the included studies used the recently published criteria defining diabetes remission [412] so that it is necessary to accept the authors' designation of diabetes remission. As for efficacy, diabetes remission is variably defined in these studies. Predictors of diabetes remission may be patient factors or specific bariatric surgical procedures. The included studies do not provide evidence regarding patient predictors of diabetes remission such as duration or severity of diabetes, BMI, or other factors. The number of included studies reporting diabetes remission by procedure is as follows: 2 (LAGB) [386, 409]; 4 (RYGB) [345, 346, 390, 409]; 1 (laparoscopic SG) [345]; and 2 (BPD) [346, 396]. The remission rates reported 2 or more years post-surgery are as follows: 57 to 73 percent (LAGB), 75 to 86 percent (RYGB), 80 percent (laparoscopic SG), and 95 to 100 percent (BPD). These data must be interpreted with caution due to the generally small numbers of diabetic subjects in each trial, variable patient populations including patients who a diagnosis of diabetes for less than 2 years [386] and patients with poorly controlled diabetes [346], and the lack of a standard definition of diabetes remission.

ES5. (continued) The remission of obesity-related comorbidities varies by procedure.

  • Reduction in the prevalence of hypertension is more frequent following gastric bypass than LAGB.

Strength of evidence: Low

  • The prevalence of dyslipidemia is lower following gastric bypass compared to LAGB.

Strength of evidence: Low

The response of hypertension and dyslipidemia to bariatric surgery is discussed in the efficacy narrative (see section 7.E.i). The included evidence regarding the effects of specific bariatric surgical procedures is limited, supporting the above qualitative statements but insufficient to be more specific regarding the magnitude of the effect size. The interpretation of the reported effects on dyslipidemia is further limited by a lack of clear definition of dyslipidemia [409] or variable responses of the specific lipid components [346]. There was insufficient evidence to assess the impact of differential response of hypertension or dyslipidemia following BPD or SG.

E.2 Component 3: Complications—Summary Table 5.4

The benefits of weight loss among obese adults, especially those with obesity-related comorbid disease, are well described in these guidelines. Bariatric surgery produces greater weight loss and maintenance of weight loss than that produced by usual care or medically supervised weight loss. The potential benefit of weight loss for severely obese adults, however, must be considered in light of the risk for complications in the short or long term. The work group determined that examination of the evidence specific to complications of bariatric surgery required expansion of the search criteria beyond those used for the efficacy and predictors of bariatric surgery. Due to the relatively low incidence of complications such as perioperative mortality (less than 1 percent), substantial sample sizes are required to accurately establish the frequency of complications and analyze associated factors. The complication evidence base therefore included those studies from the efficacy and predictors searches that included complication data [346, 383] as well as those studies that met the expanded search criteria [347, 348, 381, 385, 387, 388, 393, 394, 408]. The study by Agaba et al. [382] also met the I/E criteria and thus is listed as an included study, but it was not used by the work group due to concerns about the accuracy of the data reported in this study. These expanded criteria added retrospective cohort studies, before and after studies, and case series studies, among others. A comparator group was not required. Additional criteria for observational studies were a sample size ≥100 for studies with 10 or more years of followup or studies on BPD or SG procedures; required sample size was ≥500 for all other observational studies. These variable search requirements were based on the limitations of the number of subjects typically reported for BPD or SG. In addition, the number of subjects reported in the studies identified in the efficacy and predictors searches was usually less than 100 with the exception of the SOS study, which did not report detailed complication data by procedure. One RCT [347] that was published after the search date cutoff of December 31, 2009, was also included because it reported complication data and otherwise met criteria for inclusion.

Conclusions regarding comparative aspects of complications following different procedures, populations, or studies require interpretation as the population of patients undergoing specific procedures or reported in specific studies may vary. In addition, there are no standardized criteria for classifying post-operative complications. Several studies have identified predictive factors for complications. These factors may be patient derived, provider variables, or procedure-specific [349, 394, 417]. No provider factors such as surgeon or hospital case volume were identified as associated with complications among the included studies. Summary table 5.4 is based on 14 studies. Complications following LAGB are reported in 6 of the 14 included studies, following gastric bypass in 5 of 14, following BPD in 3 of 14, and following SG in 2 of 14.

E.2.1 a. Laparoscopic adjustable gastric banding

ES6. Perioperative (≤30 days) and longer term (>30 days) complications following bariatric surgery vary by procedure and patient-derived risk factors. When performed by an experienced surgeon, perioperative complications following LAGB are infrequent and do not tend to be life-threatening: Major adverse outcomes (1.0 percent) such as deep venous thrombosis (DVT) and reoperations and minor complications (3 percent) such as wound infection.

Strength of evidence: Moderate

The 30-day complication data following LAGB is derived primarily from the Longitudinal Assessment of Bariatric Surgery (LABS), an NIH-funded multicenter research consortium. This study and four others reported no LAGB perioperative mortality [383, 385, 388, 394, 408]. The incidence of serious complications reported by LABS was 1 percent consisting of reoperation (0.8 percent) and DVT (0.3 percent) [394]. Steffen et al. [408] reported severe complications among 824 patients: Gas embolism with secondary brain injury and esophageal perforation. Minor perioperative complications included atelectasis or pneumonia (1.5 percent) and minor wound problems (1.2 percent).

ES6. (continued)

  • Longer term complications continue to occur over time and may require operative correction: Misplacement of band (∼3 to 4 percent), erosion of gastric wall (∼1 percent), and port complication (5 to 11 percent).
  • Longer term LAGB failure leading to removal of the band with or without conversion to another bariatric procedure varies from 2 to 34 percent. Inadequate weight loss is the most often reported basis for removal of the band.

Strength of evidence: Moderate

Longer term complications following LAGB, the frequency of which varies considerably among the included studies, may be considered in three groups.

Complications requiring intra-abdominal surgery for correction include misplacement of the band on the stomach (gastric slip, approximately 3 to 4 percent) and erosion of the gastric wall by the band (approximately 1 percent) [388, 408]. Technical complications with the subcutaneous port used for the band adjustment that require operative correction are more frequent and variable (5 to 11 percent) but are relatively minor outpatient procedures done under local anesthesia. The wide range of reported band removal in different procedures is due primarily to variable institutional or surgeon determination that the band has failed to accomplish the desired weight loss such that further adjustments and diet and physical activity instruction are unlikely to produce further weight loss.

E.2.2 b. Roux-en-Y Gastric Bypass

ES6. (continued) Perioperative (≤30 days) and longer term (>30 days) complications following bariatric surgery vary by procedure and patient-derived risk factors. When performed by an experienced surgeon, perioperative complications following laparoscopic gastric bypass:

  • Consist of a major adverse outcome in approximately 4 to 5 percent, including mortality (0.2 percent), DVT and/or pulmonary embolism (PE) (0.4 percent), and a requirement for reoperation (3 to 5 percent). Rates of any complication, major or minor, range from 2 to 18 percent.

Strength of evidence: Moderate

Prior reports [418] described complication rates that were considerably higher than those currently included here.

In the present evidence base, smaller studies (N <100) reported no mortality following RYGB [346, 347, 383, 384]. Lopez-Jimenez et al. [395] also reported no mortality among 559 subjects. The LABS Consortium reported a 0.2-percent mortality rate among the 2,975 subjects. The incidence of reported complications varies with definitions. Following RYGB, the LABS Consortium reported a 4.8-percent incidence of complications of a composite consisting of mortality, DVT, reoperation, or continued hospitalization on day 30. Other investigators reported similar complication rates. Although micronutrient deficiencies have been reported with RYGB [419], the included studies, which primarily reported short-term complications, did not provide data on rates of micronutrient or other nutritional deficiencies.

ES6. (continued) Perioperative (<30 days) and longer term (≥30 days) complications following bariatric surgery vary by procedure and patient-derived risk factors. When performed by an experienced surgeon, perioperative complications following laparoscopic gastric bypass:

  • Are less frequent for the laparoscopic approach than for open incision.

Strength of evidence: Moderate

When performed by an experienced surgeon, perioperative complications following open gastric bypass: Consist of a major adverse outcome in approximately 8 percent, including mortality (2 percent), DVT/PE (1 percent), and reoperation (5 percent).

Strength of evidence: Low

Several explanations for the improved safety of RYGB over the past 10 years have been proposed. Multiple studies have reported the transition to the laparoscopic methodology from the traditional open incision to be an important contributor to improved outcome. In the present evidence base, only the LABS Consortium [394], an observational trial, reported comparative outcomes. In this study, the reported mortality for open RYGB (2.1 percent) was higher than for laparoscopic RYGB (0.2 percent). The composite end point indicating a serious complication occurred in 7.8 percent of the open and 4.8 percent of the laparoscopic RYGB subjects. These two populations are not entirely comparable; presently, open gastric bypass is limited to patients who have a contraindication for the laparoscopic methodology. The risk profile for the open patients was consistently greater than for laparoscopic ones.

ES6. (continued) Perioperative (≤30 days) and longer term (≥30 days) complications following bariatric surgery vary by procedure and patient-derived risk factors.

When performed by an experienced surgeon, perioperative complications following gastric bypass (laparoscopic or open): Are associated with extremely high BMI, inability to walk 200 feet, history of DVT/PE, and history of obstructive sleep apnea.

Strength of evidence: Low

Several studies [420-423] have performed correlation analyses of potential risk factors with complication outcomes following RYGB (laparoscopic or open). The present evidence base is limited to the LABS Consortium analysis. This analysis found extremely high BMI, inability to walk 200 feet without an assistive device, a history of DVT/PE, and a history of obstructive sleep apnea to be associated with the composite measure of adverse short-term outcomes. There is insufficient evidence to support an evidence statement regarding the mid-term and long-term complications of RYGB.

E.2.3 c. Biliopancreatic Diversion

ES6. (continued) Perioperative (≤30 days) and longer term (>30 days) complications following bariatric surgery vary by procedure and patient-derived risk factors. The mortality rate for BPD was reported by two of the three included studies.

When performed by an experienced surgeon, perioperative complications following BPD: Occur in 2 to 8 percent of cases and include mortality (<1 percent) and DVT/PE (0.4 percent). The frequency of anastomotic leak, hemorrhage, and wound complication is variable.

Strength of evidence: Low

Mortality of 0.9 percent among 343 subjects [393] and 0 percent among 20 subjects [346] were reported. Variable complication rates, ranging from 2.2 to 7.6 percent, were reported by Adami et al. [381] (N = 734); Larrad-Jimenez et al. [393] reported complication rates of 7.6 (N = 343). The lack of studies with direct comparisons and the variable definitions among studies, however, preclude drawing any conclusions regarding the relative perioperative safety of BPD as opposed to RYGB.

E.2.4 d. Biliopancreatic Diversion Longer Term Complications

ES6. (continued) Perioperative (≤30 days) and longer term (>30 days) complications following bariatric surgery vary by procedure and patient-derived risk factors. When performed by an experienced surgeon, perioperative complications following BPD include the following:

  • to 3-year complications include anemia (13 to 20 percent); deficiency of protein (0.3 to 3.0 percent), iron (17 percent), and zinc (6 percent); and neuropathy (0.4 percent). Deficiency of vitamin D and elevated parathyroid hormone may exceed 40 percent.
  • When performed by open incision, ventral hernia as high as 72 percent.

Strength of evidence: Low

Presumably as the result of the malabsorptive component of the BPD procedure, longer term complications following BPD have been reported as problematic. One study reported 1- to 3-year data [381] while two studies reported 2-year data [346, 393]. The incidence of anemia was reported to be 11 to 20 percent, protein deficiency 0.3 to 10 percent, and neuropathy 0.4 percent. A single study [393] reported deficiencies of iron (17 percent), zinc (6 percent), magnesium (0.3 percent), and vitamin D (43 percent). Although these deficiencies can be restored with replacement therapy, operative revision to diminish the extent of malabsorption has been required in some cases. The incidence of post-operative ventral hernia following open BPD is reported by a single study in the included evidence base [393]. A clinical ventral hernia occurred in 44 percent; an additional 28 percent were found to have a subclinical ventral hernia.

E.2.5 e. Laparoscopic Sleeve Gastrectomy

ES6. (continued) Perioperative (≤30 day) and longer term (>30 days) complications following bariatric surgery vary by procedure and patient-derived risk factors. When performed by an experienced surgeon, perioperative complications following laparoscopic SG

  • There is insufficient evidence to establish the incidence of perioperative and longer term complications.

Despite the increasing popularity of this procedure, with multiple associated publications, the present evidence from studies that met inclusion criteria was judged insufficient to establish incidence of perioperative and longer term complications following laparoscopic SG.

E.3 Summary

Bariatric surgical procedures have established efficacy for up to 2 years in producing mean weight losses of 20 percent or more and ameliorating obesity-related medical conditions including type 2 diabetes, hypertension, and dyslipidemia. Long-term (5 years or more) data are more limited but suggest continued benefits for most risk factors despite some weight regain over time. The impact of bariatric surgery on health seems to be most strong for diabetes, with decreased incidence and increased likelihood of remission at both 2 and 10 years post-surgery. Data are less robust for hypertension or TC. Most measures of HRQOL, especially related to physical functioning, improve with bariatric surgery although some of these improvements wane with weight regain. Type of procedure has an impact on both degree of weight loss as well as reduction in comorbidities. In general, procedures such as gastric bypass and BPD produce greater weight loss and risk factor reduction as well as greater likelihood of remission from diabetes than less invasive procedures such as LAGB. However, these procedures also have a higher likelihood of short-term complications and adverse effects. Limited data suggest that bariatric surgery may be associated with reductions in total mortality although further data are needed to determine both the strength of this association in larger samples as well as directionality by cause of death.

F Recommendations—Bariatric Surgeries

To provide clinicians and patients with practical guidance based on reviewed evidence, the following recommendations regarding bariatric surgery in adults ≥18 years are offered. The recommendations take into account both the demonstrated benefits of bariatric surgery as well as surgical complications and risks of various procedures.

Evidence-based recommendations for the efficacy of bariatric surgery were limited by the small number of bariatric surgical studies meeting inclusion criteria, including nonsurgical comparators plus follow-up of 2 years or more [341, 348, 381, 384-387, 392-394, 396-402]. In addition, in this rapidly changing field, newer and less invasive procedures are being introduced, often with limited clinical trials data. The patient populations in whom bariatric surgical procedures are performed are also being expanded, including patients with BMI in the mildly obese or even overweight range with associated comorbidities. Because of limited data, some recommendations for bariatric surgical patient or procedure selection are therefore based on expert opinion.

Recommendation 5a. Advise adults with a BMI ≥40 or BMI ≥35 with obesity-related comorbid conditions who are motivated to lose weight and who have not responded to behavioral treatment with or without pharmacotherapy with sufficient weight loss to achieve targeted health outcome goals that bariatric surgery may be an appropriate option to improve health and offer referral to an experienced bariatric surgeon for consultation and evaluation.

Recommendation Grade: A (strong)

Rationale: Well-controlled studies comparing various bariatric surgical procedures to usual care, conventional medical treatment, lifestyle intervention, or medically supervised weight loss in obese adults have consistently found superior weight loss for up to 10 years, with 2 to 3 year weight loss in the bariatric surgical group of 20 to 35 percent [341, 381, 385, 394, 397]. Although some regain is likely, mean weight loss at 10 years is still significantly greater than in nonsurgical controls [401]. Short-term weight loss varies with procedure, with the LAGB having the least weight loss, and more extensive procedures, such as RYGB or BPD, producing larger weight losses [350, 404]. Data are limited on patient or procedural factors impacting long-term weight loss.

Consistent data from controlled studies show that bariatric surgery has a favorable impact on glycemic control, including serum or plasma glucose, insulin, and HbA1c, as well as reductions in diabetes incidence and increases in remission [341, 381, 385, 394, 397, 401]. These data are most striking in patients with type 2 diabetes, in whom short-term remission may occur in up to 95% at 2 years, depending on the procedure. The extent to which longer duration of diabetes impacts initial remission due to bariatric surgery is not clear. With follow-up out to 10 years, recurrence of type 2 diabetes may occur in about half of patients [401], although more data are needed to ascertain recurrence rates with surgical procedures in use today. Patient factors (age, race/ethnicity, duration of diabetes) or procedural factors that impact long-term recurrence of diabetes remain to be elucidated. A continued benefit for bariatric surgery in prevention of development of diabetes for up to 15 years (hazard ratio 0.17) was recently reported from the Swedish study [419].

The evidence for impact of bariatric surgical procedures on blood pressure, including the development of or remission from hypertension, is less robust than for glycemic control [341, 381, 385, 394, 397, 401]. At 2 to 3 years, some but not all studies show reductions in blood pressure or use of blood pressure medication, remission from hypertension, or lower incidence or prevalence of hypertension in those undergoing bariatric surgery compared with nonsurgical controls. Blood pressure tends to increase over time, and although there are limited long-term data, one study showed greater remission from hypertension at 10 years but no difference in incidence of new cases of hypertension or in systolic blood pressure compared with nonsurgical controls [401].

There is evidence of a favorable impact of bariatric surgical procedures on some components of dyslipidemia at both 2 to 3 and 10 years, including higher HDL-C and lower TG [341, 381, 385, 394, 397, 401]. Data on TC and LDL-C are mixed, although in the few studies in which it was evaluated LDL-TC ratio improved with bariatric surgery.

In addition, this recommendation is supported by a more recent analysis from the SOS, which found benefit of bariatric surgery on incidence of CVD among patients with and without diabetes [420, 421].

Bariatric surgical procedures appear to have a favorable impact on most components of HRQOL for up to 10 years [381, 386, 387, 394]. The degree of improvement appears to correlate with amount of weight loss and is attenuated with regain.

One prospective cohort study, the SOS study [402], found a lower total mortality in those who underwent bariatric surgery compared with controls at 10 years. Most patients in this study underwent vertical banded gastroplasty, which is less efficacious for weight loss and reduction in medical comorbidities than procedures such as gastric bypass [336].

Advances in bariatric surgical approaches, including increasing use of laparoscopic surgery, and improvements in perioperative care have decreased early morbidity and mortality from bariatric surgery; thus, the SOS findings may represent a conservative estimate of the impact of bariatric surgery on mortality. However, more data are needed on both the magnitude and directionality of total and cause-specific mortality.

Because bariatric surgery leads to improvements in both weight-related outcomes and many obesity-related comorbid conditions, the benefit-to-risk ratio may be favorable in appropriately selected patients at high risk for obesity-related morbidity and mortality. In the absence of RCTs to identify the optimal duration and weight loss outcomes of nonsurgical treatment prior to recommending bariatric surgery, the decision to proceed to surgery should be based on multiple factors: patient motivation, treatment adherence, operative risk, and optimization of comorbid conditions, among others. Bariatric surgery should be considered an adjunct to lifestyle treatment: behavioral treatment, appropriate dietary modification, and physical activity.

Recommendation 5b. For individuals with a BMI <35, there is insufficient evidence to recommend for or against undergoing bariatric surgical procedures.

Recommendation Grade: N (no recommendation)

Included studies suggest that patients with a BMI of 30 to 35 achieve more weight loss and greater improvements in CVD risk factors and quality of life than controls undergoing nonsurgical management for up to two years [381, 394]. However, in the current evidence review, only one small study meeting inclusion criteria [394] restricted patient BMI to ≤35; all other included studies, even those that recruited patients with a BMI as low as 30 [381] had a mean BMI of >35. The limited data on the impact of bariatric surgical procedures in patients with a BMI <35 on weight loss and maintenance, adverse effects, glycemic control, dyslipidemia, or blood pressure control 2 or more years post-surgery preclude recommendations for or against bariatric surgery in this population. A more recent meta-analysis of bariatric surgery in adults with a BMI <35 and diabetes or impaired glucose tolerance concuded that evidence was insufficient to reach conclustions about appropriate use of bariatric surgery in this population pening additional data on long-term outcomes and complications [422].

Recommendation 5c. Advise patients that choice of a specific bariatric surgical procedure may be affected by patient factors, including age, severity of obesity/BMI, obesity-related comorbid conditions, other operative risk factors, risk of short- and long-term complications, behavioral and psychosocial factors, and patient tolerance for risk as well as provider factors (surgeon and facility).

Recommendation Grade: E (expert opinion)

The evidence review evaluated bariatric surgical procedures currently in common use, including the RYGB, LAGB, SG, and BPD. Bariatric surgery is an evolving field, and new procedures and surgical techniques will continue to be implemented over time. In addition, less invasive experimental procedures to reduce weight or improve metabolic abnormalities are also in development. As experience with newer procedures and techniques grows, early results may not be in line with longer term outcomes. Complication rates or weight loss outcomes may improve as preoperative, perioperative, and postoperative management is refined. Enhancements in patient selection may result in a better match between risk of the procedures and potential health benefits. Alternatively, a newer surgical technique or approach that appears promising initially may have unanticipated adverse effects or result in less optimal outcomes over the long term. Thus, looking at current outcomes data on predictive factors for bariatric surgical procedures provides a snapshot in time in which the only certainty is change.

Different bariatric surgical procedures are likely to have differential effects on metabolic abnormalities and CVD risk factors. For example, data suggestive of greater impact of RYGB compared with LAGB on prevention of diabetes as well as glycemic control in patients with type 2 diabetes may influence choice of procedures in this population, although more data are needed on long-term durability of diabetes remission. Recent data suggest that factors such as baseline hyperinsulinemia or dysglyceia may be more important than initial BMI in determining health benefits from bariatric surgery [423], although additional research is needed, particularly in populations with BMI <35. Behavioral predictors of short- and long-term bariatric surgical outcomes are also limited and insufficient to determine choice of procedure at present. Emerging data, such as a potential association of RYGB with postoperative problem alcohol use [424, 425] and a possible increased risk of suicide or accidental death [426], emphasize the rapidly evolving knowledge in this field and the need for flexibility as our knowledge base increases.

Short- and long-term adverse effects of bariatric surgery are also important considerations when choosing to undergo surgery as well as which procedure will offer the most favorable benefit to risk ratio. More extensive procedures also entail greater risk, including perioperative morbidity and mortality, albeit with the potential for increased weight loss and resolution of comorbidities.

There were insufficient data in the literature reviewed to determine the impact of factors such as surgeon or hospital volume on outcomes. However, most of the studies reviewed were conducted in high-volume academic medical centers with experienced bariatric surgeons. It is reasonable to consider factors such as surgeon and hospital bariatric surgical volume and experience, as well as experience with managing the surgical approach being considered, when choosing a surgeon or hospital.

In summary, determining which procedure will provide the greatest likelihood of a favorable outcome is an individual decision for each patient and provider. Patient factors, including underlying medical conditions; initial BMI; behavioral and psychosocial factors; social support; and tolerance for risk; the experience of both the surgeon and hospital; availability of pre- and postoperative care; and procedural differences in short- and long-term benefits and adverse outcomes are all reasonable to consider when choosing whether to undergo bariatric surgery, which procedure to undergo, and where and by whom the surgery should be performed.

G Gaps in Evidence and Future Research Needs

For patients with obesity who have obesity-related comorbid conditions or who are at high risk for their development, bariatric surgery offers the possibility of meaningful health benefits, albeit with significant risks. The potential for prevention or remission of diabetes, better control of CVD risk factors, improvement in quality of life, and possibly decreased mortality underscores the need for research that can better characterize those patients who are most likely to benefit from and least likely to suffer adverse consequences from bariatric surgical procedures. There is a need to understand which surgical procedures are best applied to different populations based on factors such as presence and duration of comorbid conditions, age, sex, race/ethnicity, degree and duration of obesity, underlying genetic etiologies, and psychosocial or behavioral characteristics. Obtaining these data will require large and well-designed experimental, quasi-experimental, and observational studies. The work group identified the following priority questions for research focus:

  1. What are the preoperative, perioperative, and post-operative patient and procedure characteristics that best predict successful prevention or remission of type 2 diabetes both short term and long term?
  2. What are the complications and adverse effects of various bariatric surgical procedures, both short and long term? Which patient or practitioner factors predict such complications?
  3. What is the long-term impact of bariatric surgical procedures on CVD, all-cause, and cause-specific mortality compared with nonsurgical treatment of obesity or its comorbidities? Does this vary by type of procedure or by underlying comorbid condition; e.g., type 2 diabetes, prior CVD?
  4. Which health effects result from surgically induced metabolic alterations rather than, or in addition to, weight loss?
  5. What is the long-term impact of bariatric surgical procedures on health care utilization and costs?
  6. What is the impact of bariatric surgical procedures on non-CVD or diabetes outcomes, including but not limited to musculoskeletal disease, pulmonary disease, liver disease, cancers, reproductive outcomes (including pregnancy), sleep disorders, and psychosocial outcomes such as substance abuse or depression?
  7. What is the impact of preoperative patient factors, including but not limited to insulin resistance, genetic abnormalities, and psychosocial and behavioral variables such as binge eating, on predicting shorter and longer term outcomes? Do any of these factors moderate the relationship between weight loss and resolution of comorbidities?
  8. What is the long-term impact of bariatric surgical procedures on weight loss and maintenance, CVD risk factors and incidence, type 2 diabetes incidence or remission, other obesity-related morbidity, and mortality in patients with a BMI <35?

The work group members also recognize that the evidence that formed the basis for this report came primarily from studies conducted within academic medical centers. There is a need for studies evaluating the impact of bariatric surgery in nonuniversity hospital and clinical settings, which may be more reflective of real-world medical practices.

Appendix A: Detailed Methods Applying to All Critical Questions

i Description of How Expert Panel Members Were Selected

NHLBI initiated a public call for nominations for panel membership to ensure adequate representation of key specialties and stakeholders and appropriate expertise among expert panel and work group members. A nomination form was posted on the NHLBI Web site for several weeks and distributed to a guidelines leadership group that had given advice to NHLBI on its guideline efforts. Information from nomination forms, including contact information and areas of clinical and research expertise, was entered into a database.

After closing the call for nominations, NHLBI staff reviewed the database and selected a potential chair and co-chair for each expert panel and work group. The potential chairs and co-chairs provided to NHLBI conflict of interest disclosures and a copy of their curriculum vitae. The NHLBI Ethics Office reviewed the disclosures and cleared or rejected individuals being considered as chairs and co-chairs. The selected chairs were then formed into a Guidelines Executive Committee (GEC), which worked with NHLBI to select panel members from the list of nominees.

NHLBI received 440 nominations for potential panel members with appropriate expertise for the task. Panel selection focused on creating a diverse and balanced composition of members. Panel members were selected based on their expertise in the specific topic area (e.g., high blood pressure, high blood cholesterol, and obesity) as well as in such specific disciplines as primary care, nursing, pharmacology, nutrition, exercise, behavioral science, epidemiology, clinical trials, research methodology, evidence-based medicine, guideline development, guideline implementation, systems of care, and informatics. The panels also included, as voting ex officio members, senior scientific staff from NHLBI and other Institutes from the National Institutes of Health (NIH) who are recognized experts in the topics being considered.

ii Description of How Expert Panels Developed and Prioritized Critical Questions

After panels were convened, members were invited to submit topic areas or questions for systematic review. Members were asked to identify topics of the greatest relevance and impact for the target audience of the guideline, which is primary care providers.

Panel members submitted proposed questions and topic areas over a period of several months. The number of critical questions (CQs) was scoped, and questions were prioritized based on available resources. After group discussion, panel members ranked priority CQs through collaborative dialogue and voting. The rationale for each priority CQ is addressed in the main report.

With support from the methodologist and systematic review team, panel members formulated priority CQs. They also developed inclusion and exclusion criteria (I/E criteria) to ensure that criteria were clear and precise and could be applied consistently across literature identified in the search. I/E criteria were defined and formatted using the PICOTS framework. PICOTS is a framework for a structured research question. It includes the following components in the CQ statement or in the question's I/E criteria:

Pperson, population
Iintervention, exposure
Ccomparator
Ooutcome
Ttiming
Ssetting

I/E criteria define the parameters for selecting literature for a particular CQ. I/E criteria were developed with input from the methodologist and systematic review team to ensure that criteria were clear and precise and could be applied consistently across literature identified in the search.

The final CQs and criteria were submitted to the literature search team for search strategy development.

iii Literature Search Infrastructure, Search Strategy Development, and Validation

The literature search was performed using an integrated suite of search engines that explored a central repository of citations and full-text journal articles. The central repository, search engines, search results, and Web-based modules for literature screening and data abstraction were integrated within a technology platform called the Virtual Collaborative Workspace (VCW). The VCW was custom-developed for the NHLBI systematic evidence review initiative.

The central repository consisted of 1.9 million citations and 71,000 full-text articles related to cardiovascular disease (CVD) risk reduction. Citations were acquired from the following databases: PubMed, Embase, CINAHL®, Cochrane, PsycINFO, Wilson Science, and Biological Abstracts® databases. Literature searches were conducted using a collection of search engines including TeraText®, Content Analyst, and Collexis, and Lucene. The first three engines were used for executing search strategies, and Lucene was used to correlate the search with literature screening results.

For every CQ, a literature search and screening were conducted according to the understanding of the question and the I/E criteria that provided specific characteristics of studies relevant to the question. Criteria were framed in the PICOTS format, and the question and PICOTS components were translated into a search strategy involving Boolean and conceptual queries.

A Boolean query encodes both inclusion and exclusion rules. It grants access to the maximum quantity of citations, which are then analyzed by text analytics tools and ranked to produce a selection for literature screening. Two independent reviewers conducted this screening in the VCW's Web-based module. Boolean queries select citations by matching words in titles and abstracts, as well as medical subject headings (MeSH) and subheadings. The number of citations resulting from Boolean queries has ranged from a few hundred to several thousand, depending on the question. The text analytics tools suite included:

  • A natural language processing module for automated extraction of data elements to support the application of I/E criteria. Frequently extracted and utilized data elements were study size and intervention followup period.
  • Content Analyst for automatically expanding vocabulary of queries, conceptual retrieval, and conceptual clustering. The conceptual query engine employed in Content Analyst leverages word frequency features and co-occurrence in similar contexts to index, select, and rank results. The indexing uses the singular value decomposition (SVD) algebraic method.
  • TeraText for ranking search engine results and executing operations on literature collections.

Search strategy development was intertwined with the results of literature screening, which provided feedback on search quality and context. Screened literature was categorized into two subsets: relevant or not relevant to the question. Next, results were analyzed to determine the characteristics of relevant versus not relevant citations. Additional keywords and MeSH terms were used to expand or contract the scope of the query as driven by characteristics of relevant citations. If the revised search strategy produced more citations that did not undergo the screening process, then a new batch of citations was added for review. The search strategy refinement/literature review cycle was repeated until all citations covered by the most recent Boolean query had been screened.

Each search strategy was developed and implemented in the VCW. The methodologist and panel members reviewed the search strategy, which was available for viewing and printing at any time by panel members and staff collaborating on the systematic review. The search strategy was available for execution and supplying literature updates until the literature search and screening cut-off date.

An independent methodology team validated the search strategies for a sample of questions. As part of this validation process, the methodology team developed and executed a separate search strategy and screened a random sample of citations against I/E criteria. Then, these results were compared with the search and screening results developed by the systematic review team. Based on the validation process, the searches were considered appropriate. In addition, studies identified in systematic reviews and meta-analyses were cross-checked against a CQ's list of studies included in the evidence base to ensure completeness of the search strategy.

iv Process for Literature Review and Application of I/E Criteria

Using results of the search strategy, criteria were applied to screen literature for inclusion or exclusion in the evidence base for the CQ. I/E criteria address the parameters in the PICOTS framework and determine what types of studies are eligible and appropriate to answer the CQ. When appropriate, the panel members added (with guidance from the methodology team) I/E criteria, such as sample size restrictions, to fit the context of the CQ. To enhance the quality of the abstracted literature, these criteria were applied uniformly (by the systematic review and methodology teams) within a given question.

a Pilot Literature Screening Mode

In the pilot literature screening mode, two reviewers independently screened the first 50 titles or abstracts in the search strategy results by applying I/E criteria. Reviewers voted to include or exclude the publication for full-text review. To ensure I/E criteria were applied consistently, reviewers compared their results. Discrepancies in votes were discussed, and clarification on criteria was sought from the panel when appropriate. For example, if criteria were not specific enough to be clearly applied to include or exclude a citation, then they sought guidance to word the criteria more explicitly.

During this phase, reviewers provided feedback to the literature search team about the relevance of search strategy results; the team used this feedback to further refine and optimize the search.

b Phase 1: Title and Abstract Screening Phase

After completing the pilot mode phase, two reviewers independently screened search results at the title and abstract level by applying I/E criteria. Reviewers voted to include or exclude the publication for full-text review.

When at least one reviewer voted to include a publication based on the title and abstract review, the publication advanced to Phase 2, full-text screening. When both reviewers voted to exclude a publication, then it was excluded and not reviewed further. These citations are maintained in the VCW and marked as “excluded at title/abstract phase.”

c Phase 2: Full-Text Screening Phase

In Phase 2, two reviewers independently applied I/E criteria to the full-text article and voted for “include,” “exclude,” or “undecided.” The reviewer specified the rationale for exclusion (e.g., population, intervention, etc.) in this phase.

Articles that both reviewers voted to include were moved to the “include” list. Similarly, articles that both reviewers voted to exclude were moved to the “exclude” list. These citations were maintained in the VCW and identified as “excluded at the full article phase,” and the rationale for exclusion was noted. Any articles with discrepant votes (i.e., one include and one undecided, one include and one exclude, and one exclude and one undecided) advanced to Phase 3.

d Phase 3: Resolution and Consultation Phase

In this phase, reviewers discussed their discrepant votes for “include,” “exclude,” or “undecided” and cited the relevant criteria for their decision. The two reviewers attempted to achieve consensus through collaborative discussion. If the reviewers could not reach consensus, then they consulted the methodologist. If they were still unable to reach a consensus, then they consulted the panel; however, the methodologist had the final decision. The final disposition of the article (“include” or “exclude”) was recorded in the VCW along with comments from the adjudication process.

Similar to search strategies being posted and available for viewing on the VCW, all citations screened for a CQ were maintained in the VCW with their reviewer voting status and collected comments.

v Description of Methods for Quality Assessment of Individual Studies

Articles meeting the I/E criteria after the three-phase literature review process were then rated for quality. Each study design used a separate quality rating tool.

a Design of the Quality Assessment Tools

Six quality assessment tools, developed by NHLBI and the methodology team, were used to evaluate the quality of individual studies. The tools were based on quality assessment methods, concepts, and other tools developed by researchers in the Agency for Healthcare Research and Quality's (AHRQ) Evidence-Based Practice Centers (EPCs), the Cochrane Collaborative, the U.S. Preventive Services Task Force (USPSTF), the National Health Service Centre for Reviews and Dissemination, as well as consulting epidemiologists and others working in evidence-based medicine. The methodology team and NHLBI staff adapted these tools for this project.

The tools were designed to help reviewers focus on concepts that are key for evaluating the internal validity of a study. The tools were not designed to provide a list of factors comprising a numeric score; instead, they were specific to individual types of study designs. They are described in more detail below.

The tools included items for evaluating potential flaws in study methods or implementation, including sources of bias (e.g., patient selection, performance, attrition, and detection), confounding, study power, the strength of causality in the association between interventions and outcomes, and other factors. Quality reviewers could select “yes,” “no,” or “cannot determine/not reported/not applicable” in response to each item in the tool. For each item where “no” was selected, reviewers were instructed to consider the potential risk of bias that could be introduced by that flaw in the study design or implementation. “Cannot determine” and “not reported” were also noted as representing potential flaws.

Each of the six quality assessment tools has a detailed guidance document (except for the quality assessment tool for case series studies), which was also developed by the methodology team and NHLBI. The guidance documents are specific to each tool and provide detailed descriptions and examples about how to apply the items, as well as justifications for including each item. For some items, examples were provided to clarify the intent of the question and the appropriate rater response. The six quality assessment tools and five related guidance documents are included in Tables A-1 through A-6.

b Significance of the Quality Ratings of Good, Fair, or Poor

Using the quality assessment tools, reviewers rated each study as “good,” “fair,” or “poor” quality. Reviewers used the ratings on different items in the tool to assess the risk of bias in the study due to flaws in study design or implementation.

In general terms, a “good” study has the least risk of bias and results are considered to be valid. A “fair” study is susceptible to some bias deemed not sufficient to invalidate its results. The fair quality category is likely to be broad, so studies with this rating will vary in their strengths and weaknesses.

A “poor” rating indicates significant risk of bias. Studies rated poor were excluded from the body of evidence to be considered for each CQ. The only exception to this policy was if there was no other evidence available, then poor quality studies could be considered.

c Training for the Application of Quality Assessment Tools

The methodology team conducted a series of training sessions on using four of the quality assessment tools. Initial training consisted of 2-day, in-person training sessions. Reviewers trained in the quality rating were master's or doctorate-level staff with a background in public health or health sciences. Training sessions included instruction on identifying the correct study designs, the theory behind evidence-based research and quality assessment, explanations and rationales for the items in each tool, and methods for achieving overall judgments regarding quality ratings of “good,” “fair,” or “poor.” Participants practiced evaluating multiple articles, both with the instructors and during group work. They were also instructed to refer to related articles on study methods if such papers were cited in the articles being rated.

Following the in-person training sessions, the methodology team assigned several articles with pertinent study designs to test the abilities of each reviewer. The methodology team asked reviewers to individually identify the correct study design, complete the appropriate quality assessment tool, and submit it to the team for grading against a methodologist-developed key. Next, the reviewers participated in a second round of training sessions, conducted by telephone, to review results and resolve any remaining misinterpretations. Based on the results of these evaluations, a third round of exercises and training sessions was sometimes convened.

The quality assessment tools for the before-after and case series studies were used only for the Obesity Panel's CQ5, which addresses bariatric surgery interventions. This CQ included those types of study designs and related issues specific to this surgical intervention. As a result, a formal training program for using these quality assessment tools was not conducted; instead, reviewers for CQ5 received individual training.

d Quality Assessment Process

The systematic review team or methodology team rated each article that met a CQ's inclusion criteria. Two reviewers independently rated the quality of each article, using the appropriate tool. If the ratings differed, then the reviewers discussed the article in an effort to reach consensus. If they were unable to reach consensus, then a methodologist judged the quality of the article.

Two methodologists independently rated systematic reviews and meta-analyses. If ratings differed, then the reviewers discussed the article in an effort to reach consensus. If they were unable to reach consensus, then a third methodologist judged the quality.

After they received the initial quality rating, panel members could appeal the rating of a particular study or publication. However, to enhance the objectivity of the quality rating process, the final decision on quality ratings was made by the methodology team, not the panel members.

vi Quality Assessment Tool for Controlled Intervention Studies

Table A-1 shows the quality assessment tool for controlled intervention studies along with the guidance document for that tool. The methodology team and NHLBI developed this tool based in part on criteria from the AHRQ EPCs, the USPSTF, and the National Health Service Centre for Reviews and Dissemination.

This tool addresses 14 elements of quality assessment. They include randomization and allocation concealment, similarity of compared groups at baseline, use of intent-to-treat (ITT) analysis (i.e., analysis of all randomized patients even if some were lost to followup), adequacy of blinding, the overall percentage of subjects lost to followup, differential rates of loss to followup between the intervention and control groups, and other factors.

Table A-1. Quality Assessment Tool for Controlled Intervention Studies
image

vii Guidance for Assessing the Quality of Controlled Intervention Studies

The guidance document below is organized by question number from the tool for quality assessment of controlled intervention studies.

Question 1. Described as randomized

Was the study described as randomized? A study does not satisfy quality criteria as randomized simply because the authors call it randomized; however, it is a first step in determining if a study is randomized

Questions 2 and 3. Treatment allocation—two interrelated pieces

  • Adequate randomization: Randomization is adequate if it occurred according to the play of chance (e.g., computer generated sequence in more recent studies, or random number table in older studies).
  • Inadequate randomization: Randomization is inadequate if there is a preset plan (e.g., alternation where every other subject is assigned to treatment arm or another method of allocation is used, such as time or day of hospital admission or clinic visit, ZIP Code, phone number, etc.). In fact, this is not randomization at all—it is another method of assignment to groups. If assignment is not by the play of chance, then the answer to this question is no.

There may be some tricky scenarios that will need to be read carefully and considered for the role of chance in assignment. For example, randomization may occur at the site level, where all individuals at a particular site are assigned to receive treatment or no treatment. This scenario is used for group-randomized trials, which can be truly randomized, but often are “quasi-experimental” studies with comparison groups rather than true control groups. (Few, if any, group-randomized trials are anticipated for this evidence review.)

  • Allocation concealment: This means that one does not know in advance, or cannot guess accurately, to what group the next person eligible for randomization will be assigned. Methods include sequentially numbered opaque sealed envelopes, numbered or coded containers, central randomization by a coordinating center, computer-generated randomization that is not revealed ahead of time, etc.

Questions 4 and 5. Blinding

Blinding means that one does not know to which group—intervention or control—the participant is assigned. It is also sometimes called “masking.” The reviewer assessed whether each of the following was blinded to knowledge of treatment assignment: [1] the person assessing the primary outcome(s) for the study (e.g., taking the measurements such as blood pressure, examining health records for events such as myocardial infarction, reviewing and interpreting test results such as x ray or cardiac catheterization findings); [2] the person receiving the intervention (e.g., the patient or other study participant); and [3] the person providing the intervention (e.g., the physician, nurse, pharmacist, dietitian, or behavioral interventionist).

Generally placebo-controlled medication studies are blinded to patient, provider, and outcome assessors; behavioral, lifestyle, and surgical studies are examples of studies that are frequently blinded only to the outcome assessors because blinding of the persons providing and receiving the interventions is difficult in these situations. Sometimes the individual providing the intervention is the same person performing the outcome assessment. This was noted when it occurred.

Question 6. Similarity of groups at baseline

This question relates to whether the intervention and control groups have similar baseline characteristics on average especially those characteristics that may affect the intervention or outcomes. The point of randomized trials is to create groups that are as similar as possible except for the intervention(s) being studied in order to compare the effects of the interventions between groups. When reviewers abstracted baseline characteristics, they noted when there was a significant difference between groups. Baseline characteristics for intervention groups are usually presented in a table in the article (often Table 1).

Groups can differ at baseline without raising red flags if: [1] the differences would not be expected to have any bearing on the interventions and outcomes; or [2] the differences are not statistically significant. When concerned about baseline difference in groups, reviewers recorded them in the comments section and considered them in their overall determination of the study quality.

Questions 7 and 8. Dropout

“Dropouts” in a clinical trial are individuals for whom there are no end point measurements, often because they dropped out of the study and were lost to followup.

Generally, an acceptable overall dropout rate is considered 20 percent or less of participants who were randomized or allocated into each group. An acceptable differential dropout rate is an absolute difference between groups of 15 percentage points at most (calculated by subtracting the dropout rate of one group minus the dropout rate of the other group). However, these are general rates. Lower overall dropout rates are expected in shorter studies, whereas higher overall dropout rates may be acceptable for studies of longer duration. For example, a 6-month study of weight loss interventions should be expected to have nearly 100 percent followup (almost no dropouts—nearly everybody gets their weight measured regardless of whether or not they actually received the intervention), whereas a 10-year study testing the effects of intensive blood pressure lowering on heart attacks may be acceptable if there is a 20-25 percent dropout rate, especially if the dropout rate between groups was similar. The panels for the NHLBI systematic reviews may set different levels of dropout caps.

Conversely, differential dropout rates are not flexible; there should be a 15 percent cap. If there is a differential dropout rate of 15 percent or higher between arms, then there is a serious potential for bias. This constitutes a fatal flaw, resulting in a poor quality rating for the study.

Question 9. Adherence

Did participants in each treatment group adhere to the protocols for assigned interventions? For example, if Group 1 was assigned to 10 mg/day of Drug A, did most of them take 10 mg/day of Drug A? Another example is a study evaluating the difference between a 30-pound weight loss and a 10-pound weight loss on specific clinical outcomes (e.g., heart attacks), but the 30-pound weight loss group did not achieve its intended weight loss target (e.g., the group only lost 14 pounds on average). A third example is whether a large percentage of participants assigned to one group “crossed over” and got the intervention provided to the other group. A final example is when one group that was assigned to receive a particular drug at a particular dose had a large percentage of participants who did not end up taking the drug or the dose as designed in the protocol.

Question 10. Avoid other interventions

Changes that occur in the study outcomes being assessed should be attributable to the interventions being compared in the study. If study participants receive interventions that are not part of the study protocol and could affect the outcomes being assessed, and they receive these interventions differentially, then there is cause for concern because these interventions could bias results. The following scenario is another example of how bias can occur. In a study comparing two different dietary interventions on serum cholesterol, one group had a significantly higher percentage of participants taking statin drugs than the other group. In this situation, it would be impossible to know if a difference in outcome was due to the dietary intervention or the drugs.

Question 11. Outcome measures assessment

What tools or methods were used to measure the outcomes in the study? Were the tools and methods accurate and reliable—for example, have they been validated, or are they objective? This is important as it indicates the confidence you can have in the reported outcomes. Perhaps even more important is ascertaining that outcomes were assessed in the same manner within and between groups. One example of differing methods is self-report of dietary salt intake versus urine testing for sodium content (a more reliable and valid assessment method). Another example is using BP measurements taken by practitioners who use their usual methods versus using BP measurements done by individuals trained in a standard approach. Such an approach may include using the same instrument each time and taking an individual's BP multiple times. In each of these cases, the answer to this assessment question would be “no” for the former scenario and “yes” for the latter. In addition, a study in which an intervention group was seen more frequently than the control group, enabling more opportunities to report clinical events, would not be considered reliable and valid.

Question 12. Power calculation

Generally, a study's methods section will address the sample size needed to detect differences in primary outcomes. The current standard is at least 80 percent power to detect a clinically relevant difference in an outcome using a two-sided alpha of 0.05. Often, however, older studies will not report on power.

Question 13. Prespecified outcomes

Investigators should prespecify outcomes reported in a study for hypothesis testing—which is the reason for conducting an RCT. Without prespecified outcomes, the study may be reporting ad hoc analyses, simply looking for differences supporting desired findings. Investigators also should prespecify subgroups being examined. Most RCTs conduct numerous post hoc analyses as a way of exploring findings and generating additional hypotheses. The intent of this question is to give more weight to reports that are not simply exploratory in nature.

Question 14. Intention-to-treat analysis

Intention-to-treat (ITT) means everybody who was randomized is analyzed according to the original group to which they are assigned. This is an extremely important concept because conducting an ITT analysis preserves the whole reason for doing a randomized trial; that is, to compare groups that differ only in the intervention being tested. When the ITT philosophy is not followed, groups being compared may no longer be the same. In this situation, the study would likely be rated poor. However, if an investigator used another type of analysis that could be viewed as valid, this would be explained in the “other” box on the quality assessment form. Some researchers use a completers analysis (an analysis of only the participants who completed the intervention and the study), which introduces significant potential for bias. Characteristics of participants who do not complete the study are unlikely to be the same as those who do. The likely impact of participants withdrawing from a study treatment must be considered carefully. ITT analysis provides a more conservative (potentially less biased) estimate of effectiveness.

General Guidance for Determining the Overall Quality Rating of Controlled Intervention Studies

The questions on the assessment tool were designed to help reviewers focus on the key concepts for evaluating a study's internal validity. They are not intended to create a list that is simply tallied up to arrive at a summary judgment of quality.

Internal validity is the extent to which the results (effects) reported in a study can truly be attributed to the intervention being evaluated and not to flaws in the design or conduct of the study—in other words, the ability for the study to make causal conclusions about the effects of the intervention being tested. Such flaws can increase the risk of bias. Critical appraisal involves considering the risk of potential for allocation bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a rating of poor quality. Low risk of bias translates to a rating of good quality.

Fatal flaws: If a study has a “fatal flaw,” then risk of bias is significant, and the study is of poor quality. Examples of fatal flaws in RCTs include high dropout rates, high differential dropout rates, no ITT analysis or other unsuitable statistical analysis (e.g., completers-only analysis).

Generally, when evaluating a study, one will not see a “fatal flaw;” however, one will find some risk of bias. During training, reviewers were instructed to look for the potential for bias in studies by focusing on the concepts underlying the questions in the tool. For any box checked “no,” reviewers were told to ask: “What is the potential risk of bias that may be introduced by this flaw?” That is, does this factor cause one to doubt the results that were reported in the study?

NHLBI staff provided reviewers with background reading on critical appraisal, while emphasizing that the best approach to use is to think about the questions in the tool in determining the potential for bias in a study. The staff also emphasized that each study has specific nuances; therefore, reviewers should familiarize themselves with the key concepts.

viii Quality Assessment Tool for Systematic Reviews and Meta-Analyses

Table A-2 shows the quality assessment tool for systematic reviews and meta-analyses along with the guidance document for that tool. The methodology team and NHLBI developed this tool based in part on criteria from AHRQ's EPCs and the Cochrane Collaborative.

Table A-2. Quality Assessment Tool for Systematic Reviews and Meta-Analyses
image

This tool addresses eight elements of quality assessment. They include use of prespecified eligibility criteria, use of a comprehensive and systematic literature search process, dual review for abstracts and full-text articles, quality assessment of individual studies, assessment of publication bias, and other factors.

xi Guidance for Quality Assessment of Systematic Reviews and Meta-Analyses

A systematic review is a study that attempts to answer a question by synthesizing the results of primary studies while using strategies to limit bias and random error [424]. These strategies include a comprehensive search of all potentially relevant articles and the use of explicit, reproducible criteria in the selection of articles included in the review. Research designs and study characteristics are appraised, data are synthesized, and results are interpreted using a predefined systematic approach that adheres to evidence-based methodological principles.

Systematic reviews can be qualitative or quantitative. A qualitative systematic review summarizes the results of the primary studies but does not combine the results statistically. A quantitative systematic review, or meta-analysis, is a type of systematic review that employs statistical techniques to combine the results of the different studies into a single pooled estimate of effect, often given as an odds ratio.

The guidance document below is organized by question number from the tool for quality assessment of systematic reviews and meta-analyses.

Question 1. Focused question

The review should be based on a question that is clearly stated and well-formulated. An example would be a question that uses the PICO (population, intervention, comparator, outcome) format, with all components clearly described.

Question 2. Eligibility criteria

The eligibility criteria used to determine whether studies were included or excluded should be clearly specified and predefined. It should be clear to the reader why studies were included or excluded.

Question 3. Literature search

The search strategy should employ a comprehensive, systematic approach in order to capture all of the evidence possible that pertains to the question of interest. At a minimum, a comprehensive review has the following attributes:

  • Electronic searches were conducted using multiple scientific literature databases, such as MEDLINE, EMBASE, Cochrane Central Register of Controlled Trials, PsychLit, and others as appropriate for the subject matter.
  • Manual searches of references found in articles and textbooks should supplement the electronic searches.

Additional search strategies that may be used to improve the yield include the following:

  • Studies published in other countries
  • Studies published in languages other than English
  • Identification by experts in the field of studies and articles that may have been missed
  • Search of grey literature, including technical reports and other papers from government agencies or scientific groups or committees; presentations and posters from scientific meetings, conference proceedings, unpublished manuscripts; and others. Searching the grey literature is important (whenever feasible) because sometimes only positive studies with significant findings are published in the peer-reviewed literature, which can bias the results of a review.

In their reviews, researchers described the literature search strategy clearly, and ascertained it could be reproducible by others with similar results.

Question 4. Dual review for determining which studies to include and exclude

Titles, abstracts, and full-text articles (when indicated) should be reviewed by two independent reviewers to determine which studies to include and exclude in the review. Reviewers resolved disagreements through discussion and consensus or with third parties. They clearly stated the review process, including methods for settling disagreements.

Question 5. Quality appraisal for internal validity

Each included study should be appraised for internal validity (study quality assessment) using a standardized approach for rating the quality of the individual studies. Ideally, this should be done by at least two independent reviewers appraised each study for internal validity. However, there is not one commonly accepted, standardized tool for rating the quality of studies. So, in the research papers, reviewers looked for an assessment of the quality of each study and a clear description of the process used.

Question 6. List and describe included studies

All included studies were listed in the review, along with descriptions of their key characteristics. This was presented either in narrative or table format.

Question 7. Publication bias

Publication bias is a term used when studies with positive results have a higher likelihood of being published, being published rapidly, being published in higher impact journals, being published in English, being published more than once, or being cited by others [425, 426]. Publication bias can be linked to favorable or unfavorable treatment of research findings due to investigators, editors, industry, commercial interests, or peer reviewers. To minimize the potential for publication bias, researchers can conduct a comprehensive literature search that includes the strategies discussed in Question 3.

A funnel plot—a scatter plot of component studies in a meta-analysis—is a commonly used graphical method for detecting publication bias. If there is no significant publication bias, the graph looks like a symmetrical inverted funnel.

Reviewers assessed and clearly described the likelihood of publication bias.

Question 8. Heterogeneity

Heterogeneity is used to describe important differences in studies included in a meta-analysis that may make it inappropriate to combine the studies [427]. Heterogeneity can be clinical (e.g., important differences between study participants, baseline disease severity, and interventions); methodological (e.g., important differences in the design and conduct of the study); or statistical (e.g., important differences in the quantitative results or reported effects).

Researchers usually assess clinical or methodological heterogeneity qualitatively by determining whether it makes sense to combine studies. For example:

  • Should a study evaluating the effects of an intervention on CVD risk that involves elderly male smokers with hypertension be combined with a study that involves healthy adults ages 18 to 40? (Clinical Heterogeneity)
  • Should a study that uses a randomized controlled trial (RCT) design be combined with a study that uses a case-control study design? (Methodological Heterogeneity)

Statistical heterogeneity describes the degree of variation in the effect estimates from a set of studies; it is assessed quantitatively. The two most common methods used to assess statistical heterogeneity are the Q test (also known as the χ2 or chi-square test) or I2 test.

Reviewers examined studies to determine if an assessment for heterogeneity was conducted and clearly described. If the studies are found to be heterogeneous, the investigators should explore and explain the causes of the heterogeneity, and determine what influence, if any, the study differences had on overall study results.

x Quality Assessment Tool for Cohort and Cross-Sectional Studies

Table A-3 shows the quality assessment tool for cohort and cross-sectional studies along with the guidance document for that tool. The methodology team and NHLBI developed this tool based in part on criteria from AHRQ's EPCs, the USPSTF, consultation with epidemiologists, and other sources.

This tool addresses 13 elements of quality assessment. They include: clarity of the research question or research objective; definition, selection, composition, and participation of the study population; definition and assessment of exposure and outcome variables; measurement of exposures prior to outcome assessment; study timeframe and followup; study analysis and power; and other factors.

xi Guidance for Assessing the Quality of Cohort and Cross-Sectional Studies

The guidance document below is organized by question number from the tool for quality assessment of cohort and cross-sectional studies.

Question 1. Research question

To answer this question, reviewers asked: Did the authors describe their research goal? Is it easy to understand what they were looking to find? This issue is important for all types of scientific papers. Higher quality scientific research explicitly defines a research question.

Questions 2 and 3. Study population

Reviewers asked: Did the authors describe the group of individuals from which the study participants were selected or recruited, using demographics, location, and time period? If the authors conducted this study again, would they know whom to recruit, from where, and from what time period? Is the cohort population free of the outcome of interest at the time they were recruited?

An example would be men over 40 years old with type 2 diabetes who began seeking medical care at Phoenix Good Samaritan Hospital between January 1, 1990 and December 31, 1994. In this example the population is cleared described as [1] who (men over 40 years old with type 2 diabetes); [2] where (Phoenix Good Samaritan Hospital; and [3] when (between January 1, 1990 and December 31, 1994). Another example is women who were in the nursing profession ages 34 to 59 with no known coronary disease, stroke, cancer, hypercholesterolemia, or diabetes, recruited from the 11 most populous States, with contact information obtained from State nursing boards.

In cohort studies, it is crucial that the population at baseline is free of outcome of interest. For example, the nurses' population above would be an appropriate group in which to study incident coronary disease. This information is usually found either in descriptions of population recruitment, definitions of variables, or inclusion/exclusion criteria.

When needed, reviewers examined prior papers on methods in order to assess this question. They usually found the papers in the reference list.

If fewer than 50 percent of eligible persons participated in the study, then there is concern that the study population does not adequately represent the target population. This increases the risk of bias.

Question 4. Groups recruited from the same population and uniform eligibility criteria

Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same inclusion and exclusion criteria used for all of the subjects involved? This issue is related to the description of the study population described in the section above. Reviewers may find information for both of these questions in the same section of the paper.

Most cohort studies begin with selection of the cohort; participants in this cohort are then measured or evaluated for their exposure status. However, some cohort studies recruit or select exposed participants from a different time or place than that of unexposed participants, especially retrospective cohort studies. In these retrospective studies, data are obtained from the past (retrospectively), but the analysis examines exposures prior to outcomes. The following question addresses the similarity of populations: Are diabetic men with clinical depression at higher risk for cardiovascular disease than those without clinical depression? In this example, diabetic men with depression might be selected from a mental health clinic and diabetic men without depression might be selected from an internal medicine or endocrinology clinic. Because this study recruits groups from different clinic populations, the answer to Question 3 would be “no.” However, the selection of women nurses described in Question 2 were based on the same I/E criteria, so in that case the answer to Question 3 would be “yes.”

Question 5. Sample size justification

Specifically, Question 4 asks: Did the authors present their reasons for selecting or recruiting the number of individuals included or analyzed? Did they note or discuss the statistical power of the study and provide a target value? This question addresses whether the study had enough participants to detect an association if one truly existed.

Reviewers examined methods sections of articles for an explanation of the sample size needed to detect a hypothesized difference in outcomes. Reviewers examined discussion sections of articles for information on statistical power (i.e., the study had an 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Instead of sample size calculations, sometimes an article gives estimates of variance and/or estimates of effect size. In all these cases, the answer to Question 5 would be “yes.”

However, observational cohort studies often do not report anything about power or sample sizes because the analyses are exploratory in nature. In this case, the answer to Question 5 would be “no.” A lack of a report on power or sample size is not a “fatal flaw.” Instead it may indicate the researcher did not focus on whether the study was sufficiently sized to answer a prespecified question; it may have been an exploratory, hypothesis-generating study.

This question does not refer to a description of the manner in which different groups were included or excluded per the inclusion/exclusion criteria (e.g., “Final study size was 2,978 participants after exclusion of 756 patients with a history of MI,” is not considered a sample size justification for the purposes of this question.)

Table A-3. Quality Assessment Tool for Cohort and Cross-Sectional Studies
image

Question 6. Exposure assessed prior to outcome measurement

This question is important because in order to determine whether an exposure causes an outcome, the exposure must precede the outcome.

In some prospective cohort studies, investigators identify the cohort, then determine the exposure status of members of the cohort (large epidemiological studies like the Framingham Study use this approach). However, for other cohort studies, investigators select the cohort based on its exposure status, as in the example above of diabetic men with depression (the exposure being depression). Other examples include a cohort identified by its exposure to fluoridated drinking water and compared to a cohort living in an area without fluoridated water, or a cohort of military personnel exposed to combat in the Gulf War compared to a cohort of military personnel not deployed in a combat zone.

With either of these types of cohort studies, the investigator follows the cohort forward in time (i.e., prospectively) to assess the outcomes that occurred in the exposed compared to nonexposed members of the cohort. In other words, the investigator begins the study in the present by examining groups that were exposed or not exposed to some biological or behavioral factor, intervention, or other factor, then follows them forward in time to examine outcomes. If a cohort study is conducted properly, the answer to Question 6 should be “yes,” since the investigators determined the exposure status of members of the cohort at the beginning of the study, before the outcomes occurred.

For retrospective cohort studies, the same principal applies. The difference is that rather than identifying a cohort in the present and following it forward in time, investigators go back in time (i.e., retrospectively) and select a cohort based on its past exposure status. Then, they follow them forward to assess the outcomes that occurred in the exposed and nonexposed cohort members. In retrospective cohort studies, the exposure and outcomes may have already occurred (it depends on how long they follow the cohort); consequently, investigators need to ensure that the exposure preceded the outcome.

Sometimes in cross-sectional studies (or cross-sectional analyses of cohort study data) investigators measure exposures and outcomes during the same timeframe. As a result, cross-sectional analyses provide weaker evidence than regular cohort studies regarding a potential causal relationship between exposures and outcomes. For cross-sectional analyses, the answer to Question 5 would be “no.”

Question 7. Sufficient timeframe to see an effect

Did the study allow enough time for a sufficient number of outcomes to occur or be observed, or enough time for an exposure to have a biological effect on an outcome? The intent of Question 6 is to determine whether a study allowed enough time for a sufficient number of outcomes to occur or be observed, or enough time for an exposure to have a biological effect on an outcome. For example, if clinical depression has a biological effect on increasing risk for cardiovascular disease, such an effect may take years. Similarly, if higher dietary sodium increases BP, a short timeframe may be sufficient to assess its association with BP; however, a longer timeframe would be needed to examine its association with heart attacks.

Investigators must consider timeframe to conduct a meaningful analysis of the relationship between exposures and outcomes. Often, they must conduct a study for at least several years, especially when examining health outcomes. However, the timeframe depends on the research question and outcomes being examined. Cross-sectional analyses allow no time to see an effect, since the exposures and outcomes are assessed at the same time. So with this type of analysis, the answer to Question 7 would be “no.”

Question 8. Different levels of the exposure of interest

If the exposure can be defined as a range (e.g., range of drug dosages, amount of physical activity, or amount of sodium consumed), did the investigators assess multiple categories of that exposure? (For example, for a particular drug: was the person not on medication, on a low dose of medication, or on a high dose of medication? For physical activity, did the person not exercise, exercise less than 30 minutes per day, or exercise more than 30 minutes per day? For dietary sodium: did the person consume less than 1,500 mg per day, between 1,500 mg and 3,000 mg per day, or greater than 3,000 mg per day? Sometimes exposures are measured as continuous variables (e.g., actual mg per day of dietary sodium consumed or actual minutes of exercise per day) rather than discrete categories (e.g., low sodium versus high sodium diet; normal blood pressure versus high blood pressure).

In any case, studying different levels of exposure, when possible, enables investigators to assess trends or dose-response relationships between exposures and outcomes (e.g., the higher the exposure, the greater the rate of the health outcome). Trends or dose-response relationships lend credibility to the hypothesis of causality between exposure and outcome.

However, for some exposures, Question 8 may not be applicable (e.g., when the exposure is a dichotomous variable like living in a rural setting versus an urban setting, or being vaccinated or not being vaccinated with a one-time vaccine). If there are only two possible exposures (yes/no), then reviewers would have answered this question “NA.” This answer should not negatively affect the quality rating.

Question 9. Exposure measures and assessment

Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable—for example, have they been validated or are they objective? How Question 9 is answered can influence confidence in reported exposures. When investigators measure exposures with less accuracy or validity, it is difficult to observe an association between exposure and outcome, even if one exists. As important is whether they assessed exposures in the same manner within and between groups; if not, bias may result.

The following two examples illustrate how differing exposure measures can affect confidence in associations between exposure and outcome. The first addresses measurement of dietary salt intake. A study that prospectively uses a standardized dietary log and tests participants' urine for sodium content is more valid and reliable than one that retrospectively reviews self-reports of dietary salt intake. In this example, the reviewer would answer “yes” to Question 8 with the first method and “no” for the second one. The second example addresses BP measurement. A study that uses BP measurements from a practice that follows certain standards for example, uses trained BP assessors, standardized equipment (e.g., the same BP device which has been tested and calibrated), and a standardized protocol (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged)—is more reliable and valid than a study that uses measurements from a practice that does not have such standards in place. Again, the reviewer would answer “yes” to Question 8 with the first method and “no” for the second one.

This final example illustrates the importance of assessing exposures consistently across all groups. In a study comparing individuals with high BP (exposed cohort) with those with normal BP (nonexposed group), an investigator may note a higher incidence of CVD in those with high BP, concluding that high BP leads to more CVD events. Although this increase may be true, it also may be due to these individuals seeing their health care practitioners more frequently. With more frequent visits, there are increased opportunities for detecting and documenting changes in health outcomes, including CVD-related events. Thus, the increased number of visits can bias study results and lead to inaccurate conclusions.

Question 10. Repeated exposure assessment

Was the exposure for each person measured more than once during the course of the study period? Multiple measurements with the same result increase confidence that investigators correctly classified the exposure status. In addition, multiple measurements enable them to observe changes in exposure over time. The example of individuals who had a high dietary intake illustrates changes that can occur over time. Some may have had a high dietary sodium throughout the followup period. Others may have had a high intake initially and then reduced their intake, while still others may have had a low intake throughout the study. Once again, this example may not be applicable in all cases. In many older studies, exposure was measured only at baseline. However, multiple exposure measurements do result in a stronger study design.

Cross-sectional study design does not allow for repeated exposure assessment because there is no followup period, so the answer to question 10 should be “no” for cross-sectional analyses.

Question 11. Outcome measures

Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable—for example, have they been validated or are they objective? Answers to this question can influence confidence in reported exposures. These answers also can help determine whether the outcomes were assessed in the same manner within and between groups.

An example of an outcome measure that is objective, accurate, and reliable is death. But even with a measure as objective as death, differences can exist in the accuracy and reliability of how investigators assess death. For example, did they base outcomes on an autopsy report, death certificate, death registry, or report from a family member? A study on the relationship between dietary fat intake and blood level cholesterol in which fasting blood samples used to measure cholesterol were all sent to the same laboratory illustrates outcomes that would be considered objective, accurate, and reliable. This example would get a “yes.” However, outcomes in studies in which research participants self-reported they had a heart attack or self-reported how much they weighed would be considered questionable and would get a “no.”.

Similar to the final example in Question 10, results may be biased if one group (e.g., individuals with high BP) is seen more frequently than another group (individuals with normal BP); more frequent encounters with the health care system increase the chances of outcomes being detected and documented.

Question 12. Blinding of outcome assessors

Blinding or masking means that outcome assessors did not know whether participants were exposed or unexposed. To answer this question, the reviewer examined the article for evidence that the person(s) assessing the study outcome(s) (outcome assessor) was masked to the exposure status of the research participants. An outcome assessor, for example, may examine medical records to determine outcomes that occurred in the exposed and comparison groups. Sometimes, the person measuring the exposure is the same person conducting the outcome assessment. In this case, the assessor would most likely not be blinded to exposure status. A reviewer would note such a finding in the comments section.

In assessing this criterion, the reviewers determined whether it was likely that the outcome assessors knew the exposure status of the study participants. If not, then blinding was adequate. The following example depicts how adequate blinding of the outcome assessors can be done. Investigators created a separate committee whose members were not involved in the care of the patient and had no information about the study participants' exposure status. Following a study protocol, committee members reviewed copies of participants' medical records, which had been stripped of any potential exposure information or personally identifiable information, for prespecified outcomes. If blinding was not possible, which is sometimes the case, the reviewers marked Question 12 “NA” and explained the potential for bias.

Question 13. Followup rate

Higher overall followup rates are always desirable to lower followup rates. Although higher rates are expected in studies of short duration, lower rates are often seen in studies of longer duration. Usually an acceptable overall followup rate is considered 80 percent or more of participants whose exposures were measured at baseline. However, this rate is just considered a general guideline. For example, a 6-month cohort study examining the relationship between dietary sodium intake and BP level may have over 90 percent followup; whereas, a 20-year cohort study examining the effects of sodium intake on stroke may have only a 65 percent followup rate.

Cross-sectional study design does not incorporate a followup period, so the answer to question 13 should be “no” for cross-sectional analyses.

Question 14. Statistical analyses

Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Investigators often use logistic regression or other regression methods to account for the influence of variables not of interest.

This is a key issue in cohort studies: statistical analyses need to control for potential confounders, in contrast to RCTs in which the randomization process controls for potential confounders. In their analysis, investigators need to control for all key factors that may be associated with both the exposure of interest and the outcome and are not of interest to the research question. For example, a study of the relationship between cardiorespiratory fitness and CVD events (heart attacks and strokes) should control for age, BP, blood cholesterol, and body weight. All these factors are associated with both low fitness and CVD events. Well-done cohort studies control for multiple potential confounders.

General Guidance for Determining the Overall Quality Rating of Cohort and Cross-Sectional Studies

The questions in the assessment tool were designed to help reviewers focus on key concepts for evaluating a study's internal validity, instead of being used as a list from which to add up items to judge a study's quality. Internal validity for cohort studies is the extent to which the results reported in a study can truly be attributed to the exposure being evaluated, rather than to flaws in the design or conduct of a study—in other words, the ability of the study to draw associative conclusions about the effects of the exposures being studied on outcomes. Such flaws can increase the risk of bias.

Critical appraisal involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a poor quality rating, while low risk of bias translates to a good quality rating. Again, the greater the risk of bias, the lower the quality rating of the study.

The more a study design addresses issues affecting a causal relationship between the exposure and outcome, the higher quality the study. Issues include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding.

Generally, in evaluating a study, one will not see a “fatal flaw,” but will find some risk of bias. To assess potential for bias, reviewers focused on concepts underlying the questions in the quality assessment tool. For any box checked “no,” reviewers asked: “What is the potential risk of bias that may be introduced by this flaw in study design or execution?” That is, did this factor cause them to doubt the study results or doubt the ability of the study to accurately assess an association between exposure and outcome?

In summary, NHLBI staff stressed that the best approach was to examine the questions in the tool and assess the

xii Quality Assessment Tool for Case-Control Studies

Table A-4 shows the quality assessment tool for case-control studies along with the guidance document for that tool. The methodology team and NHLBI developed this tool based in part on criteria from AHRQ's EPCs, consultation with epidemiologists, and other factors. This tool includes 12 items for assessment of study quality. They include: clarity of the research objective or research question; definition, selection, composition, and participation of the study population; definition and assessment of case or control status; exposure, and outcome variables; use of concurrent controls; confirmation that the exposure occurred prior to the outcome; statistical power; and other factors.

xiii Guidance for Assessing the Quality of Case-Control Studies

The guidance document below is organized by question number from the tool for quality assessment of case-control studies.

Question 1. Research question

Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. High quality scientific research explicitly defines a research question.

Question 2. Study population

Did the authors describe the group of individuals from which the cases and controls were selected or recruited, while using demographics, location, and time period? If the investigators conducted this study again, would they know exactly who to recruit, from where, and from what time period?

Investigators identify case-control study populations by location, time period, and inclusion criteria for cases (individuals with the disease, condition, or problem) and controls (individuals without the disease, condition, or problem). For example, the population for a study of lung cancer and chemical exposure would be all incident cases of lung cancer diagnosed in patients ages 35 to 79, from January 1, 2003 to December 31, 2008, living in Texas during that entire time period, as well as controls without lung cancer recruited from the same population during the same time period. The population is clearly described as: [1] who (men and women ages 35 to 79 with (cases) and without (controls) incident lung cancer); [2] where (living in Texas); and [3] when (between January 1, 2003 and December 31, 2008).

Other studies may use disease registries or data from cohort studies to identify cases. In these cases, the populations are individuals who live in the area covered by the disease registry or included in a cohort study (i.e., nested case-control or case-cohort). For example, a study of the relationship between vitamin D intake and myocardial infarction might use patients identified via the GRACE registry, a database of heart attack patients.

NHLBI staff encouraged reviewers to examine prior papers on methods (listed in the reference list) to make this assessment, if necessary.

Question 3. Target population and case representation

In order for a study to truly address the research question, the target population—the population from which the study population is drawn and to which study results are believed to apply—should be carefully defined. Some authors may compare characteristics of the study cases to characteristics of cases in the target population, either in text or in a table. When study cases are shown to be representative of cases in the appropriate target population, it increases the likelihood that the study was well-designed per the research question.

Table A-4. Quality Assessment Tool for Case-Control Studies
image

However, because these statistics are frequently difficult or impossible to measure, publications should not be penalized if case representation is not shown. For most papers, the response to question 3 will be “NR.” Those subquestions are combined because the answer to the second subquestion—case representation—determines the response to this item. However, it cannot be determined without considering the response to the first subquestion. For example, if the answer to the first subquestion is “yes,” and the second, “CD,” then the response for item 3 is “CD.”

Question 4. Sample size justification

Did the authors discuss their reasons for selecting or recruiting the number of individuals included? Did they discuss the statistical power of the study and provide a sample size calculation to ensure that the study is adequately powered to detect an association (if one exists)? This question does not refer to a description of the manner in which different groups were included or excluded using the inclusion/exclusion criteria (e.g., “Final study size was 1,378 participants after exclusion of 461 patients with missing data” is not considered a sample size justification for the purposes of this question.)

An article's methods section usually contains information on sample size and the size needed to detect differences in exposures and on statistical power.

Question 5. Groups recruited from the same population

To determine whether cases and controls were recruited from the same population, one can ask hypothetically, “If a control was to develop the outcome of interest (the condition that was used to select cases), would that person have been eligible to become a case?” Case-control studies begin with the selection of the cases (those with the outcome of interest, e.g., lung cancer) and controls (those in whom the outcome is absent). Cases and controls are then evaluated and categorized by their exposure status. For the lung cancer example, cases and controls were recruited from hospitals in a given region. One may reasonably assume that controls in the catchment area for the hospitals, or those already in the hospitals for a different reason, would attend those hospitals if they became a case; therefore, the controls are drawn from the same population as the cases. If the controls were recruited or selected from a different region (e.g., a State other than Texas) or time period (e.g., 1991-2000), then the cases and controls were recruited from different populations, and the answer to this question would be “no.”

The following example further explores selection of controls. In a study, eligible cases were men and women, ages 18 to 39, who were diagnosed with atherosclerosis at hospitals in Perth, Australia, between July 1, 2000 and December 31, 2007. Appropriate controls for these cases might be sampled using voter registration information for men and women ages 18 to 39, living in Perth (population-based controls); they also could be sampled from patients without atherosclerosis at the same hospitals (hospital-based controls). As long as the controls are individuals who would have been eligible to be included in the study as cases (if they had been diagnosed with atherosclerosis), then the controls were selected appropriately from the same source population as cases.

In a prospective case-control study, investigators may enroll individuals as cases at the time they are found to have the outcome of interest; the number of cases usually increases as time progresses. At this same time, they may recruit or select controls from the population without the outcome of interest. One way to identify or recruit cases is through a surveillance system. In turn, investigators can select controls from the population covered by that system. This is an example of population-based controls. Investigators also may identify and select cases from a cohort study population and identify controls from outcome-free individuals in the same cohort study. This is known as a nested case-control study.

Question 6. Inclusion and exclusion criteria prespecified and applied uniformly

Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same underlying criteria used for all of the groups involved? To answer this question, reviewers determined if the investigators developed I/E criteria prior to recruitment or selection of the study population and if they used the same underlying criteria for all groups. The investigators should have used the same selection criteria, except for study participants who had the disease or condition, which would be different for cases and controls by definition. Therefore, the investigators use the same age (or age range), gender, race, and other characteristics to select cases and controls. Information on this topic is usually found in a paper's section on the description of the study population.

Question 7. Case and control definitions

For this question, reviewers looked for descriptions of the validity of case and control definitions and processes or tools used to identify study participants as such. Was a specific description of “case” and “control” provided? Is there a discussion of the validity of the case and control definitions and the processes or tools used to identify study participants as such? They determined if the tools or methods were accurate, reliable, and objective. For example, cases might be identified as “adult patients admitted to a VA hospital from January 1, 2000 to December 31, 2009, with an ICD-9 discharge diagnosis code of acute myocardial infarction and at least one of the two confirmatory findings in their medical records: at least 2mm of ST elevation changes in two or more ECG leads and an elevated troponin level. Investigators might also use ICD-9 or CPT codes to identify patients. All cases should be identified using the same methods. Unless the distinction between cases and controls is accurate and reliable, investigators cannot use study results to draw valid conclusions.

Question 8. Random selection of study participants

If a case-control study did not use 100 percent of eligible cases and/or controls (e.g., not all disease-free participants were included as controls), did the authors indicate that random sampling was used to select controls? When it is possible to identify the source population fairly explicitly (e.g., in a nested case-control study, or in a registry-based study), then random sampling of controls is preferred. When investigators used consecutive sampling, which is frequently done for cases in prospective studies, then study participants are not considered randomly selected. In this case, the reviewers would answer “no” to Question 8. However, this would not be considered a fatal flaw.

If investigators included all eligible cases and controls as study participants, then reviewers marked “NA” in the tool. If 100 percent of cases were included (e.g., NA for cases) but only 50 percent of eligible controls, then the response would be “yes” if the controls were randomly selected, and “no” if they were not. If this cannot be determined, the appropriate response is “CD.”

Question 9. Concurrent controls

A concurrent control is a control selected at the time another person became a case, usually on the same day. This means that one or more controls are recruited or selected from the population without the outcome of interest at the time a case is diagnosed. Investigators can use this method in both prospective case-control studies and retrospective case-control studies. For example, in a retrospective study of adenocarcinoma of the colon using data from hospital records, if hospital records indicate that Person A was diagnosed with adenocarcinoma of the colon on June 22, 2002, then investigators would select one or more controls from the population of patients without adenocarcinoma of the colon on that same day. This assumes they conducted the study retrospectively, using data from hospital records. The investigators could have also conducted this study using patient records from a cohort study, in which case it would be a nested case-control study.

Investigators can use concurrent controls in the presence or absence of matching and vice versa. A study that uses matching does not necessarily mean that concurrent controls were used.

Question 10. Exposure assessed prior to outcome measurement

Investigators first determine case or control status (based on presence or absence of outcome of interest), and then assess exposure history of the case or control; therefore, reviewers ascertained that the exposure preceded the outcome. For example, if the investigators used tissue samples to determine exposure, did they collect them from patients prior to their diagnosis? If hospital records were used, did investigators verify that the date a patient was exposed (e.g., received medication for atherosclerosis) occurred prior to the date they became a case (e.g., was diagnosed with type 2 diabetes)? For an association between an exposure and an outcome to be considered causal, the exposure must have occurred prior to the outcome.

Question 11. Exposure measures and assessment

Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable—for example, have they been validated or are they objective? This is important, as it influences confidence in the reported exposures. Equally important is whether the exposures were assessed in the same manner within groups and between groups. This question pertains to bias resulting from exposure misclassification (i.e., exposure ascertainment).

For example, a retrospective self-report of dietary salt intake is not as valid and reliable as prospectively using a standardized dietary log plus testing participants' urine for sodium content because participants' retrospective recall of dietary salt intake may be inaccurate and result in misclassification of exposure status. Similarly, BP results from practices that use an established protocol for measuring BP would be considered more valid and reliable than results from practices that did not use standard protocols. A protocol may include using trained BP assessors, standardized equipment (e.g., the same BP device which has been tested and calibrated), and a standardized procedure (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged).

Question 12. Blinding of exposure assessors

Blinding or masking means that outcome assessors did not know whether participants were exposed or unexposed. To answer this question, reviewers examined articles for evidence that the outcome assessor (s) was masked to the exposure status of the research participants. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would most likely not be blinded to exposure status. A reviewer would note such a finding in the comments section of the assessment tool.

One way to ensure good blinding of exposure assessment is to have a separate committee, whose members have no information about the study participants' status as cases or controls, review research participants' records. To help answer the question above, reviewers determined if it was likely that the outcome assessor knew whether the study participant was a case or control. If it was unlikely, then the reviewers marked “no” to Question 12. Outcome assessors who used medical records to assess exposure should not have been directly involved in the study participants' care, since they probably would have known about their patients' conditions. If the medical records contained information on the patient's condition that identified him/her as a case (which is likely), that information would have had to be removed before the exposure assessors reviewed the records.

If blinding was not possible, which sometimes happens, the reviewers marked “NA” in the assessment tool and explained the potential for bias.

Question 13. Statistical analysis

Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Investigators often use logistic regression or other regression methods to account for the influence of variables not of interest.

This is a key issue in case-controlled studies; statistical analyses need to control for potential confounders, in contrast to RCTs in which the randomization process controls for potential confounders. In the analysis, investigators need to control for all key factors that may be associated with both the exposure of interest and the outcome and are not of interest to the research question.

A study of the relationship between smoking and CVD events illustrates this point. Such a study needs to control for age, gender, and body weight; all are associated with smoking and CVD events. Well-done case-control studies control for multiple potential confounders.

Matching is a technique used to improve study efficiency and control for known confounders. For example, in the study of smoking and CVD events, an investigator might identify cases that have had a heart attack or stroke and then select controls of similar age, gender, and body weight to the cases. For case-control studies, it is important that if matching was performed during the selection or recruitment process, the variables used as matching criteria (e.g., age, gender, race) should be controlled for in the analysis.

General Guidance for Determining the Overall Quality Rating of Case-Controlled Studies

NHLBI designed the questions in the assessment tool to help reviewers focus on the key concepts for evaluating a study's internal validity, not to use as a list from which to add up items to judge a study's quality.

Internal validity for case-control studies is the extent to which the associations between disease and exposure reported in the study can truly be attributed to the exposure being evaluated rather than to flaws in the design or conduct of the study. In other words, what is ability of the study to draw associative conclusions about the effects of the exposures on outcomes? Any such flaws can increase the risk of bias.

In critical appraising a study, the following factors need to be considered: risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a poor quality rating; low risk of bias translates to a good quality rating. Again, the greater the risk of bias, the lower the quality rating of the study.

In addition, the more attention in the study design to issues that can help determine whether there is a causal relationship between the outcome and the exposure, the higher the quality of the study. These include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding—all concepts reflected in the tool.

If a study has a “fatal flaw,” then risk of bias is significant; therefore, the study is deemed to be of poor quality. An example of a fatal flaw in case-control studies is a lack of a consistent standard process used to identify cases and controls.

Generally, when reviewers evaluated a study, they did not see a “fatal flaw,” but instead found some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers examined the potential for bias in the study. For any box checked “no,” reviewers asked, “What is the potential risk of bias resulting from this flaw in study design or execution?” That is, did this factor lead to doubt about the results reported in the study or the ability of the study to accurately assess an association between exposure and outcome?

By examining questions in the assessment tool, reviewers were best able to assess the potential for bias in a study. Specific rules were not useful, as each study had specific nuances. In addition, being familiar with the key concepts helped reviewers assess the studies. Examples of studies rated good, fair, and poor were useful, yet each study had to be assessed on its own.

xiv Quality Assessment Tool for Before-After Studies

Table A-5 shows the quality assessment tool for before-after (pre-post) studies along with the guidance document for that tool. The methodology team and NHLBI developed this tool based in part on criteria from AHRQ's EPCs, other papers addressing quality assessment of similar studies, and other factors.

This tool includes 12 items for assessment of study quality. They include: clarity of the research objective or research question; definition, selection, composition, and participation of the study population; definition and assessment of intervention and outcome variables; adequacy of blinding; statistical methods; and other factors.

xv Guidance for Assessing the Quality of Before-After (Pre-Post) Studies With No Control Group

The guidance document below is organized by question number from the tool for quality assessment of controlled intervention studies.

Question 1. Study question

Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. Higher quality scientific research explicitly defines a research question.

Question 2. Eligibility criteria and study population

Did the authors describe the eligibility criteria applied to the individuals from whom the study participants were selected or recruited? In other words, if the investigators were to conduct this study again, would they know whom to recruit, from where, and from what time period?

Here is a sample description of a study population: men over age 40 with type 2 diabetes, who began seeking medical care at Phoenix Good Samaritan Hospital, between January 1, 2005 and December 31, 2007. The population is clearly described as: [1] who (men over age 40 with type 2 diabetes); [2] where (Phoenix Good Samaritan Hospital); and [3] when (between January 1, 2005 and December 31, 2007). Another sample description is women who were in the nursing profession, who were ages 34 to 59 in 1995, had no known CHD, stroke, cancer, hypercholesterolemia, or diabetes, and were recruited from the 11 most populous States, with contact information obtained from State nursing boards.

To assess this question, reviewers examined prior papers on study methods (listed in reference list) when necessary.

Question 3. Study participants representative of clinical populations of interest

The participants in the study should be generally representative of the population in which the intervention will be broadly applied. Studies on small demographic subgroups may raise concerns about how the intervention will affect broader populations of interest. For example, interventions that focus on very young or very old individuals may affect middle-aged adults differently. Similarly, researchers may not be able to extrapolate study results from patients with severe chronic diseases to healthy populations.

Table A-5. Quality Assessment Tool for Before-After (Pre-Post) Studies With No Control Group
image

Question 4. All eligible participants enrolled

To further explore this question, reviewers may need to ask: Did the investigators develop the I/E criteria prior to recruiting or selecting study participants? Were the same underlying I/E criteria used for all research participants? Were all subjects who met the I/E criteria enrolled in the study?

Question 5. Sample size

Did the authors present their reasons for selecting or recruiting the number of individuals included or analyzed? Did they note or discuss the statistical power of the study? This question addresses whether there was a sufficient sample size to detect an association, if one did exist.

An article's methods section may provide information on the sample size needed to detect a hypothesized difference in outcomes and a discussion on statistical power (such as, the study had 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Sometimes estimates of variance and/or estimates of effect size are given, instead of sample size calculations. In any case, if the reviewers determined that the power was sufficient to detect the effects of interest, then they would answer “yes” to Question 5.

Question 6. Intervention clearly described

Another pertinent question regarding interventions is: Was the intervention clearly defined in detail in the study? Did the authors indicate that the intervention was consistently applied to the subjects? Did the research participants have a high level of adherence to the requirements of the intervention? For example, if the investigators assigned a group to 10 mg/day of Drug A, did most participants in this group take the specific dosage of Drug A? Or did a large percentage of participants end up not taking the specific dose of Drug A indicated in the study protocol?

Reviewers ascertained that changes in study outcomes could be attributed to study interventions. If participants received interventions that were not part of the study protocol and could affect the outcomes being assessed, the results could be biased.

Question 7. Outcome measures clearly described, valid, and reliable

Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable—for example, have they been validated or are they objective? This question is important because the answer influences confidence in the validity of study results.

An example of an outcome measure that is objective, accurate, and reliable is death—the outcome measured with more accuracy than any other. But even with a measure as objective as death, differences can exist in the accuracy and reliability of how investigators assessed death. For example, did they base it on an autopsy report, death certificate, death registry, or report from a family member? Another example of a valid study is one whose objective is to determine if dietary fat intake affects blood cholesterol level (cholesterol level being the outcome) and in which the cholesterol level is measured from fasting blood samples that are all sent to the same laboratory. These examples would get a “yes.”

An example of a “no” would be self-report by subjects that they had a heart attack, or self-report of how much they weight (if body weight is the outcome of interest).

Question 8. Blinding of outcome assessors

Blinding or masking means that the outcome assessors did not know whether the participants received the intervention or were exposed to the factor under study. To answer the question above, the reviewers examined articles for evidence that the person(s) assessing the outcome(s) was masked to the participants' intervention or exposure status. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person applying the intervention or measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would not likely be blinded to the intervention or exposure status. A reviewer would note such a finding in the comments section of the assessment tool.

In assessing this criterion, the reviewers determined whether it was likely that the person(s) conducting the outcome assessment knew the exposure status of the study participants. If not, then blinding was adequate. An example of adequate blinding of the outcome assessors is to create a separate committee whose members were not involved in the care of the patient and had no information about the study participants' exposure status. Using a study protocol, committee members would review copies of participants' medical records, which would be stripped of any potential exposure information or personally identifiable information, for prespecified outcomes.

Question 9. Followup rate

Higher overall followup rates are always desirable to lower followup rates, although higher rates are expected in shorter studies, and lower overall followup rates are often seen in longer studies. Usually an acceptable overall followup rate is considered 80 percent or more of participants whose interventions or exposures were measured at baseline. However, this is a general guideline.

In accounting for those lost to followup, in the analysis, investigators may have imputed values of the outcome for those lost to followup or used other methods. For example, they may carry forward the baseline value or the last observed value of the outcome measure and use these as imputed values for the final outcome measure for research participants lost to followup.

Question 10. Statistical analysis

Were formal statistical tests used to assess the significance of the changes in the outcome measures between the before and after time periods? The reported study results should present values for statistical tests, such as p values, to document the statistical significance (or lack thereof) for the changes in the outcome measures found in the study.

Question 11. Multiple outcome measures

Were the outcome measures for each person measured more than once during the course of the before and after study periods? Multiple measurements with the same result increase confidence that the outcomes were accurately measured.

Question 12. Group-level interventions and individual-level outcome efforts

Group-level interventions are usually not relevant for clinical interventions such as bariatric surgery, in which the interventions are applied at the individual patient level. In those cases, the questions were coded as “NA” in the assessment tool.

General Guidance for Determining the Overall Quality Rating of Before-After Studies

The questions in the quality assessment tool were designed to help reviewers focus on the key concepts for evaluating the internal validity of a study. They are not intended to create a list from which to add up items to judge a study's quality.

Internal validity is the extent to which the outcome results reported in the study can truly be attributed to the intervention or exposure being evaluated, and not to biases, measurement errors, or other confounding factors that may result from flaws in the design or conduct of the study. In other words, what is the ability of the study to draw associative conclusions about the effects of the interventions or exposures on outcomes?

Critical appraisal of a study involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues throughout the questions above. High risk of bias translates to a rating of poor quality; low risk of bias translates to a rating of good quality. Again, the greater the risk of bias, the lower the quality rating of the study.

In addition, the more attention in the study design to issues that can help determine if there is a causal relationship between the exposure and outcome, the higher quality the study. These issues include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, and sufficient timeframe to see an effect.

Generally, when reviewers evaluate a study, they will not see a “fatal flaw,” but instead will find some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers should ask themselves about the potential for bias in the study they are critically appraising. For any box checked “no” reviewers should ask, “What is the potential risk of bias resulting from this flaw in study design or execution?” That is, does this factor lead to doubt about the results reported in the study or doubt about the ability of the study to accurately assess an association between the intervention or exposure and the outcome?

The best approach is to think about the questions in the assessment tool and how each one reveals something about the potential for bias in a study. Specific rules are not useful, as each study has specific nuances. In addition, being familiar with the key concepts will help reviewers be more comfortable with critical appraisal. Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own.

xvi Quality Assessment Tool for Case Series Studies

Table A-6 shows the quality assessment tool for case series studies. The methodology team and NHLBI developed this tool based in part on criteria from AHRQ's EPCs, other papers addressing quality assessment of similar studies, and other factors.

This tool includes nine items for assessment of study quality. They include: clarity of the research objective or research question; definition, selection, composition, and participation of the study population, definition and assessment of intervention and outcome variables, statistical methods, and other factors.

Data Abstraction and Review Process

Articles rated good or fair during the quality rating process were abstracted into the VCW using a Web-based data entry form. Requirements for abstraction were specified in an evidence table template that the methodologist developed for each CQ. The evidence table template included data elements relevant to the CQ such as study characteristics, interventions, population demographics, and outcomes.

The abstractor carefully read the article and entered the required information into the Web-based tool. Once abstraction was complete, an independent quality control review was conducted. During this review, data were checked for accuracy, completeness, and the use of standard formatting.

xvii Development of Evidence Tables and Summary Tables

a Evidence Tables

For each CQ, methodologists worked with the expert panel or work group members to identify the key data elements needed to answer the question. Using the PICOTS criteria as the foundation, expert panel or work group members determined what information was needed from each study to be able to understand the design, sample, and baseline characteristics in order to interpret the outcomes of interest. A template for a standard evidence table was created and then populated with data from several example studies for the expert panel or work group to review. This was done to ensure that all appropriate study characteristics were being considered. Once a final template was agreed upon, evidence tables were generated by pulling the appropriate data elements from the master abstraction database for those studies that met the inclusion criteria for the CQ.

Only studies rated “good” and “fair” were included in the evidence tables.

Templates varied by each individual CQ but generally provided the following information:

  • Study characteristics: Author, year, study name, country and setting, funding, study design, research objective, year study began, overall study N, quality rating
  • Criteria and end points: I/E criteria, primary outcome, secondary outcome, composite outcome definitions
  • Study design details: Treatment groups, descriptions of interventions, duration of treatment, duration of followup, run-in, wash-out, sample size
  • Baseline population characteristics: Demographics, biomarkers, other measures relevant to the outcomes
  • Results: Outcomes of interest for the CQ with between group p values or confidence intervals for risk ratios, adverse events, attrition, and adherence
Table A-6. Quality Assessment Tool for Case Series Studies
image

Studies are presented in alphabetical order by study name (if none, the first author's last name was used). Some expert panels combined all the articles for a study and presented it as a single entry, but for those that did not, the articles were presented in chronological order within the group for the same study.

b Summary Tables

To enable a more targeted focus on the specific aspects of a CQ, methodologists developed summary tables, or abbreviated evidence tables, in concert with the panels or work groups. A summary table might be designed to address a general population or a specific subpopulation, such as individuals with diabetes, women, or the elderly, but it only presents concise data elements. All available data in the evidence tables were reviewed for a consistent format to present the specific outcome of interest. For example, some lifestyle interventions have lengthy descriptions in the evidence tables, but only key features were concisely stated in the summary tables. Within an outcome, the time periods were clearly identified and the order of the different measures was consistently applied. For example, weight loss is always listed in order of percentage change in body weight, followed by kilogram change, and lastly by proportion of subjects losing a certain percent of their body weight. Templates varied by each aspect of the CQ being addressed but generally provide the following information:

  • Study characteristics: Study name, author/year, design, overall study N, quality rating
  • Sample characteristics: Relevant inclusion criteria
  • Study design details: Intervention doses and duration
  • Results: Change in outcomes by time periods, attrition, and adherence

Each panel or work group determined its own ordering of studies to present the evidence within each summary table. For some, trials were listed in chronological order, for others it was by the type or characteristics of the intervention.

xviii Process for the Development of Evidence Statements and Expert Panel Voting

Using the summary tables (and evidence tables as needed), panel members collaboratively wrote the evidence statements with input from methodology staff and oversight of the process by NHLBI staff. Evidence statements aimed to summarize key messages from the evidence that could be provided to primary care providers and other stakeholders. In some cases, the evidence was too limited or inconclusive, so no evidence statement was developed, or a statement of insufficient evidence was made.

Methodology staff provided the expert panels with overarching guidance on how to grade the level of evidence (high, moderate, low), and the panels used this guidance to grade each evidence statement. This guidance is documented in the following section.

Beginning in September 2011, the GEC set up its own approach to manage relationships with industry and other potential conflicts of interest (see http://www.nhlbi.nih.gov/guidelines/cvd_adult/coi-rwi_policy.htm).

Panel members having relationships with industry (RWI) or other possible conflicts of interest (COI) were allowed to participate in discussions leading up to voting as long as they declared their relationships, but they recused themselves from voting on any issue relating to their RWI or potential COI. Voting occurred by a panel chair asking each member to signify his or her vote. NHLBI project staff, methodologists, and contractors did not vote.

Voting could be open so that differing viewpoints could be identified easily and facilitate further discussion and revisions to address areas of disagreement (e.g., by crafting language or dividing an evidence statement into more than one statement). Voting also could be by confidential ballot if the group so chose.

A record of the vote count (for, against, or recusal) was made without attribution. The ideal was 100 percent consensus, but a two-thirds majority was considered acceptable. In cases where a two-thirds majority was not reached in the initial vote, further discussion and clarification was used to create a consensus majority.

xix Description of Methods for Grading the Body of Evidence

NHBLI's Adult Cardiovascular Disease Systematic Evidence Review Project applied related but distinct processes for grading the bodies of evidence for CQs and for bodies of evidence for different outcomes included within CQs. Each of these processes is described in turn below.

a Grading the Body of Evidence

In developing the system for grading the body of evidence, NHLBI reviewed the following systems: Grading of Recommendations, Assessment, Development, and Evaluation (GRADE); USPSTF; American College of Cardiology/American Heart Association (ACC/AHA); American Academy of Pediatrics; Strength of Recommendation Taxonomy; Canadian Task Force on Preventive Health Care; Scottish Intercollegiate Guidelines Network; and Center for Evidence-Based Medicine in Oxford. In particular, GRADE, USPSTF, and ACC/AHA were considered at length. However, none of those systems fully met the needs of the NHLBI project. NHLBI, therefore, developed its own hybrid version that incorporated features of those systems. The expert panel and work group members strongly supported the resulting system and, with the methodology team, used it to decide about evidence ratings.

Two approaches were used for summarizing the body of evidence for each CQ. The first process was to conduct a de novo literature search and literature review for all of the individual studies that met a CQ's I/E criteria. This approach was used for most of the CQs. The second process, developed in response to resource limitations for the overall project, was to focus the literature search on existing systematic reviews and meta-analyses, that themselves summarized a broad range of the scientific literature. This was used for several CQs across expert panels and work groups. Additional information on the use of systematic reviews and meta-analyses is provided in the following section

Once the panel and work group members reached consensus on the wording of the evidence statement, the next step was to assign assigned a grade to the strength of the body of evidence to provide guidance to primary care providers and other stakeholders about the degree of support the evidence provides for the evidence statement. Three options were identified for grades for the strength of evidence: high, moderate, or low.

Table A-7 describes the types of evidence that were used to grade the strength of evidence as high, moderate, or low by the expert panel and work group members, with assistance from methodologists.

The strength of the body of evidence represents the degree of certainty, based on the overall body of evidence, that an effect or association is correct. It is important to assess the strength of the evidence as objectively as possible. For rating the overall strength of evidence, the entire body of evidence for a particular summary table and its associated evidence statement was used.

Methodologists provided guidance to the panels and work group for assessing the body of evidence for each outcome or summary table of interest using four domains: [1] risk of bias; [2] consistency; [3] directness; and [4] precision. Each domain was assessed and discussed, and the aggregate assessment was used to increase or decrease the strength of the evidence, as determined by the NHLBI Evidence Quality Grading System shown above. The four domains are explained in more detail below:

b Risk of bias

Risk of bias refers to the likelihood that the body of included studies for a given question or outcome is biased due to flaws in the design or conduct of the studies. Risk of bias and internal validity are similar concepts that are inversely correlated. A study with a low risk of bias has high internal validity and is more likely to provide correct results than one with high risk of bias and low internal validity. At the individual study level, risk of bias is determined by rating the quality of each individual study using standard rating instruments, such as the NHLBI study quality rating tools presented and discussed in the previous section of this report. Overall, risk of bias for the body of evidence regarding a particular question, summary table, or outcome is then assessed by the aggregate quality of studies available for that particular question or outcome. Panel and work group members reviewed the individual study quality ratings with methodologists to determine the aggregate quality of the studies available for a particular question, summary table, or outcome. If the risk of bias was low, then it increased the strength of evidence rating for the strength of the overall body of evidence. If the risk of bias was high, then it decreased the strength of evidence rating.

Table A-7. Evidence Quality Grading System
image
c Consistency

Consistency is the degree to which reported effect sizes are similar across the included studies for a particular question or outcome. Consistency enhances the overall strength of evidence and is assessed through effect sizes being in the same direction (i.e., multiple studies demonstrate an improvement in a particular outcome), and the range of effect sizes across studies being narrow. Inconsistent evidence is reflected in [1] effect sizes that are in different directions, [2] a broad range of effect sizes, [3] nonoverlapping confidence intervals, or [4] unexplained clinical or statistical heterogeneity. Studies included for a particular question or outcome can have effect sizes that are consistent, inconsistent, or unknown (or not applicable). The latter occurs in situations when there is only a single study. For the NHLBI project, consistent with the approach of AHRQ's EPCs, evidence from a single study generally should be considered insufficient for a high strength of evidence rating because a single trial, no matter how large or well designed, may not provide definitive evidence of a particular effect until confirmed by another trial. However, a very large, multicentered, well-designed, well-executed RCT that performs well in the other domains could in some circumstances be considered high quality evidence after thoughtful consideration.

d Directness

Directness has two aspects: the direct line of causality and the degree to which findings can be extended from a specific population to a more general population. The first defines directness as whether the evidence being assessed reflects a single direct link between the intervention (or service, approach, exposure, etc.) of interest and the ultimate health outcome under consideration. Indirect evidence relies on intermediate or surrogate outcomes that serve as links along a causal pathway. Evidence that an intervention results in changes in important health outcomes (e.g., mortality, morbidity) increases the strength of the evidence. Evidence that an intervention results in changes limited to intermediate or surrogate outcomes (e.g., a blood measurement) decreases the strength of the evidence. However, the importance of each link in the chain should be considered, including existing evidence that a change in an intermediate outcome affects important health outcomes.

Another example of directness involves whether the bodies of evidence used to compare interventions are the same. For example, if Drug A is compared to placebo in one study and Drug B is compared to placebo in another study, using those two studies to compare Drug A with Drug B yields indirect evidence and provides a lower strength of the evidence than direct head-to-head studies of Drug A with Drug B.

The second aspect of directness refers to the degree to which participants or interventions in the study are different from those to whom the study results are being applied. This concept is referred to as “applicability.” If the population or interventions are similar, then the evidence is direct and strengthened. If they are different, then the evidence is indirect and weakened.

e Precision

Precision is the degree of certainty about an estimate of effect for a specific outcome of interest. Indicators of precision are statistical significance and confidence intervals. Precise estimates enabled firm conclusions to be drawn about an intervention's effect relative to another intervention or control. An imprecise estimate is where the confidence interval is so wide that the superiority or inferiority of an intervention cannot be determined. Precision is related to the statistical power of the study. An outcome that was not the primary outcome or not prespecified will generally be less precise than the primary outcome of a study. In a meta-analysis, precision is reflected by the confidence interval around the summary effect size. For systematic reviews, which include multiple studies but no quantitative summary estimate, the quantitative information from each study should be considered in determining the overall precision of the body of included studies because some studies may be more precise than others. Determining precision across many studies without conducting a formal meta-analysis is challenging and requires judgment. A more precise body of evidence increases the strength of evidence and less precision reduces the strength of a body of evidence.

Following discussion of the four criteria for the strength of evidence grading options, in some cases, the expert panels and work groups also considered other factors. For example, the objectivity of an outcome measure needs to be assessed. Total mortality (usually recorded accurately) is a more objective measure than angina. Similarly, urinary sodium excretion is a more objective measure than dietary sodium intake reported by study subjects through recall. And measured height and weight, used to calculate a study subject's BMI, is a more objective measure than self-reported weight and height.

After the panel and work group members reviewed and discussed this range of factors, they voted on the final grade for the strength of evidence for each evidence statement. Methodologists provided analysis and recommendations regarding strength of evidence grading but did not participate in the voting process. A simple majority vote was sufficient to identify the strength of evidence grade. However, in most cases, the panels and work groups discussed the results if there were dissenting opinions until they achieved consensus or large majorities for the votes on the strength of evidence.

xx Policy and Procedures for the Use of Existing Systematic Reviews and Meta-Analyses

Systematic reviews and meta-analyses are routinely used in evidence reviews, and well-conducted systematic reviews and meta-analyses of RCTs are generally considered to be among the highest forms of evidence. As a result, systematic reviews and meta-analyses could be used to inform guideline development in the NHLBI CVD adult systematic evidence review project if certain criteria were met. AHRQ has published guidance on using existing systematic reviews, which has helped to inform the development of the NHLBI criteria [428].

To use existing systematic reviews or meta-analyses to inform the NHLBI evidence report, the project needed to identify: [1] those studies relevant to the topic of interest, [2] those where the risk of bias was low, and [3] those that were recent. The first item was addressed by examining the research question and component studies in the systematic reviews and meta-analyses as they related to the NHLBI CQs. The second item was addressed by using a quality assessment tool and the third was addressed by examining publication dates.

In general, the project followed the process below in using systematic reviews and meta-analyses:

  • Eligibility of systematic reviews and meta-analyses was determined by the methodologists and consulting with expert panels or work groups as needed.
  • Data were not formally abstracted from systematic reviews or meta-analyses using the database system to create individual evidence tables. Data from the systematic reviews and meta-analyses used for CQ1 and CQ2 were pulled from the studies and included in summary tables, but not in individual evidence tables. The citations were included in the reference list.
  • Systematic reviews or meta-analyses were rated using the quality assessment tool for this project. Systematic reviews or meta-analyses were used to develop evidence statements if they were rated “good” or “fair” or were comprehensive reviews commissioned by the Federal Government. Systematic reviews or meta-analyses rated as “poor” were only used when there were no eligible good or fair publications; this occurred for Obesity CQ2.
  • If an existing systematic review or meta-analysis was used to develop evidence statements:

    • Multiple eligible systematic reviews and meta-analyses addressing the same topic were identified through a systematic search to minimize bias. The systematic reviews or meta-analyses used were summarized in text, tables, or appendixes.

    • Rating the body of evidence followed the same system used for the de novo systematic reviews conducted for this project and resulted in a high (systematic reviews or meta-analyses rated “good” only), moderate, or low rating based on number, type, and quality of the studies in the systematic review or meta-analysis. In most cases, the number of systematic reviews or meta-analyses was also considered when rating the body of evidence.
    • Recommendation strength took into account whatever evidence was available in the systematic reviews or meta-analyses used to make the recommendation, including issues like strength of the evidence, applicability of the evidence, consistency of the evidence, and others. Any level of recommendation could be made, as long as it was supported by the evidence being used to make the recommendation: Grade A (Strong) (a strong recommendation only can be given if the systematic reviews or meta-analyses used to make the recommendation are rated as Good), B (Moderate), C (Weak), (D) Against, (E) Expert Opinion, (N) No recommendation.

Three criteria were used in to determine when systematic reviews or meta-analyses could be used.

Situation #1—When a systematic review or meta-analysis addresses a topic relevant to the NHLBI CVD systematic evidence reviews that was not covered by an existing CQ (e.g., effects of physical activity on CVD risk):

  1. For a systematic review or meta-analysis to be examined for relevance to the topic of interest, the topic needed to be prespecified in the form of a CQ using the PICO structure (population, intervention/exposure, comparator, and outcome). If only portion(s) of a systematic review were relevant, those relevant portions that were reported separately could be used. For example, in the Department of Health and Human Services' (HHS) 2008 systematic review on physical activity, the effects of physical activity on CVD were relevant and used to make evidence statements because they were reported in a separate chapter. However, the effects of physical activity on mental health would not be relevant and, therefore, were not used in crafting NHLBI evidence statements.
  2. Systematic reviews or meta-analyses could be used if they were recent (i.e., published within 3 years of the end date of the NHLBI systematic review publication window of December 31, 2009), or identified by the panel or work group if published after the end date of the project literature search and before the panel began to deliberate on evidence statements. If the end date of the systematic review or meta-analysis literature search was before December 31, 2009, panel or work group members could conduct a bridging literature search through December 31, 2009 in the following situations: [1] if they believed it was necessary to review relevant studies, published after the end date, and [2] if the bridging literature search covered the period up to 1 year before the literature search cut-off date of the systematic review or meta-analysis and extended no later than December 31, 2009.

Situation #2—If the NHLBI literature review identified an existing systematic review or meta-analysis that could possibly replace NHLBI's review of a CQ or subquestion:

  1. The systematic review or meta-analysis was examined for consistency between the studies in the systematic review or meta-analysis and the CQ I/E criteria. Component studies had to meet the I/E criteria; however, smaller sample sizes were allowed, as were studies published before the beginning of the NHLBI project's search date window, as long as a truly systematic approach was used. If the end date of the systematic review or meta-analysis literature search was before December 31, 2009, panel or work group members could conduct a bridging literature search through December 31, 2009 in these situations: [1] if they believed it was necessary to review relevant studies, published after the end date, and [2] if the bridging literature search covered the period up to 1 year before the literature search cut-off date of the systematic review and meta-analysis and extended no later than December 31, 2009.

SITUATION #3—If NHLBI's literature review identified an existing systematic review or meta-analysis that addressed the same or a similar CQ or subquestion as one undergoing NHLBI review:

  1. Systematic review or meta-analysis component articles that met all the I/E criteria for the CQ, but were not identified in NHLBI's literature search, could be added to the included studies in NHLBI's review and treated the same way (i.e., abstracted, quality rated, and added to evidence and summary tables).

xxi Peer Review Process

A formal peer-review process was undertaken that included inviting several scientific experts and representatives from multiple Federal agencies to review and comment on the draft documents. NHLBI selected scientific experts with diverse perspectives to review the reports. Potential reviewers were asked to sign a confidentiality agreement, but NHLBI did not collect COI information from the reviewers. DARD staff collected reviewers' comments and forwarded them to the respective panels and work groups for consideration. Each comment received was addressed—either by a narrative response and/or a change to the draft document. A compilation of the comments received and the panels' and work groups' responses was submitted to the NHLBI Advisory Council working group; individual reviewers did not receive responses.

Appendix B: Question-Specific Methods

i Search Strategy Overview and Syntax of Queries

This section describes how search strategies for the NHLBI systematic evidence review initiative were constructed and explains how to interpret the search strategies that are documented in the following section.

A search strategy is an expression of conditions connected by the logical operators AND, OR, and NOT. Parentheses are used to group conditions. Each condition is described by attributes, operators, and values. Table B-1 shows examples of queries and descriptions of results. A complete list of attributes used in search strategies with their explanations is listed in Table B-2. Commonly used macro queries are defined in Table B-3.

To increase the readability of search strategies, conditions are grouped in meaningful components. There are three major types of components: [1] study type query, [2] Boolean search, and [3] Boolean filter. These three components are connected with the AND operator; thus, a citation must satisfy all three component queries to be retrieved. The I/E criteria for each question, which was defined using the PICOTS structure, are implemented in search strategies using the study type query, Boolean search, and Boolean filter.

  • Study type query: Consists of expressions that retrieve the study designs that are eligible for inclusion in the body of evidence as defined in the criteria (i.e., RCTs, systematic reviews, prospective cohort studies, etc.)
  • Boolean search: Implements expressions for (PICOTS)
  • Boolean filter: Implements an extension of search or comparator criterion

Each of the components may use NOT queries to implement exceptions.

In addition to the strict Boolean strategy, results are ranked using keywords specified for integrated ranking of the TeraText Rank Engine and Content Analyst Conceptual Engine. Ranking helps to identify the most relevant citations first, as the titles and abstracts are analyzed for the presence and frequency of the keywords.

Table B-1. Examples of Simple Queries
image
Table B-2. Attributes, Their Values, and Explanation
image
Table B-3. Common Macro Queries Used in Search Strategies
image
image
ii Critical Question 1: Search Strategy

Among overweight and obese adults, does achievement of reduction in body weight with lifestyle and pharmacological interventions affect CVD risk factors, CVD events, morbidity, and mortality?

  1. Does this effect vary across population subgroups defined by the following demographic and clinical characteristics:
    • Age
    • Sex
    • Race/ethnicity
    • BMI
    • Baseline waist circumference (WC)
    • Presence or absence of comorbid conditions
    • Presence or absence of CVD risk factors
  2. What amount (shown as percent lost, pounds lost, etc.) of weight loss is necessary to achieve benefit with respect to CVD risk factors, morbidity, and mortality?
    • Are there benefits on CVD risk factors, CVD events morbidity, and mortality from weight loss?
    • What are the benefits of more significant weight loss?
  3. What is the effect of sustained weight loss for 2 or more years in individuals who are overweight or obese on CVD risk factors, CVD events, and health and psychological outcomes?
    • What percent of weight loss needs to be maintained at 2 or more years to be associated with health benefits?
a Study Type Query

Study types eligible for CQ1: Systematic reviews or meta-analyses.

  • {Systematic Review}
b Boolean Search

(

  • (publicationYear >1997)
  • AND (subject, title, abstract=(“Overweight” or “Obesity” or “Obesity Morbid” or “Body Mass” or “Waist Circumference”) or obese or majorSubject=(“Weight Loss” or “Diet, Reducing”))
  • AND (subject, title, abstract=(“Weight Loss” or “Diet, Reducing”) or (weight %5 reduc?))
  • AND ((subject, qualifier, abstract, title=(mortality or morbidity or prevalence or incidence or physiopathology or epidemiology or “Treatment outcome” or therapy or “therapeutic use” or Risk factor? or “Fatal Outcome” or “Survival Rate” or Myocardial Infarction? or “Myocardial Stunning” or “No-Reflow Phenomenon” or “Shock, Cardiogenic” or Heart Failure? or “Dyspnea, Paroxysmal” or “Edema, Cardiac” or Stroke? or “Kidney Failure, Chronic”) or death? or died or fatal or ((CVD or CV or cardiovascular or CHF or heart failure) %2 (event? or hospitalization)) or Chronic Kidney Failure or CKD or Chronic Kidney Disease or End Stage Renal or ESRD)
    • or (((subject=(“Fatty Liver”)) with (qualifier=(blood or diagnosis))) not subject=Alcohol?) or nonalcoholic steatohepatitis or NASH
    • or ((subject=(Depression)) with (qualifier=(blood or diagnosis)))
    • or ((subject=(Hypertension or Cholesterol or Diabetes or Metabolic Syndrome X)) with (qualifier=(blood or diagnosis)))
    • or subject, title, abstract=(“Blood pressure” and (systol? or diastol?)) or BP or SBP or DPB or hypertensive or non-hypertensive or blood pressure goal?
    • or ((subject=(Triglycerides or “Cholesterol” or “Apolipoproteins B” or Apolipoprotein B? or “Apolipoprotein A-I” or “Apolipoproteins A” or Apolipoproteins or “Lipoprotein(a)” or “Apoprotein(a)”)) with (qualifier=(blood or metabolism))) or Triglyceride? or HDL Cholesterol or HDL-C or Apolipoprotein B? or apoB or Apolipoprotein A? or apoA-1 or Lp(a) or “Lipoprotein (a)” or “Apoprotein(a)” or total cholesterol or TC or LDL particle number or LDL-P or (LDL and subject, abstract, title=“Particle Size”) or lipid goal?
    • or subject=“Glucose Tolerance Test” or ((subject=(Blood Glucose or Insulin or “Hemoglobin A, Glycosylated”)) with (qualifier=(blood or diagnostic))) or (fasting %2 glucose) or (fasting %2 insulin) or A1c or HOMA or IVGTT or OGTT or glycemic control goal?
    • or ((subject=“C-Reactive Protein”) with (qualifier=(metabolism or analysis))) or hs-CRP or CRP or hsCRP or “C-reactive protein”
    • )
  • AND ((((Subject=(Obesity or Overweight)) with (qualifier=(“drug therapy” or epidemiology)))) or placebo
    • or subject, title, abstract=(“Anti-Obesity Agent?” or “Appetite Depressant?”)
    • or subject, title, abstract=(Diethylpropion or Phenmetrazine or Phentermine or Phenylpropanolamine)
    • or substance, abstract, title=(amylin or benfluorex or butenolide or “FG 7142” or lipid mobilizing substance or norpseudoephedrine or oleoyl-estrone or orlistat or perflubron or pyroglutamyl-histidyl-glycine or satietin or sibutramine or topiramate)
    • or qualifier=(“therapeutic use” or “drug effects”)
    • or qualifier=“diet therapy”
    • or subject=(“Life style” or “Life Change Events” or Lifestyle or “Risk Reduction Behavior” or “Behavior Therapy” or Exercise or “Physical Fitness”) or “lifestyle intervention” or “energy intake” or cardiorespiratory fitness
    • or majorSubject=(“Weight Loss” or Obesity or Overweight or “Body Mass Index” or Diet or “Psychotherapy, Group”)
    • or subject, title, abstract=“Combined Modality Therapy”
    • or ((((Subject=(Obesity or Overweight)) with (qualifier=“diet therapy”))) or (Subject=(Obesity or Overweight) and Subject=Diet))
    • or (diet %2 exercise)
    • or ((pharmacological or non-pharmacological) %2 intervention?)
    • )
  • AND (subject, title, abstract=“Body Weight” or subject=“Body Weight Changes”
    • or subject, title, abstract=“Body Mass Index” or BMI
    • or subject, title, abstract=(“Weight Loss” or weight)
    • or subject, title, abstract=(“Waist-Hip Ratio” or “Waist Circumference”)
    • or subject, title, abstract=(“Body Fat Distribution” or Adiposity))
    • )
  • NOT {Non-Westernized Countries}
  • NOT majorSubject=(“Dietary Supplements”)
  • NOT majorSubject=(Accreditation)
  • NOT majorSubject=(“Digestive System Surgical Procedures” or “Bariatric Surgery” or “Gastric Bypass” or “Gastric Balloon” or Laparoscopy or Gastroplasty or Coronary Artery Bypass or Gastrectomy or “Biliopancreatic Diversion”)
  • NOT (((subject=(“Digestive System Surgical Procedures” or “Bariatric Surgery” or “Gastric Bypass” or “Gastric Balloon” or Laparoscopy or Gastroplasty or Coronary Artery Bypass or Gastrectomy or Biliopancreatic Diversion)) with (qualifier=(instrumentation or methods or adverse effects or economics or standards or statistics))))
  • NOT subject=(“Postoperative Complications” or Reoperation or “Postoperative Period” or “Length of Stay” or “Reconstructive Surgical Procedures” or “Equipment and Supplies” or “Preoperative Care” or “Postoperative Care” or “Prenatal Care” or “Weight Gain and Pregnancy” or “Pregnancy Complications”)
  • NOT subject=(“Equipment Design” or “Advertising as Topic”)
  • NOT (subject=(“Pilot Projects”) or pilot study)
  • NOT subject=((child or adolescent) not (adult or aged))
  • NOT subject=(“Child Nutrition” or “Child Behavior” or “Child, Preschool” or “Child Development” or “Infant Food”)
  • NOT subject=(Heel or Foot diseases or Cosmetic techniques or Hair Removal or Hirsutism)
  • NOT majorSubject=(“Research Design” or Questionnaires)
  • NOT (((self-report?) %3 weight) not (qualifier, abstract, title, subject=mortality or subject, title, abstract=(Myocardial Infarction or Heart Failure or Stroke or CVD event? or CHD event?)))
  • NOT ((week or days) not (week? or month? or year?))
  • NOT (subject=(Animals or Venoms))
  • NOT (title=(binge eating or schizophrenia))
  • NOT (genre=randomized)
  • NOT (recordStatus=delete)
c. Critical Question 1: Search Strategy Results and PRISMA Diagram

CQ1 was initially intended to be a de novo systematic review of original studies plus systematic reviews and meta-analyses. In 2011, the question was de-scoped and restricted to systematic reviews and meta-analyses only. The initial and subsequent exclusive supplemental systematic reviews and meta-analyses search included the bibliographic databases listed below. The search strategy presented above is the final strategy, which queries for systematic reviews and meta-analyses.

  • PubMed from January 2000 to October 2011
  • CINAHL from January 2000 to July 2008
  • EMBASE from January 2000 to July 2008
  • PsycINFO from January 2000 to July 2008
  • Evidence-based Medicine Cochrane Libraries from January 2000 to July 2008
  • Biological Abstracts from January 2004 to July 2008
  • Wilson Social Sciences Abstracts from January 2000 to July 2008

The literature search for CQ1 included an electronic search of the Central Repository for systematic reviews and meta-analyses published in the literature from January 2000 to October 2011. The Central Repository contains citations pulled from seven literature databases: PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts. The search produced 1,630 citations, with 3 additional citations identified from non-search sources (i.e., by the panel members) [22-24].

The PRISMA diagram in figure B-1 outlines the flow of information from the literature search through the various steps used in the systematic review process.

Figure 13.

PRISMA Diagram Showing Selection of Articles for CQ1

Key: Details for each exclusion rationale are determined by the I/E criteria for the question, reproduced below. The I/E criteria are also available in Section 5a.

Two reviewers independently screened the titles and abstracts of 1,633 publications against the I/E criteria, resulting in 936 publications being excluded and 697 publications being retrieved for full-text review to further assess eligibility. Then, two reviewers independently screened and assessed the 697 full-text publications for eligibility by applying the I/E criteria; 669 of these publications were excluded based on one or more of the I/E criteria (see specified rationale as noted in the PRISMA diagram).

Table B-4. Criteria for Selection of Publications for CQ1
image

Forty-two of the 697 full-text publications met the criteria and were included. The quality (internal validity) of these 42 publications was assessed using the quality assessment tool developed to assess systematic reviews or meta-analyses or RCTs (see appendix A). Of these, 14 publications were rated poor quality; rationales for the poor quality studies are included in Appendix B. The remaining 28 publications were rated good or fair quality and included in the evidence base that was used to formulate the evidence statements.

NHLBI approved using relevant data from an RCT study (i.e., Look AHEAD) based on the following rationale. Look AHEAD is a prospective, multicenter, RCT that examined the effects of intensive lifestyle intervention (ILI) versus usual diabetes care, referred to as diabetes support and education, on CV morbidity and mortality in 5,145 overweight or obese participants with type 2 diabetes. This single trial provides data on more patients than the two meta-analyses by Norris [53] and Norris [50] (N=4,659), and almost as many as the Norris [51] (N=5,956) and Orozco meta-analysis [65] (N=5,956). They have provided 4-year comparison outcome data [23] and, more importantly, 1-year dose-response data that relates the amount of weight loss to predefined CVD risk factors.[22]

Subsequent to receiving approval to include relevant data from the Look AHEAD study, an additional search was made (of the de novo citations included during the early screening stages) for RCTs of similar size to the Look AHEAD trial (≥5,000); through this process, no additional relevant studies were found.

A total of 42 publications were included in the CQ1 evidence base; 39 were systematic reviews or meta-analyses and 3 were RCTs. The panel members reviewed the final articles on the “include” list along with their quality ratings and had the opportunity to raise questions. For CQ1, panel members created spreadsheets (containing key information from the systematic reviews and meta-analyses and the Look AHEAD studies); these spreadsheets (cross-checked by the methodology and systematic review teams) formed the basis for panel deliberations.

iii Critical Question 2: Search Strategy
  1. Are the current cutpoint values for overweight (BMI 25.0 to 29.9 kg/m2) and obesity (BMI ≥30 kg/m2) compared with BMI 18.5 to 24.9 kg/m2 associated with elevated CVD-related risk (defined below)? Are the WC cutpoints of >102 cm (M) and >88 cm (F) associated with elevated CVD-related risk (defined below)? How do these cutpoints compare with other cutpoints in terms of elevated CVD risk?
    • Fatal and nonfatal CHD, stroke, and CVD
    • Overall mortality
    • Incident type 2 diabetes mellitus
    • Incident dyslipidemia
    • Incident hypertension
  2. Are differences across population subgroups in the relationships of BMI and WC cutpoints with CVD sufficiently large to warrant different cutpoints? If so, what should they be?
    • Fatal and nonfatal CHD, stroke, and CVD
    • Overall mortality
    • Incident type 2 diabetes mellitus
    • Incident dyslipidemia
    • Incident hypertension
    Groups being considered include:
    • Age
    • Sex (both male and female)
    • Race/ethnicity (African American, Hispanic, Native American, Asian, Caucasian)
  3. What are the associations between maintaining weight and weight gain with elevated CVD-related risk in normal weight, overweight, and obese adults?
    1. Study Type Query Study types eligible for CQ2: Systematic reviews, meta-analyses, or pooled analyses focusing only on CHD, CVD, and mortality as outcomes.
      • ({Systematic Review} or
        • ((subject=(Longitudinal Studies) or pooling or pooled or collaborative anal? or genre, title, abstract=Multicenter or (stratif? %5 study center) or Mantel? or Peto or DerSimonian or Laird or Woolf or subject, title, abstract=(Bayesian or (Sensitivity and Specificity)) or random effects or Meta-regression or (integrat? anal?) or between-study variance or ((variance or heterogeneity) %2 stud?)) and
        • majorSubject, title=(“Body Mass” or “Waist Circumference” or BMI or Anthropometry or “Body Weights and Measures”)))
      • AND
      • ({Cardiovascular Diseases} or
        • subject, qualifier, title, abstract=mortality or death? or died or subject=(“Cause of Death” or “Fatal Outcome” or “Survival Rate”) or subject, title, abstract=(Diabetes or “Glucose Metabolism Disorders” or “Metabolic Syndrome X” or Dyslipid? or Hyperlipid? or Hypercholesterol? or Hyperlipoprotein? or Hypertriglycerid? or “Tangier Disease” or “Smith Lemli Opitz Syndrome” or “Hyperglycemia” or “Glucose Intolerance” or “Prediabetic State” or “Insulin Resistance”)
    2. Boolean Search

      • (subject, title, abstract=(“Body Mass” or “Waist Circumference” or BMI or Anthropometry or “Body Weights and Measures”) AND ((publicationYear>1999 and publicationYear<2012)))
      • NOT {Non-Westernized Countries}NOT (majorSubject=(Angioplasty or Laparoscopy))
      • NOT (subject=”Postoperative Complications”)
      • NOT (subject, title, abstract=malnutrition)
      • NOT (subject=(Vaccines))
      • NOT ((subject=(“Bariatric Surgery” or Gastroplasty or Gastric Bypass)) with (qualifier=“adverse effects”))
      • NOT ((subject=(Obesity)) with (qualifier=surgery))
      • NOT (title=chemotherapy)
      • NOT (subject=(“Single-blind method” or “Double-blind method”) or genre=Randomized)
      • NOT subject=(“Postoperative Complications” or Reoperation or “Postoperative Period” or “Length of Stay” or “Reconstructive Surgical Procedures” or “Equipment and Supplies” or “Preoperative Care” or “Postoperative Care” or “Prenatal Care” or “Weight Gain and Pregnancy” or “Pregnancy Complications”)
      • NOT subject=(“Equipment Design” or “Advertising as Topic”)
      • NOT (subject=(“Pilot Projects”) or pilot study)
      • NOT subject=((child or adolescent) not (adult or aged))
      • NOT subject=(“Child Nutrition” or “Child Behavior” or “Child Development” or “Infant Food”)
      • NOT subject=(Heel or Foot diseases or Cosmetic techniques or Hair Removal or Hirsutism)
    3. Critical Question 2: Search Strategy Results and PRISMA Diagram

CQ 2 was initially intended to be a de novo systematic review of original studies plus systematic reviews and meta-analyses. In 2011, CQ2 was de-scoped and restricted to systematic reviews and meta-analyses only. The initial and subsequent exclusive supplemental systematic reviews and meta-analyses search included the bibliographic databases listed below. The search strategy presented above is the final strategy, which queries for systematic reviews or meta-analyses.

  • PubMed from January 2000 to October 2011
  • CINAHL from January 2000 to July 2008
  • EMBASE from January 2000 to July 2008
  • PsycInfo from January 2000 to July 2008
  • Evidence-Based Medicine Cochrane Libraries from January 2000 to July 2008
  • Biological Abstracts from January 2004 to July 2008
  • Wilson Social Sciences Abstracts from January 2000 to July 2008

The literature search for CQ2 included an electronic search of the Central Repository for systematic reviews and meta-analyses published in the literature from January 2000 to October 2011. The Central Repository contains citations pulled from seven literature databases: PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts. The search produced 1,566 citations, with 5 additional citations identified from non-search sources (i.e., by the panel members). Three of the five citations met the criteria and were eligible for inclusion in the CQ2 evidence base. [67-69] In contrast, the other two citations did not meet the criteria and were excluded from the CQ2 evidence base. [70, 71]

The PRISMA diagram in figure B-2 outlines the flow of information from the literature search through the various steps used in the systematic review process.

Figure 14.

PRISMA Diagram Showing Selection of Articles for CQ2

Key: Details for each exclusion rationale are determined by the I/E criteria for the question, reproduced below. The I/E criteria are also available in Section 6a.

Two reviewers independently screened the titles and abstracts of 1,571 publications against the I/E criteria, resulting in 1,089 publications being excluded and 482 publications being retrieved for full-text review to further assess eligibility. Next, two reviewers independently screened and assessed 482 full-text publications for eligibility by applying the I/E criteria; 467 of these publications were excluded based on one or more of the I/E criteria (see specified rationale as noted in the PRISMA diagram).

Fifteen of the 482 full-text publications met the criteria and were included. The quality (internal validity) of these 15 publications was assessed using the quality assessment tool developed to assess systematic reviews and meta-analyses (see appendix A). Of these, 12 publications were rated as poor quality; however, they were used as part of the evidence base since NHLBI policy indicated that poor studies could be used as part of the evidence base if the majority of included studies were not rated good or fair. Rationales for the poor quality studies are included in appendix B. The remaining three systematic reviews and meta-analyses were rated good or fair quality and included in the evidence base that was used to formulate the evidence statements. Panel members reviewed the final articles on the “include” list, along with their quality ratings, and had the opportunity to raise questions. Some systematic reviews and meta-analyses previously deemed to be of poor quality were upgraded to fair quality upon closer review by the methodology team, who made the final decision. [81, 82] For this question, panel members created spreadsheets containing key information from the systematic reviews and meta-analyses; these spreadsheets, cross-checked by the methodology and systematic review teams), formed the basis for panel deliberations.

vi Critical Question 3: Search Strategy

CQ3 has two parts:

  1. In overweight or obese adults, what is the comparative efficacy/effectiveness of diets of differing forms and structures (macronutrient content, carbohydrate and fat quality, nutrient density, amount of energy deficit, dietary pattern) or other dietary weight loss strategies (e.g., meal timing, portion controlled meal replacements) in achieving or maintaining weight loss?
  2. During weight loss or weight maintenance after weight loss, what are the comparative health benefits or harms of the above diets and other dietary weight loss strategies?

a. Study Type Query

Study types eligible for CQ3: RCTs, systematic reviews of RCTs, or controlled clinical trials. No restrictions on sample size.

Exclusions: Case series, case reports, before-after studies, unpublished literature, unpublished industry-sponsored trials, other unpublished data, FDA Medical and Statistical reviews, theses, studies published only as abstracts, letters, commentaries and opinion pieces, and nonsystematic reviews. Results are not compared according to randomized treatment assignments. Dropout rate >40 percent after 6 months.

  • {RCT} OR {Systematic Review} OR
  • NOT genre, title, subject=(case reports or case study or case series or before after)
  • NOT (title=(case report or commentary) OR genre=(letter or abstract or newspaper article or comment?))

b. Boolean Search

  • (
  • (publicationYear>1997 and publicationYear<2010 and language=eng?)
  • AND (overweight? or obesity or obese or subject=(obesity or overweight) or ((“body mass index” or BMI) %3 (2? or 3? or 4?)) or majorSubject=(“Weight Loss” or “Diet, Reducing”))
  • AND (diet? or meal? or low-glycemic index or glycemic load or therapeutic lifestyle change? or TLC or energy density or portion control or volumetrics or subject=(diet or dietary or Energy Intake or Caloric Restriction))
  • AND (weight %3 los? or weight reduc? or weight maintenance or subject=“weight loss” or subject=“body weight” or subject=“weight reduction” or majorSubject=“Diet, Reducing”))
  • )
  • NOT majorSubject=(Accreditation)
  • NOT (((subject=(“Digestive System Surgical Procedures” or “Bariatric Surgery” or “Gastric Bypass” or “Gastric Balloon” or Laparoscopy or Gastroplasty or Coronary Artery Bypass or Gastrectomy or Biliopancreatic Diversion)) with (qualifier=(instrumentation or methods or adverse effects or economics or standards or statistics))))
  • NOT subject=(“Postoperative Complications” or Reoperation or “Postoperative Period” or “Length of Stay” or “Reconstructive Surgical Procedures” or “Equipment and Supplies” or “Preoperative Care” or “Postoperative Care” or “Prenatal Care” or “Weight Gain and Pregnancy” or “Pregnancy Complications”)
  • NOT subject=(“Equipment Design” or “Advertising as Topic”)
  • NOT (subject=(“Pilot Projects”) or pilot study)
  • NOT subject=((child or adolescent) not (adult or aged))
  • NOT subject=(“Child Nutrition” or “Child Behavior” or “Child, Preschool” or “Child Development” or “Infant Food”)
  • NOT subject=(Heel or Foot diseases or Cosmetic techniques or Hair Removal or Hirsutism)
  • NOT subject=(“Africa” OR “Africa Northern” OR “Algeria” OR “Egypt” OR “Libya” OR “Morocco” OR “Tunisia” OR “Africa South of the Sahara” OR “Africa Central” OR “Cameroon” OR “Central African Republic” OR “Chad” OR “Congo” OR “Gabon” OR “Democratic Republic of the Congo” OR “Equatorial Guinea” OR “Africa Eastern” OR “Burundi” OR “Ethiopia” OR “Kenya” OR “Rwanda” OR “Somalia” OR “Sudan” OR “Tanzania” OR “Uganda” OR “Djibouti” OR “Eritrea” OR “Africa Southern” OR “Angola” OR “Botswana” OR “Lesotho” OR “Malawi” OR “Mozambique” OR “Namibia” OR “South Africa” OR “Swaziland” OR “Zambia” OR “Zimbabwe” OR “Africa Western” OR “Benin” OR “Burkina Faso” OR “Gambia” OR “Ghana” OR “Guinea Bissau” OR “Cote d Ivoire” OR “Liberia” OR “Mali” OR “Mauritania” OR “Niger” OR “Nigeria” OR “Senegal” OR “Sierra Leone” OR “Togo” OR “Guinea” OR “Cape Verde” OR “Americas” OR “Central America” OR “Belize” OR “Costa Rica” OR “El Salvador” OR “Guatemala” OR “Honduras” OR “Nicaragua” OR “Panama” OR “Panama Canal Zone” OR “Latin America” OR “South America” OR “Argentina” OR “Bolivia” OR “Brazil” OR “Chile” OR “Colombia” OR “Ecuador” OR “French Guiana” OR “Guyana” OR “Paraguay” OR “Peru” OR “Suriname” OR “Uruguay” OR “Venezuela” OR “Caribbean Region” OR “West Indies” OR “Barbuda and Antigua” OR “Bahamas” OR “Barbados” OR “Cuba” OR “Dominican Republic” OR “Haiti” OR “Jamaica” OR “Martinique” OR “Netherlands Antilles” OR “Puerto Rico” OR “Trinidad and Tobago” OR “Virgin Islands of the United States” OR “Dominica” OR “Grenada” OR “Guadeloupe” OR “Saint Lucia” OR “Saint Vincent and the Grenadines” OR “Saint Kitts and Nevis” OR “Antarctic Regions” OR “Arctic Regions” OR “Asia” OR “Asia Central” OR “Kazakhstan” OR “Kyrgyzstan” OR “Tajikistan” OR “Turkmenistan” OR “Uzbekistan” OR “Asia Southeastern” OR “Borneo” OR “Brunei” OR “Myanmar” OR “Cambodia” OR “Indonesia” OR “Laos” OR “Malaysia” OR “Mekong Valley” OR “Philippines” OR “Singapore” OR “Thailand” OR “Vietnam” OR “East Timor” OR “Asia Western” OR “Bangladesh” OR “Bhutan” OR “India” OR “Sikkim” OR “Middle East” OR “Afghanistan” OR “Bahrain” OR “Iran” OR “Iraq” OR “Jordan” OR “Kuwait” OR “Lebanon” OR “Oman” OR “Qatar” OR “Saudi Arabia” OR “Syria” OR “Turkey” OR “United Arab Emirates” OR “Yemen” OR “Nepal” OR “Pakistan” OR “Sri Lanka” OR “Far East” OR “China” OR “Hong Kong” OR “Tibet” OR “Japan” OR “Tokyo” OR “Korea” OR “Macau” OR “Mongolia” OR “Taiwan” OR “Atlantic Islands” OR “Azores” OR “Bermuda” OR “Falkland Islands”)
  • NOT majorSubject=(“Research Design” or Questionnaires)
  • NOT (subject=(Animals or Venoms))
  • NOT (recordStatus=delete)
Table B-5. Criteria for Selection of Publications for CQ2
image
image
image

c. Boolean Filter

The Boolean filter in the CQ3 search strategy implements the intervention criterion to reflect dietary weight loss intervention.

(abstract, title, qualifier=“diet therapy” and abstract, title, subject=“weight loss”)

  • OR ((qualifier=“diet therapy” or (diet? %3 therap?) or majorSubject=(Diet or “caloric restriction” or “glycemic index”)) and (weight %3 los? or weight reduc? or weight maintenance or subject=“weight loss”))
  • OR (title, abstract, subject=diet? and majorSubject=“weight loss”)
  • OR ((subject=“weight loss”) with (qualifier=physiology))
  • OR ((overweight? or obes?) and diet? and (weight loss or weight %2 reduc?))
  • OR (majorSubject=“Diet, Reducing”)
  • OR genre=(Comparative Study or Meta-Analysis)

    • d. Critical Question 3: Search Strategy Results and PRISMA Diagram

The following databases were searched for RCTs and systematic reviews and meta-analyses of RCTs or controlled clinical trials to answer CQ3:

  • PubMed from January 1998 to December 2009
  • CINAHL from January 1998 to July 2008
  • EMBASE from January 1998 to July 2008
  • PsycInfo from January 1998 to July 2008
  • Evidence-based Medicine Cochrane Libraries from January 1998 to July 2008
  • Biological Abstracts from January 2004 to July 2008
  • Wilson Social Sciences Abstracts from January 1998 to July 2008

The literature search for CQ3 included an electronic search of the Central Repository for RCTs or controlled clinical trials published in the literature from January 1998 to December 2009. The Central Repository contains citations pulled from seven literature databases (PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts). The search produced 1,416 citations, with 6 additional citations identified from non-search sources (i.e., by panel members) or hand search of systematic reviews and meta-analyses (obtained through the electronic search). Two of the six citations were published after December 31, 2009. Per NHLBI policy, certain lifestyle and obesity intervention studies published after the closing date could be allowed as exceptions. These studies must be RCTs in which each study arm contained at least 100 participants and was identified by experts' knowledgeable of the literature. One of the two citations published after December 2009 met the criteria and was eligible for inclusion in the CQ3 evidence base. [95] In contrast, the other citation did not meet the criteria and was excluded from the CQ3 evidence base. [96] The remaining 4 citations were identified through non-search sources (i.e., hand search) by cross-checking the references listed in 28 systematic reviews or meta-analyses. The systematic reviews and meta-analyses were only used for manual searches and were not part of the final evidence base. This manual cross-check was done to ensure that major studies were not missing from the evidence base. As a result of this cross-check, two of six studies were screened and found eligible for inclusion. [97, 98] Subsequently, the quality of these studies was rated as poor.

The PRISMA diagram for CQ3 shown in Figure B-3 outlines the flow of information from the literature search through the various steps used in the systematic review process.

Figure 15.

PRISMA Diagram Showing Selection of Articles for CQ3

Key: Details for each exclusion rationale are determined by the inclusion and exclusion criteria for the question, reproduced below. The inclusion and exclusion criteria are also available in Section 7.2

Two reviewers independently screened the titles and abstracts of 1,422 publications against the I/E criteria, resulting in 984 publications being excluded and 438 publications being retrieved for full-text review to further assess eligibility. Next, two reviewers independently screened 438 full-text publications and assessed eligibility by applying the I/E criteria; 361 of these publications were excluded based on one or more of the I/E criteria (see specified rationale as noted in the PRISMA diagram). Furthermore, the CQ3 work group noted that because the focus of the CQ is solely on the effect of different dietary approaches to weight loss, other possible interventions could not differ. So, studies were excluded if treatment arms differed in their behavioral approach (i.e., the amount of participant contact and amount or method of prescribed physical activity).

Seventy-seven of the 438 full-text publications met the criteria and were included. The quality (internal validity) of these 77 publications was assessed using the quality assessment tool developed to assess RCTs (see Appendix A). Of these, 54 publications were excluded because they were rated as poor quality; 52 of these studies were rated poor due to the ITT and attrition rates. Rationales for all poor quality studies are included in Appendix B. The remaining 17 RCTs (23 articles) were rated good or fair quality and included in the evidence base that was used to formulate the evidence statements. Panel members reviewed the final studies on the “include” list along with their quality ratings and had the opportunity to raise questions. Some trials previously deemed to be of fair or good quality were downgraded to poor quality upon closer review of evidence tables. These trials used completers analyses rather than ITT analysis and had overall attrition rates exceeding 10 percent. If the study reported only an analysis of completers and had attrition at <10 percent, it was allowed in the evidence base. Methodologists worked with the systematic review team to reevaluate these trials and make a final decision. Evidence tables and summary tables consisted only of data from the original publications of eligible RCTs; these tables formed the basis for panel deliberations.

v Critical Question 4: Search Strategy

CQ4 has two parts:

  1. Among overweight and obese adults, what is the efficacy/effectiveness of a comprehensive lifestyle intervention program (i.e., comprised of diet, physical activity, and behavior therapy) in facilitating weight loss or maintenance of lost weight?
  2. What characteristics of delivering comprehensive lifestyle interventions (e.g., frequency and duration of treatment, individual vs. group sessions, onsite vs. phone/e-mail contact) are associated with greater weight loss and weight loss maintenance?

a. Study Type Query

Study types eligible for CQ4:

  1. For efficacy/effectiveness: RCTs, systematic reviews. Sufficient information must have been presented about the intervention to replicate the study.
  2. For adverse effects: RCTs, controlled clinical trials, systematic reviews, cohort studies with a contemporaneous comparison group, case-control studies, large observational studies.
  3. Post-hoc analyses of large RCTs if analyses of randomized comparisons are included.
  4. Exclusions: Case series, case reports, before-after studies, unpublished literature, unpublished industry-sponsored trials, other unpublished data, FDA Medical and Statistical reviews, theses, studies published only as abstracts, letters, commentaries and opinion pieces, nonsystematic reviews.
  • {RCT} OR {Systematic Review} OR
  • (subject=(“Epidemiologic Studies” or “Cross Sectional Studies” or “Cohort Studies” or “Longitudinal Studies” or “Follow Up Studies” or “Prospective Studies” or “Case Control Studies” or “Cross-Over Studies” or “Retrospective Studies” or “Seroepidemiologic Studies” or “HIV Seroprevalence”) OR
  • (subject=(“Controlled Clinical Trials as Topic” or “Randomized Controlled Trials as Topic”) and abstract=?) OR
  • genre=(“Controlled Clinical Trial” OR “Validation Studies” OR “Multicenter Study” OR “Evaluation Studies”) OR
  • observational stud? or epidemiologic stud? or cross sectional stud? or cohort stud? or longitudinal stud? or follow up stud? or prospective stud? or case control stud? or cross-over stud? or retrospective stud? or title, subject=random? OR
  • (((subject=(Obesity or Overweight)) with (qualifier=(epidemiology or etiology or mortality or ethnology))) not genre=review)) AND language=eng?)
  • NOT genre, title, subject=(case reports or case study or case series or before-after)
  • NOT (title=(case report or commentary) OR genre=(letter or abstract or newspaper article or comment?))

    • b. Boolean Search
  • (
  • (publicationYear>1997 AND publicationYear<2010 AND language=eng? and abstract=?)
  • AND (subject, title, abstract=(“weight loss” or “weight reduction” or “weight maintenance”) or (weight %5 reduc?))
  • AND (subject, title, abstract=(Overweight or Obesity or Obesity Morbid or Prader Willi Syndrome) or (“weight loss” %2 maintenance) or obese or ((“body mass index” or BMI or BMIs)!13 (2? or 3? or 4?)))
  • AND (subject, title, abstract=(“Body Weight Changes” or “Weight Gain” or “Weight Loss” or “Emaciation” or “Cachexia”) or (weight %2 change?) or “baseline weight” or subject, title, abstract=(“Body Mass Index” or “Waist Circumference” or “Waist-Hip Ratio” or “Body Fat Distribution” or “Adiposity”) or “percent body fat” or “Percent reduction of excess weight” or BMI or BMIs or WC or WCs or kg)
  • AND (subject, title, abstract=(“Life Style” OR “Self care” or “Life Change Events” OR “Risk Reduction Behavior” OR “Behavior Therapy” OR “Aversive Therapy” OR “Biofeedback Psychology” OR “Desensitization Psychologic” OR “Implosive Therapy” OR “Relaxation Therapy” OR “Meditation” OR “Cognitive Therapy” OR “Sleep Phase Chronotherapy” OR “Diet” OR “Fasting” OR “Energy Intake” OR “Caloric Restriction” OR meal replacement? or “Diet Therapy” or “Exercise” OR “Motor Activity” or “physical activity” OR “Freezing Reaction Cataleptic” OR “Immobility Response Tonic” OR “Running” OR “Jogging” OR “Swimming” OR “Walking” OR Resistance Training OR “self-monitoring” OR “self-regulation” OR “Diet Records” OR “activity records” OR lifestyle) or ((subject=(Obesity or Overweight)) with (qualifier=therapy))))
  • NOT (Subject, title=(“Complementary Therapies” or Acupressure or Electroacupuncture or Meridians or Moxibustion or Anthroposophy or Auriculotherapy or Holistic Health or Homeopathy or “Medicine, Traditional” or “Mind-Body Therapies” or Aromatherapy or Biofeedback or “Breathing Exercises” or Hypnosis or “Imagery (Psychotherapy)” or “Laughter Therapy” or Meditation or “Mental Healing” or “Mind-Body Relations (Metaphysics)” or Psychophysiology or “Relaxation Therapy” or “Tai Ji” or “Therapeutic Touch” or Yoga or “Musculoskeletal Manipulations” or Massage or “Myofunctional Therapy” or Naturopathy or Organotherapy or “Tissue Therapy” or Phytotherapy or Aromatherapy or “Eclecticism, Historical” or Reflexotherapy or Rejuvenation or “Sensory Art Therapies” or “Acoustic Stimulation” or “Art Therapy” or “Color Therapy” or “Dance Therapy” or “Music Therapy” or “Play Therapy” or Psychodrama or Speleotherapy or “Spiritual Therapies” or “Faith Healing” or Magic or “Medicine, African Traditional” or Meditation or “Mental Healing” or Occultism or Radiesthesia or Shamanism or Witchcraft or Yoga))
  • NOT ((subject=Obesity) with (qualifier=Surgery))
  • NOT (subject=Drug Therapy or ((subject=“Weight Loss”) with (qualifier=“drug therapy”)))
  • NOT (majorSubject=Agents)
  • NOT ((subject=(Agents) or qualifier=(surgery or drug therapy or therapeutic use or administration or pharmaco?)) not (subject=(Diet or Behavior or Exercise or Physical or Life Style or Counseling or Cognitive or Combined Modality Therapy) or qualifier=“diet therapy”))
  • NOT (majorSubject=(Alcohol Drinking or Practice Guidelines or Bone))
  • NOT majorSubject=(“Dietary Supplements”)
  • NOT majorSubject=(“Digestive System Surgical Procedures” or “Bariatric Surgery” or “Gastric Bypass” or “Gastric Balloon” or Laparoscopy or Gastroplasty or Coronary Artery Bypass or Gastrectomy or “Biliopancreatic Diversion”)
  • NOT (((subject=(“Digestive System Surgical Procedures” or “Bariatric Surgery” or “Gastric Bypass” or “Gastric Balloon” or Laparoscopy or Gastroplasty or Coronary Artery Bypass or Gastrectomy or Biliopancreatic Diversion)) with (qualifier=(instrumentation or methods or adverse effects or economics or standards or statistics))))
  • NOT subject=(“Postoperative Complications” or Reoperation or “Postoperative Period” or “Length of Stay” or “Reconstructive Surgical Procedures” or “Equipment and Supplies” or “Preoperative Care” or “Postoperative Care” or “Prenatal Care” or “Weight Gain and Pregnancy” or “Pregnancy Complications”)
  • NOT subject=(“Equipment Design” or “Advertising as Topic”)
  • NOT subject=(Heel or Foot diseases or Cosmetic techniques or Hair Removal or Hirsutism)
  • NOT subject=(“Africa” OR “Africa Northern” OR “Algeria” OR “Egypt” OR “Libya” OR “Morocco” OR “Tunisia” OR “Africa South of the Sahara” OR “Africa Central” OR “Cameroon” OR “Central African Republic” OR “Chad” OR “Congo” OR “Gabon” OR “Democratic Republic of the Congo” OR “Equatorial Guinea” OR “Africa Eastern” OR “Burundi” OR “Ethiopia” OR “Kenya” OR “Rwanda” OR “Somalia” OR “Sudan” OR “Tanzania” OR “Uganda” OR “Djibouti” OR “Eritrea” OR “Africa Southern” OR “Angola” OR “Botswana” OR “Lesotho” OR “Malawi” OR “Mozambique” OR “Namibia” OR “South Africa” OR “Swaziland” OR “Zambia” OR “Zimbabwe” OR “Africa Western” OR “Benin” OR “Burkina Faso” OR “Gambia” OR “Ghana” OR “Guinea Bissau” OR “Cote d Ivoire” OR “Liberia” OR “Mali” OR “Mauritania” OR “Niger” OR “Nigeria” OR “Senegal” OR “Sierra Leone” OR “Togo” OR “Guinea” OR “Cape Verde” OR “Americas” OR “Central America” OR “Belize” OR “Costa Rica” OR “El Salvador” OR “Guatemala” OR “Honduras” OR “Nicaragua” OR “Panama” OR “Panama Canal Zone” OR “Latin America” OR “South America” OR “Argentina” OR “Bolivia” OR “Brazil” OR “Chile” OR “Colombia” OR “Ecuador” OR “French Guiana” OR “Guyana” OR “Paraguay” OR “Peru” OR “Suriname” OR “Uruguay” OR “Venezuela” OR “Caribbean Region” OR “West Indies” OR “Barbuda and Antigua” OR “Bahamas” OR “Barbados” OR “Cuba” OR “Dominican Republic” OR “Haiti” OR “Jamaica” OR “Martinique” OR “Netherlands Antilles” OR “Puerto Rico” OR “Trinidad and Tobago” OR “Virgin Islands of the United States” OR “Dominica” OR “Grenada” OR “Guadeloupe” OR “Saint Lucia” OR “Saint Vincent and the Grenadines” OR “Saint Kitts and Nevis” OR “Antarctic Regions” OR “Arctic Regions” OR “Asia” OR “Asia Central” OR “Kazakhstan” OR “Kyrgyzstan” OR “Tajikistan” OR “Turkmenistan” OR “Uzbekistan” OR “Asia Southeastern” OR “Borneo” OR “Brunei” OR “Myanmar” OR “Cambodia” OR “Indonesia” OR “Laos” OR “Malaysia” OR “Mekong Valley” OR “Philippines” OR “Singapore” OR “Thailand” OR “Vietnam” OR “East Timor” OR “Asia Western” OR “Bangladesh” OR “Bhutan” OR “India” OR “Sikkim” OR “Middle East” OR “Afghanistan” OR “Bahrain” OR “Iran” OR “Iraq” OR “Jordan” OR “Kuwait” OR “Lebanon” OR “Oman” OR “Qatar” OR “Saudi Arabia” OR “Syria” OR “Turkey” OR “United Arab Emirates” OR “Yemen” OR “Nepal” OR “Pakistan” OR “Sri Lanka” OR “Far East” OR “China” OR “Hong Kong” OR “Tibet” OR “Japan” OR “Tokyo” OR “Korea” OR “Macau” OR “Mongolia” OR “Taiwan” OR “Atlantic Islands” OR “Azores” OR “Bermuda” OR “Falkland Islands”)
  • NOT (subject=(Animals or Venoms))
  • NOT subject, title=(Anorexia Nervosa or Bulimia or Binge-Eating Disorder or Coprophagia or Female Athlete Triad Syndrome or Pica or Somatoform Disorders or Body Dysmorphic Disorders or Conversion Disorder or Hypochondriasis or Neurasthenia or Antipsychotic Agents or Genetic Predisposition to Disease or Epilepsy or HIV or Child or pediatric or Thinness or Acupuncture or Enteral Nutrition or Enteral tube feeding)
  • NOT subject, title=(Weight Lifting or Accidental Falls or Weight-Bearing or Femur Neck or Lumbar Vertebrae or Pelvic Bones)
  • NOT subject, title=(“Genetic Predisposition to Disease” or Breast feeding or Electric Impedance or Contraception or Contraceptives or “Transportation of Patients” or Sick Leave or Absenteeism)
    • c. Boolean Filter
Table B-6. Criteria for Selection of Publications for CQ3
image
image
image

The Boolean filter in the CQ4 search strategy implements the intervention criterion to reflect comprehensive lifestyle intervention (two or more of the following components: diet, physical activity, or behavior therapy).

  • lifestyle intervention? or (long-term %2 (maintenance or weight or effects)) or extended therapy program? or weight reducing program? or weight management or (comprehensive %3 (program? or lifestyle))
  • OR subject, title, abstract, qualifier=(diet or Energy Intake or Caloric Restriction or dietary or Fasting)
    • AND (behavio? or cognitive or psychotherapy or problem solving or relapse prevention or psychology or life style or counseling or Aversive Therapy or Biofeedback Psychology or Desensitization Psychologic or Implosive Therapy or Relaxation Therapy or Meditation or Cognitive Therapy or Sleep Phase Chronotherapy))
  • OR subject, title, abstract, qualifier=((diet or Energy Intake or Caloric Restriction or dietary or Fasting)
    • AND (physical activity or exercise or fitness or rehabilitation or life style or weight loss education or Motor Activity or Running or Jogging or Swimming or Walking or Resistance Training))
  • OR subject, title, abstract, qualifier=((behavio? or cognitive or psychotherapy or problem solving or relapse prevention or psychology or life style or counseling or Aversive Therapy or Biofeedback Psychology or Desensitization Psychologic or Implosive Therapy or Relaxation Therapy or Meditation or Cognitive Therapy or Sleep Phase Chronotherapy)
    • AND (physical activity or exercise or fitness or rehabilitation or life style or weight loss education or Motor Activity or Running or Jogging or Swimming or Walking or Resistance Training))
  • OR subject, title, abstract=(Confidence Interval? or Area Under Curve) or AUC
  • OR subject=(Combined Modality Therapy)
  • OR genre=(Comparative Study or Meta-Analysis)
    • d. Critical Question 4: Search Strategy Results and PRISMA Diagram

The following databases were searched for RCTs and systematic reviews and meta-analyses of RCTs or controlled clinical trials to answer CQ4:

  • PubMed from January 1998 to December 2009
  • CINAHL from January 1998 to July 2008
  • EMBASE from January 1998 to July 2008
  • PsycInfo from January 1998 to July 2008
  • EBM Cochrane Libraries from January 1998 to July 2008
  • Biological Abstracts from January 2004 to July 2008
  • Wilson Social Sciences Abstracts from January 1998 to July 2008

The literature search for CQ4 included an electronic search of the Central Repository for RCTs or controlled clinical trials published in the literature from January 1998 to December 2009. The Central Repository contains citations pulled from seven literature databases: PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts. The search produced 2,145 citations, with 15 additional citations identified from non-search sources (i.e., by the panel members or hand search of systematic reviews and meta-analyses) (obtained through the electronic search). The systematic reviews and meta-analyses were only used for manual searches and were not part of the final evidence base. This manual cross-check was done to ensure that major studies were not missing from the evidence base. Eleven of the 15 citations identified from non-search sources were published after December 31, 2009. Per NHLBI policy, certain lifestyle and obesity intervention studies published after the closing date could be allowed as exceptions. These studies must be RCTs in which each study arm contained at least 100 participants and were identified by experts knowledgeable of the literature. Ten of the 11 citations published after December 2009 met the criteria and were eligible for inclusion in the CQ4 evidence base.[23, 201-209] In contrast, 1 of the 11 citations did not meet the criteria and was excluded from the CQ4 evidence base. [210] The remaining four citations, identified through non-search sources, were published before 2009. Of these four, one citation had no abstract, two citations had no indication in the abstract or MeSH terms that they were related to overweight or obese populations, and one citation had no indication in the abstract or MeSH terms that the publication was related to comprehensive lifestyle interventions. Of the 15 citations identified through non-search sources, 14 were screened and found eligible for inclusion; subsequently, two of these studies were rated as poor quality studies.

The PRISMA diagram for CQ4 shown in Figure B-4 outlines the flow of information from the literature search through the various steps used in the systematic review process.

Figure 16.

PRISMA Diagram Showing Selection of Articles for CQ4

Key: Details for each exclusion rationale are determined by the I/E criteria for the question, reproduced below. The I/E criteria are also available in Section 8a.

Two reviewers independently screened the titles and abstracts of 2,160 publications against the I/E criteria, resulting in 1,776 publications being excluded and 384 publications being retrieved for full-text review to further assess eligibility. Next, two independent reviewers independently screened 384 full-text publications, assessing eligibility by applying the I/E criteria; 215 of these publications were excluded based on one or more of the I/E criteria (see specified rationale as noted in the PRISMA diagram).

One hundred and forty-six of the 384 full-text publications met the criteria and were included. The quality (internal validity) of these 146 publications was assessed using the quality assessment tool developed to assess RCTs (see Appendix A). Of these, 74 publications were excluded because they were rated as poor quality; of them, 43 studies were rated poor due to the ITT and attrition rates. Rationales for the poor quality studies are included in Appendix B. The remaining 51 trials (72 articles) were rated good or fair quality and included in the evidence base that was used to formulate the evidence statements. Panel members reviewed the final studies on the include list, along with their quality ratings, and had the opportunity to raise questions. Some trials previously deemed to be of fair or good quality were downgraded to poor quality upon closer review of evidence tables. These trials used completers analyses rather than ITT analysis and had overall attrition rates exceeding 10 percent. If the study reported only an analysis of completers and had attrition at <10 percent, it was allowed in the evidence base. Methodologists worked with the systematic review team to reevaluate these trials and make a final decision. Evidence tables and summary tables consisted only of data from the original publications of eligible RCTs; these tables formed the basis for panel deliberations.

vi Critical Question 5: Search Strategy

CQ5 has three parts:

  • a. Efficacy: What are the long-term effects of the following surgical procedures on weight loss, weight loss maintenance, CVD risk factors, related comorbidities, and mortality?

    • LAGB
    • Laparoscopic RYGB
    • Open RYGB
    • Biliopancreatic bypass with or without duodenal switch
    • Sleeve gastrectomy (SG)
  • • What are the long-term effects of the surgical procedures (listed above) in patients with different BMIs and comorbidities?

    • BMI <35
    • BMI of 35 to 40 with no comorbidities
    • BMI ≥35 with comorbidities, and
    • BMI ≥40 with no comorbidities
  • b. Predictors: What are the predictors associated with long-term effects of the following surgical procedures on weight loss, weight loss maintenance, CVD risk factors, related comorbidities, and mortality?

    • LAGB
    • Laparoscopic RYGB
    • Open RYGB
    • Biliopancreatic diversion (BPD) with or without duodenal switch
    • SG

    What are the predictors associated with long-term effects of the surgical procedures (listed above) in patients with different BMIs and comorbidities?

    • BMI <35
    • BMI of 35 to 40 with no comorbidities
    • BMI ≥35 with comorbidities, and
    • BMI ≥40 with no comorbidities.
  • c. Complications: What are the short-term (less than 30 days) and long-term (30 days or more) complications of the following bariatric surgical procedures? What are the predictors associated with complications?
    • LAGB
    • Laparoscopic RYGB
    • Open RYGB
    • BPD with or without duodenal switch
    • SG
    What are the complications of the surgical procedures (listed above) in patients with different BMIs and comorbidities?
    • BMI <35
    • BMI of 35 to 40 with no comorbidities
    • BMI ≥35 with comorbidities, and
    • BMI ≥40 with no comorbidities.
  • a. Study Type Query
  • • {RCT} OR {Systematic Review} OR
  • • (subject=(“Case Control Studies” or “Retrospective Studies” or “Cohort Studies” or “Longitudinal Studies” or “Follow Up Studies” or “Prospective Studies”) or (genre, subject=“Controlled Clinical Trial?” and qualifier=“adverse effects”) or case control or longitudinal or prospective? or retrospective? or cohort? or (before %10 after))
  • • NOT (title=case report OR genre=letter OR genre=newspaper article OR genre=comment OR genre=“case reports” OR genre=“case study”)
  • b. Boolean Search

(

publicationYear>1997 AND publicationYear<2010 and language=eng?

AND (((subject=(“Overweight” or “Obesity” or “Obesity Morbid”)) with (qualifier=surgery))

  • OR ((bariatric %3 (surger? or procedure? or operation?)) or subject=(Gastroplasty or Laparoscopy) or (subject=“Anastomosis Roux-en-Y” or subject, abstract, title=“Gastric Bypass” or Gastroileal Bypass or Gastrojejunostom? or subject, abstract, title=((Biliopancreatic or Bilio-Pancreatic) %2 (Diversion? or Bypass?)) or “laparoscopic adjustable gastric band?” or “gastric band” or “gastric banding” or ((subject=Duodenum) with (qualifier=surgery)) or “duodenal switch” or “gastric sleeve” or “sleeve gastrectomy” or “Laparoscopic Roux-en-Y gastric bypass” or “Open Roux-en-Y gastric bypass” or “Biliopancreatic bypass” or Roux-en-Y)))
  • AND (subject, title, abstract, qualifier=(mortality or death?) or subject=“Hospital Mortality” or subject, title, abstract=(“Body Weight” or subject, title, abstract=“Body Mass Index” or “Waist Circumference” or “Weight Gain” or “Weight Loss” or “Waist-Hip Ratio” or “Body Fat Distribution” or “Skinfold Thickness” or Adiposity) or BMI or abstract, title, qualifier=“adverse effects” or subject=(“Postoperative Complications” or “Postgastrectomy Syndromes” or “Dumping Syndrome” or “Postoperative Hemorrhage” or “Postoperative Nausea and Vomiting” or “Surgical Wound”))
  • )
  • NOT (recordStatus=delete)
  • NOT subject=(“Africa” OR “Africa Northern” OR “Algeria” > 157 more terms.)
  • NOT subject=(“Advertising as Topic”)
  • NOT (subject=(Animals or Venoms))
  • NOT subject=((child or adolescent) not (adult or aged))
  • NOT subject=(“Child Nutrition” or “Child Behavior” or “Child, Preschool” or “Child Development” or “Infant Food”)
  • NOT subject=(Heel or Foot diseases or Cosmetic techniques or Hair Removal or Hirsutism)

    • c. Boolean Filter

Table B-7. Criteria for Selection of Publications for CQ4
image
image
Table B-8. Criteria for Selection of Publications for CQ5
image
image
image

The Boolean filter in the CQ5 search strategy implements the intervention criterion to reflect exactly the five requested procedures (i.e., LAGB, Laparoscopic RYGB, Open RYGB, Biliopancreatic bypass/duodenal switch, and SG).

(

  • “Laparoscopic adjustable gastric banding” or “lap-band” or subject, title, abstract=(Laparoscop? and (Gastroplast? or gastric) and band?)
  • or “Laparoscopic Roux-en-Y gastric bypass” or (subject, title, abstract=“Gastric Bypass” and subject, title, abstract=Laparoscop?)
  • or “Open Roux-en-Y gastric bypass” or (subject, title, abstract=“Gastric Bypass” and subject, title, abstract=“Roux-en-Y”) or Gastroileal Bypass or Gastrojejunostom?
  • or ((Biliopancreatic or Bilio-Pancreatic) %2 (Diversion? or Bypass?)) or “duodenal switch” or subject, title, abstract=“Biliopancreatic Diversion” or ((subject=Duodenum) with (qualifier=surgery))
  • or “Gastric sleeve” or “sleeve gastrectomy”
  • or subject, title, abstract=((“Bariatric Surgery” or “Gastric Bypass” or “gastric banding” or “gastric surgery” or Gastrectomy) and ((Weight or BMI) %3 (loss or gain or reduc?)))
  • or genre, title=Meta-analysis
  • )

    • d. Critical Question 5: Search Strategy Results and PRISMA Diagram

The following databases were searched for RCTs, observational studies and systematic reviews and meta-analyses of RCTs or controlled clinical trials, and observational studies to answer CQ5:

  • PubMed from January 1998 to December 2009
  • CINAHL from January 1998 to July 2008
  • EMBASE from January 1998 to July 2008
  • PsycInfo from January 1998 to July 2008
  • Evidence-based Medicine Cochrane Libraries from January 1998 to July 2008
  • Biological Abstracts from January 2004 to July 2008
  • Wilson Social Sciences Abstracts from January 1998 to July 2008
Table B-9. CQ1 Studies Rated Fair or Good
image
image

The literature search for CQ5 included an electronic search of the Central Repository for RCTs, controlled clinical trials, and observational studies published in the literature from January 1998 to December 2009. The Central Repository contains citations pulled from seven literature databases: PubMed, CINAHL, EMBASE, PsycINFO, EBM, Biological Abstracts, and Wilson Social Sciences Abstracts. The search produced 2,317 citations, with 9 additional citations identified from non-search sources (i.e., by the panel members or hand search of systematic reviews and meta-analyses) (obtained through the electronic search). The systematic reviews and meta-analyses were only used for manual searches and were not part of the final evidence base. This manual cross-check was done to ensure that major studies were not missing from the evidence base. A similar manual cross-check of citations from the American Society for Metabolic & Bariatric Surgery (ASMBS) position statement on SG was performed in May 2012. Eight of the 9 citations identified from non-search sources were published after December 31, 2009. Per NHLBI policy, certain lifestyle and obesity intervention studies published after the closing date could be allowed as exceptions. These studies must be RCTs in which each study arm contained at least 100 participants and were identified by experts knowledgeable of the literature. Three of the nine citations published after December 2009 met the criteria and were eligible for inclusion in the CQ5 evidence base. [345-347] In contrast, five of the nine citations did not meet the criteria and were excluded from the CQ5 evidence base. [348-352] The remaining citation, identified through non-search sources, was published before 2009. [353] This citation met the criteria and was eligible for inclusion. Thus, of the nine citations identified through non-search sources, four were screened and found eligible for inclusion; subsequently, all these studies were rated as good quality.

The PRISMA diagram for CQ5 shown in Figure B-5 outlines the flow of information from the literature search through the various steps used in the systematic review process.

Figure 17.

PRISMA Diagram Showing Selection of Articles for CQ5

Key: Details for each exclusion rationale are determined by the I/E criteria for the question, reproduced below. The I/E criteria are also available in Section 9a.

Table B-10. CQ1 Studies Rated as Poor With Rationale
image
image
Table B-11. CQ2 Studies Rated Fair or Good
image

A natural language processing filter was used to identify studies with sample sizes less than 100, 100 to 299, and/or a followup time of less than 6 months. The natural language processing filter was executed against titles and abstracts. Of the 2,317 citations identified through the database search, 811 citations were automatically excluded using the natural language processing filter. Two reviewers independently screened the titles and abstracts of the 1,515 remaining citations against the I/E criteria for each of the three components (Efficacy, Predictors, and Complications). This resulted in 1,062 publications being excluded (on one or more of the I/E criteria for each of the three components of CQ5) and 453 publications being retrieved for full-text review to further assess eligibility.

Sixty-four of the 453 full-text publications met the criteria and were included. The quality (internal validity) of these 64 publications was assessed using the six quality assessment tools that were developed (see Appendix A). Of these, 29 publications were excluded because they were rated as poor quality; of these, 18 studies were rated poor due to the ITT and/or attrition rates. Rationales for the poor quality studies are included in Appendix B. The remaining 22 trials (35 articles) that met the criteria for at least one of the three components were rated good or fair quality and included in the evidence base. These articles were used to formulate the evidence statements. For the Efficacy, Predictors and Complications components, there were 17, 12, and 15 citations rated as good or fair. There were a total of eight citations that were used across more than one component. [346, 383, 384, 386, 390, 399, 406, 407] Of the 16 citations included for the Efficacy component, 4 were RCTs, and 12 were observational studies. Of the 12 citations included for the Predictors component, 6 were RCTs and 6 were observational studies. And, of the 15 citations included for the Complications component, 4 were RCTs and 11 were observational studies.

Panel members reviewed the final studies on the “include” list, along with their quality ratings and had the opportunity to raise questions. Some trials previously deemed to be of fair or good quality were downgraded to poor quality upon closer review of evidence tables. These trials used completers analyses rather than ITT analysis and had overall attrition rates exceeding 10 percent. If the study reported only an analysis of completers and had attrition at <10 percent, it was allowed in the evidence base. Methodologists worked with the systematic review team to reevaluate these trials and make a final decision. Evidence tables and summary tables consisted only of data from the original publications of eligible RCTs and observational studies; these tables formed the basis for panel deliberations.

vii Critical Questions and Quality Ratings of Studies

For each CQ, this section includes a table that lists studies rated as fair or good and a table listing studies rated as poor.

a Critical Question 1

Among overweight and obese adults, does achievement of reduction in body weight with lifestyle and pharmacological interventions affect CVD risk factors, CVD events, morbidity, and mortality?

  1. Does this effect vary across population subgroups defined by the following demographic and clinical characteristics:
    • Age
    • Sex
    • Race/ethnicity
    • Baseline BMI
    • Baseline WC
    • Presence or absence of comorbid conditions
    • Presence or absence of CVD risk factors
  2. What amount (shown as percent lost, pounds lost, etc.) of weight loss is necessary to achieve benefit with respect to CVD risk factors, morbidity, and mortality?
    • Are there benefits on CVD risk factors, CVD events morbidity, and mortality from weight loss?
    • What are the benefits of more significant weight loss?
  3. What is the effect of sustained weight loss for 2 or more years in individuals who are overweight or obese, on CVD risk factors, CVD events, and health and psychological outcomes?
    • What percent of weight loss needs to be maintained at 2 or more years to be associated with health benefits?

Tables A-16 and A-17 show studies rated fair or good and studies rated poor, respectively. The studies include systematic reviews and meta-analyses and the Look AHEAD study.

b Critical Question 2
  1. Are the current cutpoint values for overweight (BMI 25.0 to 29.9 kg/m2) and obesity (BMI ≥30 kg/m2) compared with BMI 18.5 to 24.9 kg/m2 associated with elevated CVD risk (defined below)? Are the WC cutpoints of >102 cm (male) and >88 cm (female) associated with elevated CVD risk (defined below)? How do these cutpoints compare with other cutpoints in terms of elevated CVD risk?
    • Fatal and nonfatal CHD, stroke, and CVD
    • Overall mortality
    • Incident type 2 diabetes mellitus
    • Incident dyslipidemia
    • Incident hypertension
  2. Are differences across population subgroups in the relationships of BMI and WC cutpoints with CVD sufficiently large to warrant different cutpoints? If so, what should they be?
    • Fatal and nonfatal CHD, stroke, and CVD
    • Overall mortality
    • Incident type 2 diabetes mellitus
    • Incident dyslipidemia
    • Incident hypertension
    Groups being considered include:
    • Age
    • Sex (both male and female)
    • Race/ethnicity (African American, Hispanic, Native American, Asian, Caucasian)
  3. What are the associations between maintaining weight and weight gain with elevated CVD risk in normal weight, overweight, and obese adults?
Table B-12. CQ2 Studies Rated as Poor With Rationale
image
image

Tables B-11 and B-12 show systematic reviews and meta-analyses rated fair or good and those rated poor, respectively:

CQ2 initially involved studies and systematic reviews and meta-analyses. Due to resource constraints, the final evidence review involved systematic reviews and meta-analyses only.

c Critical Question 3
  1. In overweight or obese adults, what is the comparative efficacy/effectiveness of diets of differing forms and structures (macronutrient content, carbohydrate and fat quality, nutrient density, amount of energy deficit, dietary pattern) or other dietary weight loss strategies (e.g., meal timing, portion controlled meal replacements) in achieving or maintaining weight loss?
  2. During weight loss or weight maintenance after weight loss, what are the comparative health benefits or harms of the above diets and other dietary weight loss strategies?
Table B-13. CQ3 Studies Rated Fair or Good
image
image
Table B-14. CQ3 Studies Rated Poor With Rationale
image
image
image
image
image
image
image

Tables B-13 and B-14 show studies rated fair or good and studies rated poor, respectively:

d Critical Question 4
  1. Among overweight and obese adults, what is the efficacy/effectiveness of a comprehensive lifestyle intervention program (i.e., comprised of diet, physical activity, and behavior therapy) in facilitating weight loss or maintenance of lost weight?
  2. What characteristics of delivering comprehensive lifestyle interventions (e.g., frequency and duration of treatment, individual vs. group sessions, onsite vs. phone/e-mail contact) are associated with greater weight loss and weight loss maintenance?
Table B-15. CQ4 Studies Rated Fair or Good
image
image
image
image
image
Table B-16. CQ4 Studies Rated Poor With Rationale
image
image
image
image
image
image
image
image
Table B-17. CQ5 Studies Rated Fair or Good
image
image
Table B-18. CQ5 Studies Rated Poor With Rationale
image
image
image

Tables B-15 and B-16 show studies rated fair or good and studies rated poor, respectively.

e Critical Question 5
  1. Efficacy What are the long-term effects of the following surgical procedures on weight loss, weight loss maintenance, CV risk factors, related comorbidities, and mortality?
    • LAGB
    • Laparoscopic RYGB
    • Open RYGB
    • Biliopancreatic bypass with or without duodenal switch
    • SG
    What are the long-term effects of the surgical procedures (listed above) in patients with different BMIs and comorbidities?
    • BMI <35
    • BMI of 35 to 40 with no comorbidities
    • BMI ≥35 with comorbidities, and
    • BMI ≥40 with no comorbidities
  2. Predictors What are the predictors associated with long-term effects of the following surgical procedures on weight loss, weight loss maintenance, CV risk factors, related comorbidities, and mortality?
    • LAGB
    • Laparoscopic RYGB
    • Open RYGB
    • BPD with or without duodenal switch
    • SG
    What are the predictors associated with long-term effects of the surgical procedures (listed above) in patients with different BMIs and comorbidities?
    • BMI <35
    • BMI of 35 to 40 with no comorbidities
    • BMI ≥35 with comorbidities, and
    • BMI ≥40 with no comorbidities.
  3. Complications What are the short-term (less than 30 days) and long-term (30 days or more) complications of the following bariatric surgical procedures? What are the predictors associated with complications?
    • LAGB
    • Laparoscopic RYGB
    • Open RYGB
    • BPD with or without duodenal switch
    • SG
    What are the complications of the surgical procedures (listed above) in patients with different BMIs and comorbidities?
    • BMI <35
    • BMI of 35 to 40 with no comorbidities
    • BMI ≥35 with comorbidities, and
    • BMI ≥40 with no comorbidities.

Tables B-17 and B-18 show studies rated fair or good and studies rated poor, respectively.

Appendix C: Spreadsheets and Summary Tables

Critical Question 1

Table Spreadsheet 1.1. Effect of Weight Loss From Lifestyle Interventions in Patients With Diabetes on Blood Glucose and HbA1c
image
Table Spreadsheet 1.2. Effect of Weight Loss From Lifestyle Interventions in Persons at Risk for Developing Diabetes (Prediabetes) on Risk for Converting to Type 2 Diabetes
image
Table Spreadsheet 1.3. Effect of Intentional Weight Loss From Lifestyle Interventions in Patients With or Without Diabetes on Mortality
image
Table Spreadsheet 1.4a. Effect of Weight Loss From Orlistat Interventions in Patients With Diabetes on Blood Glucose and HbA1c
image
Table Spreadsheet 1.4b. Effect of Weight Loss From Orlistat Interventions in Patients at Risk for Diabetes on Blood Glucose and HbA1c
image
Table Spreadsheet 1.5a. Effect of Weight Loss on Serum Lipids: Lifestyle Interventions
image
image
image
image
Table Spreadsheet 1.5b. Effect of Weight Loss on Serum Lipids: Orlistat Interventions
image
image
Table Spreadsheet 1.5c. Effect of Weight Loss on Serum Lipids: Diabetes Subjects
image
image
Table Spreadsheet 1.6a. Weight Loss and Hypertension Risk: Orlistat Trials, All Subjects
image
image
Table Spreadsheet 1.6b. Weight Loss and Hypertension Risk: Orlistat Trials, Diabetes Subjects
image
Table Spreadsheet 1.6c. Weight Loss and Hypertension Risk: Hypertension Subjects
image
Table Spreadsheet 1.6d. Weight Loss and Hypertension Risk: Lifestyles Trials, All Subjects
image
Table Spreadsheet 1.6e. Weight Loss and Hypertension Risk: Lifestyles Trials, Diabetes Subjects
image
image
Table Spreadsheet 1.6f. Weight Loss and Hypertension Risk: Lifestyles Trials, Hypertension Subjects
image

Critical Question 2

Table Spreadsheet 2.1. Study Descriptives
image
image
image
image

Critical Question 3

Table Summary Table 3.1. Overall Dietary Intervention and Composition
image
image
image
image
image
image
image
image
image
image
image
image
image
image
Table Summary Table 3.2. Low-Fat Approaches
image
image
image
image
image
image
image
image
Table Summary Table 3.3. Higher (25–30% of Energy) Protein Approaches
image
image
image
image
image
image
image
image
image
image
Table Summary Table 3.4. Low Carbohydrate Approaches (<30 g/day for at least a period)
image
image
image
Table Summary Table 3.5. Complex Versus Simple Carbohydrates
image
Table Summary Table 3.6. Glycemic Load Dietary Approaches
image
image
image
Table Summary Table 3.7. CQ3—Dietary Patterns (Mediterranean Style and Vegetarian and Other Dietary Pattern Approaches)
image
image
image
image
image
image
image
Table Summary Table 3.8. Meal Replacements and Adding Foods to Liquid Diets
image
image
image
Table Summary Table 3.9. Very Low-Calorie-Diet (VLCD) Approaches
image
image
image

Critical Question 4

Table Summary Table 4.1. Diet, Physical Activity, and Behavior Therapy Components in High-Intensity,* Onsite Lifestyle Interventions *A high-intensity intervention is defined as providing 14 or more intervention sessions in the first 6 months.
 
Table Summary Table 4.1a. Weight Loss Trials Compared With Usual Care, Minimal Care, or No Care Control Interventions: Outcome Data at 6 Months or Less as First Time Period Reporting
image
image
image
image
Table Summary Table 4.1b. Weight Loss Trials Compared With Usual Care, Minimal Care, or No Care Control Interventions: Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
image
image
Table Summary Table 4.1c. Weight Loss Trials Compared With Usual Care, Minimal Care, or No Care Control Interventions: Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
image
Table Summary Table 4.2. Evidence for the Comprehensive Interventions Compared With Usual Care, Minimal Care, or No-Treatment Control*A high intensity intervention is defined as providing 14 or more intervention sessions in the first 6 months.
 
Table Summary Table 4.2a. Weight Loss Trials Compared With Usual Care, Minimal Care, or No Care Control Interventions: Outcome Data at 6 Months or Less as First Time Period Reporting
image
image
image
image
Table Summary Table 4.2b. Weight Loss Trials Compared With Usual Care, Minimal Care, or No Care Control Interventions: Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
image
image
Table Summary Table 4.2c. Weight Loss Trials Compared With Usual Care, Minimal Care, or No Care Control Interventions: Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
image
Table Summary Table 4.2d. Comprehensive Interventions Compared to Other Comprehensive Intervention That Varied the Physical Activity or Behavior Therapy Component: Outcome Data at 6 Months or Less as the First Time Period Reported
image
image
image
Table Summary Table 4.3. Efficacy/Effectiveness of Electronically Delivered, Comprehensive Interventions in Achieving Weight Loss
 
Table Summary Table 4.3a. Compared With Usual Care, Minimal Control, or No Intervention (Includes Self-Directed): Electronic Text Messaging—Outcome Data at 6 Months or Less as First Time Period Reporting
image
Table Summary Table 4.3b. Compared With Usual Care, Minimal Control, or No Intervention (Includes Self-Directed): Electronic Internet—Outcome Data at 6 Months or Less as First Time Period Reporting
image
image
Table Summary Table 4.3c. Compared With Usual Care, Minimal Control, or No Intervention (Includes Self-Directed): Electronic Internet—Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
Table Summary Table 4.3d. Compared With Usual Care, Minimal Control, or No Intervention (Includes Self-Directed): Electronic Internet—Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
Table Summary Table 4.3e. Compared with Other Comprehensive Electronic Interventions: Electronic Internet—Outcome Data at 6 Months or Less as First Time Period Reporting
image
image
Table Summary Table 4.3f. Compared with Other Comprehensive Electronic Interventions: Electronic Internet—Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
Table Summary Table 4.3g. Compared with Other Comprehensive Electronic Interventions: Electronic Internet—Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
Table Summary Table 4.3h. Compared With Other Comprehensive Interventions (Includes Onsite and/or Electronic): Outcome Data at 6 Months or Less as First Time Period Reporting
image
image
Table Summary Table 4.3i. Compared With Other Comprehensive Interventions (Includes Onsite and/or Electronic): Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
Table Summary Table 4.3j. Compared With Other Comprehensive Interventions (Includes Onsite and/or Electronic): Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
Table Summary Table 4.3k. Evidence of Weight Loss in Comprehensive Electronic (Interactive Equipment) Interventions: Compared With Other Comprehensive Intervention (Internet or Onsite)—Weight Loss Outcome Data at 4 Months or Greater
image
Table Summary Table 4.3l. Evidence of Weight Maintenance in Comprehensive Electronic (Interactive Technology with Phone Feedback) Interventions: Comprehensive Electronic Intervention Compared With Personal Contact or Self-Directed Control—Weight Loss Outcome Data at 6 Months or Greater
image
image
Table Summary Table 4.3m. Evidence of Weight Maintenance in Comprehensive Electronic (Internet) Interventions: Comprehensive Electronic Intervention Compared With Personal Contact or Self-Directed Control—Weight Loss Outcome Data at 6 Months or Greater
image
Table Summary Table 4.3n. Evidence of Weight Maintenance in Comprehensive Electronic (Internet) Interventions: Comprehensive Electronic Intervention Compared With Personal Contact or Self-Directed Control—Weight Loss Outcome Data at 12 Months or Greater
image
Table Summary Table 4.4. Efficacy/Effectiveness of Comprehensive, Telephone-Delivered Lifestyle Interventions for Achieving Weight Loss*A high-intensity intervention is defined as providing 14 or more intervention sessions in the first 6 months.
 
Table Summary Table 4.4a. Trials That Compared Onsite vs. Telephone Delivered Programs for Inducing Weight Loss: Outcome Data at 6 Months Or Less as First Time Period Reporting
image
image
Table Summary Table 4.4b. Trials That Compared Onsite vs. Telephone Delivered Programs for Inducing Weight Loss: Outcome Data at 12 Months or Greater as First Time Period Reporting
image
image
Table Summary Table 4.5. Efficacy/Effectiveness of Comprehensive Weight Loss Programs in Patients Within a Primary Care Practice Compared With Usual CareTrials are organized by weight loss vs. weight maintenance, then by first outcome time period reported, then by greatest weight loss (with completers analysis data, or data not presented as kg or % being listed last)
 
Table Summary Table 4.5a. Weight Loss Trials: Outcome Data at 6 Months or Less as First Time Period Reporting
image
image
Table Summary Table 4.5b. Weight Loss Trials: Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
image
Table Summary Table 4.5c. Weight Loss Trials: Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
Table Summary Table 4.5d. Weight Maintenance Trials: Outcome Data at 6 Months or Less as First Time Period Reporting
image
Table Summary Table 4.5e. Weight Maintenance Trials: Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
Table Summary Table 4.5f. Weight Maintenance Trials: Outcome Data at Greater Than 12 months as First Time Period Reporting
image
Table Summary Table 4.6. Efficacy/Effectiveness of Commercial-Based, Comprehensive Lifestyle Interventions in Achieving Weight Loss
 
Table Summary Table 4.6a. Compared With Usual Care, Minimal Control, or No Intervention
  1. NOTE: Deibert 2004 removed because it is a commercially available supplement but not a commercial program per se; Willaing 2004 not included because commercial information material provided, not a commercial program, per se; Womble 2004 and Gold 2007are presented in the Electronic Summary Table.

image
image
image
Table Summary Table 4.7. Efficacy/Effectiveness of Very Low-Calorie Diets, as Used as Part of a Comprehensive Lifestyle Intervention in Achieving Weight Loss
 
Table Summary Table 4.7a. Evidence from Weight Loss Trials: Weight Loss Outcome Data at 6 Months or Greater
image
image
Table Summary Table 4.7b. Evidence From Weight Maintenance Trials: Weight Maintenance Outcome Data at 6 Months or Greater
image
image
Table Summary Table 4.7c. Evidence From Weight Maintenance Trials: Weight Maintenance Outcome Data at 9 Months or Greater
image
Table Summary Table 4.8. Efficacy/Effectiveness of Comprehensive Lifestyle Interventions in Maintaining Lost WeightTrials are organized by onsite vs. electronic programs as the primary intervention, then by first outcome time period reported, then by greatest weight loss (with completers analysis data, or data not presented as kg or % being listed last)
 
Table Summary Table 4.8a. Onsite Interventions: Outcome Data at 6 Months or Less as the First Time Period Reported
image
image
image
Table Summary Table 4.8b. Onsite Interventions: Outcome Data at 6 Months or Greater as First Time Period Reported
image
image
Table Summary Table 4.8c. Onsite Interventions: Outcome Data at Greater Than 12 Months as the First Time Period Reported
image
Table Summary Table 4.8d. Electronic Interventions: Outcome Data at 6 Months or Less as First Time Period Reporting
image
Table Summary Table 4.8e. Electronic Interventions: Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
image
Table Summary Table 4.8f. Comprehensive Weight Loss or Weight Loss Maintenance Trials That Reported Percentage of Participants who Achieved A Loss ≥5% of Initial Weight at ≥2 Years Post-Randomization
image
image
image
Table Summary Table 4.9. Characteristics of Lifestyle Intervention Delivery That May Affect Weight Loss: Intervention Intensity*Moderate intensity is defined as providing 6-13 intervention sessions in the first 6 months; low intensity is defined as providing 1-5 intervention sessions in 6 months.
 
Table Summary Table 4.9a. Moderate Intensity Interventions: Outcome Data at 6 Months or Less as First Time Period Reporting
image
image
image
Table Summary Table 4.9b. Moderate Intensity Interventions: Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
image
Table Summary Table 4.9c. Moderate Intensity Interventions: Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
image
image
Table Summary Table 4.9d. Low-Intensity Interventions: Outcome Data at 6 Months or Less as First Time Period Reporting
image
Table Summary Table 4.9e. Low-Intensity Interventions: Outcome Data at Greater Than 6 Months as First Time Period Reporting
image
image
Table Summary Table 4.9f. Low-Intensity Interventions: Outcome Data at Greater Than 12 Months as First Time Period Reporting
image
Table Summary Table 4.10. Characteristics of Lifestyle Intervention Delivery that May Affect Weight Loss or Weight Maintenance: Onsite Vs. Electronically Delivered Interventions
 
Table Summary Table 4.10a. Randomized Comparison of High-Intensity Onsite vs. Electronically Delivered interventions
image
image
Table Summary Table 4.10b. High-Intensity, Onsite Comprehensive Interventions: Weight Loss Outcome, Data at 6 Months or Less
image
image
image
image
image
Table Summary Table 4.10c. High-Intensity, Onsite Comprehensive Interventions: Weight Loss Outcome Data at Greater than 6 Months to 12 Months
image
image
image
Table Summary Table 4.10d. High-Intensity, Onsite Comprehensive Interventions: Weight Loss Outcome Data at Greater than 12 Months
image
image
Table Summary Table 4.10e. Comprehensive Electronically Delivered Interventions: Evidence of Weight Loss in Comprehensive Electronic (Telephone) Interventions—Compared With Usual Care, Minimal Control, or No Intervention (Includes Self-Directed): Weight Loss Outcome Data at Greater Than 12 Months
image
Table Summary Table 4.10f. Comprehensive Electronically Delivered Interventions: Evidence of Weight Loss in Comprehensive Electronic (Internet) Interventions—Compared With Usual Care, Minimal Control, or No Intervention (Includes Self-Directed): Weight Loss Outcome Data at 6 Months or Greater
image
Table Summary Table 4.10g. Comprehensive Electronically Delivered Interventions: Evidence of Weight Loss in Comprehensive Electronic (Internet) Interventions—Compared With Comprehensive Electronic Interventions: Weight Loss Outcome Data at 6 Months or Greater
image
image
Table Summary Table 4.10h. Comprehensive Electronically Delivered Interventions: Evidence of Weight Loss in Comprehensive Electronic (Internet) Interventions—Compared With Comprehensive Electronic Interventions: Weight Loss Outcome Data at 12 Months or Greater
image
Table Summary Table 4.10i. Comprehensive Electronically Delivered Interventions: Evidence of Weight Loss in Comprehensive Electronic (Internet) Interventions—Compared With Other Comprehensive Interventions (includes Onsite and/or Electronic): Weight Loss Outcome Data at 4 Months or Greater
image
image
Table Summary Table 4.10j. Comprehensive Electronically Delivered Interventions: Evidence of Weight Loss in Comprehensive Electronic (Interactive Equipment) Interventions—Compared With Other Comprehensive Intervention (Internet or Onsite): Weight Loss Outcome Data at 4 Months or Greater
image

Critical Question 5

Table Summary Table 5.1. Component 1: Efficacy of Weight Loss Surgery
image
image
image
image
image
image
image
image
image
image
image
image
image
Table Summary Table 5.2. Component 2: Predictors Associated with Long-Term Effects—Patient Characteristics
image
image
Table Summary Table 5.3. Component 2: Predictors Associated with Long-Term Effects —Types of Surgery
image
image
image
image
image
image
image
image
image
image
image
image
Table Summary Table 5.4. Component 3: Complications Associated with Weight Loss Surgery
image
image
image
image