Clinical research in implant dentistry: study design, reporting and outcome measurements: consensus report of Working Group 2 of the VIII European Workshop on Periodontology


  • Conflict of interest and source of funding statement

  • The authors declare that they have no conflict of interests. This workshop was financially supported by the European Federation of Periodontology and by unrestricted grants from Astra, Nobel Biocare and Straumann


M. Tonetti

European Research Group on Periodontology




The objective of this working group was to assess and make specific recommendations to improve the quality of reporting of clinical research in implant dentistry and discuss ways to reach a consensus on choice of outcomes.

Material and Methods

Discussions were informed by three systematic reviews on quality of reporting of observational studies (case series, case-control and cohort) and experimental research (randomized clinical trials). An additional systematic review provided information on choice of outcomes and analytical methods. In addition, an open survey among all workshop participants was utilized to capture a consensus view on the limits of currently used survival and success-based outcomes as well as to identify domains that need to be captured by future outcome systems.


The Workshop attempted to clarify the characteristics and the value in dental implant research of different study designs. In most areas, measurable quality improvements over time were identified. The Workshop recognized important aspects that require continued attention by clinical researchers, funding agencies and peer reviewers to decrease potential bias. With regard to choice of outcomes, the limitations of currently used systems were recognized. Three broad outcome domains that need to be captured by future research were identified: (i) patient reported outcome measures, (ii) peri-implant tissue health and (iii) performance of implant supported restorations. Peri-implant tissue health can be measured by marginal bone level changes and soft tissue inflammation and can be incorporated in time to event analyses.


The Workshop recommended that collaboration between clinicians and epidemiologists/clinical trials specialists should be encouraged. Aspects of design aimed at limitation of potential bias should receive attention by clinical researchers, funding agencies and journal editors. Adherence to appropriate reporting guidelines such as STROBE and CONSORT are necessary standards. Research on outcome measure domains is an area of top priority and should urgently inform a proper process leading to a consensus on outcome measures in dental implant research.

  • *Working Group 2:
  • Maurizio Tonetti,
  • Richard Palmer,
  • Zvi Artzi,
  • Francesco Cairo,
  • Mauro Donati,
  • William Giannobile,
  • Eli Machtei,
  • Phoebus Madianos,
  • Henny Meijer,
  • Ian Needleman,
  • Friedrich W. Neukam,
  • David Nisand,
  • Marc Quirynen,
  • Isabella Rocchietta,
  • Ignacio Sanz,
  • Leonardo Trombelli,
  • Yu-Kang Tu,
  • Industry representation:
  • Frederick Ceder,
  • Michael Hotze,
  • Alexandra S. Rieben.

The remit of this working group was to critically discuss quality of reporting, study design and choice of outcomes in clinical research in implant dentistry and identify areas of priorities for improvement in design and reporting of future research. It is recognized that data discussed in the workshop are based on quality of reporting (and not on quality of research) and risk of bias of published research. Such data, however, may be used to assist in improving both design and reporting of future studies.

Three systematic reviews evaluated in a systematic manner the quality of reporting of case series (Meijer & Raghoebar 2012), case-control and cohort studies (Rocchietta & Nisand 2012) and randomized clinical trials (Cairo et al. 2012) in implant dentistry. A fourth review systematically assessed the choice of outcomes, the analytical methods and reference groups in clinical implant dentistry research (Needleman et al. 2012).

Role of Experimental and Observational Research Designs in Current Implant Dentistry

Considerable confusion exists in implant dentistry with regard to study design and the research questions that can be best answered by each design.

Identification of the reported research design in publications is sometimes a challenge. Grimes & Schulz (2002) have recognized this in medicine and have provided a useful classification of the most frequently used study designs (Fig. 1). They reported that “Clinical research falls into two general categories: experimental and observational, based on whether the investigator assigns the exposures or not. Experimental trials can also be subdivided into two: randomised and non-randomised. Observational studies can be either analytical or descriptive. Analytical studies feature a comparison (control) group, whereas descriptive studies do not. Within analytical studies, cohort studies track people forward in time from exposure to outcome. By contrast, case-control studies work in reverse, tracing back from outcome to exposure. Cross-sectional studies are like a snapshot, which measures both exposure and outcome at one time point. Descriptive studies, such as case-series reports, do not have a comparison group. Thus, in this type of study, investigators cannot examine associations, a fact often forgotten or ignored.”

Figure 1.

Algorithm for classification of various types of clinical research (from Grimes & Schulz 2002).

Early clinical ground breaking studies in implant dentistry were large descriptive studies with variable length of follow-up. They established the positive clinical outcomes of modern osseointegrated implants.

The need for comparative research was soon established to answer some relevant questions. In recent years, there has been an increase in the publications reporting results from RCTs and this has provided much stronger evidence for these treatments. Although a high quality RCT is rightly considered the gold standard, it may not be able to answer all the important questions and therefore there is much to be gained from observational studies, i.e. where there is no experimental assignment of the intervention or exposure. Observational studies are particularly useful in dealing with rare or late events and may relate well to clinical practice and allow generalizability to large patient groups. In addition, large and properly designed analytical observational studies have the ability to robustly identify and measure risk factor associations on implant related outcomes. This is something that RCTs are unlikely to achieve due to practical reasons. However, it should also be recognized that observational studies are always at greater risk of bias and the effects of confounding than well designed RCTs.

The descriptive study (case series) is still the most frequently reported design in this field; it's key characteristic is that it consists of a single group of subjects. These may be longitudinal (prospective or retrospective) or cross-sectional in design.

Frequently, different study designs are employed sequentially in the generation of clinically relevant evidence for a specific treatment modality or risk factor. Case series are often the starting point: they document the performance of a new procedure, device or biological and provide key data for the proper planning of the necessary experimental research. The lack of a comparison group severely limits the conclusions beyond the proof of principle and hypothesis generation stage. The need to answer comparative questions in terms of efficacy requires the use of experimental research, and randomized controlled clinical trials in particular. These are also frequently performed in sequence (so called phase I–IV studies or pre- and post-market trials). In parallel to experimental research – and frequently based on hypotheses that emerge during the conduct of previous trials (RCT) or clinical observations (case series) – properly designed observational research provides important dimensions to the clinical application of the treatment under investigation. These include knowledge of local (e.g. bone or soft tissue quantity, quality and position, plaque control) and patient based (e.g. tobacco exposure, medications, presence and control of co-morbidities like periodontitis or diabetes) characteristics that may modify prognosis and treatment decisions.

In implant dentistry, the need to focus on RCTs has been discussed in previous workshops (2005 and 2008) but, in general, less attention has been paid to non-experimental designs.

Clinical Performance/Descriptive Research in Implant Dentistry (Meijer & Raghoebar 2012)

Critical aspects in design and potential sources of bias in case series research

There is confusion of terminology in observational studies; e.g. case series are often called cohort studies. The majority of studies in implant dentistry deal with the description of a collection of patients in which a certain treatment has been carried out. The patients are followed for a designated amount of time and a number of items are collected at certain evaluation points. These studies, either prospective or retrospective, are sometimes referred to as “cohort studies”, but they should be more appropriately called case series. These are descriptive studies.

However, it is important to know that results of case series have served and can serve as a basis for hypothesis-generation and future research. In addition, case series can still be informative for objectives, such as the initial evaluation of a new technique or device.

Although selection biases are higher in case series, efforts should be made to address and limit potential sources of bias. For example, they should define and describe the study setting, location, period and method of recruitment, follow-up time, inclusion and exclusion criteria, and source of participant selection. Issues related to funding should also be addressed.

Assessment of research reporting of descriptive studies

To assess quality of reporting of descriptive studies, the STROBE Statement (2008) should be used. In implant dentistry it should be noted that:

  • During the last 20 years the number of descriptive studies has increased and the quality of reporting has improved. In 2010, 70% of STROBE items were properly addressed, as opposed to 46% in 1990
  • Improper reporting primarily includes items in title & abstract, methods and source of funding

Priority areas for improvement of study design and reporting

Items that need to be improved related to study design of case series include:

  • Proper definition of exposures and potential confounders
  • Proper definition of outcomes
  • Patient selection criteria and characteristics

Items that need to be improved related to study reporting include:

  • A description of any efforts to address potential sources of bias
  • An explanation of how the study size was determined
  • The indication of the study's design with a commonly used term in the title or abstract
  • The reporting of the length of follow-up
  • Reporting numbers of subjects and their reasons of loss to follow-up at each stage of the study
  • Reporting patient characteristics (demographic, clinical & social)
  • Reporting on exposures and potential confounders
  • Reporting of source of funding
  • Reporting on ethical committee approval

It is recommended that authors and journals embrace the STROBE Statement to improve quality of reporting of descriptive studies.

Prognostic & Risk Factor Research in Implant Dentistry (Rocchietta & Nisand 2012)

Critical aspects of design and potential sources of bias

Although data derived from descriptive studies (case series) is useful in hypothesis generation, this research design is not suitable to provide evidence for identifying exposures associated with dental implant outcomes. Data derived from specific analytical designs are necessary to provide robust measures of association and to ascertain causality. Although it is recognized that the evidence generation process does not rely on one type of study, two observational study designs are particularly useful to expand knowledge on risk factors in implant dentistry: the case control study and the cohort study. Each has its advantages and limitations, but for their results to be valid, they require specific steps to be taken to control potential bias.

Case-control studies

Case-control studies are useful for scenarios where disease incidence is rare or the time period between the exposure and manifestation of disease is long.

The critical issues in study design for case control studies are:

  • Selection of the control group. Every effort should be made to match the cases with a suitable control by means of individual/paired matching or group matching (using the appropriate statistics). Although it is recognized that the cases and the controls will never be perfectly identical, some aspects of matching seem to be critical for the validity in implant dentistry. As an example, cases with implant loss should be matched with controls with a similar length of exposure – time of loading. Selection criteria for both cases and controls should be clearly reported. Other confounding variables that cannot be matched should be adjusted for (as far as possible), using appropriate statistical methods.
  • Minimize the biases in the ascertainment of exposure level also known as recall bias and clinical outcomes (i.e. inaccurate reporting of a past exposure)
  • Minimize the participation bias (i.e. people who agree to participate in the study may be different from those who refuse to in term of their exposure status)
  • Sample size calculation should be undertaken a priori to ensure that the study has adequate statistical power to detect a clinically relevant risk and to correctly interpret negative results

Cohort studies

The main feature of a cohort study is observation of large numbers of subjects over a long period (commonly years), with comparison of incidence rates in groups that differ in exposure levels. The alternative terms for a cohort study are follow-up, longitudinal and prospective studies. A cohort is a representative sample of a defined population (e.g. partially edentulous patients receiving dental implants).

The critical issues in study design for cohort studies are:

  • To minimize the bias in the selection of the cohort population (participation/selection bias, i.e. people who agree to participate are different from those who refuse to in terms of the propensity for developing the disease). Description of the recruitment criteria should be clearly reported
  • Appropriate statistical methods should be used to address the potential imbalance in confounding variables
  • Sample size calculation should be undertaken to ensure that the study has enough statistical power to detect a clinically relevant increase in risk and to correctly interpret negative results
  • Accuracy of measurement of the exposure and incidence of the disease outcome will determine the required sample size and the period of observation time
  • Stability of measurement of the outcome or exposure over time. (i.e. changes in examiners or changes of instruments used throughout the study)
  • The issue of subjects lost to follow-up must be addressed in the following manner; (i) the percentage should be reported, (ii) the cause for the dropout should be highlighted (iii) the treatment of missing value for dropout patients should be reported. If missing values are imputed, then the method of imputation should be reported (iv) an assessment of the subgroup that have dropped out needs to be evaluated (baseline and longitudinal data should be compared with the non-dropouts to explore biases)

Assessment of current research and research reporting

Of 104 identified papers dealing with diabetes/smoking/periodontitis as risk indicators for implant loss, three cohort studies and no case control were identified (Rocchietta & Nisand 2012). Most of the papers found were case series either retrospective or prospective, even in cases where the terminology “case control” or “cohort” was mentioned in the title and abstract. The vast majority of the papers explored the associations between outcomes and exposures/attributes as secondary analyses.

Two major problems were associated with the assessment of exposure: (i) inconsistency in the way of reporting and (ii) lack of reliable measurement of the exposure (i.e. use of patient reported data instead of objective biochemical markers).

Priority areas for improvement of study design and reporting

There is an urgent need for well designed, performed, analysed, interpreted and reported case controls and cohort studies. These have the potential to provide more robust new knowledge that has profound patient care implications. The association/effect of exposures or attributes, such as cigarette smoking, diabetes, previous periodontitis, bone quality, bone quantity, are examples of conditions that may be hard to study using only randomized controlled clinical trials.

Among other issues clinical researchers in implant dentistry may want to consider:

  • Collaboration with experts to design, analyse and interpret studies is highly recommended
  • An improved matching of the controls to minimize the differences in the distribution of confounding factors between cases and controls
  • Improving statistical analyses to account for confounding variables
  • Study design should include prior sample size calculation
  • Dropouts should be recorded in the results and the statistical analyses should account for them
  • A well-defined measurement of exposure and assessment of outcome should be adopted
  • Intra- and inter-examiner calibration and reproducibility

Comparative Research, Randomized Clinical Trials in Implant Dentistry (Cairo et al. 2012)

Critical aspects of design and potential sources of bias in implant dentistry

The majority of RCTs were conducted in single centres (83%), whereas the remaining (17%) were multicentre. The parallel design, where each participant is randomly assigned to a group and receives the same type of intervention, was the most frequent RCT type (65%), followed by split-mouth design (24%), crossover design (4%) and factorial (1%).

In the position paper (Cairo et al. 2012), bivariate analyses showed that parallel or split-mouth studies were associated with less probability to identify a statistically significant difference between groups for the primary outcome variable (p < 0.0002) compared with the other types of study design. This finding suggests that randomized parallel and split-mouth designs were the most rigorous designs in implant dentistry. It is reasonable to hypothesize that it is easier and less time consuming to select patients with a single defect (i.e., missing a single tooth) than patients with bilateral homologous defects to plan a split mouth study. Furthermore, the increased efficiency of a split mouth design reduces the sample size (number of patients), the smaller sample size reduces the external validity and generizability of the results.

More frequently, RCTs showed a superiority design (88%), aimed to demonstrate that one treatment is more effective than another. Very few studies (7%) showed a non-inferiority design, aimed to show that the effect of the treatment is at least as good as that of the control (Blackwelder 2004). The residual 5% showed an equivalence design, a trial constructed to demonstrate that an experimental treatment is similar to a control treatment (Jones et al. 1996).

A low number of RCTs declared operator (21%) and institution (2%) experience in experimental procedures: this information is really important especially for trials dealing with surgical procedures where a learning curve is generally necessary to obtain good clinical outcomes. Initial training cases (i.e. using cases series studies) could be considered for testing new technologies/new surgical approaches. Furthermore, expertise based RCTs could be considered for specific areas of implant dentistry (Devereaux et al. 2005). There were multiple sources of potential bias that appeared to impact studies in implant dentistry.

Correct allocation concealment was described in only 22% of studies, only 37% showed a random sequence generation at low risk of bias, 12% performed a correct sample size calculation, the examiner was masked in 42% of studies when blinding was feasible, and only 9% adhered to CONSORT statements. Bivariate analyses showed that several sources of bias were significantly associated with rejection of the null hypothesis. These included: inadequate allocation concealment (p = 0.0125), lack of information on dropouts (p = 0.0318) and failure to adhere to CONSORT statement (p = 0.0333).

Assessment of current research and research reporting

In the field of implant dentistry, it has been very encouraging to see increasing numbers and quality of reporting of RCTs. Notably, during the years of 2006–2011, 59% of the total RCTs in the implant dentistry field have been published.

Over the years, ethical board approval and informed consent obtainment have significantly improved (p < 0.0001 for each). In terms of study quality: metrics associated with bias, reporting, sample size calculation and statistical analysis reporting improved over the years (Cairo et al. 2012).

Priority areas for improvement of study design and quality of reporting

Key priority areas to improve quality of reporting of RCTs incorporate different items including:

  • Adequate sample size calculation with a resulting number of patients allocated to experimental procedures performed a priori
  • Incorporation of appropriate patient-reported outcomes
  • Proper randomization and allocation concealment (i.e. central allocation, computer-generated randomization, etc)
  • Blinding of examiners, whenever possible
  • Appropriate statistical analysis with the patient as the statistical unit, using one implant for one patient, mean value for each patient, random effect models or multilevel models. For multicentre studies the centre effect needs to be considered.
  • Information on dropouts and possible reasons related to missing patients at the last follow-up

Priorities to reduce bias in experimental studies

A number of recommendations should be implemented to reduce bias in RCTs dealing with implant dentistry. These include:

  • Consideration of non-inferiority/equivalence designs in addition to superiority study design for some types of comparisons. For example, the non-inferiority design is valuable when the primary outcome may be similar between the treatment and control groups, but the treatment may cost less or be easier to use
  • Clinical trial registration at study initiation
  • Inclusion of study monitoring to improve overall study quality is encouraged
  • Using the intention to treat analysis as appropriate to manage drop outs if/when they occur
  • Adherence to CONSORT statements (with emphasis on the surgical extension of the CONSORT statement)

Outcome Assessment and Analysis in Implant Dentistry (Needleman et al. 2012)

Rationale for currently used outcome measurements focussed on implant survival and success

The rationale for any medical intervention, including implant-supported prosthetic rehabilitation, is to maintain or improve patient's well-being (e.g. quality of life) that has been challenged by a disease, condition or disability. Thus, outcome measurements related to implant-supported rehabilitation cannot be limited to implant survival or success rates, but when appropriate should also include the functional performance and aesthetic aspects of the entire rehabilitation as well as the health status of peri-implant tissues. Assessment should also include patient-reported outcomes.

Implant survival rate has been used as a primary outcome measurement throughout the years because it provides long-term data on the predictability/validity of osseointegration, with respect to different patient characteristics, clinical conditions and medical devices. In this respect, it should be noted that whilst implant survival has remained the most commonly used implant outcome throughout the modern era of dental implant research, its prevalence has reduced over time from ≥70% in 1980–1994 to 53.6% in 2010–2011.

Implant success rate was less commonly employed, but was still present in just over half of studies (either as primary or secondary outcome). Implant success constituted a wide variety of composite or single outcomes with little or no consistency between studies. In general, implant success rates are usually based on clinical and/or radiological parameters related with the remodelling/loss of the implant-supporting bone as well as the disease status of the peri-implant soft tissues. Measuring the time-dependent bone level change around implants is used to provide information on the effect of occlusal load or diagnose mucositis/peri-implantitis conditions. Probing measures are used to define the healthy/diseased conditions of the peri-implant tissues.

Limitations of current measures of performance

Implant survival – Limitations:

  •  The presence or absence of the implant in the patient's mouth per se may not be associated with the maintenance or re-establishment of the patient well-being (psycho-social characteristics and quality of life, absence of disease, function, aesthetics)
  •  Lack of consensus of how to define implant survival (e.g. how to handle an implant that has completely lost osseointegration, but is still retained in place by the restoration or is non-restorable)
  •  The reported high survival rates require large sample size and/or long-term follow-up for clinical research

Implant success – Limitations:

  •  Lack of consensus in the definition of success for implant therapy
  •  Lack of consensus in the choice of outcomes to measure implant success
  •  Validity and relevance of the outcome measures (e.g. difficulty in measuring probing depth)
  •  Lack of clarity of characteristics of study or reference groups
  •  Variation in the timing of outcome assessment (e.g. timing of baseline observation)
  •  Lack of details in reporting

Best practice in analytical methods

  •  Methods should account for clustering of measurements (e.g. sites around implants) and implants (e.g. multiple implants within the restoration and the patient). This will allow a more appropriate estimation of both the clinical performance of implants and the relationships between risk factors at different levels (e.g. implant versus patient level)
  •  Using contemporary statistical methods to account for censoring (where implant loss will not have occurred to all participants within the period of study, and/or patients will have been lost to follow-up). Recommended methods of such time-to-event or survival analyses are well documented
  •  Studies need to report the statistical analysis properly to provide readers with sufficient information to evaluate the clinical significance of the results. For example, confidence intervals should be reported in connection with regression coefficients and shown on survival plots; assumptions underlying the statistical models, such as proportionality in Cox models for survival analysis, should be evaluated
  •  Clarity of characteristics of study or reference groups

Reasons for lack of consensus

The issue of outcome choice and measurement in dental implant research has been long debated and relatively little progress has been made. This may be due to:

  • Differences in implant design, prosthetic supra-structure and treatment protocol (variations in pre-surgical anatomical conditions, surgical technique, timing of occlusal loading, etc.) provide an explanation for differences in the methodology for outcome assessment
  • Different opinions on the aetiology of peri-implant disease conditions and reasons for implant loss
  • The relationship between surgical and prosthodontic provision of treatment placing difference emphasis on outcome measures
  • Lack of validity and relevance of the outcome measures
  • Lack of universally accepted guidelines (what, where, how, when and who measure)
  • Guideline development and knowledge transfer and implementation have not followed recommended best practice

Consequences of lack of consensus

  • Difficult to design, conduct and interpret studies
  • Difficult to compare and synthesize the results within and between studies
  • Lack of clear standards for quality assessment of new devices or protocols
  • Delay application of knowledge and innovation for clinical practice
  • Difficult to identify best care

Characteristics of ideal outcomes

  • Relevant to treatment goals, especially including patient reported outcomes
  • Applicable to great majority of implant designs, restorations, patients and conditions
  • Clear, well defined, easy to use and validated
  • High sensitivity/specificity for disease diagnosis and disease progression
  • Amenable to commonly used statistical methods
  • Applicable and ethical in the clinical practice
  • Universally accepted

Which dimensions/disease definitions should ideal outcome(s) capture?

The major goal of the implant-supported rehabilitation is patient well-being. This includes quality of life, oral health, proper function and acceptable aesthetics. Thus, ideal outcomes should capture aspects directly related with this treatment goal.

Practical considerations for eventual widespread adoption of a minimum set of reported outcomes

Stakeholder involvement in development, ownership, dissemination and implementation is necessary. These stakeholder groups include patients, professionals, providers of education, health-policy makers, scientific communities, industry and funding agencies. There are established pathways to manage these aspects. Adoption of ideal outcomes should be encouraged in undergraduate and post-graduate education programs as well as in Continuing Education.

Proposal of a set of outcome domains for clinical research in implant dentistry

During the VIII EFP Workshop, a questionnaire survey on priorities of implant treatment outcomes was conducted among the workshop participants. These indicated that outcome measures should capture patient reported outcomes (McGrath et al. 2012), peri-implant health (Derks and Tomasi 2012, Graziani et al. 2012) and restoration outcomes (Pjetursson & Lang 2012, Benic et al. 2012). As a result of responses and deliberation of WG2, the following outcome domains were identified:

  • For patient-reported outcome measures
    • Health-related quality of life
    • General satisfaction
  • For peri-implant health
    • Marginal bone level
    • Tissue inflammation (BOP)
    • Probing depth (probing from reference point)
  • For implant-supported restoration
    • Longevity of the restoration
    • Function/occlusion related outcomes
    • Technical complications

For the assessment of peri-implant health, significant validation of single outcomes and their measurement systems has been performed and reviewed in previous workshops (Lindhe et al. 2008, Lang et al. 2011). These, together with the peri-implantitis case definition provided by group IV of this workshop (Sanz et al. 2012), allow the introduction of a set of success criteria related to this domain. Harmonization of measurement techniques and outcome reporting will improve comparability between studies and research synthesis. More work is needed for the other domains (Lang & Zitzmann 2012).


In spite of important progress in recent years, some clinical research in implant dentistry remains at high risk of bias. Conclusions continue to be made based on inappropriate research designs – this is particularly true in observational type research.

Properly designed, executed, analysed, interpreted and reported observational research is needed to expand the body of evidence in areas – such as risk factor research – that are not likely to benefit from randomized clinical trials. STROBE guidelines for reporting of observational research must be adhered to. In risk factor research, objective measures of exposure need to be employed. Establishment of an implant failure registry could provide a key resource.

Current emphasis on evidence generation with randomized clinical trials should continue. Notwithstanding recent success with efforts to reduce bias and improve reporting, efforts on quality improvement must continue. CONSORT guidelines for reporting of randomized clinical trials must be adhered to. Non-inferiority designs may prove beneficial in introducing clinically meaningful innovation.

To overcome the present state of confusion, clinical research in implant dentistry should consider three outcome domains: patient reported outcome measures, peri-implant tissue health and outcomes related to implant supported reconstructions. Peri-implant tissue health can be measured by marginal bone level changes (radiographic measurements) and tissue inflammation (bleeding on probing and pocket depth) and incorporated in the time to event analyses. More research is needed for the other domains.

Research on outcome measure domains, how to measure them and perhaps how to integrate them into a useful composite indicator is an area of top priority and should urgently inform a proper process leading to a consensus on outcome measures.