Evaluation of failure modes and effect analysis for routine risk assessment of lung radiotherapy at a UK center

Abstract Purpose Explore the feasibility of adopting failure modes and effects analysis (FMEA) for risk assessment of a high volume clinical service at a UK radiotherapy center. Compare hypothetical failure modes to locally reported incidents. Method An FMEA for a lung radiotherapy service was conducted at a hospital that treats ~ 350 lung cancer patients annually with radical radiotherapy. A multidisciplinary team of seven people was identified including a nominated facilitator. A process map was agreed and failure modes identified and scored independently, final failure modes and scores were then agreed at a face‐to‐face meeting. Risk stratification methods were explored and staff effort recorded. Radiation incidents related to lung radiotherapy reported locally in a 2‐year period were analyzed to determine their relation to the identified failure modes. The final FMEA was therefore a combination of prospective evaluation and retrospective analysis from an incident learning system. Results Thirty‐six failure modes were identified for the pre‐existing clinical service. The top failure modes varied according to the ranking method chosen. The process required 30 h of combined staff time. Over the 2‐year period chosen, 38 voluntarily reported incidents were identified as relating to lung radiotherapy. Of these, 13 were not predicted by the identified failure modes, with six relating to delays in the process, three issues with appointment times, one communication error, two instances of a failure to image, and one technical fault deemed unpredictable by the manufacturer. Four additional failure modes were added to the FMEA following the incident analysis. Conclusion FMEA can be effectively applied to an established high volume service as a risk assessment method. Facilitation by an individual familiar with the FMEA process can reduce resource requirement. Prospective evaluation of risks should be combined with an incident reporting and learning system to produce a more comprehensive analysis of risk.


| INTRODUCTION
Modern radiotherapy is recognized as a highly complex, multistep process delivered by a multidisciplinary team requiring numerous handovers. 1 Although radiotherapy is widely considered a safe and effective treatment option for cancer patients, radiotherapy accidents can have severe consequences resulting in significant patient harm. 2 In the last decade, there has been extensive work carried out to improve radiotherapy safety and risk assessment in the United Kingdom, 3 Europe, 4 and the United States. 5 To ensure the safety of treatments is established and maintained, regular risk assessment forms a key aspect of the commissioning and review of treatment techniques 3 and remains a legal requirement under UK law. 6 More generally, the IAEA Basic Safety standard, 7 from which most national safety standards are derived, also emphasizes the need to reduce radiological accidents and evaluate risks.

1.A | Failure mode and effect analysis
There are a variety of tools available to facilitate risk assessment, with failure mode and effect analysis (FMEA) generating a significant level of interest as an appropriate tool for use in radiotherapy, most notably in the AAPM task group 100 report. 8 In the literature, FMEA has been applied successfully to several complex radiotherapy modalities such as Radiosurgery and Stereotactic Body Radiotherapy. 9,10 The methodology advocated by Huq et al. 8 starts with a process map of the steps associated with the application requiring analysis. An FMEA is then performed to assess the likelihood of failures during each step in the process and the potential impact of such a failure. For each potential failure mode, the associated risk is classified using three parameters, Severity, Occurrence (or frequency), and Detectability according to the scoring system proposed by TG-100. 8 For a center considering an FMEA-based approach to risk assessment a number of challenges identified within the literature require some local adaptation and interpretation. They are discussed below.

1.B | Identification of failure modes
Identifying potential errors and risks within a complex multistep process can be an exhaustive process and limited conclusions can be drawn from the available literature. Considering examples published on radiosurgery, the number of failure modes identified ranged from 86 11 to 409. 12 For more general radiotherapy, papers have been presented by several groups [13][14][15] with failure modes identified ranging from 52 to 127 with a variety of treatment planning and delivery platforms, and variable discussion of the scope of FMEA (i.e., including acceptance/commissioning as well as routine use).

1.C | Risk priority number
The risk priority number (RPN) is used to stratify failure modes by multiplying severity, occurrence, and detectability into a single number. The use of RPN is a recognized issue with the FMEA format 16,17 not least as the doubling of a number, for example, severity does not result in a doubling of RPN. 18 With use of an in-direct indicative score such as RPN, the method for stratifying risks and identifying those to focus intervention on is critical. Methodologies presented vary from the top 5% 19 to all modes 20 most likely reflecting the relative time and resources available between groups. Direct reliance on RPN for stratification is not recommended by Huq et al. 8 , with some manual interpretation of severity suggested. In addition, the scoring of severity, occurrence, and detectability, and subsequent RPN can produce considerable variation between individuals even with the use of a standard matrix. 21 With multiple participants, a careful choice must be made between averaging of individual scores or group consensus scoring, with Ashley and Armitage (2010) recommending a consensus approach that allows for review and discussion of variation 22

1.D | Resource implications
Process mapping, identifying, and ranking failure modes and further intervention can require considerable resources. 17 There is limited consensus or discussion of the resource implications for an individual center considering adopting FMEA risk assessment. For surface guided radiotherapy, identification of failure modes and validation of occurrence, severity, and detectability was estimated at 30 hours. 23 For a general external beam process with support from a trained facilitator, total FMEA including analysis was estimated at 75 h. 15 Three centers exploring FMEA for radiosurgery identified 104-135 failure modes but completed the process over a period of 2-6 months. 10 The most extensive example of an FMEA might be that offered by Schuller et al. 12 who identified a total of 409 failure modes, applied analysis to all modes and required an estimated total of 258 h, equivalent to 34 and a half working days. Variation and uncertainty in resource requirements could potentially act as a barrier to a center considering FMEA-based risk assessment for new or existing services and prevent more widespread practical implementation.

1.E | FMEA for lung VMAT
The aim of the work presented here is to evaluate the application of FMEA as a tool for prospective risk assessment within a UK hospital.
The FMEA approach has been adopted locally as a methodology for documented radiation risk assessments within our center, as required under UK legislation. 6 A specific indication, lung cancer, has been chosen as the focus due to its high throughput, universal application, and relative complexity due to motion management issues.
Lung cancer accounts for 13% of cancer diagnoses in the United Kingdom with 46,403 cases diagnosed in 2014. 24 Noticeably, deaths attributed to lung cancer represent 22% of all cancer deaths, with 1year survival rates of 34% and 39% for men and women, respectively. 25  The work presented here, outlines the application of FMEA to lung radiotherapy within our center, and includes discussion of current stratification methods and rankings, as well as comparison to incident reports as a measure of efficacy of the process. The emphasis of the paper is on the process of risk assessment and stratification rather than on resolution of weaknesses identified by failure modes.

| MATERIALS AND METHODS
The FMEA exercise described herein was undertaken during a 3month period at a UK NHS hospital. A multidisciplinary team was recruited for the exercise, consisting of two oncologists, two physicists, and three radiographers. To expedite the process, the lead author (a physicist) acted as facilitator and produced a process map and initial failure mode list based on historical risk assessments and the authors own understanding of the process. As the FMEA was for a pre-existing service, failure modes related to the commissioning process of the technique and equipment were omitted. Members of the group then individually scored the provided modes using the TG-100 scoring matrix and were asked to identify any additional failure modes. A face-to-face multidisciplinary team (MDT) meeting was then held to discuss the additional modes and review failure modes with a high variation (>5) in either severity, occurrence, or detectability score between MDT members. After consensus was achieved, to determine the validity of pre-existing US originating taxonomy to UK practice, local failure modes were assigned to the generic steps identified by Ford et al. 28 with the local expected causes linked to the associated coded causality. 28 To illustrate the high-risk process steps, the final FMEA modes were transcribed onto the process map. To determine the resource requirements of the FMEA process, the facilitator and all participants were asked to record time spent on the FMEA and the face-to-face meeting was timed.
To evaluate the usefulness of scoring systems the final local FMEA scores were stratified according to RPN number and a novel three-digit code system that uses severity, occurrence, and detectability to produce a three-digit number: S, O, D, that provides direct information on each category. To facilitate a three-digit code, the 1-10 system 8 was adjusted to 0-9 by subtracting 1 from all individual severity, occurrence, and detectability scores. The three-digit code system provides greater flexibility to consider risks in terms severity, occurrence, and detectability independently, and mitigates the limitation of RPN that different combinations of S, O, and D can produce exactly the same value of RPN despite potentially having very different risk implications. 29 Finally, to determine the efficacy of the FMEA process, the centers incident reporting system, Datix, was interrogated to find all reported incidents involving patients undergoing VMAT Lung radiotherapy since its implementation in April 2017. These results were then vetted for relevance to the radiotherapy planning process, compared to the identified failure modes and added to the process map.  Table 1 and Table 2, respectively.   • Two instances of a failure to image when required on a weekly basis 3.B | Lung VMAT radiotherapy process map A high level process map was created to reflect the local Care Path with 20 steps from the creation of an electronic action sheet (EAS)

3.A | Incident reports
to the end of treatment. Failure modes identified and attributable reported incidents were mapped to the relevant process steps ( Fig. 1).

3.C | Resource requirements
The total staff time invested in this project was 29.

| DISCUSSION
The work presented here demonstrates the practical application of the FMEA approach to a high volume radiotherapy service at a large UK hospital. By using a facilitated approach similar to that of Ford et al., 15

4.A | Translation of FMEA to UK practice
There are several discrepancies between typical US and UK practice, most notably in the difference between UK radiographers and US dosimetrists and the limited availability of clinicians to attend at treatment in the United Kingdom. Comparison of local practice for VMAT lung to the generic process map provided by Ford et al. 28 was, however, straightforward with the majority of failure modes fitting within the process steps and causality coding provided. Some differences arise when considering the limited physicist involvement in the local VMAT lung service where radiographers carry out plan checks. The use of causality coding was found to be particularly useful as it provided a better resolution to user related errors than simply a general classification of "operator error."

4.B | Stratification and RPN
The RPN is a multiplication of the three S, O, and D indices designed to facilitate prioritization of failure modes. In its report, TG 100 8 highlights two key issues with RPN, first that the severity of errors  Table 1 and Table 2 For an example failure mode, where the wrong patient is called for treatment, the MDT group ranked severity as 10, as this is an unintended irradiation of a patient. Occurrence was considered to be low, but not impossible at two, and detectable, with a score of two, given the local requirements for two stage identification and the use of IGRT. This produces a relatively low RPN of 40, ranked 31st of 36.
For a high volume service, even a low likelihood can produce a significant number of events. Under RPN ranking alone, there is risk for such catastrophic errors to be under-emphasized and not revisited as part of future updates to the risk assessment. Conversely, the risk-grading matrix proposed by Huq et al. 8 can overemphasize low severity, high frequency failure modes. In the example of a suboptimal IGRT match, where the radiographers position the patient < 5 mm away from the intended isocenter, this will produce a minor dosimetric error (rank 4 severity) but is likely to occur relatively often, between 2 and 5% of the time (occurrence rank 9). A 10% risk of the failure going undetected (detectability rank 7) produces an overall RPN of 105, the fourth highest by RPN ranking. A IGRT match error of a greater magnitude (e.g., >5 mm), was considered locally as a separate failure mode, with a severity of rank 5, likelihood of rank 3, and detectability of rank 5, ranked lower by RPN (ninth) despite a higher severity. The failure mode was divided in this way as the associated controls at our center differ with magnitude, with errors larger than 5 mm requiring an independent expert practitioner to review the IGRT match. Splitting failure modes in this way can ensure different outcomes are considered, but will also increase the total number of modes requiring analysis.
When considering actions, ranking failure modes according to each indices can aid stratification, but for the implementation of FMEA for regular risk assessment of services, each mode should be reviewed in case of any opportunity for improvement.

4.C | Incident reporting and FMEA
Of the 25 incident reports attributable to identified failure modes, 18 (72%) resulted from the five failure modes listed in Table 3.
Of the reported incidents, 2/38 (5%) were not detected by the expected control measure. In both cases, this was a failure of the radiographer checking the plan to spot errors in the immobilization information that was discovered on first fraction due to the use of IGRT as a safety barrier. No incidents were reported as resulting in any harm or potential harm to the patient. From this relatively low number of incidents, it can be argued that the frequencies stated within the FMEA exercise would appear to be higher than reality.
There are, however, caveats to incident reporting systems that must be considered. The incidents reported cannot include all potential near misses and errors that are undetected and thus unreported.
The system relies on voluntary reporting and incidents that occur and subsequently rectified may not be included. Physicists and radiographers reported all 38 incidents with no incidents directly reported by a clinician. There is therefore a risk of bias in the type of incidents reported due to a nonuniform culture of incident reporting between professions.  An additional incident, a technical fault within the record and verify system, was investigated by the manufacturer and classified as rare and unpredictable. Although this error did not produce a physical effect, as the plan was undeliverable, the incident highlights the impossibility of a fully exhaustive proactive risk assessment. In reality, there is an unpredictable element that relies on nonspecific control measures and the competency and skill of staff involved that may not be explicitly included within the FMEA. This highlights the importance of combining FMEA risk assessment with a local incident reporting and learning system.
Considering the incidents reported locally, four additional failure modes were added to the risk assessment, listed in Table 4.

4.D | High-risk process steps
Within the process map in Fig 1, the number of modes associated can be used to highlight particularly complex process steps. Steps with greater than three failure modes are CT scanning, outlining, planning, and first treatment. From the incident reporting analysis, the majority of actual incidents occurred at: outlining (6), planning (8), the decision to treat (7), and first treatment (5). Using the number of associated failure modes to highlight key process steps would have not included the decision to treat as a high-risk step for further detailed investigation.
The decision to treat process step represents a stage in the local patient pathway whereby an electronic action sheet (EAS) is produced by the clinician to specify the intended dose and fractionation, state the target laterality, and highlight any previous radiotherapy. A number of incidents occurred at this stage but in the local FMEA only two failure modes were identified for the EAS process step.
The first, a patient incorrectly prescribed radiotherapy and the second a patient prescribed a suboptimal combination of dose/fraction-  36 and these measures on these occasions prevented near-misses from becoming reportable incidents.
These incidents also align with a key theme in the local FMEA MDT discussion, namely the limited safety barriers for clinician outlining and prescribing. Deciding on treatment approaches and delineating CTVs are a particularly challenging task 36  From this experience, future FMEA undertaken locally will include a further meeting to confirm the final analysis. When considering the number of failure modes determined, the total of 40, is lower than previously published values albeit for different specialisms and equipment. As discussed previously, omission of risks associated with commissioning and quality assurance may account for some of this discrepancy as well as the aforementioned limited granularity to individual failure modes. Nevertheless, the work presented here represents FMEA as a feasible option for risk assessment in a busy clinic.

| CONCLUSION
Failure modes and effect analysis can be effectively applied for routine risk assessment of clinical services as required under UK law. 37 Facilitation can be used to reduce the time burden of the FMEA process to a level manageable for busy departments enabling wider implementation beyond specialist and new services. Comparison with local incident reporting highlights that although an MDT approach can produce a comprehensive list of failure modes, regular comparison with robust local reporting procedures can ensure an inclusive consideration of risk.

ACKNOWLEDG MENTS
The authors thank Dr. Antony Pope, Dr. Anoop Harridass, Rhydian Caines, Craig McGettrick, Katie Williams and Victoria Chapman for their assistance with this work. The authors report no conflict of interest in conducting the research.

AUTHOR CONTRI BUTION
Martyn Gilmore was the author of the manuscript, providing all written text. Martyn was also the facilitator alluded to in the text and as such provided all preassessment materials, including failure modes and the process map, and carried out all analysis discussed.
Carl Rowbottom was the supervisor of Martyn Gilmore during the work described in the manuscript and carried out reviews of multiple drafts prior to submission. Carl worked with Martyn to develop the structure of work described within.

CONFLI CT OF INTEREST
No Conflict of interest.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.