Clinical relevance of metal artefact reduction in computed tomography (iMAR) in the pelvic and head and neck region: Multi‐institutional contouring study of gross tumour volumes and organs at risk on clinical cases

Artefacts caused by dental implants and hip replacements may impede target volume definition and dose calculation accuracy. The iterative metal artefact reduction (iMAR) algorithm can provide a solution for this problem. The present study compares delineation of gross tumour volumes (GTVs) and organs at risk (OARs) in the pelvic and the head and neck (H & N) regions using computed tomography (CT) with and without iMAR, and thus the practical applicability of iMAR for routine clinical use.


Introduction
Radiation therapy is one of the major pillars in cancer treatment worldwide. The clinically established gold standard of three-dimensional conformal external radiation therapy is computed tomography (CT). High density and high atomic metal implants cause dark and bright streaking artefacts and thus disrupt the quality and resolution of CT imaging. Such artefacts are the result of beam hardening, photon starvation and scatter, 1 leading to a loss of image information. Consequently, anatomical identifiability is impaired and the CT numbers can be modified. 2 Electron densities result from the CT numbers and represent a crucial measure for further treatment planning. Inaccuracies in CT numbers result in incorrect electron densities, and such image reconstruction errors may cause errors in the process of target volume definition, with negative impacts such as tumour recurrence (in case of underestimation of target volumes) or increased radiation toxicity (in case of volume overestimation) with unnecessary inclusion of healthy tissues. 3 Implants in the hip and head and neck areas can produce massive artefacts, which are often in the tumour area or in its immediate vicinity. As a consequence, contouring of target volumes and organs at risk (OARs) is more difficult, leading to a more elaborate contouring process of target volumes and OARs. In addition, the prevalence of orthopaedic prostheses and dental fillings is increasing in elderly patients. 4 According to current projections, the demand for hip arthroplasty in the Organisation for Economic Co-operation and Development (OECD) countries will increase from 1.8 million per year in 2015 to 2.8 million in 2050. 5 Additionally, according to statistical estimated data, 450,000 new cases of prostate cancer were predicted for 2018 in Europe, 6 and radiation therapy is one of the main treatment methods for prostate cancer; an increasing frequency of artefact-affected CT data due to hip arthroplasty can be expected in the coming years. To delineate such artefact-affected tissues, radiation oncologists have to rely on experience. 7 In the literature, numerous different approaches to metal artefact reduction have been described in recent years. [8][9][10][11][12][13][14] However, some of these approaches are only theoretical models. Although some have been proven to reduce artefacts, not all have found their way into clinical routine. The effectiveness of the iterative metal artefact reduction (iMAR) algorithm was first demonstrated by means of technical phantoms. 2,15,16 Further studies, using anthropomorphic phantoms, also demonstrated the efficacy of iMAR. 17,18 Patient-based studies with iMAR are only rarely described in the literature, 4,19 and only a few retrospective studies with iMAR on real patients have been published so far, [20][21][22] albeit no contouring studies for the pelvis or head and neck (H & N) region have been performed. We used the iMAR algorithm for our study. Two previously introduced MAR algorithms, the normalized metal artefact reduction (NMAR) 9 and the frequency split metal artefact reduction (FSMAR), 12 in an iterative update scheme, are combined by the iMAR algorithm. 19,23 The aim of this study was to analyse the clinical significance of iMAR through a multi-institutional contouring study on representative clinical cases in the pelvic and H & N regions.

Methods
The study involved native planning CT and CT-iMAR data of two typical clinical cases in our radiotherapy department. The planning CT was performed in the supine position using a Somatom AS 20 (Siemens, Erlangen, Germany) with a 2-mm slice thickness. For each patient, corresponding CT-iMAR data were generated by software-based postprocessing. 9,12 All procedures were performed in accordance with the institutional ethics requirements.

Patient characteristics
1 A 76-year-old patient with elevated prostate-specific antigen (PSA) recurrence of a low-risk prostate gland adenocarcinoma and a bilateral hip replacement ( Fig. 1). node metastases and extensive dental implants (Fig. 2).

Study design
Reference volume (V ref ) parameters were generated by two highly experienced radiation oncologists from the Radiological Alliance Hamburg (FW for the pelvic region and CG for the H & N region) in Eclipse Treatment Planning System version 13.6, with one doctor for each of the two cases. To generate the reference structures (gross tumour volumes (GTVs) and organs at risk (OARs)), both image data sets (native CT and CT-iMAR) of each case were simultaneous accessible for their contouring process. The final version of the V ref was then contoured only on the CT-iMAR data set and used as a structure set for further analysis. Structure sets for the V ref generated on the CT-iMAR were transferred to the native CT data set, which resulted in uniform V ref on CT-iMAR and native CT for both cases. The clinical experience in oncological CT imaging amounts to more than 15 years. Each of them was a specialist in the particular region. Here, 'GTV Prostate ' (without seminal vesicles) and 'GTV Tongue ' were defined as the primary tumour mass; both organs were contoured in their entire extension. The right parotid gland for the H & N region and the urinary bladder and rectum for the pelvic region were contoured as OARs. All these contours were determined as the V ref . Based on specific instructions, radiation oncologists and radiation oncologists in training from different participant institutions were requested to delineate target volumes and OARs using their segmenting tools and clinical experience. A minimum number of years in clinical practice were not required. However, experience in target definition for such cases was mandatory. The contouring process of the observers was performed within the Eclipse Treatment Planning System version 10 and 13.6 (Varian Medical Systems, Palo Alto, CA, USA), depending on availability.
Observers were blinded to all other contouring plans to guarantee independence of results. A guideline of contouring tasks was designed to minimise memory bias; the contouring process was initiated on the native CT data set that presented loss of information and where GTVs were partly located in areas with total signal extinction or very high Hounsfield unit (HU) values. Subsequently, the same contouring was performed on the CT-iMAR data set that supposedly presents more image information. Finally, for both cases, contours were obtained for all regions of interest (ROIs) on both data sets (native CT and CT-iMAR) for each participant.

Data analysis
All structures were analysed using Eclipse Treatment Planning System version 13.6. Cranio-caudal cutting of all ROIs and V ref to a main artefact area allowed us to evaluate just the effect of iMAR. Structures were already cut in the presented figures. For a quantitative detection of the main artefact area, we used a 'Segment High Density Artefacts' tool in Eclipse to identify artefact with high density (threshold 2.83 g/cm 3 ). The first and last slice on the CT scan that showed such segments were defined as the borders of the main artefact area. Thereafter, we evaluated the Dice similarity coefficient (DSC) among the reference structures and each observer volume on the native CT scan and CT-iMAR data set to study the iMAR effect and to quantify the results (V ref Àobserver n ).
In addition, we evaluated the effect of iMAR within each observer (observer n Àobserver n ) [intraobserver approach]. The DSC is defined as follows: It provides values between 0 and 1, where 0 represents no intersection and 1 reflects clear overlap of structures A and B. In the literature, a DSC > 0.7 is commonly reported to indicate an excellent match. 24 The DSC parameter not only considers absolute volumes but also local positions of two (A and B) compared volumes. Single absolute volumes of two structures are independent of the location. Volumetric and DSC data were collected using the DICOM-statistic tool in Eclipse which is embedded in Eclipse Contouring; volume was measured in cubic centimetres. Descriptive statistics were used; minimum, maximum and mean values, as well as standard deviation (SD), were calculated for each structure 0 s data (on native CT and CT-iMAR) and for the DSC analysis. Average DSC and average volumetric values for the targets (GTVs and OARs) were analysed with paired t-tests after checking for normal distribution (Kolmogorov-Smirnov test). All analyses and boxplots were performed with a P-value set to < 0.05 in Excel (Microsoft, Redmond, USA). In addition, we asked the observers to evaluate the assistance and benefit of iMAR in categories (1 = excellent, 2 = good; 3 = adequate; 4 = sufficient; 5 = poor; 6 = unsatisfactory). Work experience in radiotherapy (in years) was also inquired. Table 1 and 2 show the volumetric parameters of the GTVs and OARs in the pelvic region and in the H & N region. Distributions of absolute volumes and DSC parameters are illustrated in Figures 5 and 6 and represented as boxplots. Overall, delineated prostate volumes ranged from 28.3 cm 3 to 195.6 cm 3 using the native CT data set and from 37.4 cm 3 to 117.5 cm 3 with the CT-iMAR data set. Average GTV prostate was 87 AE 44 cm 3 on the native CT data set and 75 AE 22 cm 3 on the CT-iMAR data set. Thus, the volume decreased by 12 cm 3 , and the SD halved. However, there were no significantly different results in volumetric parameters due to iMAR imaging contouring of the prostate as target volume.

Results
No significant differences were found between native CT and CT-iMAR imaging for contouring the rectum as an OAR (P = 0.714). On both imaging data sets (native CT and CT-iMAR), mean volume and SD were nearly similar. Significant evidence (P = 0.013) could be provided for delineation of the bladder in the two different imaging data sets. Mean volumes of the bladder on native CT and CT-iMAR were 213 AE 17 cm 3 and 227 AE 10 cm 3 , respectively, ranging from 170.9 cm 3 to 241.9 cm 3 (on native CT) and from 215.6 cm 3 to 237.9 cm 3 (on CT-iMAR). Data for the H & N region were not significantly different. The reference structure GTV Tongue (34 cm 3 ) was slightly underestimated both on the native CT data set (28 AE 6 cm 3 ) and on the CT-iMAR data set (30 AE 7 cm 3 ). The statistical evaluation of the intraobserver DSC calculations and the DSCs between observer and the reference for each ROI is presented in Table 3. There was no significant improvement in DSC values (0.83 AE 0.06 (native CT) compared to 0.86 AE 0.06 (CT-iMAR)) for the GTV Tongue . Contoured volumes of the parotid gland corresponded to the reference (18 cm 3 ). All resulting mean DSC values in the intraobserver modality for each organ were higher (>0.7) than the mentioned limit. Compared with the reference, the mean DSC values for the prostate increased significantly from 0.68 AE 0.15 to 0.78 AE 0.07 (P = 0.01). No significant change in mean DSC values for the rectum could be identified (P = 0.696). In both imaging data sets, the mean DSC for the rectum was > 0.7. For the bladder, we already found high mean DSC values on native CT (0.91 AE 0.04). However, the small improvement of mean DSC values to 0.94 AE 0.01 due to iMAR was significant (P = 0.008).
The average professional work experience of the participating observers was 11.6 AE 9.2 years for the pelvic region and 13.5 AE 10.0 years for the H & N region. Median values for assistance due to iMAR were 3.0 (pelvic region) and 2.8 (H & N region).

Discussion
The goal of this study was to evaluate the importance of the iMAR algorithm in routine clinical use, especially for tissue demarcation in the contouring process of extremely metal artefact-affected CT data images. Several studies have previously shown an influence of iMAR on HU reestablishment and dose calculation accuracy by removing metal artefact. However, so far, no contouring study can be found that evaluates the practical applicability of iMAR in routine clinical use for prostate or H & N cases. In our study, we found for the pelvic region that physicians were able to delineate the prostate (P = 0.010) and the bladder (P = 0.008) significantly more precisely using iMAR image data. Moreover, the distribution of the values was lower with iMAR. The visibility of the cranial part of the prostate, its apex and the dorso-basal area of the bladder at the transition to the prostate gland was especially improved by iMAR, which is also reflected in the contoured volume of both organs. Due to iMAR, the average volume of the bladder was larger, while that of the prostate was smaller compared to the reference. The iMAR had no influence on the distinctness of the rectum in our study. The participants were probably able to compensate for artefact-related misinformation in the area of the rectum. Overall, without using iMAR, visibility in the area between rectum and bladder was very limited by bright and dark streak artefacts; thus, important adjoining anatomical structures and geometrical borders were not discernible. With iMAR, this area was filled with soft tissue. Though borders between tissues seem slightly blurred on the iMAR reconstructed data, physicians were able to delineate GTVs and OARs successfully because image quality was sufficiently improved.
In the H & N region, about one-third of the volume of the parotid gland was erased. Contrary to the assumption that this was a difficult case, it appeared that clinical experience and non-artefact-affected residual anatomical structures on the CT data set were apparently sufficient for precise volume definition. Accordingly, there was no significant improvement in the target volume definition in the CT-iMAR data set compared to the reference structure for the GTV Tongue (P = 0.238) and OARs (P = 0.508).
For our study, radiation oncologists and radiation oncologists in training from university hospitals and from an experienced practice were chosen as participants to rule out a bias in the predetermined contouring progress. This also correlates with the moderate iMAR score of 3.0 and 2.8 for the pelvic and H & N region. The group of participants with different training statuses was deliberately chosen to test a wide range of clinically active physicians in a radiotherapy department to gain additional information, such as the importance of iMAR in individual training status. The wide range of findings in contoured volumes did not correlate with individual training status. We found that even some junior doctors showed remarkably high and reliable values. This strengthened the argument that iMAR is an important additional tool regardless of work experience. Average benefit scores of 2-3 were consistent with the findings that iMAR reconstruction of tissues was well (but not perfectly) improved, as if there were no implants. Though iMAR reliably reconstructs soft tissues, the boundaries between soft tissues were slightly indistinct but adequate for delineation of GTVs and OARs.
The iMAR technique reduces dose errors significantly. B€ ar et al. 2 investigated the influence of artefact from dental implants on the dose distribution. Comparing intensity-modulated radiation therapy (IMRT) plans from corrected and uncorrected images, dose differences in the range of AE 5% were discovered in target volume and OARs, depending on the treatment site. Also, M€ arz et al. 16 showed a significantly improved dose calculation accuracy for IMRT and volumetric modulated arc therapy (VMAT) plans with correction of artefact by iMAR. iMAR-corrected CTs are also beneficial in the image-guided radiation therapy (IGRT) process, delivering improved reference data for cone beam computed tomography (CBCT) and two-dimensional (2D) images. To achieve a maximum of tumour control with a minimum of toxicity, the highest possible precision of radiotherapy is required to further improve treatment results. Recent research has shown that there is a trend towards hypofractionated radiotherapy (e.g. in prostate carcinomas [25][26][27][28] and in some H & N tumours 29,30 ). The phase III prostate advances in comparative evidence (PACE) trial even investigates extreme hypofractionation in prostate cases, using doses of > 7 Gy per fraction. 31 Therefore, high image quality will be of increasing importance as safety margins can be expected to be tight, especially in extreme hypofractionation schedules.
The DSC values for the GTV prostate did not show perfect results, and GTV site anomalies also affected planning target volumes (PTVs). In turn, dose distribution will change, making tumour control more difficult, especially in areas of common recurrences. Consequently, we can assume that iMAR can have a positive effect on accurate dose distributions and, therefore, on tumour control. The seminal vesicles are located in the transitional zone and are often the site of recurrences. 32 Likewise, artefact reduction occurred in the apex, and tissue contrast improved. By increasing contouring precision, the probability of recurrences might be reduced. Axente et al. 19 were able to show that using a CT-iMAR data set for contouring can increase the confidence of the participants. They also demonstrated that iMAR is an indispensable method for the reconstruction of the local anatomy in prostate cases with bilateral hip implants; our findings are in line with their results. Conversely, in  To improve contouring accuracy, co-registration of CT with magnetic resonance imaging (MRI) is often used in the clinical routine. Successful definition of target volumes in cases with bilateral hip replacements in pelvic radiotherapy could be demonstrated using co-registration of CT and MRI. 33,34 Further studies using iMAR and MRI might demonstrate an increase in the precision of treatment planning. Based on our findings, it can be surmised that iMAR might show similar effects for some gynaecological tumours. For our study, two cases with extreme metal artefact were selected. However, these two cases were not representative of all patients' cases having metal implants or inserts, as delineation and dose calculation will vary depending on each patient 0 s anatomy and implants especially in H & N cases. Nonetheless, using iMAR at our institution is the standard of care in cases of detected metal artefact during creation of planning CT scans.
In conclusion, prostate cancer patients with bilateral hip implants, iMAR is helpful regarding target volume definition because of quantitatively improved image reconstruction, especially in the transition of the prostate and bladder and the apex of the prostate, thus reducing uncertainties in target definition and further processes. For the H & N region, experience and anatomical residual structures are sufficient for precise contouring of target volumes and OARs. Due to its ease of use, the authors recommend using iMAR, if available, for delineation in cases of disruptive metal artefacts regardless of anatomical regions but particularly for prostate or head and neck cases.