Head and neck cancer patient positioning using synthetic CT data in MRI‐only radiation therapy

Abstract Purpose The accuracy and precision of patient positioning is crucial in radiotherapy; however, there are no publications available using synthetic computed tomography (sCT) that evaluate rotations in head and neck (H&N) patients positioning or the effect of translation and rotation combined. The aim of this work was to evaluate the differences between using sCT with the CT for 2D‐ and 3D‐patient positioning in a magnetic resonance imaging (MRI)‐only workflow. Methods This study included 14 H&N cancer patients, with generated sCT data (MRI Planner v2.2) and the CT deformably registered to the MRI. Patient positioning was evaluated by comparing sCT against CT data: 3D cone beam CT (CBCT) was registered to the deformed CT (dCT) and sCT in six degrees of freedom (DoF) with a rigid auto‐registration algorithm and bone threshold, and 2D deformed digital reconstructed radiographs (dDRR) and synthetic DRRs (sDRR) were manually registered to orthogonal projections in five DoF by six blinded observers. The difference in displacement in all DoF were calculated for dCT and sCT, as well as for dDRR and sDRR. The interobserver variation was evaluated by separate application of the paired dDRR and sDRR registration matrices to the original coordinates of the planning target volume (PTV) structures and calculation of the Euclidean distance between the corresponding points. The Dice similarity coefficient (DSC) was calculated between dDRR/sDRR‐registered PTVs. Results The mean difference in patient positioning using CBCT was <0.7 mm and <0.3° and using orthogonal projections <0.4 mm and <0.2° in all directions. The maximum Euclidean distance was 5.1 mm, the corresponding mean (1SD) Euclidean distance and mean DSC were 3.5 ± 0.7 mm and 0.93, respectively. Conclusions This study shows that the sCT‐based patient positioning gives a comparable result with that based on CT images, allowing sCT to replace CT as reference for patient treatment positioning.


INTRODUCTION
In traditional external radiotherapy, computed tomography (CT) images are the primary source of information for treatment planning. Magnetic resonance imaging (MRI) can be incorporated as a secondary source of information, used for delineation of target and organs at risk (OAR) but requires a coregistration between MRI and CT. This coregistration process is associated with systematic uncertainties, 1,2 caused by imperfect registration algorithms, interscan variation in patient setup, and/or anatomical changes. The head and neck (H&N) region consists of numerous anatomical structures and has a complex freedom of motion, which causes a higher risk of imperfect coregistrations compared to other regions, such as brain.
In the past decade, interest has been directed toward MRI-only radiotherapy workflows, that is, radiotherapy workflows that use MRI as the only imaging modality. An MRI-only workflow excludes the CT imaging session, and with this also the anatomical uncertainties derived from the MRI to CT coregistration. 3 A single imaging session further leads to reduced inconvenience for the patients compared to a multimodality workflow.
Since MRI neither provides the electron densities necessary for absorbed dose calculation nor can be directly reconstructed to digital reconstructed radiograph (DRR) with the same contrast as CT data, additional data processing steps resolving these issues must be incorporated in an MRI-only workflow. These supplementary steps generally include a conversion of the MRI data to synthetic CT data (sCT). The most common conversion techniques are bulk density assignment, atlas-or voxelbased conversion, and the state-of -the-art approach of today; machine learning. [4][5][6] In an MRI-only workflow, sCT and synthetic DRR (sDRR) originating from the sCT data replace the original CT and DRR data.
The accuracy of sCT-based absorbed dose calculations has been previously evaluated for various sCT generation methods and treatment sites, [7][8][9][10][11] including the prereleased version for sCT generation in the H&N region used in this study. 12 Although the dosimetric properties of sCT data generated with various methods have been widely evaluated, not many studies have focused on the accuracy and precision of patient positioning using sCT data. 3 There are some validations of bone-based patient positioning using sDRR and/or sCT for brain, 13,14 prostate, 15-17 rectum, 18 and pelvis. 19 However, very few investigations of sCT-based patient positioning have been performed for the H&N region 20 and to our knowledge, there are no publications available that evaluate rotations for positioning of H&N cancer patients. There are also no publications available for any anatomical region evaluating the effect of translation and rotation combined. The aim of this work was to evaluate the differences between using sCT/sDRR with the original CT/DRR for patient positioning in an MRI-only H&N radiotherapy workflow.

Patient selection
Patient positioning was evaluated for 14 H&N patient data sets,comprising both sCT data and CT data.Ethical approval was obtained from the Swedish Ethical Review Board (Reg nr. 645-17). The patients received oral and written information prior to study inclusion and signed a written consent if participating. The inclusion criteria were sufficient field of view (FOV) for both the MRI and CT data to enclose the patient body contour of the H&N and shoulders, no major post-surgical implants or dental restorations causing disruption of the body contour in the MRI, and cone beam CT (CBCT) as well as orthogonal 2D projections (0 • and 90 • ) for treatment positioning verification.

Images
CT data were acquired using an Aquilion LB CT scanner (Toshiba Medical Systems, Tokyo, Japan). All MRIs were acquired with a 1.5 T Siemens Aera wide bore MR-system (Siemens Healthcare, Erlangen, Germany), equipped with a flat tabletop (CIVCO Radiotherapy, Iowa, USA). A T1-weighted Dixon Vibe (3D spoiled GRE) acquisition was utilized for sCT generation using the software MRI Planner version 2.2 (Spectronic Medical AB, Helsingborg, Sweden). 21 The MRI scan parameters were transversal slices, 448 × 381 scan matrix, 448 × 448 reconstruction matrix, reconstructed in-plane resolution 1.12 × 1.12 mm 2 , 198 slices, 2 mm slice thickness, no slice gap, 795 Hz/pixel readout bandwidth, 8 ms repetition time, 2.39 ms and 4.77 ms echo time, 3 averages, 3D geometry correction, and a total scan time of 9 min 57 s. Patient setup for MRI and CT acquisition, and the CT scan parameters and the commercially available CNN-based transfer function estimation algorithm used for sCT generation have been previously described. 12 CBCT data and orthogonal projections were acquired using the kV On-Board Imager system mounted on the treatment device (Varian Medical Systems, Palo Alto, CA). The CBCT data had a 512 × 512 reconstruction matrix, 0.91 mm 2 reconstructed in-plane resolution, 2-mm slice thickness and 88 slices. The peak kilo voltage output and exposure were 125 kV and 268 mAs, respectively. The orthogonal projections were obtained at 0 • and 90 • , and both images had a 1024 × 768 matrix, 0.26 mm 2 in-plane resolution, and a peak kilo voltage output and exposure of 85 kV and 5 mAs, respectively.
The CT and MRI were acquired consecutively (within hours), and the CBCT data and orthogonal projections used in this study were acquired at the third treatment fraction, approximately 10 days (8-16 days) after the CT and MRI acquisition. All data were acquired in treatment position, with the patient´s head and shoulders immobilized in a custom-made five-point thermoplastic mask (Orfit Industries, Wijnegem, Belgium) using a head support (CIVCO Radiotherapy, Iowa, USA). The CBCT data were acquired for the purpose of this study only and not used for treatment positioning verification. For all treatment instances, including the third fraction where CBCT data were acquired, patient positioning was carried out using bone anatomy in the orthogonal projections.

Initial image registration
The pretreatment CT and MRI images are expected to differ from each other due to slightly inconsistent positioning of the patient or anatomical displacement between the two scanning sessions. For this reason, the CT data were deformably registered to the in-phase MRI data before further analysis. Prior to the registration, the immobilization mask and background noise were handled by applying −1000 Hounsfield Units (HU) and 0 signal level to the voxels outside the body contours in the CT and MRI, respectively. For registration, a rigid Euler transformation followed by a nonrigid B-spline transformation with a local rigidity constrain of the bones (MICE Toolkit, v.1.1.0.0, Nonpi Medical, Umeå, Sweden) was utilized. The registrations were optimized with an adaptive stochastic gradient descent procedure with mutual information as similarity metric, as well as rigidity penalty term for the nonrigid deformation. The deformed CT (dCT) was resampled to obtain the same frame of reference and spatial resolution as the sCT. To validate the physical behavior of the deformation, the determinant of the Jacobian (JD) of the deformation field was calculated within the body contour of the dCT data. 22 To confirm the anatomical correspondence between the dCT and MRI, the difference in Euclidean distance between anatomical bone landmarks was calculated. The landmarks 23 are listed in Table 1. The reproducibility of the placement of the anatomical landmarks was evaluated by repeatedly placing the first right landmark 10 times.

Patient positioning
Patient positioning using 3D CBCT data was conducted in Eclipse Image Registration workspace (Var-TA B L E 1 List of the bone landmarks used to evaluate the computed tomography (CT) to magnetic resonance imaging (MRI) image preregistration, and the results of the preregistration evaluation The calculated distances between anatomical bone landmarks Mean (± 1SD) (mm) 1. Horizontal distance between the medial edge of bilateral mandibular condyles.
Horizontal distance between the tip of bilateral mastoid processes.
−0.5 ± 2.5 3. Vertical distance between the mentum and the midpoint of the anterior surface of the vertebral body. ian Medical Systems, Palo Alto, CA) using the rigid autoregistration algorithm. The auto-registration was optimized with a downhill simplex procedure and the mutual information as similarity metric. For each patient, the CBCT data were registered to the dCT and sCT in six degrees of freedom. The volume of interest was selected to cover the vertebral column, mandible, and skull base, and with a bone-threshold of 200 -1700 HU corresponding to the clinical practice for 2D patient positioning at our department that relies on the bony anatomy. The registrations were visually inspected and considered to meet the requirements for clinical use, that is, the bony anatomy of CBCT and reference image overlapped. Patient positioning using 2D orthogonal projections was conducted in Eclipse Offline Review workspace (Varian Medical Systems, Palo Alto, CA). For each patient, the deformed DRR (dDRR) and the sDRR were retrospectively and manually registered to the third fraction´s orthogonal projections in the five available degrees of freedom (2D data cannot be rotated around the inferior-superior axis in the Eclipse Offline Review workspace) by six blinded observers (three radiotherapy technologists [RTTs] and three medical physicists), resulting in a total of 168 registrations. The observers were not informed of the type of DRRs that were registered to the orthogonal projections and were asked to comment on any abnormalities in the reference data.
F I G U R E 1 Illustration of the evaluation of interobserver variation. (a) The blue and red areas are the planning target volume (PTV) structure with the deformed digital reconstructed radiographs (dDRR) and synthetic DRR (sDRR) registration matrix applied, respectively. (b) The yellow vector in the cropped image corresponds to the Euclidean distance between a single coordinate in the dDRR-registered PTV structure and the corresponding coordinate in the sDRR-registered PTV structure.

Evaluations
To assess the 2D-and 3D-positioning accuracy, the difference in displacement in all directions (right-left [R-L], posterior-anterior [P-A] and inferior-superior [I-S], and rotations around these axes) were calculated for dCT and sCT, as well as for dDRR and sDRR. Paired equivalence tests of two one-sided t-test 24 was conducted (package Statsmodels v0.10.0, Python v3.7) to verify the equivalence of the mean 2D-and 3D-differences in translation, respectively. The confidence interval was 95%, and the equivalence interval was set to (−1 mm, 1 mm). The interobserver variation was evaluated by separate application of the paired dDRR and sDRR registration matrices to the original coordinates of the planning target volume (PTV) structures, and subsequent calculation of the difference between the corresponding points ( Figure 1). This Euclidean distance was calculated for every point in the DICOM-RT Struct file (i.e., border of PTV). The Dice similarity coefficient (DSC) was calculated to estimate the overlap between dDRR/sDRR-registered PTVs for every observer.

Initial image registration
For all data sets except one, the JDs of the deformation fields were >0 within the body contours. The evaluation of anatomical correspondence of bones gave a maxmimum mean difference of −1.3 ± 1.6 mm for all evaluated anatomical landmarks ( Table 1). The 10 repeated distributions of the first right landmark gave a mean and maximum value of 1.5 and 3.4 mm, respectively.

Patient positioning
The sDRRs originating from the sCT data were comparable to the original DRR; however, there were some observable differences in the sDRR such as blurred edges and decreased contrast for part of bone structures (highlighted with ellipses in Figure 2).

2D and 3D positioning accuracy
The mean differences between sCT-and CT-based patient positioning using CBCT were <0.7 mm and < 0.3 • in all directions, as seen in Table 2. For all cases, no absolute translation or rotation difference was larger than 2.1 mm and 1.3 • , respectively. The mean differences between sCT-and CTbased patient positioning using orthogonal projections were <0.4 mm and <0.2 • in all directions, as seen in Table 2. For all cases, no absolute translation or rotation difference was larger than 2.5 mm and 1.8 • , respectively.
All translation and rotation differences in each direction between patient positioning using dCT and sCT for both 2D-and 3D-patient positioning are presented in Figure 3.
The paired statistical test of equivalence rejected both t-tests' null hypothesizes (p-value <0.001). The translation data fall within the equivalence bonds (−1 mm, 1 mm) for both 2D-and 3D-positioning and are therefore considered equivalent compared to CT.

Interobserver evaluation
The mean (± 1SD) Euclidean distance for all observers within each case, the maximum and corresponding mean (± 1SD) Euclidean distance for each case of the interobserver evaluation, is presented in Table 3, together with the mean (± 1SD) DSC for all observers within each case. Histograms showing the distribution of distances between every dDRR and corresponding sDRR-based registered PTV coordinates for the two smallest and largest mean Euclidean distance differences are shown in Figure 4.

DISCUSSION
In this study, we have evaluated the use of sCT and sDRR for patient positioning in external radiation therapy by comparing the difference in registration result between sCT/sDRR and the original CT/DRR for the H&N region. The results show that using sCT and sDRR data as reference yields similar registration results as when using CT and DRR data. Even though the sDRR may have some visual dissimilarities in contrast compared to CT-based DRR, this study shows that they still can be reliably used for patient positioning purposes. In order to be able to transit from traditional MRI-CTbased workflow to an MRI-only workflow, there are high demands on the image quality of the MRI data used for both the traditional treatment preparation steps (i.e., delineation) and reliable sCT generation. In addition, the combined challenges of the H&N region are complex anatomy, the large MRI FOV required for sCT generation, and the occational presence of surgical implants or dental restorations that cause disruption of the body contour in MRI. MRI can suffer from a large number of artifacts, including metal and motion artefacts, aliasing, chemical shift, etc., which could influence the sCT generation. But apart from motion and dental-related metal artefacts, no MRI artifacts were observed in this study.
All treatment planning procedures contain systematic treatment preparation errors such as setup errors at the CT and MRI scanner and the treatment unit, delineation errors of the target and OARs as well as the fact that the pretreatment CT and MRI images are instant captures of continuously moving organs. 25 The strategy used in this study to evaluate differences in patient positioning eliminates some of these errors, such as the initial laser-based treatment unit setup errors, and delineation error since both images use the same delineated target and OARs. The anatomical differences between the CT and MRI acquisition were mitigated by deformable F I G U R E 3 Differences in translation (upper row) and rotation around the specified axis (lower row) between deformed computed tomography (dCT) and synthetic CT (sCT). The red boxplots (left column) show differences in translation and rotation between cone beam CT registered to dCT, and CBCT registered to sCT for 14 cases in the right-left (R-L)/pitch, posterior-anterior (P-A)/yaw and inferior-superior (I-S)/roll directions. The blue boxplots (right column) show differences in translation and rotation between deformed DRR registered to orthogonal and synthetic DRR registered to orthogonal projection for 14 cases and six observers. The bar of the boxplots is the inter-quartile range (IQR), the whiskers 1.5⋅IQR and the plus outliers > or < 1.5⋅IQR. registration of the CT to the MRI. This is a critical step in this study since a geometrical difference between these images could influence the patient positioning evaluation results.
The physical behavior of the deformation was validated using JD, where a JD greater than 1 indicates expansion of a voxel, and a JD smaller than 1 indicates a contraction. A negative value of the JD would indicate a physically unrealistic deformation for most organs as they only can expand or be compressed, and not undergo noninvertible deformations such as folding. 22 The JD of the deformation field showed that 13 of 14 dCT data sets had physically realistic deformations (JD > 0). The data set with unrealistic deformation was caused by a physical folding of loose skin at the back of the patient during the MRI acquisition, which was not the case when acquiring the CT images, and hence, cannot be attributed to errors in registration of the MRI and CT images. The landmark evaluation resulted in a mean difference between landmarks in the CT and MRI of approximately 0.3 mm. However, it is essential to recognize that any remaining anatomical difference would systematically affect the results of the patient position-ing evaluation. Nevertheless, in an MRI-only workflow such preprocessing registration will be redundant.
When inspecting the data for case 11, it is clear that the initial registration CT-MRI is deficient for the vertebra column. The MRI data suffered from motion artifacts over vertebra C3-C5, causing difficulties in visually distinguishing part of the vertebras. The algorithm generating sCT data appears to be able to generate the body of the vertebras well, but to have some trouble with generating the spinous processes. As a result of the suboptimal deformable registration, differences in 2D and 3D positioning results using sDRR/sCT or dDRR/dCT data were noticeable for this specific case (2D: up to 2.4 mm, 1.7 • difference, 3D: up to 2.1 mm, 0.2 • difference).
Previous studies have evaluated translational differences for bone-based patient positioning using synthetic 2D data only, 15,16,20 and both synthetic 2D and 3D data. 13,19 There are also studies evaluating both translation and rotation differences using synthetic 3D data. 14,16,18 To our knowledge, there are no studies evaluating rotations in the H&N region.
Comparing our results with previous studies, the translational differences are similar for patient F I G U R E 4 Distributions of the Euclidean distances between every transformed planning target volume (PTV)-point using deformed digital reconstructed radiographs (DRR) and synthetic DRR-based registration matrices for the two smallest (case 1 and 5) and largest (case 7 and 11) mean Euclidean distance differences and all observers. positioning using orthogonal projections. A maximum mean difference of −0.2 mm in the P-A direction has been shown for H&N cancer, 20 compared with our maximum mean difference of −0.3 mm in the I-S direction. The translational and rotational differences are also similar for patient positioning using CBCT, where a maximum mean difference of 1.5 mm (P-A) and 1.2 • (around R-L) was shown for prostate. 17 Both the 2Dand 3D-evaluation had the largest translational differences in displacement in the I-S directions. This might be the result of a slice thickness of 2.0 mm compared to the reconstructed in-plane resolution of 1.1 × 1.1 mm.
Until now, adequate setup margins for routine setup verification in H&N radiotherapy have been determined using bone landmarks and 2D kV images 26 as well as 3D CBCT images. 27 Previously, it has been concluded that average translational and rotational setup errors of up to 2.2 mm and 0.7 • for H&N IMRT treatment plans with a 5-mm PTV margin, over the entire treatment course, introduce minor dosimetric differences to the CTV (0.39 Gy mean difference of D 98 ),PTV (1.21 Gy mean difference of D 95 ), and spinal cord (1.20 Gy mean difference of the dose to 1 cm 3 spinal cord). 28 Our results were well within these margins and smaller than the setup errors, indicating that utilization of sCT data as reference for patient positioning will have minor impact on the delivered dose distribution.
In our experience, there are no previous studies evaluating both translation and rotation using both 2D-and 3D-data. Evaluating differences in translation and rotation for a reference point is the most common method to evaluate patient positioning. Although, when allowing rotation, one can argue that the impact of a difference in this metric is hard to interpret. We propose employment of Euclidean distances between transformed PTV coordinates using dDRR and sDRR-based registration matrices in order to interpret this impact. For example, in case 11 -observer 2, the common method resulted in 2.4-and 1.9-mm difference in P-A and I-S direction respectively, and 1.7 and 0.9 • difference in rotation around R-L and P-A. The Euclidean distance method, however, resulted in a mean distance of 3.5 mm and a maximum distance of 5.1 mm between dDRR and sDRR PTV coordinates. The Euclidean distance method also has the benefit of being easily visualized as histograms (Figure 4). Identical rotation between patient positioning using dDRR and sDRR will make the distribution appear as a spike, and an offset of the spike will correspondingly represent a translation offset.
Deviations in patient positioning with orthogonal projections can arise from different observer strategies regarding the use of translation and rotation during image registration, as different registration strategies could provide relatively large differences in translation or rotation between observers and still obtain comparable position of the PTV. The mean DSC found for the different cases shows that the different observers pro-TA B L E 3 The mean (± 1 standard deviation (SD)) Euclidean distance for all observers within each case, the maximum and corresponding mean Euclidean distance of planning target volume (PTV) coordinates found for each case after applying the paired deformed digital reconstructed radiographs (DRR) and synthetic DRR registration matrices to the original coordinates, together with the mean Dice similarity coefficient for all observers within each case duced registrations that resulted in highly overlapping PTV structures. Visual analysis of the differences between sDRR and dDRR-based patient positioning for case 7 pictured different positioning strategies. For the dDRR, all observers aligned the vertebral column in exchange of a slight mismatch for the head. The opposite was found for the sDRR, where the head was aligned, and the vertebral column slightly mismatched. The largest difference between the two modalities was found in the I-S direction and rotation around R-L. This was not seen for the CBCT evaluation, where the sCT and CT positioning differences was within the standard deviation for all translation and rotation directions. Visual analyzis of case 14 showed no obvious reason for the deviation, and no trend was seen in any translation or rotation for all observers except one. That single observer had mistaken the GTV guideline for vertebrae guideline, leading to a slightly larger mismatch. Performing a rigid registration in an anatomical region where bone structures are able to move with respect to each other (i.e., flexibility in the neck despite immobilization) could give unclear results since not all structures will be aligned after a successful registration. Therefore, it is difficult to mitigate different registration strategies as an unsuccessful registration. Nevertheless, the mean DSC of 0.96 for both cases shows highly overlapping PTV structures using sDRR or dDRR data.
Using observers for 2D patient positioning evaluation mimics the typical clinical scenario but will also reduce the reproducibility. An alternative to that strategy would be to use an automated patient positioning algorithm. However, these algorithms are not yet widely available, and the positioning still completely relies on the work of RTTs. Other limitations in this study are the low number of landmarks and patient positioning observers, and the number of patients included.
In the future, we intend to investigate the remaining part of the MRI-only workflow (i.e., delineation) and develop a patient-specific quality assurance that the generation of sCT brings, with the ultimate goal to treat H&N cancer patient in an MRI-only workflow.
In conclusion, this study evaluated the use of sCT data as reference for patient positioning using CBCT and orthogonal projections for the H&N anatomy allowing six and five degrees of freedom, respectively. The differences between CT and sCT-based positioning were comparable, allowing sCT to replace CT as reference for patient positioning at treatment.