Evaluation of four surface surrogates for modeling lung tumor positions over several fractions in radiotherapy

Abstract Patient breathing during lung cancer radiotherapy reduces the ability to keep a sharp dose gradient between tumor and normal tissues. To mitigate detrimental effects, accurate information about the tumor position is required. In this work, we evaluate the errors in modeled tumor positions over several fractions of a simple tumor motion model driven by a surface surrogate measure and its time derivative. The model is tested with respect to four different surface surrogates and a varying number of surrogate and image acquisitions used for model training. Fourteen patients were imaged 100 times with cine CT, at three sessions mimicking a planning session followed by two treatment fractions. Patient body contours were concurrently detected by a body surface laser scanning system BSLS from which four surface surrogates were extracted; thoracic point TP, abdominal point AP, the radial distance mean RDM, and a surface derived volume SDV. The motion model was trained on session 1 and evaluated on sessions 2 and 3 by comparing modeled tumor positions with measured positions from the cine images. The number of concurrent surrogate and image acquisitions used in the training set was varied, and its impact on the final result was evaluated. The use of AP as a surface surrogate yielded the smallest error in modeled tumor positions. The mean deviation between modeled and measured tumor positions was 1.9 mm. The corresponding deviations for using the other surrogates were 2.0 mm (RDM), 2.8 mm (SDV), and 3.0 mm (TP). To produce a motion model that accurately models the tumor position over several fractions requires at least 10 simultaneous surrogate and image acquisitions over 1–2 minutes.


| INTRODUCTION
Respiratory motion is a challenge in lung cancer radiotherapy. The motion is patient-specific and can be highly irregular. 1,2 Several methods can be used to mitigate the effect of the motion, as reviewed by Lu et al. 3 One alternative is to adapt the treatment delivery to motion, for example, by gating or tracking. Although promising and innovative solutions are tested, such as extra xray sources for imaging at the treatment unit, 4 imaging during treatment often suffers from limited image contrast. 5 Other alternatives are to reduce the motion by patient guidance, breath-hold or abdominal compression, 6 but reduced patient comfort and issues in patient compliance might inhibit those to be practical solutions.
Gating/tracking or breathing interventions measure might not be accessible or feasible, implying a need for static treatments with free breathing. That requires geometrical margins to encompass the tumor motion so that the prescribed dose can be delivered to the tumor. Breathing motion surrogates, such as the vertical position of a box on the patient's thorax 7 or data from surface scanning 8 and spirometry, 9 have been used for many years to detect lung motion during treatment and fourdimensional computed tomography (4DCT). However, in order to be useful, the surrogate data must correlate with the tumor position to model the tumor position accurately. A motion model enables the tumor position to be modeled by any values within certain ranges of the surrogate signals. To acquire training data for such a model, images to capture the tumor motion and surrogate data need to be acquired simultaneously so that the model parameters can be determined. Several motion models have been presented, and a request for clinical validation research was prompted in a comprehensive review by McClelland et al. 10 Motion models can also be used during a 4DCT to avoid introducing image artifacts. 11,12 One type of motion model is statistical motion models for which internal motion is determined from a deformable image registration between the training set and a reference image and a principal component analysis is used to reduce the dimensionalities of the complex motion patterns. A surrogate is used to correlate with the internal motion indirectly via the eigenvalues. 13 However, for the small volume treated in non-small cell lung cancer (NSCLC) radiotherapy, rigid registration might be a sufficient method to detect the tumor motion and its vicinity in a fast and problem-oriented way. This also eliminates the risk of having folding effects that might occur for deformable image registration, that is, when the anatomy is deformed beyond its physical limitation and image information is lost or duplicated.
It has been shown that a spirometer reading of the volume of air ventilated during normal breathing (i.e., tidal volume) has a stronger correlation to the tumor position than the vertical position of a marker box place on the thorax of the patient. 9 Using the tidal volume and its time derivative (i.e., airflow) seems promising for intra-fractional position modeling. 11,12 The model introduced by Low et al. 11 is a simple method that easily adapts to the current clinical workflow and enables both intra-and inter-cycle variations by including both amplitude and hysteresis variations. A body surface scanning system has been successfully used to scan the skin contour of the patient together with tumor 2D images. 14 Also, a surface-derived volume (SDV) has been shown to have a higher correlation to the spirometer readings for healthy volunteers, compared to a gating point on the thorax for which the vertical position of the thorax was measured. 15 However, there are still some issues with the methods that have been tested. Spirometers have a drift in the signal and might also be uncomfortable to have in the mouth for a long time. 9 The time for evaluation is often limited, 16 or the models are trained on 4DCT data even though the 4DCT data have limited information about inter-cycle variations and is by definition already a result of an applied simple motion model where the phase or amplitude values of a surrogate has been used in the reconstruction process. [17][18][19][20][21] Reasonably, to train a model that enables modeling of inter-cycle variations, the model must also be trained on image data for several breathing cycles, for example, cine CT acquisition 22 or a model-based 4DCT technique where the 3D volume is acquired rapidly without the need for stitching slices based on phase or amplitude. 23 To the best of our knowledge, none has trained the Low's motion model on high temporal cine CT data and verified the performance on lung cancer patients over several minutes and fractions.
Hence, in this paper, we aim to evaluate the performance of Low's model for four different surface surrogate measures by evaluating the tumor position residuals for several fractions. We also aim to determine the minimum number of data points, that is, the number of surrogate and image acquisitions needed in the model's training set to yield a motion model that predicts the tumor motion as precisely as possible.

| Tumor position data
After ethical approval and informed consent, 14 nonsmall cell lung cancer patients were recruited. They were CT-scanned (Philips Brilliance ® CT Big Bore 16 slice) at three sessions. The first session was just after the treatment planning CT acquisition and the other two were in conjunction with treatment sessions. For each imaging session, 100 cine CT images were acquired quasirandomly for 8 minutes with a frequency of about one cine CT every 5 th second, that is, 0.2 Hz. The cine acquisitions were made by a radiotherapy technologist (RTT) by pressing an exposure button at randomized times. To keep attention, the RTTs were guided by a simple Pong-like script which moved a ball from side to side with randomized speed yielding a beep sound each time of exposure when the ball hit the wall. Each cine image was 16 × 1.5 mm thick with 0.5 × 0.5 mm 2 pixels with a recording time of 0.3 seconds. Due to a change in the study protocol, to omit a coached part of the acquisition tested for the first patients, only 50 cine images acquired per session with limited surrogate extraction were used for the first three patients. The tumor cranial or caudal part was centered in the collimator opening of 24 mm, and the table position was kept stationary at all cine CT acquisitions. The tumor position was determined by rigidly registering the tumor region in the cine CT images to a reference breath-hold CT. The median of all registrations per session was subtracted to yield intrafractional tumor position data, independent of potential patient setup errors. Hence, the intra-fractional movement could be compared over several sessions. Details of the acquisition procedure are given in Wikström et al. 1

| Surrogate methods and measures
Simultaneously to the CT capture, the patient body contour was monitored by a body surface laser scanning (BSLS) system (Sentinel, C-rad AB). A research version of the BSLS software was used to triangulate the points captured from the reflection of laser lines projected at 11 + 3 cranial-caudal positions on the patient, c.f. Figure 1.
The system recorded the 3D position of 500-1 000 data points at 3 Hz to form polygonal chains of the laser lines. The group of 11 lines was evenly distributed to cover 22.5 cm. Due to the oblique angle of incidence, minor sliding of captured data occurred along the craniocaudal direction during breathing. To compensate for this, the scanned point cloud was linearly interpolated to capture heights at fixed lateral and craniocaudal coordinates ( Figure 1). Four surrogate measures were extracted from the scanned body contour as listed in Equation (1).
• The abdominal point (AP) was defined close to the umbilicus, nominally 15 cm caudally from the xiphoid process, and monitored the vertical height. The distance of 15 cm was, however, changed sometimes to compensate for length variations between patients. • The radial distance mean (RDM) was defined as the mean of all distances from the origin to each captured point s p of the laser line reflections constituting polygonal chains. • The surface derived volume SDV was defined as the volume between a horizontal plane at origin and a region of interest (ROI) of size 15 × 15 cm 2 of the central part of the surface. Since the SDV is used as a relative measure in this work, the level of the volume's bottom plane is not important, given that it never intersects the surface ROI. The reason for using the ROI of the central part of the surface was to avoid the intermittent detection of lateral points due to temporal shadow effects and poor reflection at the sides. A rectilinear grid of 2 cm grid spacing was used to integrate the volume below the ROI numerically. The balance between accuracy and calculation speed was considered in the choice of grid spacing. • The thoracic point (TP) was defined at the xiphoid process and measured the vertical height.
The BSLS system delivered the scanned TP and AP data in real-time, time-stamped, and stored to a file together with the recorded room coordinates of all points constituting the polygonal chains. Also, the CT beam on and off signals were detected, time-stamped, and stored to the same file using a cable connecting the surface scanning computer and the pulmonary port at the CT unit. The SDV and RDM were extracted retrospectively in MATLAB (The MathWorks Inc).
In summary, we have the following four surrogate measures (c.f. Figure 1) The multi-colored surface represents the patient's thorax-abdomen region seen from the left. The craniocaudal direction is along the y axis with the lateral plane given by the x and y axes. The 11 laser lines scanned to capture the patient outer contour are shown in dark red. The three green lines show the lines used to determine the height of the AP. The RDM is derived from the distances from the systems' origin to the discrete data points along the red lines. The distance separating the laser lines was chosen to balance resolution, coverage area, and acquisition frame rate. The SDV was calculated as the volume under the black dots at positions given in the lateral plane in a 15 × 15 cm 2 region, illustrated as a box with grey sidewalls. The three nearest dark red lines to the TP were used to determine its height. All four surrogates except RDM had fixed lateral plane coordinates with the height interpolated from closest the laser lines. All surrogates were baseline adjusted over time with a symmetrical median window spanning full exhalation amplitudes of seven breathing cycles where Δ AP (t) , Δ RDM (t) , Δ SDV (t), and Δ TP (t) represent baseline adjustments calculated by the interpolation of a running median filter using a symmetrical window of full exhalation amplitudes from seven breathing cycles. The vector s p reaches from the origin to the point p along the polygonal chains.

| Model training and evaluation
In this work, we used a previously published motion model by Low et al., 11

| Data analysis
The analysis was divided into two parts as summarized in Table 1. In the first part, the aim was to find the best surrogate, the training set contained all surrogate and image acquisitions from the first session, and the validation set contained all surface and image acquisitions from sessions 2 and 3. The surrogate that produced the smallest combined and 90 was used for the second part of the analysis. In the second part, the number of surrogate and image acquisitions used for training was varied from 4 to 100 in varying steps to investigate the achieved level of precision in modeled tumor positions during sessions two and three.

| R ESULTS
We found that the AP surrogate yielded the smallest residual error in modeled tumor position, see Table 2. The irregularities in the breathing trace were quite pronounced for some of the patients. For example, the four surrogate measures for patient 5's first session are shown as a function of time in Figure 2. Occasionally, TP demonstrates a small time lag compared to the other measures, and TP was also more prone to drift. In Figure 3, we illustrate the function of the model by applying it to four cycles of the AP surrogate from the validation data set for patient 9.
In the example, the modeled continuous tumor trace is shown together with the measured and modeled tumor positions at the times of the five cine image acquisitions during these cycles, A-E. The residual error between the modeled and the measured tumor position was, in this case, very small for all points except point E where an out-of-plane motion of about 2 mm occurred. The entire tumor trace from both the training and validation sets is shown in Figure 4, together with the discrete modeled and measured tumor positions. In a range of 10-20 concurrent surrogate and image acquisitions, the 90% no longer reduces with more training data, as shown in Figure 5. This is equivalent to an acquisition time of 1-2 minutes with the current methods. The correlations between the modeled and measured tumor positions per surrogate measure are merged for all sessions and patients and are shown in Figure 6.

| DISCUSSION
The residuals in modeled tumor positions were analyzed over several minutes and sessions by comparing  24,25 or internal air content, 9 that is, two measures closely linked to lung tumor motion. 26,27 The AP is a simple measure already implemented in the clinical version of the body surface scanning system. Instead of cine CT, several fast helical CT scans could cover a larger volume while keeping the imaging dose to a low level. 28 Thomas et al. 29 reported that five images could be sufficient to build the motion model with sufficient accuracy; however, in contrast to this study, the authors validated the model by a "leave-one-out" method of the 25 scans during the same setup instead of, as in this study, creating the model at one session and testing it on 100 cine CT images at two other sessions.
The SDV has previously shown promising results to have a strong correlation to tidal volume, measured by a spirometer 15 and 2D diaphragm position. 17 Spirometer data have also been shown to have a strong correlation to the inner anatomy. 9 However, in this work, the SDV produced less accurate modeled tumor positions than both AP and RDM. Furthermore, a benefit of using RDM instead of SDV is that it does not need sliding surface interpolation or a volume calculation; hence it is more calculation efficient and facilitates use in real-time. Since hundreds of points are collected along the laser lines, the RDM is assumed to be relatively robust for intermittent detection failures on the patients' sides. The surface declination at the sides causes the captured points to be relatively sparse, producing large volume changes if a point at the edge disappears. However, the change in RDM for such events is assumed to be negligible.
Fayad et al. 18 investigated the correlation between the patient contour and anatomical landmarks by extracting the body contour from 4DCT scans for 10 patients. In line with our findings, they found the strongest correlation for the abdominal region. Kauweloa et al. 14 also showed promising correlations between an abdominal ROI of the body surface and the 2D lung tumor position for phantom measurements and four patients. However, they found a drift in the signal leading to a recommendation of not using the system for amplitude sorted 4DCT. No systematic drift was detected in our work. In a phantom study by Jönsson et al., 30 they concluded that the BSLS produced an accurate signal for 4DCT reconstruction, although the shape of the breathing trace differed slightly between the available commercial systems they tested. This led to the conclusion that the same system used for treatment planning should be used during treatment.  The TP was occasionally lagging the other surface surrogates slightly. The reason for this was that the abdominal part elevated before the thorax during inspiration. This wave-formed inhalation was also seen in time-lapses of Figure 1 and has also been detected by other authurs. 31

F I G U R E 2
Breathing trace for the first session of patient five showing data of the TP, AP, RDM, and SDV. Before use as model input, all signals were subtracted by the red baseline detected by a moving median filter over the full expiration phases. Grey vertical lines show the times for cine CT acquisitions. The acquisition time per cine CT was 0.3 seconds. The short pause in the middle (after about 240-250 seconds in the example above) is the loading time since the cine CTs were grouped 50 by 50 in two acquisition protocols in the CT software. A time lag compared to the other surrogates was occasionally seen for TP, for example, for third cine acquisition F I G U R E 3 Four breathing cycles in the validation set for patient nine are shown as an example of model behavior. A, The AP surrogate amplitude and the times for the cine image acquisitions A-E. B, The modeled tumor trace for these four breathing cycles is shown as a continuous grey line. Red thin arrows illustrate the residuals between the modeled (green dots) and measured (blue dots) tumor positions. The pink and dark red arrows show the amplitude and time derivative vectors x A and x Aʹ , respectively. The data are shown projected to the plane of the two major principal axes. The inserted stick man shows the orientation of the patient The choice of using the Low model was based on previously published data showing the potential of accurately model points in the lung regardless of irregular breathing. 11 More sophisticated models 10 or machine learning approaches might improve the accuracy or produce population-based correlations between surface and tumor motion. However, this was out of the scope of this work.
In this study, the evaluated surrogates were determined at fixed lateral and longitudinal coordinates. A possible improvement might be applying a deformable mesh to compensate for sliding motion during breathing to monitor the same region of the patient's skin.
Schaerer et al 32 investigated the accuracy of a body surface scanning system to detect fixed surface landmarks at the abdomen of the patient by applying a deformable mesh of the body contour for five healthy volunteers. They concluded that the accuracy between star-shaped feature points attached to the patient and points determined from deformed patient body contours were decreased from 3.6 mm for the initial condition to 1.1 mm after deformable registration. However, this sliding adjustment will primarily affect surface surrogates that slide over gradients in the cranial-caudal direction. Hence, given that the surface is rather flat in This study only considered tumor motion and excluded the breathing motion of other parts of the body. All motions that affect beams penetrating the patient should be considered during treatment planning to achieve high accuracy dose calculations, particularly for particle therapy. For treatment planning, margins for setup errors, segmentation uncertainties, etc., must be added to the results.
The study protocol change after patient three is assumed to have a minor impact on the results since the data are heavily oversampled. Nevertheless, since the acquisition time is 4 minutes instead of 8 minutes for the first three patients, long-term drifts might have been undetected; hence the deviation between modeled and measured tumor positions might be slightly underestimated for the first three patients compared to the other patients. However, no indication of this was seen.
Training data must contain simultaneously acquired data pairs of surrogate and tumor positions. Cine CT images are therefore more suitable to train a motion model rather than 4DCT images. Besides the limitation of 4DCT to measure inter-cycle variation, it has not welldefined values for irregular breathing. In phase sorting, each slice in a phase bin is acquired with equal phase but can differ in amplitude and its derivative, hence not giving well-defined values of the model's independent variables. In amplitude sorting, the amplitude is welldefined, but its derivative is still not well-defined since it may stem from different phases and breathing cycles. It is not feasible to handle each 4DCT slice separately to circumvent these problems since it commonly only represents a few millimeters field-of-view in the craniocaudal direction and during a minimal time.
By means of a motion model, the tumor trajectory can potentially be extensively simulated for a more precise delineation of the internal target volume (ITV) to facilitate better dose coverage. Further work will evaluate the benefit of using motion model-created ITVs compared to 4DCT-based ITVs.

| CONCLUSION
In conclusion, the accuracy in modeled tumor positions was determined for four surface surrogate measures. The surrogate AP produced the most accurate modeled tumor positions, and at least 10 concurrent surrogate and image acquisitions are required to achieve a motion model that accurately models the tumor positions for several fractions.

A C K N O W L E D G M E N T S
The authors thank C-rad for technical and partial financial support and the Medical Radiation Physics department for financial support. The following founds are also acknowledged for financial support: Regional agreement on medical training and clinical research (ALF) between Uppsala County council and Uppsala University, Research foundation Stiftelsen Onkologiska Klinikens i Uppsala Forskningsfond. We also acknowledge the RTTs (Christina, Charlotta, Tora, Unni, Farnaz) who kindly acquired all the cine CT images and Lars Lindhagen at Uppsala Clinical Research Center for consultancy regarding statistical questions.

C O N F L I C T S O F I N T E R E S T
The research was partly sponsored by C-rad AB, Uppsala, Sweden. Ethics and consent: The local ethics committee approved on 13 of May 2008 (D-nr: 2008-193).

A U T H O R C O N T R I B U T I O N
Ulf M Isacsson: Substantial contributions to the conception and design of the work and analysis of the data. Revised the manuscript critically for important intellectual content. Approved final version and agreed to be accountable for all aspects of the work.
Kristina M Nilsson: Substantial contributions to the conception of the work. Segmented all the used structures. Revised the manuscript critically for important intellectual content. Approved final version and agreed to be accountable for all aspects of the work.