Pre‐treatment analysis of non‐rigid variations can assist robust intensity‐modulated proton therapy plan selection for head and neck patients

Abstract Purpose To incorporate small non‐rigid variations of head and neck patients into the robust evaluation of intensity‐modulated proton therapy (IMPT) for the selection of robust treatment plans. Methods A cohort of 20 nasopharynx cancer patients with weekly kilovoltage CT (kVCT) and 15 oropharynx cancer patients with weekly cone‐beam CT (CBCT) were retrospectively included. Anatomical variations between week 0/week 1 of treatment were acquired using deformable image registration (DIR) for all 35 patients and then applied to the planning CT of four patients who have kVCT scanned each week to simulate potential small non‐rigid variations (sNRVs). The robust evaluations were conducted on IMPT plans with: (1) different number of beam fields from 3‐field to 5‐field; (2) different beam angles. The robust evaluation before treatment, including the sNRVs and setup uncertainty, referred to as sNRV+R evaluation was compared with the conventional evaluation (without sNRVs) in terms of robustness consistency with the gold standard evaluation based on weekly CT. Results Among four patients (490 scenarios), we observed a maximum difference in the sNRV+R evaluation to the nominal dose of: 9.37% dose degradation on D 95 of clinical target volumes (CTVs), increase in mean dose (D mean) of parotid 11.87 Gy, increase in max dose (D max) of brainstem 20.82 Gy. In contrast, in conventional evaluation, we observed a maximum difference to the nominal dose of: 7.58% dose degradation on D 95 of the CTVs, increase in parotid Dmean by 4.88 Gy, increase in brainstem D max by 13.5 Gy. In the measurement of the robustness ranking consistency with the gold standard evaluation, the sNRV+R evaluation was better or equal to the conventional evaluation in 77% of cases, particularly, better on spinal cord, parotid glands, and low‐risk CTV. Conclusion This study demonstrated the additional dose discrepancy that sNRVs can make. The inclusion of sNRVs can be beneficial to robust evaluation, providing information on clinical uncertainties additional to the conventional rigid isocenter shift.


INTRODUCTION
Intensity-modulated proton therapy (IMPT) offers the potential to limit dose to normal tissues for head and neck (H&N) cancer patients. However, anatomical variations in the radiation area increase dosimetric uncertainty during treatment delivery. 7,8 Progressive changes due to weight loss, tumour shrinkage, and parotid glands shrinkage were reported as 3.9%-25.5% (weight loss), 20%-60% (tumor shrinkage), and 21.3%-42% (parotid glands shrinkage). [9][10][11] Neck folds, neck tilts, spine flexions, and jaw and shoulder position changes also commonly occur during the H&N cancer treatment. 12,13 These small non-rigid variations (sNRVs) cannot be simplified as rigid translations and, unlike progressive changes that are patient-specific, sNRVs occur randomly. Current research in H&N proton therapy focuses on the development of adaptive strategies to mitigate the influence of progressive anatomical changes. In clinical practice, offline adaptive planning strategies are applied when a threshold of dose to a critical structure is reached. 14,15 This method is effective, but delays in implementing adaptive re-plans exist due to time required for imaging, re-planning, plan approval, and plan verification. This reactive approach to adaptive therapy poses workflow challenges for the busy clinical practice. To mitigate time delay during the offline adaptive process, the use of anatomical modeling was suggested. 16,17 Anatomical models can accurately predict the patients' progressive changes and can therefore be used to create adaptive plans in advance, which can be applied as soon as the adaption threshold is reached. Online adaption is intended for same-day application. However, due to computational limitations, online adaption either compromises on accuracy or constrains the optimizer. Matter et al. 18 used an analytical pencil beam algorithm to generate plans in 10 s. However, analytical calculations overestimate the target by 10% and underestimate some organs at risk (OARs) by up to 10 Gy. 19 Bobić et al. 20 constrained the optimizer to adjust the beamlet positions, energies, and beamlet weights to produce adapted plans. They reported a median adjustment time of 12 min excluding the time taken for deformable image registration (DIR). Lalonde et al. 21 only adjusted the weights of the beamlets to produce adapted plans, their median adjustment time was also 12 min but included the time for DIR. When plans are adapted either online or offline, the patient position may be different from the position in the image. sNRVs not captured during imaging will still be present.
In addition to adaptive planning strategies that mitigate the dosimetric impact of anatomical variability, evaluation of plan robustness is also used. 22,23 Set up and range uncertainty are considered in conventional robust evaluation. Treatment plan evaluation including inter-fractional anatomical variations often uses images acquired during the treatment, [24][25][26] and as such, they can only inform the planning process for a portion of the treatment delivery. A more complete robust evaluation including the possible sNRVs before treatment is crucial to design a plan that is robust toward these anatomical changes. Because sNRVs are not patient-specific, they can be included into robust evaluation to provide additional information before treatment. To our knowledge, studies have yet to reveal the dosimetric impact of sNRVs on proton therapy plans.
Range uncertainty in robust analysis evaluates the dosimetric impact of the systematic uncertainty in calculated range based on CT calibration and conversion to relative stopping power (RSP), while setup uncertainty reflects random errors throughout a course of therapy. This study focused on random errors. We aim to (1) establish the additional impact of sNRV, as a component of random setup error, over and above the rigid translation by building a distribution of possible sNRVs based on population data; (2) provide a robust evaluation method based on the probability distribution. The benefit of this new evaluation method was compared to the conventional robust evaluation, with gold-standard evaluation (after-treatment evaluation that used weekly repeated CTs) as the reference for quantification.

Patient data
Twenty nasopharynx cancer patients with weekly repeat CT and 15 oropharynx cancer patients with weekly cone-beam CT (CBCT) who received photon therapy were recruited retrospectively. We obtained deformations between week 0 (planning CT) and week 1 of treatment (the time between planning CT and treatment week 1 is 14 days, which is the standard time for treatment planning) for all 35 patients, creating a distribution of possible sNRVs based on the method described in Section 2.2. Examples of sNRVs are shown in Appendix A. Four nasopharynx patients who have weekly repeat CTs were randomly selected as test dataset, where we applied the 35 sNRVs to their planning CT. We evaluated the robustness of IMPT plans toward the uncertainty (see Section 2.3) applied to the test patients. We evaluated the following scenarios: (1) different number of fields from 3-field to 5-field plan; (2) different beam angles. The different beam arrangements used in this paper are listed in the upper part of Table 1 and illustrated in Appendix B. The targets (both tumor and nodal area) were split for different fields in these IMPT plans. All plans were robustly optimized using ±3 mm setup and ±3.5% range uncertainty in

Extracting small non-rigid variations from CT images
Anatomical variations during the first week of treatment are predominately due to sNRVs, whereas progressive changes (weight-loss, tumor shrinkage) are less significant. [27][28][29] Thus, the anatomical changes in the first week from a cohort of patients can be seen representative of a distribution of possible sNRVs.
The sNRVs of a cohort of patients (see Section 2.1) were captured using DIR.DIR finds the optimal deformation vector field (DVF) to achieve the greatest similarity between two images. We used stationary velocity fields (SVFs) v v v of diffeomorphic image registration to identify anatomical changes in this project. SVFs can easily be calculated from the inverse DVFs using: 30 To apply the deformations between groups of subjects, we need to project the SVFs into the atlas space, in which all the SVFs have the same position and resolution. The atlas was obtained from a groupwise registration which spatially normalized a cohort of patients [1]1 . 16,31 In the procedure of the projection, the planning CT (pCT) of each patient was the reference geometry, and the CT acquired during the first treatment week (CT t ) was registered to the pCT to produce v v v p→t , where p stands for pCT and t stands for the week (in this case t = 1) when the weekly CT acquired. Then, each patient's pCT was registered to the atlas to pro- P includes all the patients' data used in this study.
Then v v v a,p→t was transformed into the space of an individual patientp using The deformation v v vp →t was used for warping pCT to simulate an sNRV. Finally, in order to warp the planning image Ip, the transformation must be directed from the predicted anatomy to the pCT. This can be simply achieved by reversing the SVFs using The warped image CT sNRV was acquired from: with t = 1 for all the equations above. This method produced 35 CT sNRV s for each patient to represent the possible sNRVs. The diffeomorphic image registration is implemented in NiftyReg [1] . NiftyReg is an open-source DIR tool available as part of the NifTK project.

Robustness evaluation
We included the 35 sNRV scenarios of each test patient into the robustness evaluation using CT sNRV s. For the four test patients, the dose distributions of IMPT plans were calculated under each robustness scenario. We compared (1) the robust evaluation based on the sNRV scenarios and rigid translation with (2) the conventional setup setting that only includes rigid translation. Probability analysis was used in these two before-treatment evaluations to rank the robustness of IMPT plans for each robustly optimized dose metric listed in the lower part of Table 1.

Robustness evaluation scenarios
For our proposed evaluation method using the sNRV scenarios (1), we simulated the isocenter shift for each cardinal direction (x n ,y n ,z n ) following the Gaussian distribution with mean = 0 mm and standard deviation = 1.5 mm 25 on the 35 CT sNRV s. This was done to calculate the perturbed dose distributions caused by the sNRVs and rigid setup uncertainty, since the CT sNRV s have the same isocenter as the planning CT. The 35 dose distributions for each IMPT plan were included in this sNRV+R evaluation. The conventional evaluation (2) only include the rigid setup uncertainty by applying the same isocenter shifts used in the sNRV+R evaluation to the planning CT. This way, we achieved 35 perturbed dose distributions per IMPT plan, which were included to evaluate the plan robustness.

Probability analysis for robust evaluation
The workflow for the sNRV+R evaluation (1) and the conventional evaluation (2) is illustrated in Appendix C. Each considered dose metric D x (e.g., D 95 ) would have corresponding perturbed dose metrics under the different uncertainty scenarios. The nominal dose metric is subtracted from the perturbed dose metrics to form a distribution of dose metric discrepancies ΔD x experienced across the uncertainty scenarios.
The upper and lower boundaries of dose metrics in the evaluation can be demonstrated by the shaded areas in the nominal dose-volume histogram (DVH), as an indicator of worst-case scenarios. It was also suggested in the literature to include a probability approach in robust analysis. 32 For this, the distance between the probability distribution of ΔD x under uncertainty and its ideal probability distribution (Dirac delta function, the dose metrics do not change even under uncertainty) was calculated using the Wasserstein distance (WD) where U and I are the probability distribution functions of ΔD x under uncertainty and its ideal distribution, respectively. The WD measures the effort required to convert one distribution into the other. The smaller the WD, the more robust is a plan for this dose metric.

Performance analysis of robust evaluations
To investigate the effectiveness of sNRVs in indicating the plan robustness to inter-fractional anatomical changes before treatment, the dose discrepancy between accumulated dose using weekly CTs and the nominal dose was taken as the gold standard. In the gold standard evaluation, the dose distributions of the IMPT plans with different beam arrangements were calculated on six weekly CTs of each test patient. Because the accumulated dose is generally used in treatment evaluation and related to prognostics, the weekly dose was accumulated in the reference frame of the planning CT using the DIR algorithm of Niftyreg, referred to as Accu Nom . In the weekly dose calculation, although the isocenter was determined using the information from the rigid registration, the setup error (both rigid setup and sNRV) still existed. Thus, the difference between Accu Nom and the nominal plan, referred to as ΔD st , represents the influences from total random errors, including actual progression uncertainty and setup uncertainty (both rigid setup and sNRV).
Because different beam arrangements are used in this study, the robustness of beam arrangements can be ranked, referred to as robustness ranking. In the sNRV+R evaluation and the conventional evaluation, the WD is used in robustness ranking for each dose metric. In the gold standard evaluation, ΔD st is used in robustness ranking. To quantitatively validate that the performance of the sNRV+R evaluation is better than the conventional evaluation, the consistency C of the robustness ranking for a dose metric is calculated for each beam arrangement of each patient, using , and RP G (B i ) represent the robustness ranking position of a beam arrangement B i in sNRV+R evaluation, conventional evaluation and the gold standard evaluation, respectively. If C ≤ 0 for a dose metric, then this dose metric of beam arrangement B i supports that sNRV+R evaluation is better for robust evaluation, compared to the conventional rigid setup evaluation.

Dosimetric influences caused by small non-rigid variations
This section demonstrates the additional dosimetric influence caused by non-rigid setup uncertainty.
An example of the dose distribution difference caused by an sNRV is shown in Figure 1. The red arrows indicate areas where the dose has fallen under 95% of the prescription dose. With these simulated images of sNRVs and corresponding dose distributions, we can help clinicians to avoid non-rigid postures that can lead to unacceptable dosimetry.
The comparison between the sNRV+R evaluation and the conventional evaluation on an exemplary patient (patient 1) is shown in Figure 2. The upper and lower boundaries of dose metrics in the sNRV+R evaluation (2a) and the conventional evaluation (2b) are indicated by the shaded areas in Figure 2(a,b) separately. We observed that the additional sNRVs widen the bandwidth compared to the conventional robust evaluation. The detailed numbers of the sNRV+R evaluation and the conventional evaluation for four test patients are listed in Appendix D. Among four patients (490 scenarios), we observed a maximum difference in the sNRV+R evaluation to the nominal dose of: 9.37% dose degradation on the D 95 of CTVs, increase in parotid D mean by 11.87 Gy, increase in larynx D mean by 15.04 Gy, increase in brainstem D max by 20.82 Gy, increase in spinal cord D max by 20.96 Gy. For CTVs, 4 patients all had scenarios where the CTV D 95 fell below 95%, 47 out of 490 scenarios in total. In contrast, in conventional evaluation, we observed a maximum difference to the nominal dose of: 7.58% dose degradation on D 95 of the CTVs, increase in parotid D mean by 4.88 Gy, increase in larynx D mean by 6.13 Gy, increase in brainstem D max by 13.5 Gy, and increase in spinal cord D max by 12.9 Gy.
Please note that the worst-case CTV coverage (D 95 ) under setup uncertainty can drop below 95% in some cases. To generate 35 scenarios in conventional robust evaluation, we let the isocenter shifts follow a Gaussian distribution with mean = 0 mm and standard deviation = 1.5 mm. This results in multiple scenarios that can be used for statistical analysis, rather than only using the 12 scenarios usually encountered during robust optimization with 3 mm orthogonal shifts and ±3.5% range error. While we still used the usual 3 mm option to optimize the plan, the additional shifts created with the Gaussian distribution were used for the evaluation. Using this Gaussian distribution may result in scenarios where the shift exceeds 3 mm. However, only 4/490 scenarios were below 95%. Those scenarios only happened to patient 3 whose target volume was located close to the skin, making this particular patient more sensitive to setup uncertainties.
The comparisons of the dose metrics for this patient based on box plots are shown in Figure 2(c)-(h). Dose metrics for the different plans with different beam arrangements are shown in the same figures as box plots. By comparing boxplot of (c)-(e) (sNRV+R evaluation) to (f)-(h) (conventional evaluation) in Figure 2, the mean values of the CTVs' D 95 in the sNRV+R evaluation are lower than the values in the conventional evaluation ranging from −1.57% to −0.95% (range shows the differences between different beam arrangements). The mean values of parotid D mean , oral cavity D mean , and larynx D mean are higher than the values in conventional evaluation, ranging from 1.02 to 1.82 Gy, 0.52 to 0.70 Gy, and 0.84 to 3.18 Gy, respectively. The mean values of D max of spinal cord, optical nerve, and chiasm between the two evaluations only have slight differences, less than 0.6 Gy. Figure 2 only partially demonstrates the Dx under uncertainty. In Figure 3, we plot the probability distribution of ΔDx in the conventional evaluation and in the sNRV+R evaluation on high-risk CTV D 95 and parotid D mean , respectively, for patient 1. In Figure 3, we can see the influence caused by the sNRVs on the probability distribution of ΔDx from different beam arrangements. The robustness of a beam arrangement is presented by the closeness of the probability curve of beam arrangements to the Dirac delta function. For the high-risk CTV, the 3B 60 plan is the most robust (the ΔD mean curve of the 3B 60 is the closest F I G U R E 1 An example of dose distribution variations caused by a random small non-rigid variation (sNRV). (a) The dose distribution on the planning CT, and (b) the dose distribution of the same slice on a deformed planning CT. The green contour in images is the low-risk CTV. The color bar was chosen to mask out doses lower than the 95% prescription dose of low-risk CTV (82.4% is corresponding to 95% prescription dose of low-risk CTV). The red arrows indicate areas of underdosage caused by the sNRVs. (c) presents the difference in dose-volume histogram (DVH) caused by the sNRV to the Dirac delta function, indicated as the dashed vertical line) in the sNRV+R evaluation, as opposed to the conventional evaluation, where we find this beam arrangement to be the less robust one. For the parotid glands,in conventional evaluation (Figure 3c)),the 4B 120 is the most robust beam arrangement, while in sNRV+R evaluation (Figure 3d)), the 3B 60 is the most robust. The ΔD st from the gold standard evaluation validated that 3B 60 indeed is the most robust beam arrangement for the parotid D mean (please refer to the table in Appendix D).

Robust evaluation analysis
This section applies the ranking consistency to demonstrate the benefit of sNRVs for robust evaluation. Regarding the 10 robustly optimized dose metrics listed in the lower part of Table 1 for each beam arrangement, we calculated the percentage of dose metrics that supports the inclusion of sNRV in robust evaluation based on the data of four test patients. Referring to the table in Appendix D, we summarized the percentage of dose metrics satisfying C ≤ 0 (P C≤0 ) for each beam arrangement in Table 2. P C≤0 s are all above 70%, showing that overall including the sNRVs is beneficial to the robust evaluation of all beam arrangements.
We summarized the percentage of C ≤ 0 across all patients and beam arrangements for each dose metric in Table 3. Here, we can conclude that overall including the sNRVs is beneficial to robust evaluation for each dose metric, compared to only include rigid setup uncertainty.

DISCUSSION
Dose distributions in proton therapy are more sensitive to geometric changes than photon therapy. However, in previously published methods of robust evaluation, the impact of anatomical changes before treatment was not considered. In this paper, we demonstrated that including sNRVs into robust evaluation is beneficial.

The use of small non-rigid variations for robust beam selection
In the validation of sNRVs' role in robust evaluation, the dose discrepancy that represents the influence from inter-fractional anatomical changes and isocenter shifts was used as the gold standard. The consistency of robustness ranking showed that P C≤0 is higher especially on the spinal cord, parotid gland and low-risk CTV, which are closely related to outline changes and neck motions, and also on small structures really sensitive to the sNRVs such as cochlea and chiasm, supporting that sNRVs play a positive role in robust evaluation in terms of indicating robustness to inter-fractional anatomical changes.
The method proposed in this study can assist in selecting robust beam arrangements for proton plans without 4D optimization. In Appendix D, the p-values between the distributions of ΔD x in the sNRV+R evaluation and conventional evaluation showed that the sNRVs mainly influenced the probability distribution of CTVs ΔD 95 and parotid ΔD mean . The highest priority of the robust optimization for the four test patients in this study was to ensure target coverage. Similar performance of D 95% based on ΔD st was found on different beam arrangements, with differences smaller than 2%. To best demonstrate the advantage of the sNRV+R evaluation over the conventional evaluation, the beam arrangement was selected based on the impact of the sNRVs on the dose of parotid glands as an illustration. Also, the dose on parotid glands is closely related to toxicity such as xerostomia and dysphagia that can have a long-term impact on patients' quality of life. Here, for example, for patient 1, a similar parotid D mean was achieved using 4B 110 and 4B 120 . If 4B 120 was selected based on WD, the accumulated parotid D mean reduces by 0.7 Gy, which is corresponding to 1 fraction of D mean delivered to the parotid glands. Other organs can be used for beam selection as well, for example, for patient 1, the rank of the chiasm D max in the sNRV+R evaluation indicated the most robust beam arrangement as the gold standard evaluation.
There were two interesting scenarios worth noticing. In different beam arrangements for patient 1, even though the nominal parotid D mean of 3B 60 was the highest, the accumulated dose was lower than 4B 110 and 5B because 3B 60 was the most robust beam arrangement (the lowest WD) under sNRV+R uncertainty. The ΔD st of 3B 60 showed that 3B 60 controls ΔD st of the parotid D mean within 3 Gy,which is corresponding to 10% NTCP difference 33 and used to trigger replan to protect the parotid glands. This case indicates that beam selection based on robust evaluation can potentially reduce the replan rate, something that needs further investigation in the future. For patient 3, even though the nominal parotid D mean of 4B 120 was higher than in the 5B beam arrangements, the accumulated dose was the lowest because 4B 120 was the more robust beam arrangement. A message clearly emerged here is that the best nominal plan may not be the best plan during the treatment.
The impact of different beam angles on the robustness of a plan can be analyzed on the patientspecific geometry using our method. The results can be used to create a robustness plan database to assist to find a more robust planning approach as presented by McGowan et al. 22 and Malyapa et al. 23

4.2
The potential use of small non-rigid variations in clinic The distribution of sNRVs has the potential to be used in other clinical applications.
First, we found that the sNRV that leads to the most dose discrepancy varies from patient to patient and from beam arrangement to beam arrangement. This approach can help clinicians avoid the set-ups with the sNRVs that can lead to unacceptable dose distributions for a specific patient using a specific beam arrangement.
Second, the acquired sNRVs can potentially assist in better estimating the truly delivered accumulated dose using weekly CTs. The sNRVs can be randomly allocated to each weekly CT, with 5 sNRVs per weekly CT. These deformed weekly CTs can be used to estimate the daily dose distribution under the influence of sNRV. Repeating this procedure can reveal the range of potential accumulated doses for the whole treatment.
Third, this study presented the possibility of including sNRVs from a patient population to robust analysis, which also indicated the potential to be used in robust optimization. Mesías et al. 15 included the first two weekly CTs of patients into robust optimization to account for the sNRVs, suggesting that sNRVs can reduce the need for adaption. They indicated that the first two weekly CTs can be replaced by a series of CT images scanned before treatment. Li et al. 34 considered weekly CTs in the robust evaluation, and Yang et al. 35 added the adaptive planning CTs into robust optimization. However, their methods relied on the acquisition of CT images during the treatment, which limits the creation of a robust plan at the early planning stage. In contrast to their patientspecific approach, I suggest an atlas-based technique. While this approach is not patient-specific but based on the assumption that sNRVs are mainly random, there are some advantages. First, this method does not require the acquisition of a series of CT images of the same patient pre-treatment, therefore saving imaging dose and reducing workload. Second, assuming that sNRVs can be reasonably represented using this method, deformed images with the sNRVs can be prepared in advance and fully exploit the benefits of robust optimization with multiple CTs. This will be investigated in future studies.
It should be mentioned that the inclusion of a large patient cohort (many sNRV scenarios) would require recalculating the treatment plan many times. For efficiency, I suggest limiting the number of included sNRVs to the most common/frequent ones. The most common sNRVs can be found, for example, by using anatomical models which use principal component analysis applied to anatomical deformations of a patient cohort to estimate the likelihood of a certain anatomical deformation to happen. By only including the most likely principal components of the deformation into the robust evalu-ation, the number of recalculated plans can be reduced while still representing well the sNRVs. This trade-off will be explored in future work.
The concept presented here can be adapted to different scenarios. Here, I did not factor in immobilization equipment and patient characteristics such as age, size, disease staging, and physical condition. All those factors are likely to influence the possibility and the amplitude of a specific anatomical change to arise during the treatment. While this is not yet considered in this paper, the presented approach has the potential to do so. If sufficient patient data are used to build the atlas, the patient data can be stratified into groups of different immobilization devices and patient characteristics before performing the robust evaluation.

Limitations
For the purpose of showing the feasibility, different plans with different beam arrangements were only created for four test patients. Further validation of the methods will be conducted on a large number of patients. Another limitation of this work is that the impact of DIR accuracy was ignored on robust evaluation. We assumed that the DIR uncertainty would equally affect the robust evaluation for different beam arrangements. When validating the role of sNRVs in robust evaluation, we decided to not take the range uncertainty from Hounsfield units (HU) into account because range uncertainty is an isolated source considered in the robust evaluation and is solely based on the CT calibration. Therefore, it should only have small influence on the results of the comparison, which established that sNRV should be considered in the robust evaluation as a component of the random set up errors, not just rigid set up. However, to fully evaluate the plan, the range uncertainty should be used with the translation rigid setup and sNRVs. Please see Appendix E for an example of the further evaluation.

CONCLUSION
This study demonstrated the additional dose discrepancy arising from sNRVs and the robust plan evaluation method based on the probability distribution. The novelty of this study exists in three aspects: (1) compared with the conventional evaluation, we demonstrated that the inclusion of sNRVs can be beneficial to robust evaluation in terms of indicating plan robustness to inter-fractional anatomical changes.(2) This atlas-based method can help clinicians to choose plans that are robust against those sNRVs. (3) Our method can potentially provide multiple images for potential future 4D optimization.

C O N F L I C T O F I N T E R E S T
The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.

DATA AVA I L A B I L I T Y S TAT E M E N T
Restricted access. The authors cannot make these data publicly available due to data use agreement.