Clinical MR imaging in Parkinson’s disease: How useful is the swallow tail sign?

Abstract Background With conventional MRI, no Parkinson's disease (PD)‐specific abnormalities can be detected. However, there is a critical need for accompanying neuroimaging markers to guide the diagnosis. With high‐resolution susceptibility‐weighted MRI (SWI) sequences, the imaging of nigrosome‐1 (N1) is possible. The so‐called swallow tail sign (STS) has been proposed as a suitable neuroimaging marker for the diagnosis of PD. Objectives To investigate whether the absence of the STS can be applied for distinguishing PD patients from healthy controls (HCs). Methods SWI images of 44 PD patients and 50 age‐ and gender‐matched HCs were investigated using a 3T MRI scanner. Two trained neuroradiologists blind‐rated the images and evaluated whether the STS was absent (1) on one side or (2) both sides of the participant's midbrain. Results Our results confirmed good interrater reliability comparable to previously published studies. However, we did not identify any group differences between PD patients and HCs. Measures of diagnostic values revealed overall poor diagnostic performance. Conclusions Even though previously stated, our study does not confirm the potential use of the STS as a supportive neuroimaging marker for PD in a clinical setting. In conclusion, there is a critical need for improvements in N1‐targeted MRI sequences and the development of advanced segmentation algorithms.

the thoroughful exclusion of absolute exclusion criteria (e.g., supranuclear gaze palsy), the absence of red flags (e.g., the rapid progression of gait impairment or requiring regular use of wheelchair within 5 years of disease onset), and the presence of at least two supportive diagnostic criteria (e.g., a clear beneficial response to dopaminergic therapy, which was present in all investigated individuals of the disease group).

| MRI acquisition and analysis
Structural MR imaging was performed at the CBBM Core Facility Magnetic Resonance Imaging of the University of Lübeck using a 3-T Siemens Magnetom Skyra scanner equipped with a 64-channel head-coil. For SWI, a 2D gradient echo (GRE) sequence with the following acquisition parameters has been acquired: TR=27 ms; TE=20 ms; MTC off; SWI on; flip angle 15°; 1x1x2 mm 3 resolution; 120 × 220×220 mm 3 field of view; acquisition time 4.54 min, transversal orientation, phase-encoding direction P»A. The repeated measurements were coregistered before averaging. The chosen MRI protocol parameters adhere to a standard and widely available SW imaging protocol as often used in clinical practice (e.g., for evaluating microbleeds). We employed this approach to ensure the generalizability of our findings in a clinical setting. T1 imaging: Additionally, structural images of the whole brain using a 3D T1-weighted MP-RAGE sequence were acquired (TR=1900 ms; TE=2.44 ms; TI=900 ms; flip angle 9°; 1 × 1×1 mm 3 resolution; 92 × 256×256 mm 3 field of view; acquisition time 4.33 min, sagittal orientation, phase-encoding direction A»P) and evaluated by a trained neuroradiologist to rule out the presence of conflicting structural lesions or relevant comorbidities (e.g., normal pressure hydrocephalus or vascular parkinsonism). Two trained neuroradiologists (A.N., P.S.) blindrated the SWI images and evaluated whether STS was absent on one (1) or both sides (2) of the participant's midbrain. Both raters had access to all axial slices for subsequent assessment of the STS. The accuracy of the classification was assessed against the clinical diagnosis of PD as the golden standard.

| Statistics
Interrater reliability (IRR) was assessed with Cohen's Kappa statistic; afterward, a consensus agreement following a personal discussion of both raters was reached, upon disagreement. Chi-square tests (X 2 ) were employed to evaluate group differences regarding the one or two-sided absence of the STS in PD patients compared to HCs. Contingency tables were calculated for measures of diagnostic value (e.g., sensitivity; see Table 1). In addition, the area under the receiver-operator characteristics curve (ROC-AUC) was calculated for both conditions to assess the overall diagnostic performance. All

| Our study showed reasonable IRR for the assessment of the STS
Kappa statistics for the unilateral absence of the STS showed substantial (κ = 0.67, p >.001) and for the bilateral absence of the STS moderate (κ = 0.59, p >.001) interrater agreement according to the classification of Landis and Koch (Landis & Koch, 1977).

| No significant group differences could be observed
No group differences were present between PD patients and HCs, neither for the unilateral absence of the STS (X 2 (1)=3.097, p=.078, n = 94) nor for the bilateral absence of the STS (X 2 (1,)=0.324, p=.569, n = 94). Measurements of diagnostic value (e.g., sensitivity) are presented in the table. In summary, none of the diagnostic metrics provide any benefit for clinical decision support, which is additionally highlighted by the poor ROC-AUCs (see Table 1).

| D ISCUSS I ON
In summary, our study is methodologically well in line with previously published reports. Our IRR illustrates that the overall detection of the STS was robust and provides general comparability to other already published studies (Rizzo et al., 2019). The unilateral absence of the STS may be more suitable for further studies highlighted by the overall better IRR and diagnostic performance metrics with a trend toward a difference between PD and HCs. However, we were not able to demonstrate that there is unequivocal diagnostic value to identify PD patients, which stands in clear contrast to previous reports (Mahlknecht et al., 2017). Our study included a comparable number of cases, indicating that the lack of relevant group differences is not driven by the lack of sufficient statistical power (Bae et al., 2016;Sung et al., 2016). However, the number of subjects per group was unbalanced in some previous studies, which may skew the diagnostic performance toward falsely better sensitivity and positive predictive values (Bae et al., 2016). Our SWI sequence is well in line with former studies performed on 3T MRI scanners and is routinely applied in clinical practice. The post-mortem study of Kau et al. (Kau et al., 2019) identified microvessels within the SNpc as a potential confounder for SWI measurements: In eight out of nine HCs, one or more microvessels were detected medial to the STS or at least unilaterally in the medial part of the STS formation. Intrinsic vessels of the midbrain dopaminergic system may occasionally be responsible for false-positive identification of the masked STS in the assessment of the dorsolateral SNpc ( Figure 1). Therefore, both iron deposits and microvessels might contribute to the hypointense signal surrounding N1 in the SWI of normal aged midbrains without being specific to PD (Postuma et al., 2015).
However, anatomical variation is unlikely to be present only in our study and, therefore, cannot cause these conflicting results solely. In another study, Oustwani et al. found that STS was absent in 21% of HCs, implicating a considerable number of false-positive ratings in HCs (Oustwani et al., 2017), which is even more pronounced in our TA B L E 1 Summary of the diagnostic performance metrics in both conditions following previous rater consensus agreement Note: In general, the absence of the STS tends to be more specific than sensitive. However, the diagnostic value is negligible concerning the high false-positive and false-negative rates. The low diagnostic performance is summarized in the ROC-AUCs values, which did not significantly outperform pure chance.
Abbreviations: NPV, negative predictive value; PPV, positive predictive value; ROC-AUC, receiver-operator characteristics area under the curve; STS, swallow tail sign.