Feasibility and Reliability of Automatic Quantitative Analyses of Mitral Annular Plane Systolic Excursion by Handheld Ultrasound Devices

Handheld ultrasound devices (HUDs) have previously been limited to grayscale imaging without options for left ventricle (LV) quantification. We aimed to study the feasibility and reliability of automatic measurements of mitral annular plane systolic excursion (MAPSE) by HUDs.

decades have examined handheld ultrasound devices (HUDs) as reliable and valuable complements to the physical examination. 2 Because of their easy accessibility, small size, and low cost, HUDs are increasingly becoming a part of the everyday clinical work, even among inexperienced users. 2 HUDs may facilitate the diagnostic process, but there is a risk of image misinterpretation when used by nonexperts. Limitations of image appraisal are often due to a foreshortened apex, endocardial dropout, and blindness to shape distortion. 3 The HUDs have so far been limited to 2dimensional imaging without the possibility of a quantitative analysis of left ventricle (LV) function such as the LV ejection fraction or mitral annular motion indices. Software-based automatic measurements can aid the operator in the evaluation and quantification of heart disease. 4,5 Mitral annular plane systolic excursion (MAPSE) was first described in 1967 and acknowledged as a useful diagnostic measure in the late 1980s. [6][7][8] By measuring MAPSE, the longitudinal shortening of the LV is assessed. There is a close association between MAPSE and the LV ejection fraction. 9,10 When high-end equipment is used, MAPSE can be measured with M-mode or color tissue Doppler (cTD) imaging in the apical 4-chamber view. For a good assessment, the atrioventricular plane needs to be adequately visualized. 11 Our research group has developed an algorithm that automatically measures MAPSE from live grayscale recordings by HUDs. 4 The aim of our study was to evaluate whether automatic measurement of MAPSE was feasible, reliable, and accurate when used by expert cardiologists and sonographers. Our hypothesis was that the feasibility and reliability of automatic MAPSE measurements on HUD recordings would be congruent with manual measurements on high-end echocardiographic recordings.

Materials and Methods
This study was conducted at the echocardiographic laboratory at St Olav's University Hospital during February and March of 2018. Two echocardiographic experts (a specialist in cardiology and a sonographer) performed a standard echocardiography and subsequently a focused cardiac ultrasound examination by a HUD on randomly recruited participants. The study was approved by the Regional Committee for Medical Research Ethics (REK 2017/2054) and conducted according to the second Helsinki declaration. All participants gave their written informed consent.

Study Population
Patients referred for an echocardiographic examination at the cardiology clinic were included. The inclusion criteria were referral for echocardiography at the clinic's echocardiographic laboratory, age older than 18 years, and the ability to give informed consent. Medical histories and characteristics were obtained from the patients' medical records.

Echocardiography
The reference echocardiograms were recorded with high-end equipment (Vivid E9 with an M5S-D phased array transducer; GE Healthcare, Horten, Norway). The patients were examined in the left lateral supine position. The echocardiographic examinations included the following views: parasternal shortand long-axis, apical 4-chamber, 2-chamber, apical long-axis, and substernal. Left ventricular quantification was performed according to recommendations from the American Society of Echocardiography and the European Association of Cardiovascular Imaging. 3 MAPSE was measured as the longitudinal movement of the septal and lateral mitral annular points in the apical 4-chamber view by M-mode and cTD imaging. In cTD imaging, a 2-dimensional frame rate of 25 frames per second was interleaved with Doppler recordings of 100 frames per second ( Figure 1). The measurements represented the average of 3 consecutive cardiac cycles.
For the purpose of the study, 4 separate grayscale M-mode recordings and 2 cTD recordings were included. All analyses were performed offline by the cardiologist using EchoPAC SWO, version 201 (GE Ultrasound). MAPSE was then reanalyzed by the sonographer in a similar way, blinded to the cardiologist's measurements.
Focused Cardiac Ultrasound Examination by the HUD Immediately after the standard echocardiography, the cardiologist or sonographer performed a focused cardiac ultrasound examination by the HUD (Vscan Extend; GE Ultrasound) while the patient remained in the left supine position. The device has a sector transducer with a bandwidth of 1.7 to 3.8 MHz and weighs 406 g. The HUD offers automatic storage of cineloops of a single cardiac cycle without the need for an electrocardiogram, using an image-processing technique called the sum of absolute differences. 12 An apical 4-chamber view was recorded 4 separate times for each patient. All recordings were analyzed by a fully automatic method on a handheld research device.

Method for Automatic Measurement of MAPSE by the HUD
The development of the automatic algorithm is comprehensively described in previous work by our group. 4,13,14 The algorithm uses a Kalman filter to fit a deformable model of the LV to the image data. To process and track the images and the LV movement, a real-time contour-tracking library was used (GE Vingmed, Horten, Norway). The septal and lateral points of the model were tracked to estimate MAPSE. 4 A real-time contour-tracking library provides real-time image segmentation of the LV using a nonuniform rational B-spline model. The model is composed of 12 control points. Their location is updated by finding the LV border in 75 equally spaced edge profiles perpendicular to the B-spline curve. The points are distributed along the edge of the model, which is programmatically generated and related to a model used in previous studies. 13 The model is first initiated by looping through the frame (s) to allow the deformable model to find the endocardial border before being switched into the tracking mode. When the tracking mode is enabled, the septal Figure 1. Motion mode and cTD imaging. A, Mitral annular plane systolic excursion measured from the reconstructed motion mode of the lateral mitral annulus in an apical 4-chamber recording. Mitral annular plane systolic excursion is measured as the total mitral annular excursion from end diastole to end systole (green crosses). B, Mitral annular plane systolic excursion measured by temporal integration of the septal and lateral mitral annular velocities in cTD 4-chamber recordings. Regions of interest were placed in the basal part of the septal and lateral walls, and the corresponding displacement curves are shown to the right. The cumulative displacement during systole (MAPSE) is measured at end systole (yellow crosses). and lateral points of the model are returned from the real-time contour-tracking library. The array of points is evaluated to locate the maximum mitral annular plane displacement. MAPSE is calculated at the septal and lateral mitral annular points and presented by the algorithm together with the averaged value. The automatic algorithm was run on all HUD recordings (4 recordings in each participant). Measurements were annotated 1 to 4 depending on the chronologic order of the recordings ( Figure 2).

Image Quality
The image quality of randomly arranged HUD recordings was scored from 1 to 6 by 2 cardiologists, who were blinded to all measurements. The image quality score consisted of 5 parameters scored between 1 and 6, where 6 represented the maximum score and 1 the lowest possible score. The parameters included were as follows: (1) view (score of 6 for the 4-chamber view; 3 for the 5-chamber view or inclusion of the coronary sinus; and 1 for the 2-chamber, long-axis, and other views); (2) alignment of the LV (score of 6 for <15 of misalignment; 4 for 15 -29 of misalignment; 2 for 30 -44 of misalignment; and 0 for ≥45 of misalignment); (3) malposition of the apex (score of 6 for a correct position; 4 for <15-mm malposition; and 2 for ≥15-mm malposition); (4) assessment of the mitral annulus (score of 6 for excellent visualization; 5 for near excellent; 4 for good; 3 for fair; 2 for poor; and 1 if the mitral annulus was not judgable); and (5) the number of LV segments with a visible endocardium (score of 6 for 6 segments, etc). A mean score was calculated on the basis of the above-specified parameters. The cardiologists also scored how well the application tracked the mitral annulus (scores of 0-3, where 3 represented good; 2, moderate to good; 1, less than moderate; and 0, poor tracking) and whether the automatic measurement should have been discarded.

Statistics
As data were normally distributed, continuous variables are expressed as mean ± standard deviation (SD), whereas categorical variables are expressed as frequencies and percentages. Comparison of methods was done by a paired t test. The coefficient of variation (CV) was defined as the SD of the difference divided by the mean of the two methods analyzed. The average of the repeated measurements for each method was used for comparison. Bland-Altman statistics were used to illustrate the agreement between the methods. Limits of agreement (LoA) were calculated as mean ± 1.96 SD. The reliability of the measurements was evaluated by intraclass correlation coefficients (ICCs), where values of less than 0.5 were considered poor; 0.5 to 0.75, moderate; 0.75 to 0.9, good; and greater than 0.9, excellent. 15 The intrarater reliability was calculated by a 2-way mixed model defined by absolute agreement in the data set of single measurements analyzed by the cardiologist. The inter-rater reliability was calculated by a 2-way mixed model defined by absolute agreement in the data set of average measurements analyzed by both the cardiologist and the sonographer. The same model was used for automatic and reference measurements. The influence of image quality on the performance of the automatic algorithm was evaluated by a regression analysis. A univariate analysis of variance for each image quality parameter was performed to test its influence on the difference between the methods. P < .05 was considered statistically significant. Power estimates were based on the criteria that a relative difference of 15% between automatic and reference measurements would be of clinical significance. With estimated mean MAPSE of 10 ± 2.5 mm and correlation between the methods of 0.85, power to detect a greater than 10% relative difference was  (15) Data are presented as mean AE SD (range) and number (percent) where applicable.

Study Population
Twenty patients (9 women and 11 men) were included in the study. Table 1 shows the baseline characteristics of the study population. The mean age was 64.2 years (range, 22-85 years); 6 (30%) of the patients had known heart failure; and 15 (75%) had a sinus rhythm when examined.

Comparison of Automatic Measurements to Manual Reference Measurements
The automatic method failed in 9 (11%) of the total 80 HUD recordings. The reason for the failure was the algorithm's inability to identify the atrioventricular plane (Figure 2). These recordings were discarded from the study. Table 2 shows the mean MAPSE for each method as averaged and by the LV (septal and lateral) wall, the difference between methods, and the CV. Compared to M-mode imaging, the fully automatic method underestimated mean MAPSE by 11% (mean difference, 1.2 ± 1.4 mm; P < .005). There was a larger discrepancy between automatic MAPSE and M-mode measurements in the lateral wall (13%) compared to the septal wall (8%), with differences of 1.7 ± 2.0 and 0.8 ± 1.3 mm, respectively. When comparing automatic measurements of MAPSE to cTD measurement, the underestimation was small and nonsignificant (0.8 ± 1.8 mm; P = .073). The difference in MAPSE between M-mode and cTD measurements was nonsignificant (0.4 ± 1.9 mm; P = .363). The CV between the methods was 13% or less in all cases. Upper and lower 95% LoA were 4.0 and − 1.5 mm with a bias of 1.2 between automatic MAPSE and M-mode. In comparison, the 95% LoA between automatic and cTD measurements were 4.3 and − 2.7 mm with a bias of 0.8. Bland-Altman plots are shown in Figure 3. The difference between the measurements was not influenced by the level of LV function.
The reliability of the measurements is shown in Table 3. The ICC for the absolute agreement for intra-rater reliability by the cardiologist was good to excellent for all methods (≥0.86). The inter-rater reliability for the M-mode measurements between the cardiologist and sonographer was excellent, with an ICC of 0.98. The ICCs for the absolute agreement of MAPSE by automatic and M-mode measurements and automatic and cTD measurements were 0.85 and 0.81, respectively. The correlations were considered good in both cases.

Influence of Image Quality
Overall, the mean score of the automatic images was 4.53 ± 0.53. There was no significant impact of image quality on the difference between automatic and Mmode measurements (P = .57). Quality did not influence whether the measurements were discarded (P = .40). The mean score of the discarded images was 4.4 ± 0.42. Regression analyses showed that none of the quality parameters had a significant effect on the outcome (all P ≥ .36).

Discussion
In this study, fully automatic measurements of MAPSE by a HUD were compared with MAPSE by M-mode and cTD measurements on high-end equipment. Despite an underestimation by the automatic measurements, the results showed high feasibility and reliability for all methods. When HUDs are operated by experts, automatic measurements of MAPSE are both feasible and reliable. The study population was a general population and, compared to other studies, showed a similar distribution of cardiac diseases. 2,5,16,17 Of the 80 recordings, 9 were discarded. The discarded recordings were not suitable for analyses, since the algorithm failed to detect the atrioventricular plane ( Figure 2). As previously shown by our group, the automatic measurements underestimated MAPSE in comparison to M-mode. Snare et al 4 suggested that high gain in the base of the LV could influence the performance of the automatic tracking and might be an explanation for the underestimation. MAPSE was highest when measured by M-mode imaging, lower by cTD, and lowest by the automatic measurements. However, the difference between cTD and the automatic measurements was nonsignificant. The finding of higher absolute values for MAPSE by Mmode compared to cTD imaging is well known. 18 Mmode imaging has previously been shown to have larger lateral MAPSE values in comparison to other modalities. 18 We found a larger difference between automatic and M-mode measurements in the lateral mitral annular point (1.7 mm) compared to septal measurements (0.8 mm). However, compared to cTD imaging, the automatic method provided similar lateral mitral annular motion measurements. The difference between M-mode and cTD for lateral indices has previously been shown. 18 The lateral wall is usually less aligned with the ultrasound beam compared to the septum. This may cause some underestimation by Doppler imaging and overestimation by M-mode, as seen both by us and others. 18 However, as the HUD used does not have cTD available, tracking must be performed by grayscale speckles only, and this may cause some underestimation related to suboptimal tracking. Considering the high intra-and inter-rater reliability, the high feasibility, and the agreement with reference methods in our study, the automatic method performed well. The CVs provided in Table 2 show the quite-modest variation between the methods. Different echocardiographic indices for quantification of LV function often show CVs between 5 and 20%. [19][20][21] We obtained CVs of 10% to 14% for comparisons of different recordings and methods, all within the range of sufficient reproducibility. Thus, we argue that when used by experts, automatic measurement of MAPSE by a HUD may be a robust method for assessing LV function.
Previous studies have shown that MAPSE of greater than 10 to 12 mm averaged from the septal and lateral mitral annular points correlated with a normal LV ejection fraction. 22,23 Cutoff values for detection of reduced MAPSE are in the same range. 6,11,24 The cutoffs to differentiate between diseased and normal values should be specific for the method used, and this relates to M-mode, cTD, and automatic measurements.
In general, experts such as cardiologists have excellent reproducibility and reliability when measuring MAPSE. 25 As expected, the intra-rater reliability for the repeated measurements of MAPSE by the cardiologist was good to excellent for automatic and standard methods. This agreed with previous studies in which the reliability of LV indices was evaluated. 21,26,27 Further studies are needed to evaluate the automatic method by a HUD in the hands of inexperienced users, but when used by experts, it is highly reliable. The correlations for automatic MAPSE and M-mode-or cTD-based measurements was good (both ICCs ≥0.81), even though the ICCs were not as high as suggested for high-precision metrics. 19,28 In this study, no parameter of image quality showed any influence on the difference between the Inter-rater correlation of reference echocardiograms in which both the cardiologist and the sonographer measured MAPSE by M-mode in the same recordings.
methods. This may have been related to experienced operators and the fact that the difference in image quality scores related more to patient factors than the recordings. The high echogenicity of the atrioventricular plane makes it visible even in case of poor image quality, 11,27 and this may be an advantage for MAPSE measurements compared to other indices of LV function.
In general, it is expected that expert users have high reliability when evaluating LV function using both high-end equipment and HUDs. Some will argue that experienced users rarely have the need for automatic measurements for quantification of LV function, but automation of repetitive routine measurements may be favorable for experts as well. In our study, automatic measurements of MAPSE were only tested on HUD recordings obtained by experts; thus, the results should not be generalized to inexperienced users. However, validation by experienced users is important to test the feasibility of the method by itself and may form a basis for future studies evaluating the accuracy of the application in the hands of nonexperts.
Based on the results of this pilot study, we conclude that in the hands of experienced users, automatic measurements of MAPSE by a HUD showed good feasibility and excellent reliability. Compared to reference measurements, the fully automatic method underestimated MAPSE by 1 mm. This may allow for automatic quantitative measurements of LV function to be supportive tools in HUDs.