CACCT: An Automated Tool of Detecting Complicated Cardiac Malformations in Mouse Models.

Congenital heart disease (CHD) is the major cause of morbidity/mortality in infancy and childhood. Using a mouse model to uncover the mechanism of CHD is essential to understand its pathogenesis. However, conventional 2D phenotyping methods cannot comprehensively exhibit and accurately distinguish various 3D cardiac malformations for the complicated structure of heart cavity. Here, a new automated tool based on microcomputed tomography (micro-CT) image data sets known as computer-assisted cardiac cavity tracking (CACCT) is presented, which can detect the connections between cardiac cavities and identify complicated cardiac malformations in mouse hearts automatically. With CACCT, researchers, even those without expert training or diagnostic experience of CHD, can identify complicated cardiac malformations in mice conveniently and precisely, including transposition of the great arteries, double-outlet right ventricle and atypical ventricular septal defect, whose accuracy is equivalent to senior fetal cardiologists. CACCT provides an effective approach to accurately identify heterogeneous cardiac malformations, which will facilitate the mechanistic studies into CHD and heart development.


Manually marking ventricular cavities and the great vessels
The ventricular cavity and the related great arterial cavity were used to represent the corresponding outflow tract (OFT). First, a new surface module was created, and transverse section was selected as the annotation image. After observing all images of the heart, an image with only two specific ventricles was selected as the initial image of ventricular cavities (Supplementary Figure 5a). Then, the area of the two ventricular cavities on this image was determined. A ventricular cavity was randomly selected on the initial image and the inner and outer contours of all areas of the ventricular cavity were marked with a "magic wand" with 10.0% reductivity and 10.0% fault tolerance (Supplementary Figure 5a). Then, the inner and outer edges of all areas of the ventricular cavity were marked on the initial image with the "magic wand", image by image toward the bottom of the heart. As the images moved toward the bottom of the heart, an image was found on which the ventricle and the atria connected (Supplementary Figure 5b). The "magic wand" would not only recognize the contour of the ventricular cavity but would also recognize the contour of the atrial cavity connected with the ventricular cavity. The marked contour of the atrial cavity was deleted on these images to avoid the interference of atria in the observation of the OFT (Supplementary Marking was stopped on the image where the great artery bifurcates (Supplementary Figure   5e). Then, the inner and outer edges of all areas of the ventricular cavity were with a "magic 10 wand" on the initial image of ventricular cavities, image by image toward the apex until the area of the ventricular cavity disappeared on the image (Supplementary Figure 5e).

Detection of cardiac cavity regions
For a computed tomography (CT) image, first, the black border area around the image was removed by the threshold segmentation method (pixel gray-value=255), and then the remaining areas, expressed as A n , were extracted using the OTSU method to obtain a binary image (the black part was the area of cardiac cavities and external region of the heart, and the white part was the cardiac tissue) (Supplementary Figure 3a). Before the appearance of the ventricular cavity (observed from the apex to the bottom of the heart), the differences between regional pixels on images were not obvious, and the variance, expressed as σ, was also small.
As segmentation error occurred by using the OTSU method, a method threshold σ T was used.
If σ > σ T , A n would be threshold segmented; otherwise, A n would be set to black (Supplementary Figure 3a). The black pixels were selected as seeds in A n . Based on the region growth algorithm, all the connected areas of the black regions expressed as A c were obtained. The maximum connected areas were removed from A c , and the remaining connected areas represented the cardiac cavity region remembered as A lu (Supplementary Figure 3b).
Cardiac cavity images were obtained by iterating all images of the cardiac CT images with the above methods.

Construction of a three-dimensional (3D) graphical data trove
A connected area was defined as a node, and each node contained all types of information of 11 the connected area, including the number of the frame, pixel vector (internal pixel vector, edge pixel vector), area size, and center coordinates.
It was assumed that connected areas of the cardiac cavities region ii j {A |j 1,2,..,N }  were obtained on frame i and A i j meant the connected area of node N i on frame i, and N i represented the number of these connected areas. These connected areas were used as seeds Then, the nodes obtained on frame i+1 were used as seeds of frame i+2, and all nodes on frame i+2 were obtained as mentioned above. The 3D graphical data trove, Ω G , was constructed by iterating all frames of the cardiac CT images of one heart. In addition, all the original connected areas obtained were based on the cardiac cavity region as seeds, without parent nodes.
For detecting the targeted part of intracardiac regions, it was necessary to determine which targeted node belonged to each part of the heart (attribution of the node). Therefore, 10 attributing markers (M) were defined for each node (Supplementary Table 1).

Detection of the atrial cavity region
The left and right atrium areas were assumed to be A kl u and A kr u on frame k, respectively, and were also called atrium areas and denoted as V. The remaining regions of the cardiac cavity region on the images were taken as non-atrium areas and denoted as B. The initial areas of A kl u and A kr u were obtained by mapping the manual annotations onto the corresponding image and binary image. Before detecting the atrial cavity, the nodes of small areas were filtered out to avoid the interference of small areas, and these filtered nodes were optimized after marking the atrial and ventricular cavity and great arterial cavity (Supplementary methods

Optimization of filtered nodes).
During atrial cavity region detection, the similarity between the node on the latter frame with the atrium regions and non-atrium regions on the previous frame was assessed to determine the attribution of the node on the latter frame. This similarity formula was defined r v is the similarity between the node and atrium regions of the previous frame (frame k), r b is the similarity between the node and non-atrium regions of the previous frame (frame k); Connections between the atrial and ventricular cavities or other cardiac cavity regions 13 meant that the connected area of the atrial cavity consisted of both atrial and non-atrial regions. When atrium regions were updated on the latter frame, the attribution of all the nodes on the latter frame were determined as follows: The connected area k1 it,t A  represented by node t on frame k+1 was assumed to contain both the atrial and non-atrial regions, and that k+1 t M = 7; thus it was necessary to distinguish and segment the atrial and non-atrial areas in the connected areas and update the atrial V k+1 , the non-atrial B k+1 and the set of nodes k+1 k+1 jg {G |j=1,2,...,N }.
new,t represents the remaining part after the atrial connected areas on frame k were removed, and A k+1 new,t is initially empty. B k represents the background area of the atrial cavity region on frame k.
The previous content of the node C k+1 t was deleted, the region-related information of A k+1 new,t was recorded, and then the connection relationship between parent and son nodes was updated according to whether there was intersection of parent with son nodes. The atrial and non-atrial regions on frame k+1 were obtained by judging the atrial and non-atrial attribution of all nodes on frame k+1.
V k+1 represents the atrial connected areas on frame k+1. A k+1 new,t represents the remaining part after the atrial connected areas on frame k were removed.

Detection of normal ventricular cavity
For detecting the ventricular cavity region, the LV cavity and RV cavity regions on frame k were assumed to be A k lv and A k rv , and the corresponding set of nodes were Ω k g,lv and Ω k g,rv .
A similarity formula was defined to determine the attribution of all son nodes in the latter frame or all parent nodes on the previous frame, assuming when A G is the connected area of node G. A it,lv and A it,rv represent the intersection of the corresponding connected area of node G with the left ventricular (LV) cavity A k lv and right ventricular (RV) cavity A k rv , respectively, on the parent (or son) frame of node G. d 1 and d 2 represent the similarity between node G with the LV cavity region and RV cavity region, respectively;   N  represents the area of the corresponding connected area.
All son nodes or parent nodes of the 3D graphical data trove without congenital heart defects were directly transmitted: the son or parent nodes obtained from the LV still belonged to the LV region, and the son or parent nodes obtained from the RV still belonged to the RV.
Therefore, the principle of the attribution of nodes is given: G indicates the attribution of node G; M G = 3 indicates that node G belongs to the LV cavity region; M G = 4 indicates that node G belongs to the RV cavity region.
First, the initial ventricular cavity regions were obtained by mapping the manually marked initial ventricular cavity image on the corresponding image and the binary image.
Then, the next ventricular cavity region on the son (or parent) frame was detected based on the initial ventricular cavity region. The iteration process was repeated to detect the whole ventricular cavity region and the 3D graphical data trove of one ventricular cavity was updated.

Ventricular cavity detection algorithm:
Input: the manually marked initial LV cavity image A k0 lv , the manually marked initial According to the 3D graphical data trove Ω g , obtain parent nodes Ω k-1 g, parent , whose attributing marker M = 0, of Ω k g, lv and Ω k g, rv . The amount of the Ω k-1 g, parent is N k-1 g, parent .
If N k-1 g, parent = 0 End the circulation.

End
For j = 0 to N k-1 g, parent G = G j , G j ∈ Ω k-1 g, parent , the connected areas of node G are A G According to formula (7) and (8), judge the attribution of nodes: End 3) Search backward from frame k0.
According to the 3D graphical data trove Ω g , obtain son nodes Ω k+1 g, son , whose attributing marker M = 0, of Ω k g, lv andΩ k g, rv . The amount of the Ω k+1 g, son is N k+1 g, son .
If N k+1 g, son = 0 End the circulation.

End
For j = 0 to N k+1 g, son G = G j , G j ∈ Ω k+1 g, son , the connected areas of node G are A G According to formula (7) and (8) M G represents the attribution of node G. M G = 8 indicates that the son node or parent node G is not only from a node in LV but also from a node in RV on the current image. M G = 8 also indicates that the LV and the RV are directly connected in the frame k-1 or k+1. M G = 3 indicates that node G belongs to the LV region. M G = 4 indicates that node G belongs to the RV region. vR0 and vR1 represent the similarity thresholds between node G and the LV and RV, respectively.

21
When M G = 8, the LV and the RV were directly connected in frame k-1 or k+1. In this case, it was necessary to distinguish and segment LV and RV. First, the connected area A G corresponding to node G was acquired. According to A k lv and A k rv , C G , the additional area in A G compared to LV and RV was obtained: according to the similarity between the connected area with LV (or RV) (see formula (7)).

Ventricular cavity region detecting algorithm (with paralleled VSD):
Input According to the data structure Ω g , obtain parent nodes Ω k-1 g, parent , whose attributing 23 marker M = 0, of Ω k g, lv and Ω k g, rv . The amount of the Ω k-1 g, parent is N k-1 g, parent .
If N k-1 g, parent = 0 End the circulation.

End
For j = 0 to N k-1 g, parent G = G j , G j ∈ Ω k-1 g, parent , the connected areas of node G are A G According to formula (7) and (9), judge the attribution of nodes: According to the 3D graphical data trove Ω g , obtain son nodes Ω k+1 g, son , whose attributing marker M = 0, of Ω k g, lv and Ω k g, rv . The amount of the Ω k+1 g, son is N k+1 g, son .
If N k+1 g, son = 0 End the circulation.

End
For j = 0 to N k+1 g, son G = G j , G j ∈ Ω k+1 g, son , the connected areas of node G are A G According to formula (7) and (9) For i = 0: N According to the data structure Ω g on frame i, obtain the node G whose M = 8 For G in Ω i g :

Detection of ventricular cavity on images with oblique VSD
To recognize the targeted ventricular cavity on the images with oblique VSD, the transition 26 nodes from LV to RV (or RV to LV) G key were found through node marker M, obtained using formulae (7) and (9). Part of the transition node belonging to both the LV and RV was defined as the intersection area. The left side was the process of the intersection area changing from LV to RV, and the right side was the process of the intersection area changing from RV to LV

Detection of the great vessel cavity region
The great arterial cavity was detected after the left and right ventricular cavities were detected.
The connected area of the aorta cavity and pulmonary arterial cavity on frame k were assumed to be A k ao and A k pa , respectively, and the corresponding nodes were assumed to be Ω k g, ao and Ω k g, pa , respectively. Similar concepts of the ventricular cavity recognition algorithm were used to 28 obtain the node set Ω all, ao , the connected areas set A all, ao of the aorta cavity, the node set Ω all, pa and the connected areas set A all, pa of the pulmonary arterial cavity (see Supplementary Method 5.1). If one great artery was spatially connected to one ventricle, the artery was directly attributed to that ventricle.
When the great artery was connected to the ventricle, a large number of repeated searches for ventricular cavity nodes might occur during the detection of the great artery cavity. Fewer son nodes were found in the great arterial cavity region compared with the ventricular cavity regions. To avoid repeated and meaningless searches, a constraint condition was set when the artery was searched forward: st = ao represents the aorta, and st = pa represents the pulmonary artery. N k0 g, st represents the number of connected nodes on input frame K0, N k g, st represents the number of connected nodes on input frame k. As a constant, aN represents the possibility of the maximum number of subareas of a great artery.
When ε = 0, keep searching forward; when ε = 1, end the searching process. After updating the 3D graphical data trove with the information of the ventricular cavity and the great arterial cavity was completed, because the semilunar valves or blood clots in the great arterial cavity might clog the cavity completely, the ventricular cavity and the great arterial cavity were not connected. Here, to determine the relationship between the ventricular cavity and the great arterial cavity, images (n = cR) after the left and RV node set were searched to find all the last connected areas of the left and RV cavity that had no son nodes. A zero, lv and 29 A zero, rv , respectively, and searched frames (n = aR) ahead of the great arterial cavity node set to obtain all the first connected areas of the great arterial cavity A zero, st . By observing the heart, the great arterial cavity was found to be connected to the nearer ventricular cavity. The shortest distances between A zero, st and A zero, lv and between A zero, st and A zero, rv were calculated to determine the connecting relationship between the marked great arterial cavities and ventricular cavities.
To calculate the distance, two connected areas A 1 and A 2 and 1,i 1 pA  and 2,j 2 pA  were assumed and the distance between the pixels of the two connected areas was defined as: 1,i 2,j p p 1,i 2,j 2 D p p  (13) Then, the shortest distance between the two connected areas was calculated: According to formulae (13) and (14), the attribution of the great arterial cavities to the ventricular cavities was determined. Using the great arterial cavity extraction algorithm and inference of the great arteries algorithm, the LV and its related great arterial cavity node set Ω all, lv and connected area set A all, lv , as well as the RV and its related great arterial cavity node set Ω all, rv and connected area set A all, rv were updated and obtained. According to 3D graphical data trove Ω g , obtain parent nodes Ω k-1 g, parent of Ω k g, ao and Ω k g, pa . The amount of the Ω k-1 g, parent is N k-1 g, parent .

End
For j = 0 to N k-1 g, parent G = G j , G j ∈ Ω k-1 g, parent , the connected areas of node G are A G 31 According to the similarity between node G and A k ao and the similarity between node G and A k pa , judge the attribution of node G:

Optimization of filtered nodes (detailed cavities)
Before detecting the atrial region and acquiring the inference process from ventricular cavities 38 to the great arterial cavities, all connected areas less than the area-size threshold T a were filtered to avoid the impact of small areas, as follows: S k i represents the size of the connected area of node i on frame k; M k i represents the attributing marker of node i on frame k; M k i = 0 indicates that node k was not processed; and M k i = 1 indicates that node k is filtered out. In addition, in the forward or backward search for ventricular and great arterial cavities, some nodes without a parent node or son node were neglected. When searching from frame G1 in the V1 direction ( Supplementary Figure 3c), the red node did not have a parent node and was neglected in the search for ventricular and great arterial cavities. When searching from frame G3 in the V2 direction (Supplementary Figure   3c), the orange node did not have a son node and was neglected in the search for ventricular and great arterial cavities. Therefore, after obtaining the whole inference process from ventricular cavities to the great arterial cavities, these neglected or filtered connected areas were reassessed. Some regions had not been searched in the forward or backward search process for ventricular and great arterial cavities and should be classified into a corresponding inference process.