Iterative registration for multi-modality retinal fundus photographs using directional vessel skeleton

This paper proposes an automated registration method for multi-modality retinal fundus photographs based on the directional vessel skeleton. The main purpose is to register two retinal fundus photographs with different modalities of the same scanning region, which can provide multi-modality information for clinicians to diagnose retinal diseases or to make a treatment decision. The directional vessel skeleton of each fundus image is ﬁrst detected by bias ﬁeld correction and Gabor ﬁlter. The ﬁnal registered fundus photographs are then obtained by the iterative afﬁne registration between the detected directional vessel skeletons of two photographs. In this work, four kinds of fundus photographs in the macular regions of the patient with diseases, consisting of 20 optical coherence tomography fundus images, 20 colour fundus photographs, 20 ﬂuorescein fundus angiography images and 20 indocyanine green angiography images, are utilised to quantitatively evaluate the proposed method. The root-mean-square errors show an advantageous performance in both registration success rate and accuracy.


INTRODUCTION
With the development of imaging technology, various types of retinal imaging technologies are applied to collect multimodality retinal images, such as optical coherence tomography (OCT) [1], colour fundus photograph (CFP), fluorescein fundus angiography (FFA) and indocyanine green angiography (ICGA). Characteristics from different modality images can be helpful for clinicians to diagnose retinal diseases or to make a treatment decision. For instance, OCT is a high-resolution, of special diseases. Clinicians need to comprehensively consider the OFI and the traditional images to accurately diagnose the disease or predict the progression of the retinal lesion [46]. To achieve the quantification analysis of diseases based on multi-modality retinal images, registration of different imaging modalities is needed to calibrate to the same coordinate system. However, manual registration is time consuming. Therefore, an automatic and accurate registration method is becoming important in the joint analysis of retinal diseases.
The pixel intensity in the images may vary considerably due to different imaging mechanisms, especially between multimodal images. Under this circumstance, the intensity-based registration algorithms are hard to deal with the registration between multimodal images [10][11][12]. On the contrary, the feature-based registration algorithms are widely used in the registration between different modalities of retinal fundus photographs. The feature-based registration algorithms of retinal fundus photographs are mostly based on feature points [13][14][15][16][17][18][19][20][21] or vasculature skeletons [22][23][24][25][26][27][28][29][30]. Cattin et al. proposed a retina mosaicing method based on speeded up robust features (SURF) [13,14]. SURF is a local feature extraction algorithm based on Haar wavelet and has stronger robustness than the traditional scale-invariant feature transform (SIFT). Unfortunately, this SURF-based method can just be applicable for monomodal retina image registration. Tsai et al. proposed a general dual bootstrap iterative closest point algorithm, which uses "corner" and "face" points as the feature points [15]. However, the visibility of blood vessels in OFI is too low to extract enough accurate "corner" or "face" points for the registration. Chen et al. proposed a registration method for multimodal retinal images based on a hybrid area-feature descriptor named partial intensity invariant feature descriptor (PIIFD) [16,17]. This method used both the local region and feature points to avoid the influence of poor quality vasculature of the retina. However, the vessel density in the macula region is low for most retinal fundus photographs. So the above two differences in the macula region influence both the accuracy and robustness of finding enough pairs of feature points between retinal fundus photographs with different modalities.
In comparison, the vasculature skeleton-based registration methods have no demand for the same local areas and vessel density between different modalities of retinal fundus photographs. In most cases, the registration method based on vasculature skeletons may be more stable than the feature pointbased method [22][23][24][25][26][27][28][29][30]. Li et al. proposed a registration method between OFI and CFP based on blood vessel ridges [25]. The blood vessel ridges were defined as vessel centrelines using the ridge-branch-based (RBB) algorithm [31,32]. Golabbakhsh et al. also proposed a registration method with OFI and CFP using a quadratic registration modal [26,27]. The vasculature skeletons of both OFI and CFP were extracted based on a curvelet transform. Niu et al. proposed a local patch matchingbased algorithm for the registration with OFI and CFP [28]. The vasculature extraction and feature points detection were both used in this method. Although state-of-the-art registration algorithms have been proven successful in many cases, automatic FIGURE 1 An example to explain how the low-quality OCT fundus images (OFIs) can influence the detection of the vasculature and accurate registration in multi-modality retinal images is still a challenging task.
The reasons mentioned above mainly lie in the following aspects: (1) The reflectivity of retinal tissues is different in various types of imaging instrument based on the different scanning principles. In general, the intensity-based registration algorithms failed to register between different modality retinal images. (2) The characteristics of the blood vessel show significant differences in the various retinal images. For example, the retinal macular region contains less blood vessel than the optic nerve head, resulting in registration difficulty in the former. Especially the vasculature in the OFI of the retinal macular region is hard to detect due to the artifacts ( Figure 1). Additionally, retinal diseases could make the blood vessel unable to visualise clearly.
(3) Because of the intensity difference between the retinal fundus photographs with different modalities, the existing registration methods are all just based on the binary skeletons. However, there are no secondary features to distinguish the real vessels and the fake vessels (mostly are noises, artifacts and lesions) in the binary skeletons. In some cases, especially the images of the macular region, which have much more noises and less vessels than the optic disc region, the real vessels in one image may be misregistered to the fake vessels in another image.
In order to overcome the above challenging problem, directional vasculature skeletons are proposed in this paper. The registrations between multi-modality retinal fundus photographs are based on both vessel skeleton and direction information of each blood vessel. Although the detection of the directional vasculature skeletons is also influenced by the low-quality OFIs, the additional directional information could be a second feature between the real and the falsely detected vessels. Then, an objective function is proposed to find the optimal transformation parameters by using the gradient descent method. The registration accuracy and applicability are both improved. Our main contributions can be summarised as follows: (1) The directional information is proposed and added in the vessel skeletons. The directional information is an important difference between the real vessels and the pseudo vessels in the skeletons. (2) an unified framework is proposed to register between different retinal images in this paper (four modalities are used in this research)

FIGURE 2
The flowchart of the automated multi-modality registration between multi-modality retinal fundus photographs with different sizes (3) Compared with the state-of-the-art registration methods, the proposed framework can adaptively tackle the intensity difference and the multi-resolution problem.
The structure of this paper is organised as follows. The materials and methods are introduced in Section 2. In Section 3, the results of the experiment will be explained. Some important issues relevant to this work are discussed in Section 4. Finally, we provide a conclusion in Section 5.

MATERIALS AND METHODS
Our multi-modality registration method mainly includes four steps. First, each retinal fundus photograph is improved using a bias field correction and Gabor filter [33,34]. Second, the directional vessel skeleton of each retinal fundus photograph is obtained based on a multi-scale vessel enhancement algorithm [35]. The two images are then relocated to align the fovea region based on a simplified template match algorithm. The final registered image is obtained after the registration between two relocated images based on the detected directional vessel skeletons. The flowchart of the registration between the retinal fundus photographs of different sizes is shown in Figure 2.

Data acquisition
Our dataset from 20 eyes in 13 patients diagnosed with AMD, including 20 CFP images, 20 FFA images, 20 ICGA images and 20 OFI images, comprising a total of 80 images are used to evaluate the performance of our proposed method in this study. The OCT scans were acquired using an SD-OCT system (Zeiss Research Browser, version 6.0.2.81; Carl Zeiss Meditec, Inc). The raw data produced by the SD-OCT instrument were then imported into the vendor's proprietary software for analysis and reconstruction. The OFIs were generated by averaging all the voxel values in the 3-D data along the axial A-scan lines. The OFIs represented a 6.0×6.0 mm 2 (512×512 pixels) rectangular area centred on the macular fovea. The three other retinal fundus photographs (CFP, FFA and ICGA) were respectively acquired using a corresponding system of Heidelberg Engineering. In addition, the test was performed on a workstation with Inter(R) Core(TM) i7-5820K @ 3.30 GHz and 32.0 GB RAM using MATLAB 2014b (Mathworks, Natick, MA).

Bias field correction
Each retinal fundus photograph is first resampled to the same sample frequency. The intensity value in acquired fundus photographs is relatively low in regions where the retina is out of focus ( Figure 3(a)). This uneven distribution of intensity value may cause false detection of blood vessels. So, an illumination bias field is created by applying an X × Y Gaussian filter with 100 pixels standard deviation to each fundus photograph of the retina ( Figure 3) [33]. The intensity inhomogeneity is then corrected using Equation (1): where X and Y are the sizes of the corresponding fundus photograph, Mean(G(F 0 )) is the mean value of the bias field, F 0 (x, y) is the corresponding fundus photograph, G(F 0 (x, y)) is the bias field by filtering this fundus photograph, and F 1 (x, y) is the fundus photograph after the bias field correction.

Gabor filter
After this intensity correction, a Gabor filter [30,34] is performed on each fundus photograph to enhance the vasculature based on the kernel function C, defined as Equation (2): (n x cos +n y sin ) 2 where the size of the kernel function C is n×n pixels and the coordinates of each point in C are [n x , n y ]. θ is the orientation angle for the kernel function, and σ x = 0.5 and σ y = 1 are the standard deviations of the kernel function in the dimensions x and y, and w = 8 is the spatial frequency of the kernel function C.
The maximum value of each angle in each fundus photograph is calculated and chosen to enhance the vasculature, defined as Equation (3): where F(x, y) is the fundus photograph after the Gabor filter and eight equally spaced are used in this work.

Directional vessel skeleton detection
The multi-scale vessel enhancement filter is used to calculate the directional vessel skeleton of each retinal fundus photograph after the pre-processing [35]. For each photograph, the convolution between the corresponding fundus photograph F(x, y) and the derivatives of Gaussians are first calculated as Equations (4) and (5): where G(x, y, s) is the Gaussian function at scale s, (x, y) is the coordinate of points in the fundus photograph and the parameter is introduced by Lindeberg [36]. After calculating three partial derivatives, two eigenvalues with the smallest magnitude are calculated as Equation (6): where λ 1 is the eigenvalues with the smallest magnitude which represent the pixels belonging to a vessel region, and λ 2 is an indicator of brightness/darkness. According to the eigenvalues λ 1 and λ 2 , the vessels in each fundus photograph of scale s are set to V o (s) and detected by Equation (7): where β = 0.5 and c = 15 are the control parameters that balance the sensitivity of the line filter to the measures R B and 2 1 + 2 2 . The final estimation of vesselness V f (Figure 4(b)) is obtained by integrating the vesselness measures provided by the filter response at different scales, as shown in Equation (8): However, the registration only based on the binary mask of vessels is not accurate enough. So the directional information of the vasculature is calculated as Equation (9): where D o (s) is the original directional information of each fundus photograph with scale s. represents the coefficient of directional information. The final directional information (Figure 4(c)) is calculated by Equation (10): After the estimation of vesselness V f and directional information D f both obtained, the none-directional vessel skeleton of each fundus photograph is first calculated using Equation (11) (Figure 4(b) and (d)). The final directional vessel skeleton is then obtained based on Equation (12) (Figure 4(c) and (e)). The directional vessel skeletons of each fundus photograph are shown in Figure 5.

Directional vessel skeletons based optimal transformation model
A relocating process is first performed to generally align the fovea between two photographs of different sizes. The relocating is based on the combination of a simplified template match and affine registration. Compared to traditional template matches, the directional vessel skeleton of the larger retinal fundus photograph (reference image) is equally divided into 25 regions based on the sizes of two photographs. For each region, a simple affine registration just with the maximum Iterations of only 300 is performed with the directional vessel skeleton of the smaller photograph. The overlapping area between the transformed smaller photograph and each corresponding region is then calculated. The regions with overlapping areas less than a quarter of the smaller photograph are first abandoned. Among the regions with overlapping areas above a quarter of the smaller photograph, the region with the smallest absolute error is finally chosen as the reference image to register with the smaller photograph which is set as a moving photograph.
Before the final registration between the reference image and corresponding moving image, the two images are first zero-padded to the same size ( Figure 6(b)). Affine registration, including translation, rotation, scale and shear, is performed between the zero-padded reference and moving images. The goal of this affine registration is to find a transformation of the moving image that maximize the overlap region of vessels with The direction based registration between colour fundus photograph (CFP) and OCT fundus images (OFI). (a) The extracted reference through the template match (marked by the red box). (b) The directional vessel skeleton of OFI centred on the same location of (a). (c) The merged skeleton after the affine registration between directional vessel skeleton of extracted CFP (red) and OFI (green) connections above 100 pixels and squared difference below 0.1 between the zero-padded reference DV r : where (x, y) is the pixel coordinate, t is a vector of transformation parameters, Sd i (t) is the squared difference of the i-th connected region of the overlapping, N is the number of the connected region with pixels above 100, Connection i (t) is the number of the pixels of the i-th connected region and C(t) is the final cost function. This maximisation can be solved iteratively by Equation (16): where α = 0.01 is the iterative step size and d (k) is the gradient descent. An example of this registration between CFP and OFI is shown in Figure 6. For the registration between multi-modality retinal fundus photographs of the same size, the relocating process is not needed. Two photographs are registered just based on the final affine registration using the directional vessel skeletons. A registration between OFIs of the same patient with different scanning time is shown in Figure 7.

EXPERIMENT AND RESULTS
The automated multi-modality registration between retinal fundus photographs of different sizes took on average 80 s. The registration between two retinal fundus photographs with the same size took on average 60 s.

Qualitative analysis
The registration examples of both right and left eyes between CFP and corresponding OFI are respectively shown in Figures 8  and 9. The registration results between FFA with corresponding OFIs are shown in Figure 10. Although there is a motion artifact in the OFI (Figure 10(b)), the FFA and the corresponding OFI are still accurately registered based on the directional vessel skeletons.

Quantitative analysis
The accuracy of the proposed registration method is quantitatively assessed for every retinal image pair. Specifically, pairs of corresponding points are selected manually in both images by an experienced ophthalmologist ( Figure 11). These selected control points are equally distributed intersection points in the vasculatures which can improve the reliability of this quantification. The error in point placement is assessed as the root-meansquare error (RMSE) between the affine-transformed points and the points from the reference image. The RMSE of these selected control points are calculated as Equation (17): (17) where Px r i and Py r i are the coordinate of the i-th control point in the reference image, and Px m i and Py m i are the coordinate of the i-th control point in the registered moving image. The RMSE results are shown in Table 1 (the registrations with RMSE larger than 100 μm are defined as failures [17]). From the quantitative results, we can observe that the proposed method is relatively robust for multi-modality retinal image registration. For the registration between different CFP and corresponding OFI, we compare the performance of the proposed with local patch matching [28]. The algorithm in [28] is also proposed for the registration between CFP and OFI. The comparison results are shown in Table 2. Although the RMSE of the successful registrations based on the local patch matching is just a little bit lower than our method, the success rate of the local patch matching was just 55%, which is much lower than our method. Therefore, the performance of the proposed method is better than the one presented in [28].
For the registration between OFIs from the same patient with different visiting times, a comparison between PIIFD [19] and our method is performed. The comparison results are shown in Table 3. The proposed method can achieve a 100% success rate using 20 image pairs. More importantly, the RMSE of our method was much lower than the PIIFD. The performance of the proposed method is very useful for predicting the progression of diseases.
In order to further validate the performance of our method, the RMSE between our method with and without the directional information is compared, as shown in Table 4. Through this table, we can see that the use of the directional information made a big improvement in the registration success rate. For instance, the registration success rate between ICGA and OFI increases from 35% to 80%. The improvement of the average registration accuracy (decrease of RMSE) between different modality retinal fundus photographs is around 42%. In fact, many pseudo blood vessels are identified such as noises, artifacts and lesions in the vessel skeletons. Without the directional information, these pseudo vessels may be registered to the real vessels. So the use of directional information is important for the registration between low-quality images of different modalities. In addition, the paired t-tests are performed in this comparison based on both successful registered photographs to further

DISCUSSION
We have developed a method for the automated registration between multi-modality retinal fundus photographs. This registration is very helpful for aided diagnosis based on the overall consideration between retinal fundus photographs with different modalities. In this study, four fundus photographs OFI, CFP, FFA and ICGA are used to evaluate the performance of the proposed method. The conventional topographic imaging CFP, FFA and ICGA are, respectively registered to the corresponding OFI. In addition, the OFIs from the same patient at different visiting times could also be registered using our method. Unfortunately, the registration accuracy for some photographs is still unsatisfactory. These unsatisfactory registrations are mostly caused by the low-quality OFIs. These lowquality OFIs always have large motion artifacts and low contrast vasculatures (Figure 12(b)). However, this issue could be solved by using an advanced OCT system to obtain high-quality OFIs [37][38][39][40][41][42]. In addition, the quality of OFIs could also be improved by adding the layer segmentation process before the projection (whole projection is used in this work) [43][44][45].
In order to improve the registration accuracy and stability, the directional vessel skeleton of each fundus photograph is obtained before the registration. This directional vessel skeleton could not only reflect the vasculature information but also could reflect the directional information which is as important as the vasculature information in the accurate registration. The noises, artifacts and lesions which are mistaken as blood vessels in the vessel detection step have no differences from the real vessels in the binary vessel skeletons. The registration including many deformations especially the rotation may cause some wrong registrations between pseudo vessels and the real vessels. But the addition of the directional information would drastically reduce the probability of these wrong registrations, because the directional information between the pseudo vessels and the real vessels are totally different. This improvement could be well reflected in Table 4.

CONCLUSION
We propose an automated registration method for the multimodality of retinal fundus photographs using directional vessel skeleton. This method is mainly used to register two retinal fundus photographs with different modalities from the same patient. In this study, four kinds of retinal fundus photographs, such as OFI, CFP, FFA and ICGA, are used to evaluate the registration accuracy of the proposed method. The proposed registration allows the physician to correlate the cross-sectional scattering properties of the retina with the familiar information provided by other modalities. In the future, we further evaluate the robustness of the proposed method using a larger dataset.