A hybrid BPSO‐SVM for feature selection and classification of ocular health

Correspondence B. Keerthiveena, Department of Instrumentation and Control Engineering, PSG College of Technology, Tamil Nadu, India-641004. Email: 1707ru01@psgtech.ac.in Abstract Glaucoma and diabetic retinopathy are the most common eye diseases and the leading cause of blindness around the world. The prime objective of this study is to devise and develop an experimental computer-aided diagnosis system to provide an efficient way for assisting the ophthalmologist in early detection of ocular diseases such as glaucoma and diabetic retinopathy. The proposed technique follows three stages: Pre-processing, feature selection and classification. Initially, the fundus image is pre-processed to extract the green channel image, and the obtained green channel image is further enhanced using contrast limited adaptive histogram equalisation technique. Three different kinds of features: Clinical features, transform domain features and structural features are utilised to extract the relevant information from the enhanced fundus images. To avoid redundant information, an improved feature selection mechanism is used to select the optimum set of features from the extracted features. Subsequently, the selected features are used to train the support vector machine classifier for the classification of the retinal diseases with 10-fold cross-validation. The performance of the proposed method is assessed using eight different quantitative evaluation measures. The experimental results demonstrate the effectiveness of the proposed work over prior works for the early detection of ocular diseases.


INTRODUCTION
Computer-aided diagnosis systems have increased the potential of using image processing and pattern recognition techniques in the field of ophthalmology. The fundus images provide requisite information about the sensory part of the vision system. Retinal diseases are extensively increased day by day and diseases like glaucoma and diabetic retinopathy (DR) are the leading causes of blindness worldwide. Figure 1 shows the anatomy of the human eye with different retinal diseases. At present, fundus photography is widely used for mass screening of these diseases and graded manually which is time-consuming and errorprone. In this scenario, a computer-aided diagnosis system will be required for the automation of mass screening and diagnosis of retinal diseases that cater to future needs. The initial step to classify the retinal image as normal or abnormal is the segmentation of blood vessels, optic disc and optic cup. tion of glaucoma such as the size of the optic disc, progressive change of the optic cup and narrowness of the optic disc. Liu et al. [1] localised optic disc by taking centre and radius as a reference and extracting a region of interest (ROI) with highest intensity pixels, by considering the ROI as an initial contour, the variational level set approach is used for segmentation. Optic disc boundary is not accurate due to the presence of vessels. Hence, it is corrected using ellipse fitting to smoothen the boundary. Panda et al. [2] proposed an automatic technique for the detection of retinal nerve fiber layer and angular width using a cumulative zero count local binary pattern (LBP) and directional differential energy. Zhao et al. [3] discussed the method of region growing segmentation. In the said approach, a 2D Gabor wavelet is used to enhance the vessels. Then, an anisotropic diffusion filter is used to smooth the image and preserve the vessel boundaries. Finally, the region growing method and a region-based active contour model with level set implementation are applied to extract retinal vessels, and the results provide sufficiently detailed information about the thin retinal vessels. Lotankar et al. [4] proposed an optic disc segmentation approach based on the geodesic active contour model. Vessels in the retinal images are eliminated by bottom-hat filter followed by Otsu's thresholding. In this regard, it is observed that clustering-based approaches [5] also yield better accuracy for the segmentation of the optic disc and optic cup. Franklin et al. [6] discussed the computerised screening of DR using an artificial neural network (ANN). Using ANN, some of the features such as hemorrhages and exudates can be detected. The said features are used to examine whether retinopathy is present or not.
In our previous work [7], first-order and second-order features were extracted and classified using SVM classifier for the early detection of various retinal pathologies. Huang et al. [8] initially normalised the input image and features are extracted from the vessel centreline which includes the large, medium, and tiny size of the vessel. Feature selection is made using a genetic search algorithm and finally, a linear discriminate analysis classifier is used to train the classifier. They achieved an accuracy of 92.0%, the sensitivity of 89.6%, the specificity of 91.3% on the INSPIRE dataset and accuracy of 72.0%, the sensitivity of 70.9%, the specificity of 73.8% on the digital retinal images for vessel extraction (DRIVE) dataset. Bock et al. [9] developed an algorithm to detect the glaucoma risk index from the colour fundus image. Finally, 90 features are composed by using principal component analysis (PCA). The sensitivity of 73%, specificity of 85% in the detection of glaucoma with 575 images is obtained. R. A. Welikala et al. [10] proposed a feature selection by genetic algorithm, and dual classification for the automatic detection of DR is implemented. The sensitivity of about 91.38% and specificity of 96.00% are obtained using a dataset of 60 images.
Rahebi and Hardalaç [11] proposed a DR detection scheme using a gray level co-occurrence matrix. The feature set is given as an input to the multilayer perceptron neural network scheme for the classification of Blood vessels. Singh et al. [12] proposed technique based on features extracted from predominant features such as cup to disc ratio (CDR), neuroretinal rim (NRR) area and blood vessels. The said approach is found to be achieved a higher accuracy. Issac et al. [13] proposed a DR detection scheme in which an automatic image analysis system is developed for early detection of glaucoma using wavelet features. To improve the accuracy of classification, the wavelet features are extracted from the segmented optic disc. Gour and Khanna [14] proposed an automated system for the early detection of glaucoma. The features extracted from GIST and pyramid histogram of oriented gradient (HOG) are selected through PCA.
From the above analysis, it can be concluded that the stateof-the-art-techniques in ocular disease classification has two major disadvantages: Less efficiency in early detection of ocular pathologies and the complexity of the algorithm increases as the dimensionality of the data increases. To overcome these two disadvantages, in the proposed scheme, the considered fitness function takes care of both classification performance and features selection process. By optimizing the weights, regularisation parameters and kernels of the SVM a minimum number of features are selected and improved diagnostic accuracy is obtained by using the particle swarm optimisation (PSO) scheme.
It is to be noted that as per the author's knowledge in stateof-the-art techniques none of the techniques reported the use of a combination of clinical, transform and structural features with feature selection PSO and SVM for the classification of retinal diseases. The foremost objective of this work is to glean information from the fundus images that will be suitable for the classification of retinal diseases. Fundus images suffer from poor illumination, so before extracting features pre-processing is the basic step for getting better performance. In the subsequent stage of the proposed technique, three different kinds of features: Clinical, transform domain and structural are extracted from the contrast-enhanced green channel fundus images. The extracted features are systematically selected according to the significance using PSO technique. Finally, the selected features are used to train the SVM for the classification of retinal diseases. To validate the proposed scheme, the fundus images affected by DR and glaucoma are classified from the various open-access datasets. The performance of the proposed algorithm is validated with the four existing state-of-the-art techniques.
This paper is organised as follows. Section 2 represents the block diagram and brief discussions on each block of the proposed technique with complete details. The experimental results with quantitative evaluations and the discussions on the proposed technique are provided in Section 3. Section 4 summarises the conclusions behind the proposed technique with future works.

PROPOSED METHOD
The schematic block diagram of the proposed technique is shown in Figure 2. The proposed scheme follows three steps: pre-processing, feature selection, and classification of the normal, glaucoma and DR. In the initial step of the algorithm; the input fundus images are pre-processed to extract the green channel and further, the green channel images are enhanced using contrast limited adaptive histogram equalisation (CLAHE) algorithm to bring out the subtle details in the green channel images. In the subsequent step; four different types of features from three different categories are extracted: Clinical feature, transform domain feature and structural features (HOG, and LBP). To extract the clinical features; morphological operation based on dilation is exploited and the fuzzy c-means clustering (FCM) technique is used to segment the optic disc and optic cup. Wavelet packet transform (WPT) holds the ability to build a time-frequency relationship of an image. Third level WPT is used and the statistical features are extracted from the selected sub-bands. HOG and LBP are extracted to analyse the structural characteristics of the fundus images. Once all the features are extracted, the PSO algorithm is applied to select the best set of features for the classification. The combination of these features is intended to give a promising increase in the performance of the classifier to classify three different classes namely class 0 (normal eye), class 1 (glaucoma) and class 2 (DR). The detailed descriptions of each block are briefly described as follows.

Extraction of green channel and contrast limited enhancement
Medical images usually contain noisy background due to the interference and illumination artefacts which affect the process of measuring parameters in image acquisition systems. The manual control of these artefacts is inefficient and timeconsuming while capturing the images. Pre-processing is an essential step that eradicates the noise present in the fundus image and reduces the image variation by normalizing the original image with the reference image. In the proposed scheme, we follow two steps for pre-processing: (i) Green channel extraction and (ii) CLAHE. The green channel [15] of the RGB colour retinal image presents the high contrast between the background and the vessel. The red channel is the brightest part of the RGB channel but with low contrast and blue channel repre-sents the poor dynamic range. Hence in the proposed algorithm, we adhered to the green channel of the retinal image for further processing. CLAHE [16] is a modified version of the adaptive histogram equalisation (AHE) [17]. In AHE, the enhancement process is applied over a specific region whereas in CLAHE, the entire image is divided into small regions called tiles and for each tile, the contract is enhanced. The enhancement of the region can be done by choosing some parameters like number of tiles, number of bins, clip limit etc. The number of tiles (rectangular contextual regions) is chosen to be [16] which is a 2-element vector of positive integers. To prevent image from oversaturation, a contrast factor of 0.05 has been used as a clip limit.

Feature extraction and selection
In the proposed scheme, we have adhered to three different kinds of features from the clinical trial fundus images: clinical domain feature, transform domain feature, structural domain features. The detailed description of each kind of feature is described in the subsequent sections.

Extraction of the clinical domain features
Clinical domain features are the most important features for any medical diagnosis applications. Clinical domain features represent the quantification of the medical data that are usually referred by the medical practitioners for disease identification and medical diagnosis. The optic disc is one of the essential parts of ocular disease diagnosis. Identification of position and shape of the optic disc in the fundus image is one of the essential steps for examining the severity of glaucoma and DR. The optic disc can be divided into two distinct zones namely, optic cup and NRR. The loss in optic nerve head leads to the enlargement of the optic cup region and thinning of the NRR. The optic disc is surrounded by major blood vessels. In the green channel image, it may be observed that the contrast and visibility of the optic disc are superior and highly intensive, while the NRR and retinal blood vessels are of low intensities.
During the pre-processing stage, all the images are scaled up to the same size. The image obtained after pre-processing will be subjected to morphological operation. For this process, the structural parameter of diameter equal to the primary blood vessel width which is chosen to isolate optic disc from the blood vessel. FCM [18] is a clustering algorithm used to group a set of unlabelled data points based on the similarity between the data elements. The FCM algorithm [19] attempts to partition a finite collection of mxn data points into a set of C fuzzy clusters concerning a given criterion. The objective function J FCM used in FCM algorithm can be written as: where u ij is the membership function of the ith pattern to jth cluster, f i represents the ith data point and b i j is the distance of a pixel f i from the cluster centre v j . The degree of fuzzification is denoted by r and v j is the jth cluster centre in the considered feature space. The cluster centres are initialised and u i j for each pixel is updated using the cluster center equation given below: The FCM exploits a deterministic strategy to optimise the objective function J FCM . It starts with initializing the cluster centre values and then for each component the solution that minimises the objective function is updated. This process is repeated until the cluster centre is converged to a tolerance range ( ): In this work, v j is considered to be two and hence the complete process is considered to be a binary segmentation process. Finally, the optic disc is obtained and the number of white pixels in the final segmented image is calculated and is used for the calculation of CDR. Based on this, segmented blood vessels, optic disc and optic cup, four different clinical features like disc damage likelihood scale (DDLS), CDR, NRR and blood vessel ratio (BVR) are extracted easily.
Different clinical features used in the proposed scheme are as follow: DDLS (f 1 ): Early diagnosis of glaucoma prevents the damage of the optic nerve resulting in deteriorated vision [20]. The severity of the disease is calculated using DDLS and is calculated by: The minimum width area is obtained by subtracting the disc length from the cup length. Since the shape of the optic disc is oval, the diameter of the optic disc features two diameters: The diameter which runs through the semi-minor axis, and the semi-major axis. The shortest diameter and longest diameter is calculated and added to find the disc diameter of the optic disc.

CDR (f 2 ):
The optic CDR is one of the important features for the evaluation of glaucoma. CDR can be found by two methods. In this paper, the CDR is calculated by computing the ratio of the cup to disc horizontal length, vertical length and area. The total number of white pixels present in the optic disc is calculated and termed as disc area.
Here, A cup is the major axis length of the disc boundary and A disc is the minor axis length of the disc boundary. The area of the optic disc and optic cup is calculated by performing the morphological operations. The CDR is calculated by finding the ratio of the optic cup area to the optic disc area. The optic disc and optic cup area can be calculated by the number of white pixels present in the corresponding segmented image. If CDR [21] value is less than 0.3, then the eye is considered as normal. But if CDR value is more than 0.3 then the eye is considered abnormal.

Neuro-retinal rim area (f 3 ):
Optic disc is vertically oval and the optic cup is horizontally oval thus resulting in the characteristic of a disc-shaped NRR. The appearance of the healthy disc follows the ISNT rule (evidenced by NRR thinning and vessel baring inferiorly). The rule is the ordering of rim area of inferior (I), superior (S), nasal (N) and temporal (T) regions. For a healthy disc, the thickness portion of the rim is I followed by S, then N, and finally by T, that is, I > S > N > T [22]. The white pixels lying in all quadrants are used to calculate the NRR area. Here, the Inferior region is the thickest, and temporal region is the thinnest. The NRR area can be calculated as: For a healthy disc, the optic cup spread out in the nasal and temporal regions. But as glaucoma progresses, the optic cup starts to expand in the inferior and superior region of the optic disc. The binary image of the segmented disc and cup region is found out and cropped using a suitable mask. The image is divided into four sectors (I, S, N, T) by rotating the mask at an angle of 90 • . Calculate the NRR area in each of the sectors by calculating the number of white pixels available in each of I, S, N, T regions.

BVR (f 4 ):
The fundus scanning begins with the temporal part of the optic disc, then moves towards the superior, nasal, inferior part and back to the temporal part of the optic disc. In a healthy eye, blood vessels are concentrated mainly on the inferior and superior quadrants of the optic disc [23]. If the person is affected by glaucoma, the blood vessel shifts towards the nasal side of the optic disc. Hence, BVR should be less for a glaucoma eye than the normal eye. It can be calculated as: BVR = blood vessel area of I + blood vessel area of S blood vessel area of N + blood vessel area of T .
The mask used for the extraction of the NRR area is again used for extracting the blood vessel area. After extracting the blood vessels in each quadrant, the blood vessels area ratio is calculated. The ratio should be less for a glaucomatous eye in comparison to a healthy eye.

Extraction of the transform domain features
The retinal image features defined in the time domain are not enough for the classification of diseases. In WPT both time and frequency domain information can be used. WPT is coined by Coifman, et al. [24] by generalizing the wavelet transform that offers a richer range of possibilities for signal analysis. The classical two-band wavelet transform consists of logarithmic frequency resolution. The low frequencies have narrow bandwidth with finer resolution and high frequencies have wide bandwidth with poor resolution. WPT consists of both low and high passbands which allow a finer adjustable resolution at high A space V j of its approximation at a resolution 2 −j decomposed into low-frequency space V j+1 and detail space where j is the arbitrary starting scale, p and q denote the scale and translation indices respectively. Let h and g be a pair of conjugate mirror filter with finite impulse responses and the discrete WPT coefficients x n (0≤ n < N ) with length N = 2 −J can be computed as follows: Where Ψ p,q j is the transform coefficient corresponding to the wavelet packet of size 2 −j . The transform is invertible if appropriate dual filtersh[n] andḡ[n] are used on the synthesis side. For images, WPT results in 2 2n basis, where n is the level of decomposition. To select the best basis for the wavelet packet, the cost function should be optimised for each sub-band. The cost function can be formulated by using a logarithm of energy, Shannon entropy [25], rate distortion optimisation [26] and singular value decomposition [27] method. In this work, energy-based best basis selection is used and the 21 best bases are selected as shown in Figure 3.
The features such as mean, standard deviation, skewness, kurtosis, entropy, energy, contrast, correlation, dissimilarity, autocorrelation, inverse difference, cluster prominence, cluster shade, difference variance, difference entropy, sum of squares, sum variance, sum entropy, homogeneity, maximal correlation coefficient and difference entropy are extracted from WPT.

Extraction of the structural domain features
In the proposed scheme, we have considered two important structural based features, that is, HOG and LBP to represent the structural contents in the fundus images.
HOG: The extraction of useful information from an image or image patch is represented as a feature descriptor [28]. In the HOG feature descriptor, the features are described by the distribution of directions of the gradients. The input image is divided into 16 overlapping blocks and each block is comprised of 4 cells. The histogram of gradient is achieved by filtering the image with a 'sobel' operator with kernel size 3 × 3. The histogram of gradient is calculated for the 4 × 4 cells. The gradient of 4 × 4 image patch contains the magnitude and direction of the pixels. Each gradient is then approximated to 9 major orientation: number of bins = 2(k−1) 9 , where k = 1, 2,…9. A bin is selected based on the gradient direction and the value that enters into the bin is selected based on the gradient magnitude. The feature vector of each block is obtained where F b is the orientation histogram with 4 cells. The feature vector obtained is given as F b and normalised feature vector (F b ′ ) by the following equation: The HOG feature vector is obtained by concatenating all the normalised block feature vectors. There are horizontal 4 and vertical 4 positions which makes a total of 15 positions and the size of each normalised block vector is 1 × 36. The size of the HOG feature vector after concatenation is 1 × 540. LBP: LBP [29] is an effective greyscale texture operator used in many computer vision applications. To calculate the LBP value of the pixel, a label is created for the image pixels by thresholding 3 × 3 neighbouring pixels with the centre value and considering the result as a binary number. The neighbouring pixels are thresholded concerning the central pixel of the neighbourhood generating a binary pattern. The value of the LBP label is obtained for every pixel by the following equation: Where T is the total number of neighbours involved, R is the radius of the neighbourhood, g c is the gray value of the central pixel and g T is the value of its neighbours. The value of the labels depends on the size of the neighbourhood and 2 T is the different binary patterns that can be generated in each neighborhood. Generally, when LBP is used for textural description, it is common to include a contrast measure by defining a local variance as follows: The LBP and local variance are combined to enhance the performance of the LBP operator [30]. In the proposed algorithm, the LBP value of each pixel (total number of neighbours T = 8, the radius of the neighbourhood R) is computed, and corresponding eight bit binary numbers are generated as shown in Figure 4. The histogram of the LBP value for each pixel is calculated, the size of histogram is1 × 256.

Support vector machine for image classification
Support Vector Machine (SVM) was developed by Hava Siegelmann and Vladimir Vapnik [31] in the early 1990′s for classification and regression tasks. Machine learning is the subset of artificial intelligence wherein the system is trained to perform the algorithm without any explicit programming. Compared to other machine learning techniques, SVM requires a very small number of features, and computational efficiency is high. Basically, there are two stages in supervised learning: Training phase and testing phase. It is often observed that by increasing the number of data, the error function can be decreased simultaneously. During the testing phase, the classification criterion or the decision function should be initialised to separate the unseen data. This criterion is implemented for data with 2 classes which forms a straight line with a maximum distance from the data of the classes [32]. However, if the data is not linearly separable then feature space with input data is mapped into a higher dimensional dot product space.
Dataset X consists of N observations and F features. For our convenience, ith training set is denoted by x i = (x i1 ,…,x iF ) T which corresponds to the class y i for i = 1,2,…,N. The output y i consists of two different classes −1 and +1. For this type of linearly non-separable class, it is possible that any hyperplane can be separate all the positive class and the negative class correctly. To overcome the violation of the constraints y i ((w c × x i ) + b) ≥ 1, a stack variable is introduced to separate all the positive and negative inputs: The objective function is defined as: Where = ( i , … , N ) T and P C > 0 is a penalty parameter. The two terms in objective function indicate not only to minimise ||w c || 2 (maximizing the margin) but also minimise i , the parameter P C is the turning parameter.
For the multi-class classifier, there are two major multi-class SVM classification strategies: One-against-all (OAA) and oneagainst-one. OAA approach is proposed by Bottou et al. [33] (1994) the one-versus-rest converts the classification problem of k categories into k dual-category problems. When training the ith classifier, data in the ith category is regarded as +1 and the data of the remaining categories are regarded as −1 to complete the training of k dual-category SVM [34]. During the testing process, each testing instance is tested by trained k dualcategory SVMs. The classification results can be determined by comparing the outputs of SVM. The unknown category F with the decision function arg max i = 1,2…k (w c i ) t ( f ) + b i can be applied to generate kth decision-making values, and category f is the category of the maximum decision making value. To train the multi-class SVM, f 1 , f 2 , f 3 ,…,f m be the feature vector with labels l 1 , l 2 , l 3 ,…, l w which are generated for the training set where 'm' is the size of the training data. In the proposed scheme, the size of the feature vector is 821 and their label is denoted as 0 for normal eye, 1 for glaucoma and 2 for DR.

BPSO-SVM for feature selection and classification
PSO is a population-based bio inspired optimisation tool developed by Eberhart and Kennedy [35] inspired by the social behaviour of flocking birds in search of food. Compared to the evolutionary computational method, PSO proves to be an efficient technique in feature selection. In PSO, each particle gains knowledge to find the best solution. All the candidates have fitness values, which are evaluated by the fitness function to be optimised. During movement, each particle flies through the probability space with a change in velocity according to its own experience and the experience gained from the neighbouring particle [36].
To apply the PSO algorithm for the selection of features, the facts to be considered are the representation of position, velocity, position update strategies, velocity limitation and fitness function. PSO is initialised with a group of random particles and then, searches for the optimal location by updating generation. Particles move through the fitness function and are evaluated according to the fitness criterion after each time step. In every iteration, each particle is updated as P Best and G Best values to avoid being trapped in a local optimum by fine-tuning the inertia weight. P Best is a local fitness value, whereas G Best constitutes a global fitness value. After finding the two best values, the modification of the particle's position (x) and velocity (v) can be mathematically modelled according to the following equation: x new Where w p is the weight, c 1, and c 2 acceleration (learning) factors, rand 1 and rand 2 are random numbers. The velocities V old i and V new i represents the current particle position and the updated particle position. The velocity is represented as a positive integer which varies from a minimum value of 1 to a maximum value of v max at a particular moment of time. After updating the velocity of each particle, a particle position will be updated as a new velocity V new i . The selection of velocity serves as a constraint to control the global exploration of a particle swarm. If v max is low, then the particle has more difficulty traveling from the local optimal region. When v max is high, the particles might fly to past good solutions.
The PSO converges rapidly during the initial stage and the G Best value has to be evaluated before each particle solution is updated. The feature after updating is calculated by the function H (V new i ). The feature after particle selection is represented in terms of a binary bit string with length 'n,' where n is the total number of features. Each bit represents a feature, the value 0 represents the unselected feature and the value 1 means the selected feature. If the value of a variable is less than or equal to 0.5, then its corresponding feature is not selected. Conversely, if the value of a variable is greater than 0.5, then its corresponding feature is selected. The main steps of BPSO-SVM process are summarised below: Step 1: Initialisation of the PSO algorithm with different parameters such as the population of the random particles and velocities Step 2: To balance the number of selected features and the quality of each reduct, the fitness function is given as: Where R f is the cardinality of the selected subset, C f is the total number of features in the dataset and weights (w f ) is the weight of the number of selected features.
Step 3: Update the global best and personal best positions of the particles based on the fitness function values Step 4: Update Equations (24) and (25) for each particle and the obtained new position of the particle is used for further processing Step 5: Repeat steps 2-4 until the conditions are not satisfied

EXPERIMENTAL RESULTS AND ANALYSIS
This section demonstrates the effectiveness of the proposed algorithm. The algorithm is implemented in MATLAB R2017a and is run on Pentium D, 2.8 GHz PC with 8 GB RAM and Windows operating system. An experiment is carried out on several datasets; however, for illustration, we have provided the results on four benchmark datasets: PSGIMSR, DRIVE, STructured Analysis of the Retina (STARE) and high-resolution fundus (HRF). The effectiveness of the proposed scheme is verified by comparing the results of it with those of the four stateof-the-art techniques: Rahebi et al. [11], Singh et al. [12], Issac et al. [13] and Gour et al. [14]. In the field of ocular disease diagnosis, the optic disc segmentation is mainly validated using the ground-truth images prepared and validated by the ophthalmologist. The optic disc regions determined by the ophthalmologist are considered as gold standards. To validate the performance of the segmentation algorithm, FCM output is compared with three segmentation algorithms such as Otsu's thresholding [4], maximum entropy thresholding (MET) [37] and k-means clustering [38] algorithm. For further evaluation of it, we have corroborated our findings by evaluating those of different feature combinations too. Further, this section is divided into four subsections: Dataset used for evaluation, visual analysis of results, quantitative analysis of results and discussions.

Dataset used for evaluation
To evaluate the performance of the proposed algorithm for automated screening of retinal disease, four benchmark datasets are used. A total of 389 images which include 128 normal and 261 abnormal (glaucoma and DR) images have been used to report the validity of the proposed system from different datasets. Out of 261 abnormal images, 85 are images with glaucoma and 176 are images with DR. For this observational study, the ethical clearance certificate is obtained from the Institutional

Visual analysis of results
The analysis of results for the proposed scheme is carried out with different fundus images and the case study for few images is presented in Figures 5-9. The optic disc and optic cup are segmented using the FCM algorithm. Figure 5(a) depicts the input fundus image with abnormal (glaucoma) image. Figure 5(b) represents the enhanced input image which is used for segmentation. Figure 5(c) shows the segmentation of blood vessels.  Figure 5(e,f) show the segmented optic disc and optic cup by the FCM algorithm. Figure 6(a) depict the cropped ROI image which is used for segmentation. Figure 6(b-e) visualises the segmentation results by using different algorithms. Figure 6(b,c) represents the segmented optic disc using Otsu's thresholding and MET technique. The segmentation of the optic disc from the fundus    Figure 7 shows the mask which is common for both extractions of NRR and blood vessel in each quadrant. The topmost quadrant is the superior part and the bottom is the inferior part while the left and right quadrants are the nasal and temporal regions in the retinal image. Figure 8(a) shows the NRR. Figure 8(b) shows the neuroretinal area in the superior quadrant. This can be obtained by multiplying the mask image with the area of the NRR. A similar  Figure 8(c-e). Figure 9 illustrates the process involved in the extracted blood vessels from the optic disc. Figure 9(a) represents the blood vessel area. Figure 9(b-e) represents the blood vessel area in all the quadrants namely, superior, temporal, inferior and nasal respectively. Table 1 shows the chart of DDLS which is characterised by three stages like risk, glaucoma damage and glaucoma disability with a scale of 0 to 10.
The clinical parameters for randomly selected 10 images are computed and reported in Table 2. The result shown in the table includes ISNT rule, disc area, cup area, CDR, minimum rim to disc ratio and DDLS. The result for image 3 in Table 2 indicates that the image is normal. But the ISNT rule (I = 0.24, S = 0.254, N = 0.210, T = 0.217) is not satisfied and should be diagnosed as Glaucoma. Hence the parameters such as CDR, DDLS along ISNT rule are employed for accurate diagnosis of the diseases.
The performance of the segmentation algorithm using Otsu's thresholding, MET, k-means clustering and FCM technique are summarised in Table 3. The performance of the segmentation algorithm is validated in terms of accuracy, sensitivity, specificity, dice similarity coefficient (DSC) and Jaccard index (JAC). It is observed that the FCM algorithm outperforms all the listed methods and achieves 94.6% accuracy, 90.8% sensitivity, 96.5% specificity, 0.918 DSC and 0.85 JAC.

Quantitative analysis of results
A computer-aided diagnosis system is proposed to increase the accuracy of the mass screening process which will be useful in assisting the doctors for interpretation from the acquired fundus image. To validate the performance of the proposed algorithm, well-known parameter: Accuracy (acc), sensitivity (sen), precision (prec), specificity (spec), F1 score, Matthews correlation coefficient (MCC), informedness (info) and classification error (err) are widely used. The performance of the classifier can be defined by the following equations.
Where true positive (TP), true negative (TN), false positive (FP) and false negative (FN). The total number of features extracted is summarised in the following Table 4. After preprocessing, the features are extracted from clinical features, transform domain features and structural features. A total of The proposed feature subset is compared with LBP and HOG features with the selected number of features. As shown in Table 5, the combined features yield better performance than all the other single features. From the table, it is observed that the accuracy of the proposed algorithm is increased gradually by 16% from 10 to 100 features. From the table, it is known that the proposed feature subset shows an average of 3% improvement when compared to HOG and an average of 5% compared to LBP in terms of accuracy. If the number of features reaches 100, the performance of the classifier achieves 98.20%, 100% and 97.56% regarding the accuracy, sensitivity and specificity, respectively. Similarly, the classification error of the proposed feature subset is low compared to HOG and LBP features.
With a linear SVM classifier, the accuracy of 93.93%, the sensitivity of 100% and specificity of 90.24% are obtained. Similar performance measurement is observed for different SVM kernels such as cubic SVM, quadratic SVM and Gaussian SVM with the selected features vector. Compared to other SVM kernel techniques, it is observed that the Gaussian SVM performs better and is shown in Table 6. To examine the predictive accuracy of the fitted models, 10-fold cross-validation is chosen before training any models.      To overcome the chance of over fitting towards a class, the databases of small size are trained using SVM classifier with 10-fold crossvalidation. Table 8 shows a performance overview of the proposed algorithm with existing state-of-art-techniques in terms of accuracy, specificity, and sensitivity. The proposed algorithm is compared with the following published methods: Rahebi et al. [11], Singh et al. [12], Issac et al. [13] and Gour et al. [14]. From the table, it is evident that the proposed algorithm outperforms all other existing techniques.

Discussions
From the results, it is found that the proposed algorithm produces better results as compared with the existing techniques. The proposed algorithm considers a hybrid feature selection and classification of various ocular pathologies with low computational time. Figure 10 indicates the features used to discriminate the classes such as normal, glaucoma and DR and their corresponding characteristics like contrast, correlation, energy, homogeneity, variance, mean, standard deviation, kurtosis and skewness. A total of 389 retinal images from four different datasets are used in this work. The modelling of classifiers is done at initial stages and the performance of the classifier model is verified using 10-fold cross-validation. Figure 11 illustrates the confusion matrix, which shows class 0 with the normal eye, which   is not detected correctly whereas; class 1 with glaucoma and class 2 with DR are correctly detected. It is observed that about 46 images normal, 173 images are found to be abnormal and 4 cases are misclassified by using PSGIMSR dataset.
In addition to these performance measures, the receiver operating characteristic (ROC) curve is derived to measure the performance of the classifier. The value of the area under the ROC curve (AUC) will be 1.0 for the perfect system. The values of AUC are 1.00, 0.95 and 0.98 for class 0, class 1 and class 2, respectively. To evaluate the importance of feature selection, the performance of the system is measured at different feature vectors and the result is shown in Figure 12.
Based on the experimental study, the following points can highlight the advantages of the proposed algorithm: • The higher accuracy rate can be achieved using the BPSO-SVM approach. By optimizing the weights (w c ), regularisation parameter (P c ), and kernels of SVM using the PSO technique, features can be selected along with the improved diagnostic accuracy • Compared to other SVM kernel techniques, it is observed that the Gaussian SVM performs better, and the average accuracy of Gaussian SVM is increased by 8% when compared to cubic SVM • To validate the diagnostic accuracy of the proposed technique, the AUC performance measure is employed. The results of this performance measure outperform other existing techniques. BPSO-SVM based diagnostic model is more efficient and effective

CONCLUSIONS AND FUTURE WORKS
In this proposed algorithm, we put forward a computer-aided diagnosis system, which classifies the normal images from the abnormal image with improved accuracy. In the proposed technique, a total of 821 features are extracted from transform domain, structural domain, and clinical domain features. The clinical features are extraction from the optic cup, optic disc and blood vessels. WPT with third level decomposition is performed on the image to extract the transform domain features. HOG and LBP has proven to be an effective descriptor technique for the extraction of structural features. Among these features, optimal features are selected using PSO technique. The SVM model is used to classify the normal from the abnormal fundus image with 10-fold cross-validation. The proposed scheme is tested on different test images. The performance of the proposed algorithm yields better results compared to the existing technique. The proposed algorithm yields an accuracy of 98.20%, sensitivity of 100%, specificity of 96.56%, precision of 96.15%, F1 score of 98.03, MCC of 0.9680, informedness of 0.970 and classification error of 0.0157 for PSGIMSR dataset. The results are compared with a recent image processing algorithm based on the detection of glaucoma and DR.
In the future, the system could have been extended to other disease classification like age-related macular degeneration, papilledema, macular edema and central retinal vein occlusion. Moreover, the proposed algorithm has contributed to improved classification accuracy with optimised feature selection approaches. We will investigate further the selection mechanism of the metaheuristics to achieve maximum classification accuracy and minimise the number of selected features. In addition to this, some important features could be extracted for improving the accuracy of the classifier. Segmentation and classification of retinal images through deep learning is another intersecting area that can be explored through further research. In the future, we would also like to study the dependencies between features, the way it affects the accuracy of the detection of ocular diseases such as glaucoma and DR.