Two‐dimensional DFT with sliding and hopping windows for edge map generation of road images

Correspondence P. Sumathi, Department of Electrical Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand 247 667, India. Email: psumifee@iitr.ac.in Abstract An edge map generation technique based on two-dimensional discrete Fourier transform with sliding and hopping windows is proposed for road images to detect the lanes and road markings. A 2 × 2 sliding/hopping window DFT with bin indices (k1, k2 = 0, 1) for horizontal edge detection and (k1, k2 = 1, 0) for vertical edge detection has been proposed. The 2-D SDFT/HDFT-based edge detector has been proved to be more efficient for lane and road marking images in comparison with conventional edge detectors and cellular neural network based edge detectors. In the presence of noise and various signal to noise ratio conditions, the horizontal and vertical edges have been efficiently recovered with good Pratt figure of merit without applying any pre-processing, noise removal process, and post processing techniques. The PFOM was found to be quite stable with wide threshold range for various noise levels. The consistent performance of the proposed edge detector is proved with MSE and PSNR determination of detected edge images. Moreover, the proposed 2-D SDFT/HDFT-based edge detector performs well in developing edge maps for real-time road videos. The system-on-chip implementation of the 2-D SDFT/HDFT edge detector on Cyclone IV FPGA chip is also carried out for detecting the lane and road markings.


INTRODUCTION
Vision based measurement (VBM) is being proposed and used in various automated applications such as advanced driver assistance systems (ADAS) and intelligent transportation systems [1,4]. Camera-based vehicle instrumentation is becoming popular to analyze the intentions and state of a driver, and detect potential driver errors to significantly reduce car accidents [2]. The ADAS integrates a number of different functions such as lane change assist, forward and rear collision warning, park assist, and blind spot detection [3]. Edge detection is an important process for the lane position detection, tracking systems, advanced driver assistant systems, and intelligent transportation systems [4]. Lane detection is a two-step process, in which the first step is detecting the lanes and then fit to the parametric curve. The challenges involved in vision-based lane detection include the lack of clarity of lane markings, light reflections, illumination, shadows, poor visibility due to bad weather conditions and high noise levels. Due  [5]. This research work focuses on the first step to detect the lanes through developing edge maps. The performance of the lane detection process depends on the successful edge characterization on the road images. An edge in an image connects the boundary between two different regions. Edge provides sufficient information with reduced data size which is suitable for the image analysis for most of the applications. The image intensity shows abrupt changes at edges and the magnitude of derivative of the image intensity function is high at the location of edges [6]. A number of methods for edge detection involving different approaches could be found in literature. These are (i) statistical methods, (ii) difference methods, and (iii) curve fitting [6]. The conventional edge detectors such as Sobel [7], Canny [8], and Prewitt [9] are popular and employed for feature extraction of lane and road marking images. The classical methods such as Sobel and Prewitt detectors use the first directional derivative to determine the edges [6]. However, images with strong noise cannot be handled by classical edge detectors. In [10], the fuzzy reasoning-based algorithm of the Fuzzy Sobel method is introduced which automatically obtains four threshold values, and applies fuzzy reasoning for edge enhancement. The edges extracted by this method are very clear and provides better representation for image edges and object contours. An improvement is proposed in Sobel edge detector with soft-threshold wavelet technique [11] for the removal of noise. The problems encountered in FPGA implementation of Sobel algorithm has been solved and improved through gradient calculation template with increased processing speed [12].
The zero-crossing edge detectors of another category use the second derivative along with Laplacian operator. A Laplacian of Gaussian filter is used as an edge detector for texture analysis of glass production [13]. LoG method computes the second derivative in the image and then find zero-crossing points. Since the second derivative of the image is used for edge detection, this filter is very sensitive to noise. To overcome this problem, firstly the noise should be reduced by applying Gaussian smooth filter and then Laplacian filter must be implemented to the image. The Canny detector is a Gaussian edge detector, is one of the most popular edge detectors in the literature and it has been widely used in many applications. Although the Gaussian detectors exhibit relatively better performance, they are computationally much more complex than classical derivative based edge detectors [6]. An improvement in Canny edge detection algorithm [14] is proposed towards improving the sensitivity of noise and loss of weak edge information. The concept of gravitational field intensity was introduced to replace image gradient. Two adaptive threshold selection methods based on the mean standard deviation of image gradient were used to enhance the noise suppression. A robust edge detection algorithm using multiple threshold approach (B-Edge) is proposed [15] to cover limitations of edge connectivity and thickness. The widely used Canny edge operator uses two thresholds and still witnesses a few gaps for optimal results. To handle the loopholes of the Canny edge operator, this algorithm uses the simulated triple thresholds that target the prime issues of the edge detection such as image contrast, effective edge pixels selection, errors handling, and similarity to the ground truth. Apart from the conventional edge detectors, the Hough transform is a much applied algorithm in lane detection and tracking systems. The computationally efficient hierarchical additive Hough transform is proposed for detecting only the straight lanes [16]. An efficient method for reliably detecting road lanes based on spatiotemporal images is proposed using Hough transform [17].
The cellular neural network (CNN) is a novel class of information processing system which was proposed in [18]. Its twodimensional inputs and output make it suitable for image processing [19]. The edge detection based on CNN is proposed in [6] with differential evolution algorithm to obtain edges of digital images. A template learning set-up has been proposed and resultant cloning template detects the edges; however, the noisy images are not considered for edge detection. Another edge detection method based CNN is proposed in [20] for noisy images. The edge detection is carried out with two steps; in the first step, noise is removed with a set of CNN cloning templates and in the second step, the resultant noise removed images are given as inputs to the CNN with another set of cloning templates. Moreover, the improved results of edge detection is shown with salt and pepper noise of different levels. The Gaussian noise is not considered for the evaluation. A CNNbased noise removal technique for color images have been proposed with hybrid linear matrix inequality and swarm optimization procedure [21]. The cloning templates were designed for salt and pepper noise, speckle noise, and Gaussian noise, and noise removal has been evaluated for color images. These templates can be used to remove the noise efficiently. However, for edge detection, an another set of templates need to be designed. A bias feed CNN model has been proposed for glass defect detection in [22]. Bias input is a single number value in traditional CNN algorithm which is converted into a bias template and used to balance the brightness level of the image. Through the contribution from bias input, background reflections and negative effects arising from transparency are decreased. These cloning templates are obtained through particle swarm optimization. Generally, for the CNN models, there are three major template design methods used. They are: (i) intuitive way, (ii) template learning, and (iii) direct template derivation. The first requires intuitive thinking of the designer. It leads to quick results in simple cases; however, it does not guarantee to obtain the desired template. Moreover, designers need to conduct lots of experiments in both image processing and array dynamics. The second design method, the template learning, is extensively studied and a popular field of the CNN research. Almost all classical neural network training methods have been applied to the CNN structure. The third method can be applied when the desired function is exactly determined. The design methods depend on the particular template class [23].
In general, the edge detection is performed as a linear filtering process which convolves the image pixel and kernel. In 1D sense, discrete Fourier transform (DFT) is a linear filter that performs the correlation between signal samples and basis function to extract the fundamental signal and harmonics. The 1D sliding DFT is computationally efficient as tuned filter to extract the particular fundamental and harmonic of interest from the input signal [24,25]. Moreover, the instantaneous values of fundamental and harmonic can be extracted from the filter operation [26]. In 2D, similar to classical edge detectors, the 2D sliding DFT performs correlation between image pixel and kernel, in which correlation is the convolution of image pixel and 180 • rotated kernel [27]. The 2D SDFT can be tuned to extract only edge frequencies from the image and reject all other frequencies including high noise levels. The pre-processing and postprocessing can be completely eliminated through the 2D Sliding DFT/Hopping DFT edge detector, since the sliding DFT is a tuned 2D filter which can sharply tuned to extract the edges.
The existing edge detectors requires pre-processing like Gaussian filtering in Canny and post-processing such as thinning in Canny, Sobel and Prewitt edge detectors. The CNN requires noise removal process as a separate process before edge detection. Moreover, it involves the difficult process of designing the cloning templates for a particular application. The performance of the edge detectors in the presence of noise is (i) pre-processing, (ii) filtering, (iii) noise removal, (iv) post-processing, (v) high computational complexity, (vi) more resource utilization.
To this end, an edge detector is proposed based on 2D SDFT/ HDFT and applied for road marking images in order to create an edge map. These edge maps can be employed in lane detection, tracking systems, and driver assistant systems. The proposed edge map generation method performs well for the road images. The performance is also verified for the known ground truth images corrupted with noise. The proposed method yields consistent performance for the noisy images with good PFOM, MSE, and PSNR. The SDFT based edge detector yields medium entropy for the detected edges compared to studied edge detectors. This paper is organized as follows: The proposed edge detection technique based on 2D SDFT/HDFT is described in Section 2. The performance evaluation and comparison of the proposed method with existing well-known edge-detectors and CNN-based edge detectors are discussed in Section 3. The FPGA implementation of the proposed edge detector is also presented in Section 3. Some important remarks are given in Section 4.

PROPOSED EDGE DETECTION TECHNIQUE
The block diagram representation of proposed edge detection methodology is shown in Figure 1. As a first step in edge detection, the 2D SDFT is performed on input image with 2 × 2 window as shown in Figure 2.

2D sliding DFT
The two-dimensional DFT is determined on a fixed-size window of pixels N × N , which is regularly updated with new pixels and the oldest ones are discarded. Consider the N × N win- is F x,y (k 1 , k 2 ) [28], which is represented by For the pixel (x + 1, y), the 2D-DFT is represented as, Let m + 1 = p and shifting the limits of summation in (2), Equation (3) can be rewritten in terms of current pixel (x, y) and previous pixel (x − 1, y) as Let Using (5), Equation (4) can be written as Replacing D x,y (k 2 ) in (6), Equation (8) confirms that the 2D sliding DFT at position (x, y) can be directly computed from the position (x − 1, y) using 1D sliding DFT. For 2 × 2 window, Equation (8) becomes For detecting horizontal edges For detecting vertical edges The 1D DFT is the correlation of samples and basis functions, whose frequencies of k cycles within N samples width [24]. Correlation is either the convolution of flipped samples and basis function or the convolution of samples and flipped basis function. In the proposed 2D SDFT technique, the image pixel and sliding window are correlated. The horizontal (refer (10)) and vertical edges (refer (11)) have been detected by kernels, obtained from basis function, moving on the image as shown in Figure 2 as shaded pixels. The 2D SDFT performs linear filtering, which is equivalent to the convolution of 180 • rotated kernel and image pixel in horizontal and vertical directions. After computing the 2D SDFT on each pixel of the image, each pixel is compared with a threshold to determine edge map of the image. Threshold is chosen as 4 × mean of F non−dir .

2D hopping DFT
The 1D hopping DFT [29] computes the N -point DFT at time index n using the precalculated N -point DFT at time index n − L, where L is time hop. The HDFT can adjust the time hop between successive DFT outputs. For 2D HDFT, using (5) and (6) To Equation (13) can be written as The second term of Equation (14) is 1D DFT. Let Equation (15) can be written in the form of 1-D SDFT as follows: Let e x,y = d x,y − d x,y−N . Hence, Equation (16) can be written as . (17) The 1D SDFT of Equation (17) can be expressed in terms of 1D HDFT by repeating L times of SDFT as Equation (18) can be rewritten as The 2D HDFT equation can be expressed as For L = 2 and N = 4, Equation (20) can be represented as For horizontal edges, For vertical edges.

IMPLEMENTATION
The performance of the proposed edge detector is evaluated for different dataset of road images and compared with the wellknown edge detectors such as Canny, Sobel, Prewitt, and CNN. The performance for the noisy images are determined for various noise level. The qualitative analysis has been carried out by computing the PFOM, entropy, MSE, and PSNR.

2D SDFT based edge detector
The performance of the proposed 2D SDFT based edge detector is evaluated with road marking images from ROad MArkings (ROMA) image database [30]. The images of ROMA database possessing the following details have been considered for evaluating the proposed 2D SDFT based edge detector. Camera: Balser A101CP, image size: 1280 × 1024, Pixel size: 6.7 micrometer × 6.7 micro-meter; all are color images and they are converted to gray scale for evaluation. The input images used for edge detection, edge map developed through the proposed method (refer Figure 1), overlaying the original image and the detected edges, red in color are shown in Figure 5. The detected lanes, lane markings, trees, and objects could be seen with clarity in the edge map. The performance of this edge detector is compared with the conventional edge detection algorithms such as Canny, Sobel, and Prewitt edge detectors. The comparison of the output of each edge detector for ROMA images are shown  Figures 6 and 7. It could be observed that the proposed edge detector performs better on the road marking images. The proposed method is also evaluated with RoadMarking dataset used in [31]. The edge map comparison of each method are shown in Figures 8 and 9 with input images of different lighting conditions. The information extracted in the edge map can be used for lane detection and lane position detection in tracking and driver-less cars for decision making process. Furthermore, video sequence has been tested for the proposed edge detector and the algorithm could successfully create the edge map for the video frames.

Cellular neural network (CNN)
CNN is a locally interconnected analog processor array arranged to a regular 2D grid. The model of a 2D CNN is composed of basic processing units called cells. Each cell is connected to its neighboring ones; therefore, only the adjacent cells can interact directly with each other [6]. For a CNN array with M rows and N columns on a 2D grid, the dynamics of each cell can be described by the state equations [20],ẋ , y i j (t ) are the state, the input and the output of the (i, j )-th cell in the grid. The initial condition of x i j (0) = 0 and static input |u i j | ≤ 1. A(i, j ; k, l ), B(i, j ; k, l ) denote the connection templates from cell C (k, l ) to cell C (i, j ); T i j represents the bias of (i, j )-th cell in the grid. Equation (24) shows that the state and the output of each cell are affected by the inputs and outputs of its neighboring cells. Moreover, for each cell C (i, j ), the following set N i, j (r ), named r neighborhood, can be defined as where r denotes the neighborhood radius of each cell which is a positive integer, and the pairs (i, j ) and (k, l ) are the indices which express the position of cells, that is, the rows and the columns of the generic cell and its neighboring ones in the grid, respectively. The cells which belong to the r-neighborhood of C (i, j ) are arranged in a maximum (2r + 1) × (2r + 1) grid whose central element coincides with C (i, j ).  Figure 10 shows the performance of CNN edge detector by using above mentioned templates for the ROMA DataBase images used in Figures 6 and 7. Similarly, the performance of CNN is compared with the proposed 2D edge detector for the  Figures 8 and 9 from the DataSet. These edge-detected road images are shown in Figure 11. The 2D SDFT performs better than CNN and it was observed that in the CNN output, the edge detected road images, certain noises were introduced.

Pratt's figure of merit
Pratt's figure of merit (PFOM) was proposed by Pratt for accuracy assessment of extracted edges [32]. It represents the deviation of an actual edge point (I A ) from the ideal edge (I I ) and it is defined as where e(i ) is distance between I A and I I , is scaling constant (usually 1/9).

Performance of the proposed edge detector with and without noise
The PFOM is determined for the images, whose ideal edges are known in prior without adding any noise, and compared with  Figure 12(a)). The proposed method performs better than other techniques with a PFOM of 0.9545. Similarly, the comparison of performance evaluation with different SNR of 40 to 10 dB is shown in Figure 12(b). These input images have been tested for edge detection with CNN; the results are shown in Figure 13. The bias is chosen as −5 for these noisy images, whereas A and B templates are retained as such. The proposed 2D SDFT edge detector performs well in the presence of noise. The threshold is chosen to be 0.4 for other conventional edge detectors. In case of Canny edge detector, the width of Gaussian filter ( ) is chosen for maximum PFOM. The Canny edge detector involves the multiple stages in the edge detection process. These are: (i) noise reduction by Gaussian filter, (ii) Sobel kernel to find edge gradient, (iii) non-maxima suppression to remove any unwanted pixels, and (iv) hysteresis thresholding. The Sobel and Prewitt edge detectors involves thinning process in the edge map generation. CNN performs well for the image without noise and 40 dB noise with only edge detection templates; however, the PFOM decreases from 30 to 10 dB noise cases. The detected edges of noise image of 10 dB is noticeably noisy and PFOM is poor. The edge detection of noisy images may be handled differently with CNN in two steps. They are: (i) first, the noise removal templates need to be designed and apply to CNN to remove the noise. (ii) Secondly, the edge detection templates need to be applied on the noise removed image. Hence, designing templates may be a great challenge in the case of CNN. Moreover, the conventional edge detectors did not demand separate step to remove the noise, whereas CNN may require that important procedure; otherwise, it fails to detect the edges. In comparison with these edge detectors, the proposed 2D SDFT/ HDFT-based edge detector performs better without involving any filtering or thinning or noise removal process. Furthermore, the range of threshold has been found for each edge detectors, except CNN, without noise and with noise cases for the input image shown in Figure 12(a),(b). The simulation has been performed on input image to find the range of threshold for which PFOM is retained at maximum value. The simulation results have been shown in Figure 14(a)-(f) for without noise and variation of noise for 40 to 10 dB, respectively. It can be observed from Figure 14(a) that the Canny demands the variation with threshold. The variation of both and threshold are varied to obtain maximum PFOM as shown in Figure 15. The Sobel and Prewitt edge detector threshold range is the same value for the cases of with and without thinning process. In these edge detectors, the PFOM is found to be good when thinning process is applied for edge detection; without thinning process, the PFOM was found to be poor. The proposed edge detector based on 2D SDFT/HDFT performs equally well with Canny edge detector for the noise input image. It could be noticed from Figure 14(a)-(f) that 15 dB noise is tolerable with the proposed edge detector, below which the performance is degraded and threshold range cannot be obtained.

Entropy
The content of an information that is present in an image is measured through entropy function [33].
where I represents an image and p i is the rate of recurrence of pixels with intensity i.   Figure 16(b) shows that the CNN-based edge detection has high entropy and hence the dou-ble edge detection for selected road images. The performance comparison of the proposed method with the conventional and CNN edge detectors are given in Table 1.

MSE and PSNR
The mean-square error (MSE) and peak signal-to-noise-ratio (PSNR) are calculated for the proposed 2D SDFT edge detector. A comparison is made with other conventional edge detectors and CNN. The MSE and PSNR [20] are given by where M × N is the size of the image, I (i, j ) is ideal edge and K (i, j ) is the detected edge, and R is the maximum possible pixel value of the detected edge. The calculated MSE and PSNR are listed in Table 2. Moreover, the MSE and PSNR can be determined only for those images for which the ground truth is known. It could be noticed that the proposed 2D SDFT performs invariably well in the presence of noise, whereas the other methods show variation in performance in the presence of noise. Though the CNN performance is good for certain cases, it is entirely depending on the bias and degrades with noise. For bias T = −5 case, the PFOM is not good, but the MSE and PSNR are good up to noise level 15 dB. For bias T = −0.5, which is used for testing the road images, the performance is good up to the noise level of 40 dB and then it degrades.

FPGA Implementation
The proposed edge detection algorithm is implemented in FPGA. The Altera DE2-115 FPGA board of clock frequency 50 MHz with Cyclone IV 4CE115F29 FPGA device is used for implementing the algorithm in real-time. The proposed edge detector blocks are developed in simulink environment and through HDL-coder 3.1, the equivalent verilog codes are generated and downloaded into FPGA. The PC and FPGA board communication is established by USB-blaster. The SDRAM of size 8-bit 307200 (640× 480) bytes is used to store a frame of a video from the camera. and the edge detected video is brought out to the real-world through DAC-VGA connector, and finally the edge detected video is displayed in the TFT monitor. The proposed edge detector algorithm is made available as system-on-chip device by programming the FPGA through active serial programming, so that the algorithm is available permanently on the chip. Figure 17 shows experimental set-up used for capturing the indoor environment and detected edge map displayed with a monitor in red lines, Altera FPGA board with a camera and its interface. The utilization summary of the 2D SDFT method with 2 × 2 window on Cyclone IV FPGA,    The resource utilization summary of the chip Cyclone IV GX EP4CGX150DF31C7 FPGA is as follows: SDFT 4 x 4 window size: Total logic elements 48,784 / 149,760 (33%), Total registers 47268, Total pins 360/508 (71%), total memory bits 51,152 / 6,635,520 (< 1%), embedded multiplier 9-bit elements 8 / 720 (1%). HDFT 4 × 4 window size: Total logic elements 100,228 / 149,760 (67%), Total registers 98535, Total pins 360/508 (71%), total memory bits 51,152 / 6,635,520 (< 1%), embedded multiplier 9-bit elements 8 / 720 (1%).

Real-time road videos
The proposed edge detector is tested with the following environments for edge detection. (i) Indoor environment; (ii) in the corridor; (iii) in the outdoor on the road. In the first case, the video is recorded in indoor environment by slowly moving the camera in our laboratory space and all the edges of the room could be detected with both 2-D SDFT/HDFT detector. In the second case, the video is recorded in the corridor of the department with varying lighting conditions. Light in the corridor switches on/off automatically by sensing the presence of person. A person walking towards the camera could be detected with all the edges in the corridor with poor and better lighting conditions. In the third case, the video is recorded for outdoor environment in real-time road conditions; the lane markings and vehicles are detected by this proposed method.

CONCLUSION
The 2D SDFT/HDFT is used for developing edge maps of road images in real-time. The suitability of this edge detector is explored for lane, road markings and object detection in road images and videos. The proposed edge map generation technique performs better in comparison with the conventional Canny, Sobel, and Prewitt edge detectors without demanding any filtering and thinning processes in the presence of noise