Improved Method for Positioning Crane Grab Boom Corner Points using Hough Transform and K-means Clustering

To ensure that the crane can smoothly calibrate and align the lifting rod with the beam body lifting hole, it is necessary to use image processing technology to locate and detect the corner coordinates of the crane’s lifting rod. Traditional corner detection methods are not suitable for this scene. This article proposes a new idea for corner positioning, which locates corner coordinates through the intersection of straight lines. Firstly, using the R and G channels of the RGB color space to construct a grayscale diﬀerence map is beneﬁcial for Otsu’s threshold segmentation; Secondly, this article proposes an optimal adaptive threshold determination method to ﬁlter the number of votes in the clustering results, eliminate interfering straight lines, and improve the clustering centroid calculation method based on the weight calculation formula of diﬀerent voting proportion, replacing the original clustering centroid as the basis for line ﬁtting; Finally, calculate the corner coordinates of the crane’s grab boom based on the straight line ﬁtting results, and compare the recognition accuracy under diﬀerent lighting conditions. This method is signiﬁcantly superior to traditional corner detection methods, providing a method basis for solving the algorithm accuracy and robustness problems of port cranes under multiple environmental variables.


INTRODUCTION
Stereo binocular vision plays a crucial role in the field of computer vision, particularly in tasks such as 3D reconstruction, object localization, and scene understanding.Accurate and precise localization of feature points is a critical issue in binocular vision as it directly influences precise depth estimation and 3D spatial reconstruction.This study aims to investigate and address the problem of feature point localization in binocular vision, aiming to improve the accuracy and reliability of stereo vision systems.In bridge construction operations, precise positioning of the grasping boom is essential.Leveraging the dual cameras installed on the bridge-building machine's hoisting apparatus, the principle of binocular disparity mapping 1,2 is utilized to calculate the three-dimensional coordinates of the grasping boom's bottom surface center.This enables precise control of the grasping boom's angle and direction, facilitating accurate alignment between the bridge-building machine's grasping boom and the hanging holes of segmental girders.Significant deviations in the position or angle of the grasping boom can lead to inefficiency, inaccurate operations, or even serious safety accidents.Therefore, detecting the corner coordinates of the grasping boom through image processing techniques and calculating the three-dimensional spatial coordinates of the bottom surface center using feature point matching in the left and right images allows real-time monitoring of its position and orientation.This facilitates determining whether the grasping boom is in the correct angle and position, ensuring efficient and accurate alignment between the grasping boom and the hanging holes of segmental girders.Additionally, detecting the corner coordinates of the grasping boom lays the foundation for the autonomous control and intelligent application 3,4 of the bridge-building machine.Through the detection and analysis of corner coordinates, the bridge-building machine can achieve automation control and intelligent decision-making, further enhancing operational efficiency and safety.
Three-dimensional spatial reconstruction techniques play a crucial role in object localization, and researchers have proposed various methods for calculating three-dimensional coordinates based on different principles.Yu et al. 5 addressed the challenges of insufficient features and difficulty in depth acquisition in monocular visual measurement and proposed a pose estimation model based on the triangulation depth calculation method.They employed the PNP solution based on three-dimensional point coordinates, which improved the accuracy and stability of visual measurement and laid the foundation for road 3D reconstruction.Zhang et al. 6 proposed and implemented a monocular visual-based laser scanner position measurement method, combining visual localization technology with laser scanning to be applied in three-dimensional reconstruction of point clouds.Zhu et al. 7 proposed a binocular vision-based indoor positioning and monitoring method to address potential daily risks for elderly individuals living alone.They established an imaging model using image data from calibrated binocular cameras, performed feature matching using BRIEF on the left and right images, and finally estimated the status of the elderly individual to analyze the risk of falling.
Currently, corner detection algorithms aim to identify the planar coordinates in an image that could potentially be corners.These algorithms include Harris corner detection, 8 Shi-Tomasi corner detection 9 (based on Harris corner detection), Fast corner detection, 10 and SIFT feature detection algorithm, among others.However, these corner detection algorithms primarily focus on detecting all corners in the image and do not directly provide the coordinates of specific corner locations.Wang et al. 11 utilized binocular vision for weld seam recognition and localization.The blurred surface of the weld joint posed challenges in the binocular vision matching process.They proposed a multi-BRIEF descriptor stereo matching algorithm combined with polarity constraint, which improved the matching performance and reduced the matching time.Sheng et al. 12 addressed the issue of misalignment in feature points when using binocular vision for robotic arm object grasping.They proposed an improved SURF algorithm that employed reverse feature vector matching strategy to initialize the set of matching points.Then, the RANSAC algorithm was utilized to select the initial set of matching points based on the polar coordinate geometric model, resulting in a well-matched set.Zhao et al. 13 utilized binocular vision to identify the contour and distance information of fruit trees.They proposed a left-right image feature point matching method based on cosine distance and vector modulus.The results demonstrated that the improved SIFT stereo matching algorithm had almost the same recognition time but improved stability in recognition.Despite these advancements, the mentioned algorithms do not directly provide the coordinates of specific corner locations.Additionally, when dealing with large-scale scenes, these algorithms may encounter difficulties due to the large number of extracted feature points, leading to increased matching time and potential generation of numerous incorrect matches.Moreover, SIFT, SURF, and ORB algorithms 14 are sensitive to noise and occlusion in the image, which may result in erroneous feature point extraction and matching.
This article presents a new approach for precise calibration and alignment of the crane's grab arm and the lifting holes on the segmental beams for bridge construction.The proposed method employs a novel corner detection technique by calculating the intersection coordinates of fitted lines.This enables the determination of the three corner coordinates on both sides of the crane's grab arm, ensuring a one-to-one correspondence between the corners in the left and right images and addressing article the problem of feature point mismatches.Finally, these corner coordinates are used to fit the plane of the crane's lifting rod, thereby determining its position and orientation in space.To achieve the goal of corner detection through the calculation of intersection coordinates of lines, it is essential to perform line detection and fitting on the edge point information in the left and right images.In image processing, there are two main categories of methods commonly used for detecting pixels with linear features.The first category of methods involves predicting the presence of straight lines by analyzing the distribution of pixels.The most representative techniques in this category are the least squares method and the RANSAC line fitting algorithm.Li et al. 15 proposed an adaptive road detection method that combines lane lines and obstacle boundaries.This method utilizes an adaptive sliding window for feature selection and performs road line fitting using segmented least squares regression.To address the issue of inaccurate detection of rice crop rows, He et al. 16 employed a robust regression least squares method to fit the rice rows, effectively eliminating the interference caused by outliers.In the context of complex backgrounds where the horizon line is often submerged, Song et al. 17 presented a RANSAC-based method for horizon line detection.This approach involves image segmentation using the k-means algorithm, followed by line fitting using the RANSAC algorithm on the segmented horizon image.For addressing the problem of severe seedling compression in agricultural machinery spraying operations, Li et al. 18 proposed a vision-based navigation line detection method.The detected feature points are processed using the RANSAC algorithm to remove outliers, and the crop rows are subsequently fitted using the least squares method.The least squares method, as mentioned earlier, is suitable for a specific set of pixels with linear features.However, it is sensitive to outliers, and severe deviations can directly affect the accuracy of linear regression.Thus, when used for line detection, it may only yield a line that deviates significantly.On the other hand, the RANSAC line fitting algorithm exhibits strong robustness against outliers.However, similar to the least squares method, it is applicable only to a specific set of pixels with linear features.The second category of methods aims to detect lines from a global perspective, allowing for the detection of any pixels with linear features in an image.Techniques such as Hough transform, probabilistic Hough transform, 19 and LSD (line segment detector) 20 can detect global linear features.Josth et al. 21addressed the issue of slow real-time performance in standard Hough transform and proposed an improved Hough transform accumulation scheme, making it suitable for computer systems with small and efficient memory operations.Chen et al. 22 introduced a navigation path extraction method for greenhouse cucumber picking robots based on the predicted point Hough transform.They utilized a novel grayscale factor for image segmentation and performed Hough transform fitting on the predicted points to extract the navigation path.Ziwen et al. 23 presented a crop row detection method based on Hough transform.They employed k-means clustering to identify different crop rows, but did not eliminate interference lines in the Hough transform, and the accuracy of row fitting still needs improvement.Cha et al. 24 proposed a bolt loosening detection method based on Hough transform and support vector machines (SVM).This approach utilized Hough transform for bolt localization and employed SVM for discrimination and classification of bolt looseness.However, the training of linear SVM in this method was time-consuming.In this article, to enhance the segmentation effectiveness between the crane's grab arm and the surrounding environment, a novel method for constructing differential grayscale images is proposed.Furthermore, for the process of straight line detection and fitting of the crane's grab arm, an optimal adaptive threshold determination method is proposed to filter the voting results in the clustering process and eliminate interference lines.Subsequently, an improved calculation method for cluster centroids is introduced, utilizing a weighted calculation formula based on the proportion of different voting numbers.This method replaces the original cluster centroids as the basis for line fitting.Finally, the three sets of corner coordinates corresponding to the crane's grab arm in the left and right images are obtained by solving the equations of the fitted lines.This algorithm accurately computes the feature points required for stereo matching in binocular vision.
The structure of the entire article is as follows: The Section 1 provides an introduction to the intelligent application of binocular stereo vision in the field of bridge construction, different methods for calculating three-dimensional spatial coordinates, the advantages and disadvantages of feature point matching algorithms in binocular vision, as well as the current research status.The Section 2 describes the process of corner detection for the crane's grab arm, including target image segmentation, the method for determining the optimal adaptive threshold, the calculation of cluster centroids based on weighted voting, and the extraction of corner coordinates.The Section 3 presents the components of the experimental platform and provides corresponding experimental discussions and analysis.The Section 4 summarizes the conclusions of the entire article.

Improved image segmentation methods
The construction of difference grayscale images is a crucial step in color image segmentation, as different difference grayscale methods directly impact the segmentation effectiveness of the crane grab arm and the background objects, thereby affecting the accuracy of corner detection.The lateral color characteristics of the crane grab arm are blue and green.Traditional difference grayscale methods include 2R The traditional difference grayscale method, as shown in Equation ( 1), The traditional grayscale differencing method, denoted as 2G − R, is represented by Equation (2): where R, G, and B are, respectively, the red, green, and blue components of RGB image, and f (x, y) represents the grayscale value of each pixel.
The utilization of methods such as 2R − G − B, 2G − B − R, and 2B − G − R for grayscale differencing in image segmentation can result in under-segmentation, as they can only target a specific color channel (R, G, or B), thereby missing out on complete segmentation of the image.On the other hand, method 2G − R for grayscale differencing may lead to over-segmentation issues.The aforementioned grayscale differencing methods directly impact the effectiveness of image segmentation, consequently affecting the accuracy of corner detection.To address these challenges, this study analyzes the color characteristics of the crane grab arm on its side.Typically, the color components G (green) exhibit higher values than R (red) for the grab arm.This characteristic is reflected in the grayscale images constructed from the R and G channels, where the grab arm appears brighter in the G channel compared to the R channel.Taking advantage of this observation, the study proposes an alternative grayscale differencing method, as depicted in Equation (3).This method effectively extracts objects with a tendency towards blue or green colors, thereby improving the applicability of color extraction and avoiding under-segmentation issues.Simultaneously, it prevents excessive segmentation of background clutter by mitigating the impact of an overly large G component coefficient. Otherwise. ( The proposed grayscale differencing method in this article leads to a bimodal distribution of grayscale histogram features between the target object and the background, which is more favorable for Otsu thresholding.Additionally, morphological opening and closing operations are employed for noise reduction, aiming to eliminate artifacts in the black background and within the white target region.Finally, the Canny edge operator is applied to obtain the edge map of the crane grab arm.

Hough transform-based k-means clustering
According to the principles of the Hough transform, the higher the linearity of the arrangement of pixels in the edge detection image, the more likely it can be detected.The coordinates mapped to the Hough space correspond to a straight line in the image coordinate system.The equations for expressing the line in both the image coordinate system and the parameter space are shown in Equations ( 4) and ( 5), respectively.
In the equations, k and b represent the slope and intercept of the line in the image coordinate system, respectively, while  and  represent the polar radius and angle of the line in the parameter space, respectively.
The crane grab arm has four edges that need to be detected.The Hough transform often detects a large number of lines, including many interfering lines.Therefore, a clustering approach is employed to cluster the parameter points (, ) obtained from the Hough transform.The centroid C k = ( centroid ,  centroid ) of each cluster is used as the fitted result, which facilitates obtaining the coordinates of the four lines necessary for determining the corner points.When applying other clustering algorithms to grayscale differential images with significant interference, the parameter points of the detected lines may be clustered into uncertain clusters.In contrast, the k-means algorithm requires a predetermined number of clusters, denoted as K, to ensure that the lines detected by the Hough transform are clustered into four distinct classes, which intuitively corresponds to fitting the four lines.Therefore, this article utilizes the k-means algorithm for clustering the parameter points of the lines.
Specifically, for a given number of K clusters, where the centroid coordinates of each cluster are represented as C k and the set of all data points in cluster i is denoted as x in representing the nth data point in cluster i.The sum of squared errors (SSE) for cluster i, denoted as SSE i , is calculated as shown in Equation ( 6), while the total SSE for all clusters is denoted as SSE and calculated as shown in Equation (7).The objective of the k-means algorithm is to find a partitioning scheme that minimizes the total SSE by iteratively updating the cluster centroids and reassigning data points to clusters.
where SSE i represents the sum of squared errors for cluster i, SSE represents the total sum of squared errors.C i represents the centroid coordinates of cluster i, and x ij represents the coordinates of the jth data point within cluster i.
When the total sum of squared errors is minimized, the formula for calculating the centroid coordinates within each cluster is shown in Equation (8).
Here, C k denotes the centroid coordinates of the kth cluster, and n k represents the number of coordinate points in the set Q k .

Adaptive thresholding based on the mean and variance of the Hough transform voting counts
In order to ensure that the Hough transform does not miss the desired lines, it is necessary to detect lines using a low threshold.However, when the Hough transform threshold T 1 is set low, it is often affected by noisy pixels, resulting in more interference lines in the detection results.In Section 2.2, the centroid coordinates C k of the k-means clustering based on the Hough transform can be biased due to the increase of interference points, leading to deviations in the line fitting results and affecting the accuracy of corner detection.To address these issues, this article analyzes the distribution and dispersion of voting counts in the parameter space for each cluster in the clustering results.It is found that low-voting-count lines are densely distributed while high-voting-count lines are sparsely distributed within each cluster.Therefore, this article proposes an adaptive thresholding method based on the mean and variance of the Hough transform voting counts.The method calculates an adaptive threshold for voting counts within each cluster, allowing for the retention of lines with higher voting counts and the removal of lines with lower voting counts, thereby obtaining a new set of clustering results consisting of four clusters.
Specifically, in the clustering results of the i-th cluster, the mean vote_x i and standard deviation vote_S i of the line voting counts in the Hough space are calculated.The adaptive threshold is set as T_adapt i according to Equation (9).The value of m is typically determined based on experimental conditions and is generally within the range of [0, 2].Using the piecewise function f ( x ij , T_adapt i ) , the elements x ij from the i-th cluster X i that satisfy the threshold T_adapt i are placed into a new set, as shown in Equation (10).After calculating the adaptive threshold for each clustering result, four new clusters are obtained, where the voting counts in these four clusters are relatively high.This effectively filters out interference lines detected at low thresholds.

Clustering centroid calculation and corner coordinate calculation method based on voting number weighting
Due to varying lighting conditions and changing backgrounds, image segmentation may result in partial edge loss and issues of over-segmentation and under-segmentation.The local arrangement of edge points in the edge detection image may have shifted, but the overall linear relationship remains relatively unchanged.It is not sufficient to only consider the line with the highest voting number among the clustering results.Therefore, the relatively lower voting lines in the new clustering results also provide valuable information for line fitting.
This article proposes a clustering centroid calculation method based on voting number weighting to replace the original k-means clustering centroids as the basis for line fitting.The weight of each line's voting number is taken into account in the calculation formula.Specifically, the higher the voting number of a line in the parameter space, indicating a stronger linear arrangement of pixels in the image coordinate system, the greater its weight.At the same time, for lines where the local arrangement of edge points has shifted, their weights are also considered in the calculation.The formula for clustering centroid calculation based on voting number weighting is shown in Equation (11).
In the equation, ( i ,  i ) represents the calculated clustering centroid coordinates for the i-th cluster after weighting. ij represents the polar radius of the j-th line in the i-th cluster, and  ij represents the polar angle of the j-th line in the i-th cluster.v vote ij represents the voting number of the j-th line in the i-th cluster.
The method of determining the adaptive threshold based on the calculation of mean and variance of line voting numbers allows for the removal of low-voting lines, thereby improving the reliability of the lines to be fitted in each cluster.The improved clustering centroid calculation method takes into account the weight proportion of the voting numbers for each line.The calculated results theoretically conform more closely to the true linear relationship of the edge points, thereby achieving a better fit of the lines to the actual edges.The flowchart of the calculation process is shown in Figure 1.
By utilizing the adaptive threshold based on voting numbers to remove interfering lines, and then employing the improved clustering centroid calculation method, the final result of obtaining the coordinates of the four lines for corner calculation is achieved.Obtaining the coordinates of the corners involves calculating the intersection points

Start
Perform a mapping of the edge points (x, y) from the edge detection graph onto curves within the Hough space.
Initialize a two-dimensional array to store the vote count value for each point in the accumulator.
Utilize the Hough transform threshold to ascertain the identified line parameters (ρ, θ, Vote).
Perform outlier detection on the line parameters (ρ, θ, vote) and subsequently apply a normalization procedure. of the four lines and determining the required information of three corner coordinates, which are output sequentially.Given the polar coordinate representation of these lines as ( i ,  i ), the equations of the two known lines in the Cartesian coordinate system can be derived from their polar coordinates.The calculation formula is shown as follows (12).
x cos  1 + y sin  1 =  1 , where ( 1 ,  1 ) represents the polar coordinate representation of line 1, and ( 2 ,  2 ) represents the polar coordinate representation of line 2. By solving the system of equations formed by combining the above equations, the coordinates of the intersection point (x, y) of these two lines can be obtained.The calculation formula is shown as follows (13).
Use formula (13) to solve for the intersection coordinates of four straight lines, and then output the coordinates of the three corner points to be solved based on the distribution characteristics of the corner points of the suspension rod grasped by the bridge erecting machine.The specific steps for finding the corner points are as follows: 1. Calculate the coordinates of the six intersections formed by the four straight lines.2. Evaluate whether the calculated intersection coordinates contain outliers.If outliers are present, recalculate the coordinates.3. Take the absolute values of the coordinates for each intersection point.4. Compare the ordinate values of the six intersection coordinates and sequentially remove the three sets of coordinates with the largest values.5. Utilize prior knowledge to determine the correct horizontal coordinate arrangement order for the three corner points of the suspension rod, as grasped by the bridge erecting machine.Arrange the coordinates in ascending order.6.Output the coordinates of the corner points.

Image acquisition scheme for crane grab boom
The segment beam automatic grabbing device is connected to the crane through the crane's trolley, and the device is equipped with automatic grabbing rods on both sides of the bottom of the lifting frame.These rods are used to insert into the lifting holes of the segment beam to complete the grabbing task.Once securely locked, the control system adjusts the position of the crane and the orientation of the lifting device to place the segment beam in the specified area.The industrial camera is positioned as shown in the schematic diagram 2, installed at the bottom of the lifting frame at a 30-degree angle from the vertical direction.The industrial camera follows the movement of the lifting device while maintaining a constant relative position with the automatic grabbing rods, ensuring that the bottom of the automatic grabbing rods remains in the center of the industrial camera's field of view.
During operation, the crane pulls the lifting device to a certain distance directly above the segment beam to be grabbed, adjusting its orientation to align with the segment beam.The industrial camera automatically takes photos and transmits the captured images back to a local computer through the image transmission module.The images are then processed using the method described in this article.Additionally, the captured photos are sent to the crane operator's cabin for reference.The local computer calculates the coordinates of the intersection points of the lines and provides image pixel coordinates for photogrammetric measurement and three-dimensional spatial calculation to determine the three-dimensional position of the lifting device.The calculated parameters are output to the control system, which drives the adjustment mechanism of the lifting rods based on the parameters to achieve precise alignment between the rods and the lifting holes.Finally, the lifting device is lowered to achieve the automatic grabbing of the segment beam (Figure 2).

Evaluation indicators for corner detection results
This article uses the average distance between the intersection coordinates of the detected lines and the corner coordinates corresponding to the lifting rods to evaluate the magnitude of the error in line detection and determine if the line detection results are within the error range (Figure 3).The manually labeled points A, B, and C represent the corner points of the lifting rods.The intersection point between line l 1 and l 2 is denoted as point "a", the intersection point between line l 2 and l 3 is denoted as point "b," and the intersection point between line l 3 and l 4 is denoted as point "c."Aa, Bb, and Cc represent the error distances between the corresponding coordinate points.Finally, the average error distance of Aa, Bb, and Cc is used to measure the detection results.The evaluation criteria are illustrated in Figure 3.

Image graying experiment
In order to ensure the reliability of the experiment, this article randomly selects an image of the grabbing arm of the bridge erecting machine in the experimental scene as the research object to validate the method proposed in this article.Figure 4A,B are images captured by the left and right cameras under night-time supplementary lighting conditions.Taking Figure 4A, different differential grayscale processing methods are analyzed.In Figure 4, the two sides of the lifting arm grasped by the bridge erecting machine are colored blue and green.Figure 5A constructs a differential grayscale image using the 2G − R method.It can be observed that doubling the grayscale value of the G channel and subtracting it from the R channel results in a significant grayscale difference in the target area, making the grayscale difference of the two sides of the lifting arm too pronounced.Additionally, it amplifies the grayscale difference between the background clutter and the surrounding environment, which is not conducive to subsequent OTSU binarization.Figure 5B constructs a differential grayscale image using the 2G − R − B method.Although this method highlights the green area more prominently, it fails to emphasize the blue area on the left side of the lifting arm.Moreover, this approach also leads to the issue of excessive segmentation, highlighting the holes in the segment beam in the image.This may cause segmentation failure between the target lifting arm and the surrounding environment after OTSU binarization.Figure 5C constructs a differential grayscale image using the G-R method.This method can extract objects in the image that are slightly blue or slightly green, thereby improving the applicability of color extraction and avoiding under-segmentation issues.It also prevents excessive segmentation of background clutter caused by a large G component coefficient.The grayscale differential construction method employed in this article ensures that more than 90% of the pixels in the entire image have a grayscale value of 0. The resulting waveform exhibits distinct peaks and valleys, presenting a bimodal distribution, which facilitates OTSU binarization.Figure 6A shows the result of OTSU binarization, where the lifting arm and the surrounding environment are completely separated.Figure 6B applies morphological processing to fill the holes in the binarized result, further eliminating unnecessary noise and interference.Finally, Figure 6C represents the final result of Canny edge detection, demonstrating that the detected edges align with the contour shape of the lifting arm.

Experiment on line clustering for corner detection
Before conducting the clustering experiment, it is necessary to perform Hough transform on the edge detection image.Due to the continuous movement of the industrial camera along with the bridge crane grabbing the lifting rod, the construction scene background undergoes constant changes, and there are variations caused by different lighting conditions.This inevitably leads to the presence of some small interfering pixels in the differential grayscale image.In this case, we selected edge detection images obtained under adverse working conditions for demonstration purposes.Figure 7 shows the lines detected using Hough transform with different voting thresholds of 30, 40, and 60.
From Figure 7A, it can be seen that when the voting threshold is set to 30, the Hough transform detects multiple lines in the image.In other words, setting a smaller voting threshold results in shorter detected line segments, which clearly includes the detection of interfering pixel lines from the edge image.In Figure 7B, when the voting threshold is set to 40, the Hough transform significantly reduces the detection of interfering pixel lines.Therefore, it can be concluded that increasing the voting threshold improves the accuracy of line detection.However, when the voting threshold is set to 60, the Hough transform only detects 2 out of the 4 edge lines representing the crane grabbing the lifting rod, missing the other 2 edge lines.This indicates that solely relying on increasing the voting threshold is not sufficient for the detection of the edge lines in the crane grabbing the lifting rod.In the edge detection image shown in Figure 7, the arrangement of pixel points on the four edge lines representing the crane grabbing the lifting rod undergoes a sudden change in linearity.In other words, there are slight differences in the detected lines (, ) on the same edge line.These subtle differences can directly affect the accuracy of line fitting.Hence, it is challenging to determine which line to select as the edge line for the crane grabbing the lifting rod.Moreover, the purpose of line detection in this article is to calculate the coordinates of the three corner points of the lifting rod by intersecting the lines.Therefore, the detected lines need to be fitted, which simplifies the problem to four lines.
Figure 8 shows a partial view of line detection.When the edge image is affected by uncontrollable factors such as lighting, a small portion of correct edge pixels may be lost, but the overall linear relationship does not change significantly.In Figure 8A, among the detected lines 1, 2, and 3, line 2 represents the result of lost edge pixel detection.However, its linear relationship is similar to the other detected lines with minor errors.The proposed improved clustering centroid calculation method takes into account the weight proportion of each line, and the calculated results theoretically better match the linear relationship of real edge points.By using a method to determine adaptive thresholds based on statistical voting, lines with low voting numbers are eliminated, thereby improving the reliability of the lines to be fitted in each cluster.The improved clustering centroid calculation method further aligns the line fitting results with the real edges in theory.The detection results are shown in Figure 8C.
The comparison between the proposed method and the original k-means clustering results is shown in Figure 9. Figure 9A illustrates the clustering results of the original k-means algorithm in the Hough space for different thresholds (T 1 ) of 25, 30, and 35.The k-means clustering centroids are represented by circles and diagonal crosses, and the data points from different clusters are depicted in different colors.In Figure 9, when T 1 is set to 25, there are numerous interfering line parameter points in the Hough space, resulting in significant interference in the clustering results, where some interference points are falsely clustered together.When T 1 is set to 30, although some interfering line parameter points are removed due to the increased threshold, there still exist some interference, leading to inaccurate clustering results.When T 1 is set to 35, further removal of interfering line parameter points occurs, and the line parameter points within each cluster precisely represent the correct lines, resulting in better clustering results.

F I G U R E 9 Comparison between the proposed method and the original k-means clustering results in this article
On the other hand, by using the proposed adaptive threshold combined with the improved clustering centroid calculation method in this article, as shown in Figure 9B, the method determines the optimal adaptive threshold based on statistical voting.It adaptively sets the threshold (T_adapt i ) according to the distribution of voting numbers, eliminating interfering line parameter points.Then, it recalculates the clustering centroids based on the weighted proportion of the voting numbers.The proposed method can successfully perform the clustering analysis for Hough transform thresholds of 25, 30, and 35.Overall, the proposed method demonstrates better robustness and accuracy in clustering compared to the original k-means clustering approach.

Error analysis of the corner detection results
The following is an error analysis of the corner detection results using the proposed algorithm and the original k-means line fitting method with Hough transform thresholds set to 25, 30, 35, and 40.Four noisy edge detection images were selected for validation in this study, as shown in Figure 10.In Figure 10A, it can be observed that for the same edge detection image, both algorithms exhibit a decreasing trend in errors as the Hough transform threshold T 1 increases.When T 1 ≤ 35, the original k-means algorithm for line fitting shows significantly larger corner detection errors compared to the proposed algorithm.When T 1 > 35, the errors of both algorithms are comparable.Therefore, under low threshold conditions in the Hough transform, the proposed algorithm demonstrates smaller errors and significant advantages, allowing it to achieve minimal errors even in noisy edge detection images.This pattern is consistent across all four images.

Analysis of corner detection results under different lighting conditions
To verify the effectiveness of the proposed method under different lighting conditions, a selection of line detection results is presented in Figure 11. Figure 11A shows the line detection results under outdoor natural lighting conditions, where it can be observed that shadows have minimal impact on the detection results, with slight deviations that still enable the detection of the crane boom contours.Figure 11B displays the line detection results under indoor lighting conditions, where the method performs well in accurately detecting the lines.Figure 11C shows the line detection results under low supplementary lighting conditions, where the corresponding lines are detected, although there may be instances of detection failure due to low light intensity.In Figure 11D, the line detection results under high supplementary lighting conditions demonstrate excellent performance, as the ample light enables accurate line detection.Finally, Figure 11E illustrates the line detection results under strong supplementary lighting conditions, where the intense lighting highlights the target object and leads to more accurate edge detection, resulting in the best overall detection performance.The recognition accuracy of the proposed algorithm for identifying the crane boom under different lighting conditions is presented in Table 1, with a total of 70 test images conducted for each lighting condition.The average error range within [0, 10] is considered as successful recognition, while an average error range within (10, ∞) indicates recognition failure.Under low lighting conditions, the detection average error within the range [0, 2] accounts for 90.0% and achieves a recognition success rate of 92.9%.The highest recognition success rate is obtained under strong supplementary lighting conditions, with a detection average error within the range of 0-2 pixels accounting for 97.1% and a recognition accuracy of 98.6%.The recognition accuracy results for the Hough-Kmeans original centroid calculation method in corner detection are presented in Table 2. Compared to the proposed method in this article, the accuracy of recognition outcomes is consistently lower.Especially under low supplementary lighting conditions, due to poor lighting conditions, there is a theoretical loss of edge points and the introduction of interference impurities in the edge detection results, resulting in a recognition success rate of only 64.3%.
Meanwhile, the proposed method is compared with the probabilistic Hough line detection method, the LSD method, and the Hough-Kmeans original centroid corner calculation method.As shown in Figure 12A, the probabilistic Hough line detection method exhibits high false detection and omission rates, with irrelevant background

F I G U R E 1
Apply the k-means clustering algorithm to the straight line parameters, yielding the following clustering results: cluster1, cluster2, cluster3, cluster4.if vote(i,j)>T_adapt(i) Acquire the updated clustering results, comprising four distinct clusters labeled as clusterA, clusterB, clusterC, and clusterD.Recalculate the centroid coordinates (ρ, θ) within clusters A, B, C, and D based on the proportion of vote weights.Utilize the (ρ, θ) parameters to plot a straight line on the image.End Compute the adaptive threshold by considering the vote count within each cluster.Y Calculation flowchart

F I G U R E 4
Images of the crane boom captured by the left and right cameras under night-time lighting conditions

F I G U R E 5 F I G U R E 7
Different methods for grayscale conversion F I G U R E 6 Binarization, morphological processing, and edge detection result image Hough transform detection results at different thresholds

F I G U R E 8
Local schematic diagram of line detection

F I G U R E 10
Error analysis of the proposed algorithm compared to the original k-means line fitting method in this article F I G U R E 11 Corner detection results under different lighting conditions TA B L E 1 The accuracy of corner detection using the proposed algorithm under different lighting conditions Average error range (pixel)

TA B L E 2 2 F I G U R E 12
The accuracy of corner detection using the original clustering centroid fitting method under different lighting conditionsAverage error range (pixel) Effectiveness of different line detection methods for corner detection