Get access

Extracting grey relational systems from incomplete road traffic accidents data: the case of Gauteng Province in South Africa




Road traffic accidents are among the top leading causes of deaths and injuries of various levels in South Africa. With the wealth and huge amount of data generated from road traffic accidents, the issue of traffic accident prediction has become a central challenge in the field of transportation data analysis. Such accident prediction is designed to detect patterns involved in dangerous crashes and thus help decision making and planning before casualty and loss occur. Recently, numerous researchers have presented a wide range of prediction techniques. Most of these methods are based on statistical studies but usually fail to explain the insights of prediction results. This has led to the development and application of supervised learning algorithms (classifiers) in an attempt to provide more accurate accident prediction in terms of injury severity (fatal/serious/slight/property damage with no injury). Even then, the task of learning an accurate classifier from instances raises a number of new issues some of which have not been properly addressed by transportation research. Thus, an effective prediction method is required for improving predictive accuracy.


The essence of the paper is the proposal that prediction of accidents given poor data quality (in terms of incomplete data) can be improved by using a classifier based on grey relational analysis, a similarity-based method. We evaluate the grey relational classifier with other state-of-the-art classifiers including artificial neural networks, classification and regression trees, k-nearest neighbour, linear discriminant analysis, naïve Bayes classifier, algorithm quasi-optimal and support vector machines. Real-world road traffic accident dataset is utilized for this task. Experimental results are provided to illustrate the efficiency and the robustness of the grey relational classifier algorithm in terms of road traffic accident predictive accuracy.