Extraction of optimal fatigue-driving steering indicators considering individual differences

Extracting effective steering indicators is critical for fatigue-identiﬁcation based on steering behaviour. However, considering individual differences, the best calculation parameters of the individual driver’s steering indicators are different. The indicator’s performance decreases by using uniﬁed parameters rather than the individual driver’s best parameters. The authors propose a model extracting individual driver’s optimal fatigue steering indicators calculated by the individual driver’s best parameters. Individual driver’s naturalistic driving data are analysed by the model. First, indicators in sober and fatigue state are examined by the Wilcoxon test, and |Z| represents the indicator’s fatigue-identiﬁcation performance. And the function reﬂecting correspondences between calculation parameters and indicator’s fatigue-identiﬁcation performance is constructed. Then the function is optimized using particle swarm optimization to obtain individual driver’s best calculation parameters maximizing | Z |. Finally, effective indicators calculated by the individual driver’s best parameters constitute the individual driver’s optimal fatigue indicator set. Indicators calculated by the individual driver’s best parameters and uniﬁed parameters are used to establish the individual fatigue-identiﬁcation model. The average identiﬁcation accuracy of models using individual driver’s best parameters and uniﬁed parameters is 87.4% and 79.1%, which indicates using individual driver’s optimal fatigue indicators can improve identiﬁcation accuracy. This study can provide references for individualized fatigue-driving identiﬁcation

Applications of advanced sensor technology, computer technology, and the big data mining technology in ITSs promote indicators extraction and model of fatigue-driving [5,6]. Many scholars have carried out fatigue-driving identification studies based on physiological-psychological signals [7], driving behaviour [5], vehicle movement [8], driving duration [7], etc., and have obtained great progress. Among them, fatigue-driving identification models using indicators derived from the steering wheel angle are common [9]. Some studies point out that the driver's fatigue can be reflected by steering behaviours well [2,5,8]. And fatigue-driving can be identified validly by building a model based on the standard deviation of steering wheel angle (SDSWA) [10], static percentage of steering wheel movement (SPSWM) [11]. These fatigue-identification methods based on the steering wheel angle have many advantages, such as no interference to the driver, little influenced by environmental  [14]. SRR the time window is 300 s, the reversal threshold is 6 • field environment Gastaldi, et al [11]. MASE the time window is 240 s simulated environment Thiffault, et al [15]. SDSWM the time window is 300 s simulated environment Zhang, et al [16]. SDSWA first-layer and second-layer time window is 60 and 15 s respectively field environment SPSWM first-layer and second-layer time window is 60 and 15 s respectively; the threshold of static angular velocity is ±0.1 • /s Niu [17]. factors, and good real-time performance. Moreover, the installation of steering angle sensors is cheap and simple. So, the fatigue-identification method based on the steering wheel angle has been applied to develop the anti-fatigue assisted driving systems in real vehicles [12,13]. Therefore, we conduct researches on fatigue-driving based on the steering wheel angle.
Researches about fatigue-driving identification using steering indicators have been summarized in Table 1. Researchers got different conclusions by using various unified calculation parameters (such as time windows) to calculate indicators for all drivers. For example, in the simulated environment, Wang et al used the 60-s time window to calculate SDSWM, NSR, and other indicators, and the fatigue-detection accuracy reached 64.15% [8]. Zhang et al. found that the correlation between SRR using these calculating parameters and fatigue was highest with a value of 0.42 in the field environment. It was indicated that calculation parameters could greatly influence the fatigue-identification performance of indicators [14].
Most researchers used unified parameters to calculate steering indicators for all drivers and mixed all drivers' data to build models [2,10,16]. Although much research progress was obtained, these researches neglected the influences of individual differences, which reduced the adaptability and accuracy of individual driver's fatigue-identification [8,18].
Previous studies have shown that individual differences are ubiquitous among drivers with different gender, age, driving experience, and other individual properties [17,19,20]. And individual differences are important factors affecting the accuracy of driving behaviour models [20], such as drinking driving models and fatigue-driving models. When fatigue occurs, there are individual differences in many aspects among drivers, including the changes of the physiological and psychological characteristics [21,22], the susceptibility to driving fatigue [23], the changes in driving operation, and the fatigue countermeasure [24]. Therefore, the effects of individual differences should be considered when we do researches on the indicators, identification models, and countermeasures of driving fatigue.
In studies of fatigue-driving indicators, Xu et al. [25] believed that individual differences are important reasons affecting the performance of indicators. They found that there are significant individual differences in indicators such as SDSWA and percentage of eyelid closure (PERCLOS) in the fatiguedriving state. Inger et al. [26] used the mixed-effects regression analysis model to analyse the relationship between subjective fatigue value and blink time, the standard deviation of lane position (SDLP). The result indicates that whether it is fatigue or not, there are significant individual differences in both blink time and SDLP. Zhang et al. [14] found that there are significant individual differences in SDLP and SRR calculated by optimal unified parameters of all drivers, meanwhile, they thought that there were individual differences in the best indicator's calculation parameters. In research on the fatigueidentification model, scholars found that individual differences are important factors affecting the identification accuracy [8,9]. When indicators are calculated by unified parameters for all drivers, individual differences can lead to systematic errors in fatigue-identification [26]. Therefore, establishing an accurate fatigue-identification model at the individual level should consider the impact of individual differences. Wang et al. [8] used the individual driver's non-intrusive indicators to establish a personalized fatigue-identification model results showed that considering individual differences can indeed improve the accuracy of fatigue-identification, which is attributed to using driver-specific fatigue threshold for individual drivers.
The above studies showed the necessity of considering individual differences. Extracting effective fatigue-driving indicators of individual drivers is a prerequisite for establishing an individualized fatigue-driving model [8]. And the fatigue-identification performance of indicators is significantly affected by the calculating parameters [16,17]. In most identification studies of fatigue-driving, indicators of all drivers are calculated by the unified calculation parameters [27,28]. But, due to individual differences, the indictor's best calculation parameters of different drivers are different [14,29]. The previous research indicated that because of individual differences, for some drivers, using unified parameters instead of their best parameters to calculate indicators will reduce the fatigue-identification performance of indicators [14]. It can be inferred that when there is no significant difference between indicators of the awake state and the fatigue state, even if individual driver's data is used to build personalized models, the fatigue-identification accuracy may not be ideal. It has also been mentioned in some studies [15,20]. For improving the indicator's performance of fatigue-identification, the indicator's calculation parameters of individual drivers should be optimized. And effective indicators calculated by the individual driver's best parameters will be chosen to constitute an optimal fatigue-driving indicator set of individual drivers. Further, an individual fatigue-driving identification model can be established.
Therefore, considering the individual differences in indicator's best calculation parameters, we propose a novel model to optimize the indicator's calculation parameters of individual drivers and constitute an optimal fatigue-driving indicator set of individual drivers. Then, to verify the performance of the individual driver's optimal fatigue-driving indicator set, we build a fatigue-driving identification model based on the back propagation (BP) neural network which is a multi-layer feedforward neural network trained according to error inverse propagation algorithm.
Subsequently, we will present detailed content. In Section 2, the extraction and performance verification models of optimal fatigue-driving steering indicators of the individual driver are introduced. In Section 3, we give the experimental details of data collection, the calculation parameters, and the formulas of indicators. In Section 4, the optimal fatigue-driving steering indicators of some individual drivers are displayed, and indicators' fatigue-identification performance and other aspects are analysed. Finally, the conclusion and prospect are drawn in Section 5.

Extraction model of optimal fatigue-driving steering indicator set of an individual driver
The extraction model of an individual driver's optimal fatiguedriving steering indicator set is shown in Figure 1. And an individual driver's dataset is input into the model to obtain the individual driver's optimal fatigue-driving steering indicator set. As is shown in Figure 1, the model can be divided into three parts of I-III. The aim of part I is to construct the function which can reflect the correspondence between the indicator's calculation parameters and the indicator's fatigue-identification performance represented by |Z|. |Z| is obtained by performing the Wilcoxon test on indicators in the sober and the fatigue state. The sober and fatigue-driving states are referred to as two kinds of driving states hereinafter. Then in part II, we use particle swarm optimization (PSO) to optimize the function in part I and obtain an individual driver's best calculation parameters of indicators. Finally, in part III, we use the Wilcoxon test to examine the indicator calculated by the best parameters and choose the indicator with p-value < 0.05 to constitute the individual driver's optimal fatigue-driving steering indicator set. The detailed process of the model is introduced as follows.
First of all, we determine an individual driver and indicator to be analysed, then extract pre-processed naturalistic driving data of the individual driver (introduced in Section 3.2). Wilcoxon test: Because there are abnormal data in the dataset, the sample data is not normally distributed. The number of time windows used to calculate the indicators determined the sample size of the indicators [16]. In the field experiments, due to durations of the sober state usually differed from durations of the fatigue state [14], indicator sample sizes of the sober state were different from that of fatigue state. Thus, we selected the Wilcoxon test to analyse the variability of indicators between sober and fatigue states. The Wilcoxon test is often used to test significant differences between unpaired samples from two groups when the sample data is not the normal distribution [30]. |Z| in the Wilcoxon test is always used to quantify differences between two sample groups. So, we use |Z| to represent the fatigue-identification performance of indicators. The larger the |Z| is, the greater the indicator's difference between sober state and fatigue state is. The formula of Z in the Wilcoxon test is introduced as follows.
Indicator sample values are calculated according to formulas and calculation parameters (introduced in Section 3.3). S : S 1 , … , S g , … , S n are indicator sample values in the sober driving state, and the sample amount is n. F : F 1 , … , F k , … , F m are indicator sample values in the fatigue state, and the sample amount is m. Indicator sample values of two kinds of driving states are mixed (sample amount is Q = n + m). The samples are sorted in ascending order.
In the mixed sample values R g is the rank of sample value g of S, R k is the rank of sample value k of F. N(0,1) means that the distribution of Z conforms to the standard normal distribution.
Finally, based on the indicator's calculation parameters and |Z| from the Wilcoxon test, we construct the function H which can represent the change of |Z| with the change of the calculation parameters, the function is described as follows.
is an independent variable vector consists of the indicator's calculation parameters and O j represent the indicator's calculation parameter j . |Z| is a dependent variable.
The function can evaluate the goodness of an indicator's calculation parameters. The larger the |Z| is, the better the goodness of the indicator's calculation parameter is. So, we use PSO in part II to deal with the function and obtain an indicator's optimal calculation parameters with maximum |Z|.

2.1.2
Part II: Obtain the individual driver's best calculation parameters of the fatigue-driving indicator In part II, we mainly introduce the optimization procedure of the indicator's calculation parameters based on PSO. Particle swarm optimization: PSO is an intelligent optimal solution search algorithm based on group collaboration [31]. PSO has many advantages, such as global searching ability, simple structure, and fast convergence speed due to the cooperation of multiple agents [31]. Thus, we choose PSO to optimize function H and obtain an individual driver's best calculation parameters to improve the fatigue-identification performance of indicators.
In PSO of this work, the indicator's calculation parameters constitute the position vector sents the calculation parameter j of the indicator. j = 1, 2, … , D, D is the number of indicator's calculation parameters and N is the total number of particles in a swarm. We design the particle swarm size (N = 30) and search space. Each particle moves randomly in the search space at the speed vector Function H is used as the fitness function of a particle. Particle's position vector O is the independent variable of H and the goodness of particle is evaluated by dependent variable |Z| of fitness function H. The larger the |Z| is, the better the goodness of the particle is, that is, the better the fatigue-identification performance of the indicator is. Thus, the best position with maximal |Z| can be gained through PSO. The best position vector is the best calculation parameters of the indicator. The process of PSO is shown in Figure 2.
In Figure 2, first, we randomly initialize the position and velocity of each particle. Second, we calculate the fitness values of the particles. Third, we obtain the personal best position of particle i P bi (P bi1 , P bi2 , … , P bi j , … , P biD ) and the global best by comparing fitness value. Fourthly, we update the velocity and position of the ith particle in iterations according to Equation (3).
-The velocity of the ith particle at (q + 1)th iteration. w q -Inertia coefficient at qth iteration. O q i -The position vector of the ith particle at q th iteration. c 1 , c 2 -Learning factor, c 1 = c 2 = 2. P q bi -Personal best position of the ith particle at q th iteration. G q b -Global best position until q th iteration. rand 1 , rand 2 -Random variables.
Finally, if the termination condition is met, the iteration is finished and output the individual driver's best calculation parameters of indicator represented by G b . It is found that when the number of iterations of PSO reaches about 200 times, the fitness value does not change much. Therefore, to save the algorithm running time, the termination condition is the number of iterations reaching 200.
Part III: Choose indicator calculated by the individual driver's best parameters to constitute the individual driver's optimal fatigue-driving indicator set After part II, we obtain the indicator's best calculation parameters of an individual driver. However, whether the indicator cal-culated by the individual driver's best parameters is effective for identifying fatigue-driving needs to be analysed. So, in part III, we test differences between indicators in two kinds of driving states calculated by the best parameters. For an individual driver, first, we calculate the indicator using the individual driver's best parameters output by PSO. Second, we do the Wilcoxon test on indicators in two kinds of driving states and obtain the p-value. Lastly, we choose indicators with p-value < 0.05 to constitute the optimal fatigue-driving indicator set of an individual driver. Indicators of the indicator set are calculated by the individual driver's best parameters.

Performance verification model for optimal fatigue-driving indicator set of an individual driver based on BP neural network
To verify the performance of an individual driver's optimal fatigue-driving indicator set, we build the individual fatigueidentification model based on the BP neural network. The BP neural network is a powerful model for solving complex nonlinear classification problems [32]. For an individual driver, we use an individual driver's optimal fatigue-driving indicator set to build a fatigue-identification model named FM_I. Those indicators in FM_I are calculated by individual driver's best parameters. And for comparison, we also use indicators calculated by unified parameters to build the individual fatigue recognition model named FM_U. The types of indicators in FM_U are the same as those in FM_I. By comparing the recognition accuracy of FM_U and FM_I, we can illustrate the fatigue-identification advantage of the optimal fatigue-driving indicator set of an individual driver.

Experiment design
1. Experiment participants: Because the influence of gender on fatigue-driving is not analysed in this paper, and according to statistics, the proportion of female professional drivers in the district was low. Therefore, 35 male professional drivers and five female professional drivers were recruited. The ratio of men to women is similar to the real situation. The occupations of the participants included taxi drivers, bus drivers, etc. All participants hold a valid driving license above C1. The average age of participants was 46.83 years (SD = 5.62). Participants were physically and mentally healthy, and they had normal work and rest. The average driving age of participants was 16.53 years (SD = 6.10), they had better driving skills and stable driving habits. Coaches with 30 years of driving experience were recruited as an observer on the co-pilot, they were trained to use KSS for evaluating the driving state of participants and take safety measures to incase of an emergency. The experiment protocol including experimental details and the possible risk was introduced to all participants. Each participant signed informed consent before experiments and was paid some salary after finishing the whole experiment. The study was carried according to requests of the local ethics board and the privacy information of each participant was protected. 2. Experiment apparatus: Field experiments were carried out using the human-car-environment data acquisition platform modified by a real vehicle. The original steering wheel angle was obtained by a steering angle sensor with 0.1 • resolution and 20-Hz sampling frequency, which was installed on a vehicle's steering axle. Section of the Hanshi highway was chosen as the experiment route because the road is straighter, the surface is in good condition, and the traffic volume is small. Experiment apparatuses and the route were shown in Figure 3. 3. Experiment process: The main purpose of this study was to address the effect of individual differences on fatigueidentification. The weather and traffic volume can affect the naturalistic driving behaviour [33]. If these two factors were not controlled, heterogeneity of driving behaviour might be caused by changes in weather or traffic volume rather than the influences of individual differences. However, in the field experiment, the weather conditions such as light and humidity and traffic volume are dynamic, which cannot be set as accurately and conveniently as in the driving simulator [33]. Therefore, referring to previous studies [14,29], these two factors were controlled by performing the experiment under the condition of well-lit weather and low traffic volume to reduce complexities of the experiment, and excluding periods where data were significantly affected by these factors.

Data preprocessing
To obtain data that fits the requirements of the model in Figure 1, we preprocess the original experimental data according to the process shown in Figure 4. Before data preprocessing, based on the timestamp of each sensor, we synchronize the time series of the multi-sensor data. So multi-source data corresponded to the same driving state at the same time. Multi-sensor data of every 5 min corresponds to the same KSS because we evaluate the KSS every 5 min. Besides, this paper does not consider the impact of road curves and the traffic environment. According to the video outside the vehicle, the data in the scenes of corners and lane changes due to congestion are excluded to eliminate the influence of small radius curves and traffic congestion on steering behaviour.

Design double-layer time window
Time windows are important indicator calculation parameters to be optimized, which have significant impacts on the fatigueidentification performance of indicators [14,17]. According to the reference [16], we propose the double-layer time window to

Extract consecutive driving data on the highway
For the high speed and high risk of fatigue-driving on the highway, we choose the consecutive driving situation of the highway as an emphasis on research. According to videos outside the vehicle, the experimental data in continuous driving scenarios are extracted. Through the statistics of the speed under the continuous driving scenario, we find that the minimum speed of most observation units under the continuous driving scenario is above 80 km/h. So, we select observation units with the lowest speed exceeding 80 km/h as the observation units of continuous driving on the highway.

Indicator calculations
According to studies [2,8,9], Steering behaviour indicators calculated by unified parameters are chosen to analyse, which is shown in Table 2.
For highlighting the advantages of individual driver's best calculation parameters of indicator, we also use unified parameters to calculate all participants' indicators. The unified parameters are used to calculate all drivers' indicators. And the more people whose fatigue state can be detected using the indicator calculated by the unified parameter, the better the unified parameter. Thus, we use the Wilcoxon test to examine individual driver's indicator samples and use the number of participants whose p-value < 0.05 as the optimized targets. Based on related researches [8,16], through several trials, we can gain every indicator's suitable unified parameters making more participants with P-value < 0.05.
Definitions and calculations of indicators are as follows: 1. SDSWA reflects the stability of the rotating steering wheel: n is the total number of sample points in the T 2 time window, SW A i is steering wheel angle value for the single sample point i, the value is positive for turning right and negative for turning left. 2. MLSWA and SDLSWA reflect the range and stability of large left-turning respectively. SW A i in the T 2 time window is arranged from smallest to largest, the number at the 25%th positions is the lower quartile (LSWA), MLSWA, and SDLSWA are mean and standard deviation of SW A i which is less than LSWA in T 2 time window respectively.
n l is the total number of sample points smaller than LSWA. 3. MUSWA and SDUSWA reflect the range and stability of large right-turning respectively.SW A i in the T 2 time window is arranged from smallest to largest, the number at the 75%th positions is the upper quartile (USWA), MUSWA and SDUSWA are mean and standard deviation of SW A i which is bigger than USWA in T 2 time window respectively.
) n u is the total number of sample points bigger than USWA. 4. MASWM and SDSWM reflect the speed and stability of the steering wheel movement respectively.
SW M i is steering wheel speed in the ith second, nv is the total number of sample points for steering wheel movement in the T 2 time window. 5. SPSWM reflects whether the steering behaviour is timely and appropriate. SPSWM = nv s nv (8) nv s is the number of sample points within the scope of ±TSV in the T 2 time window. For indicators ①-⑦, calculation parameters to be optimized are T 1 and T 2 . And for indicator ⑧, the calculation parameters are T 1 , T 2 , and TSV.

Individual participant's optimal steering indicator set of fatigue-driving
According to the reference [16], generally speaking, the duration of the fatigue state is 15-75 s, while the typical fatigue characteristics duration is 5 to 20 s. To avoid missing the optimal time window, this paper appropriately expands the optimization scope of the time window. We set the optimization ranges of T 1 and T 2 to be 10-80 and 1-25 s, respectively. The data of each participant are input into the model shown in Figure 1, the calculation parameters of the eight indicators are optimized respectively. With the limitation of article length, we randomly select six participants to display the individual participant's optimal fatigue-driving steering indicator set in Table 3. The sequence numbers for the six example participants are 4, 11, 18, 19, 25, and 29. As shown in Table 3, the same indicator of different individual participants has different best calculation parameters. Most indicators calculated by the best parameters of individual participants can effectively distinguish fatigue-driving (p-value < 0.05). However, for some participants, indicators calculated by the best parameters can't still distinguish fatigue-driving state. Such as the MUSWA of the No. 4 participant (p-value > 0.05), which can't be chosen to constitute the optimal fatigue-driving indicator set of the No. 4 participant.
We take the No. 25 participant as an example to show the optimization process of calculation parameters of SDSWA. Figure 5 shows the maximum |Z| of SDSWA is gradually * means p-value < 0.05, and the indicator with p-value < 0.05 is selected as the effective indicator of fatigue-driving.

FIGURE 5
The maximum |Z| of SDSWA in iterations of PSO increasing during the iterative process of PSO. It indicates that the model shown in Figure 1 can improve the fatigueidentification performance of the indicator by optimizing calculation parameters.

Comparison and individual difference analysis of indicators calculated by unified parameters and individual participant's best parameters
We take SDSWM as an example to compare the fatigueidentification performance of indicators calculated by unified parameters and individual participant's best parameters. Further, we explore indicator individual differences among participants.
The unified parameters and individual participant's best parameters are abbreviated as two types of parameters. For each participant, we use the Wilcoxon test to examine SDSWM calculated by the two types of parameters respectively. Figure 6(a) shows a radar map of SDSWM's |Z|, SDSWM of the individual participant is calculated by two types of parameters respectively. Figure 6(b,c) show the distribution of SDSWM in two kinds of driving states calculated by the unified parameters and the individual participant's best parameters respectively.
In Figure 6(a), the dashed and the solid line are constituted by |Z| of SDSWM calculated by unified parameters and |Z| of SDSWM calculated by individual participant's best parameters, respectively. The solid line polygon wraps the dashed line polygon. It indicates that for every participant, the |Z| of SDSWM calculated by individual participant's best parameters is bigger than that calculated by unified parameters. The result shows that the fatigue-identification performance of SDSWM calculated by individual participant's best parameters is stronger.
Besides, both solid line and dashed lines are not circular but irregular polygons in Figure 6 According to Figure 6(b,c), for each participant, the median value of SDSWM in the fatigue-driving state is higher than that in the sober driving state. The result shows that the stability of  the steering wheel is poor when a participant is fatigue-driving, which is consistent with the existing research conclusions [2]. Comparing the SDSWM box plots of the six example participants in Figure 6 According to SDSWM box plots of the same participant in Figure 6(b,c), for the most participant, differences of distribution for SDSWM in two kinds of driving states calculated by individual participant's best parameters are bigger. For No. 4, the differences between the median of SDSWM in two kinds of driving states calculated by unified parameters are 0.03, while that calculated by individual participant's best parameters is 0.06. It indicates that SDSWM calculated by individual participant's best parameters can better distinguish fatigue-driving state, and the result corresponds to the conclusion obtained in Figure 6(a).
Through the relative analysis in Figure 6, we can conclude that using individual participant's best calculation parameters to calculate indicators can improve the fatigue-identification performance of a single indicator. But the influence of individual participant's optimal fatigue indicator set on the accuracy of fatigue-identification needs further study. Besides, among different participants, there are significant individual differences in the indicator's fatigue-identification ability and distribution. Therefore, we use an individual participant's optimal fatigue indicator set to build driver-specific fatigue-identification models to verify the performance of indicators calculated by the individual driver's best calculation parameters.

Performance verification for individual participant's optimal steering indicator set of fatigue-driving
For each participant, sample data of the indicator set calculated by the two types of parameters are input to the BP neural network respectively. We establish two kinds of driver-specific fatigue-identification models namely FM_U and FM_I. By comparing the identification accuracy of FM_U and FM_I, we can verify the performance and advantage of the individual participant's optimal steering indicator set of fatigue-driving.
Every observation unit (T 1 time window) is one sample of fatigue-driving identification. According to the total sample amount of each participant, certain proportions of samples are selected as training sample sets. Identification accuracy equals the number of correctly identified test samples divided by the total number of test samples.
Through multiple trials, we gain the best number of hidden layers and the fittest number of neurons in each hidden layer. On this basis, we can build the best FM_U and FM_I of an individual participant. Therefore, we can calculate the best identification accuracies of the best FM_U and the best FM_I respectively. Meanwhile, to avoid random effects, the average identification accuracy of FM_U and FM_I are calculated based on the identification accuracies of models with different hidden layer structures produced in the trial process.
In Figure 7(a,b), No. 25 participant is taken as an example to show the process of training the model of best FM_U and FM_I. The sample-set input to the neural network for training is divided into three parts: training, verification, and testing. According to the convention, the proportions of the three parts are 70%, 15%, and 15%, respectively. Comparing Figure 7(a,b), the best validation means the square error of FM_I The best and average identification accuracies of FM_U and FM_I for the six example participants and FM_U are 0.21435 and 0.22898 respectively. The best validation mean square error of FM_I is smaller than that of FM_U, which illustrates the FM_I is more accurate. Besides, the best FM_I achieves optimal fit at the seventh epoch, while the best FM_U achieves optimal fit at the tenth epoch, which indicates the convergence speed of FM_I is faster.
As shown in Figure 7(c), the best identification accuracy of the best FM_I is higher than that of the best FM_U. We take participant No. 19 as an example, the best identification accuracy of the best FM_I is 87.5%, which is 6% more than that of the best FM_U with the accuracy of 81.5%. And the average identification accuracy of FM_I is slightly higher than that of FM_U, which indicates that FM_I can improve the identification accuracy relatively solidly. The average of best FM_U of 35 participants is 79.1%, and that of the best FM_I of 35 par-ticipants is 87.4%. The comparison result shows that using the individual participant's optimal fatigue-driving steering indicator set can effectively improve identification accuracy. The reason is that the FM_I uses fatigue indicators calculated by individual driver's best parameters having stronger fatigue-identification performance, which is illustrated in Figure 6.
As reported in previous research, the sample size has an impact on the accuracy of the BP neural network [34]. Thus, we analyse the number of observation units (sample size) of models. Because the number of observation units is determined by driving time in continuous scenes and T 1 , each participant has a different number of observation units. When the unified T 1 of all participants is 60 s, the number of observation units ranges from 322 to 364 (Mean = 343, SD = 14). Due to the differences between the unified T 1 and individual participant's best T 1 , the sample size of FM_U and FM_I are different. For eliminating the impact of sample size differences on model accuracy, stratified random sampling is used for the population with a large sample size to ensure that the sample sizes of FM_U and FM_I models are the same. For instance, for participant No. 29, the sample sizes of indicators calculated by 60 s and 50 s are 352 and 416. According to the proportion (6:4) of sober and fatigue samples in the indicators calculated by 50 s, 196, and 156 samples are randomly selected from the sober and fatigue samples respectively. And a total of 352 samples are drawn for building FM_I. Therefore, differences in accuracy between FM_U and FM_I are mainly caused by the difference in the fatigueidentification performance of the indicators.
According to the analysis of Figure 7(c), we can see that the fatigue-identification accuracy of the FM_I is more ideal than that of FM_U. Besides, it's found that there are individual differences in the increase in the best identification accuracy of FM_I. For instance, the increase of participant No. 11 is maximum with 8.9%, while the increase of participant No. 29 was a minimum with 3.2%. Thus, for participant No. 11, it's more indispensable to adopt the individual participant's optimal fatigue indicator set to build the identification model of fatigue-driving.

CONCLUSION
In ITSs, establishing an accurate fatigue-identification model is the key to developing anti-fatigue advanced assisted driving systems. However, individual differences are important reasons for reducing the performance of fatigue-identification models [8]. Therefore, considering the individual differences, we propose a model to optimize the individual driver's calculation parameters of indicators and extract an individual driver's optimal indicator set of fatigue-driving. We use the model to deal with individual participant's naturalistic driving data and obtain individual participant's optimal fatigue-driving steering indicator set. The analysis shows that indicators calculated by individual participant's best parameters are more effective and there are significant individual differences in the distribution of indicators. Further, we used the indicators calculated by the unified parameters and individual participant's best parameters to establish the driver-specific fatigue-identification model and compared identification accuracies of FM_U and FM_I. The average of the best identification accuracy of the best FM_U and the best FM_I for all participants is 79.1% and 87.4% respectively. The result shows that the accuracy of the fatiguedriving identification model using individual participant's optimal fatigue-driving steering indicator set is higher. The performance of FM_I is similar to ref. [8]. In ref. [8], the fatigueidentification accuracy at an individual driver level was 88.6% in a simulated driving environment. Many factors interfere with fatigue, such as traffic volume, approaching vehicles, outside noise, etc., which causes the difficulties of fatigue-identification in the field experiment [14]. Therefore, the performance of the FM_I is ideal in the real vehicle environment.
Admittedly, there are some shortcomings in our study. For example, limited by the actual conditions, the amount of naturalistic driving data of some participants are not very enough, which causes insufficient model training and not very high fatigue-identification accuracy. And this paper only studied the steering indicators of fatigue-driving. Besides, variations in steering wheel angles caused by curves and traffic environment interfere with the correlation between fatigue driving and steering wheel angles. Because curve radius, traffic volume, and other traffic environment factors are difficult to control in field experiments. The influence of curves and traffic environment was out of the scope of the current work, which might be more suitable to be analysed in driving simulation experiments. In the future, more drivers will be organized to conduct field experiments, more abundant naturalistic driving behaviour data will be collected, and more types of fatigue-driving indicators including PERCLOS, SDLP will be analysed by the proposed model. Besides, more efficient identification models will be adopted to further consummate the research conclusions. These efforts will highlight the advantages of individualized fatigue-driving research and promote the development of individualized antifatigue-driving active safety systems in ITS.