Critical ride comfort detection for automated vehicles

In a future connected vehicle environment, an optimized route and motion planning should not only fulfill efficiency and safety constraints but also minimize vehicle motions and oscillations, causing poor ride comfort perceived by passengers. This work provides a framework for a large-scale and cost-efficient evaluation to address AV's ride comfort and allow the comparison of different comfort assessment strategies. The proposed tool also gives insights to comfort data, allowing for the development of novel algorithms, guidelines, or motion planning systems incorporating passenger comfort. A vehicle-road simulation framework utilizable to assess the most common ride comfort determination strategies based on vehicle dynamics data is presented. The developed methodology encompasses a road surface model, a non-linear vehicle model optimization, and Monte Carlo simulations to allow for an accurate and cost-efficient generation of virtual chassis acceleration data. Ride comfort is determined by applying a commonly used threshold method and an analysis based on ISO 2631. The two methods are compared against comfort classifications based on empirical measurements of the International Roughness Index (IRI). A case study with three road sites in Austria demonstrates the framework's practical application with real data and achieves high-resolution ride comfort classifications. The results highlight that ISO 2631 comfort estimates are most similar to IRI classifications and that the thresholding procedure detects preventable situations but also over- or underestimates ride comfort. Hence, the work shows the potential risk of negative ride comfort of AVs using simple threshold values and stresses the importance of a robust comfort evaluation method for enhancing AVs' path and motion planning with maximal ride comfort.


INTRODUCTION
The deployment of Automated vehicles (AV) involves several challenges, such as the change of the human driver's role. With AVs, the role of a human driver shifts towards the one of a vehicle passenger. As a result, the importance of user acceptance of such technologies arises and should not be neglected during the design procedure. Although there is a broad spectrum of ongoing research (e.g., improvements and development of Advanced Driver Assistance Systems (ADAS)), not much attention has been devoted to the area of ride comfort. In this paper, ride comfort is defined as the overall perceived comfort by the passengers caused by vibrations in the passenger compartment. While some oscillations can be minimized by vehicle design, usually part of a NVH (noise, vibration and harshness) optimization, some are actuated by a rough road surface. Recent studies have pointed out that passengers in an AV are more likely to experience deficient ride comfort than passengers in a maneuvering non-automated vehicle Rosenzweig and Bartl (2015); Smyth et al. (2018); Sawabe et al. (2017). Hence, the data pool an AV utilizes for efficient and user-friendly motion planning should be enhanced with (a) comfort data representing current critical sections in a road network and (b) a suitable and robust method that does not allow over-or underestimation of critical comfort. To the best of the authors' knowledge, this work is the first to study AVs' ride comfort with an extensive data pool. Besides, studies tackled the correlation of different methods (e.g., ISO 2631 and IRI) but did not consider the specific comfort results and their importance for passenger comfort in AVs.
In order to tackle the challenge of mitigating deficient ride comfort, this paper compares different methodologies for determining ride comfort. First, we propose a sub-microscopic simulation framework allowing for a cost-efficient and generic generation of accurate vehicle dynamics from AVs. A physical vehicle model with accurate vehicle dynamics implementation and an Adaptive Cruise Control (ACC) has been applied. To calibrate the non-linear vehicle model for an optimum correlation to real behaviour, we utilize real-world vehicle dynamics measurements and solve an optimization problem to minimize the modeling error. After calibration and determination of model parameters, Monte Carlo (MC) simulations with Latin Hypercube Sampling (LHS) are performed. A representative set of input samples for the vehicle's speed profile and the lateral position in the lane is obtained to model different driver behaviors. Also, we sample the friction coefficient of the road surface as a proxy for different weather conditions. Finally, a representative number of simulations allows the utilization of output data to evaluate the ride comfort based on (a) most recent definitions of threshold values, (b) the guidelines from the International Standardization Organization (ISO) ISO 2631, and (c) the International Roughness Index (IRI). For proof of concept, the methodology is applied to three test sections in Austria.
The remainder of this paper is organized as follows: Fundamental works on ride comfort determination from vehicle dynamics and the research gap towards AVs are presented in Section 2. Sections 3 and 4 explain the architecture of the simulation framework and the model optimization procedure, respectively. The strategies to determine ride comfort estimates is introduced in Section 5. Finally, the conducted case study and the obtained results are presented in Section 6. The paper closes with a conclusion and potential future research in Section 7.

BACKGROUND AND CONTRIBUTIONS
Besides all novel technical methodologies and improvements in the automotive domain (e.g., ADAS), less attention is paid to human comfort that is potentially influenced by the introduction of AVs on the market Kyriakidis et al. (2017). The number of publications within the last decade in the AV domain shows a substantial increase. However, at the same time, the study of the overall comfort of humans during a vehicle trip in an AV is a relatively new research field Rosenzweig and Bartl (2015); Du et al. (2018). When vibrations in a vehicle exceed certain thresholds, the occupants feel discomfort, and ride quality decreases. Nevertheless, when these vibrations are low, the human body feels relatively comfortable, and negative influences on well-being, health, or motion sickness are insignificant.
In a non-automated vehicle that is maneuvered by a human driver, the occupant can adjust the driving behavior upon the level of (dis)comfort. However, this is not possible in an AV when a-priori knowledge about the road infrastructure and traffic conditions are not available and/or sensing systems face difficulties (e.g., weather conditions, covered obstacles, or reflections). Hence, methodologies are needed to benchmark ride comfort objectively. Holzinger et al. (2014) have shown how to objectively assess vehicle driveability with the software AVL-DRIVE™, focusing on ADAS/AD safety and comfort quality. However, there is limited literature about vibration-specific ride comfort evaluation caused by road roughness. Currently, most publications in this field address the public transportation sector. Therefore, in the following, this work analyses (a) studies of ride comfort determination that have been applied in the public transportation domain and/or (b) are relevant or applied to (automated) vehicles.
Recent studies aimed at evaluating and improving passenger comfort utilize methods of different levels of sophistication. First, several publications derive a model (linear regressions are commonly applied) to predict the ride comfort by combining empirical kinematic measurements and subjective survey data. An extensive study from Förstberg (2000) exploits the human response to different kinematic motions in trains. This work is one of the first, which combines empirical measurements (objective data) with data collected from a survey (subjective data). Regression analysis is performed, which finds that the most critical variables are motion dose from the lateral and roll acceleration in the horizontal plane. Due to the high low-frequency content of vehicle dynamic signals, Förstberg (2000) proposes a model of provocation of motion sickness showing that motion sickness has a time decay or leakage. Along with the mentioned study, other works such as Zhang et al. (2014); Eboli et al. (2016); Nguyen et al. (2019) combine objective and subjective data to determine ride comfort in public transportation vehicles. Zhang et al. (2014) proposes a bus comfort model based on multiple linear regressions. Subjective parameters derived from a questionnaire and kinematic parameters (such as noise, thermal comfort, vibration, and acceleration) are considered. Also, Nguyen et al. (2019) fits a linear regression model to explain the correlation between subjective passenger ratings and lateral acceleration and duration of a turning movement. However, the study also incorporates passenger posture. The presented performance of the model can be deemed modest. Eboli et al. (2016) again merges subjective and objective measurement data and derives thresholds of comfort levels.
Secondly, threshold values are widely used for ride comfort determination, as the implementation is straightforward, and adjustments are inexpensive. In Eriksson and Svensson (2015) different thresholds for the acceleration and jerk in public transport are collected, ranging from 1.00m/s 2 up to 1.47m/s 2 and 0.50m/s 3 up to 0.90m/s 3 , respectively. Elbanhawi et al. (2015) defines such thresholds ranging from 1.08m/s 2 and 1.47m/s 2 for the acceleration and 2.90m/s 3 as an upper bound for the jerk. The deviation of the jerk threshold can be explained by the fact that the study in Eriksson and Svensson (2015) considers standing passengers in public transport, whereas Elbanhawi et al. (2015) takes only sitting passengers into account. Bae et al. (2019) provide the most recent extensive overview of threshold values with different categories of driving styles. The study is also one of the few compiling public transport and different driving styles for a personal vehicle. Upper and lower thresholds (i.e., acceleration and deceleration, respectively) for comfortable public transportation (PT), normal driving (ND), and aggressive driving (AD) are collected. The lateral acceleration bands for PT, ND and AD are defined by -0.90m/s 2 -0.90m/s 2 , -4.00m/s 2 -4.00m/s 2 , and -5.60m/s 2 -5.60m/s 2 . For the longitudinal direction, the corresponding acceleration bands are listed with -0.90m/s 2 -0.90m/s 2 , -2.00m/s 2 -1.47m/s 2 , and -5.80m/s 2 -3.07m/s 2 , for PT, ND and AD, respectively. In all the publications mentioned above, no threshold values for accelerations in the vertical direction are defined.
Although thresholding might create comfortable vehicle trips on average, it does not guarantee the mitigation of discomfort in specific situations. Hence, thirdly, a more advanced approach is the analysis of the Power Spectral Density (PSD) of a given signal. By transforming the acceleration signals to the frequency domain with standard transformation methods (e.g., Continuous Wavelet Transform (CWT) Lee and Son (2001)) relevant frequency components can be analyzed. The ISO provides the guideline documents ISO 2631 that utilize the PSD approach to evaluate human exposure to whole-body vibrations. The guidelines provide a methodology to utilize acceleration signals in order to explore the effect of vibrations on (a) human health and comfort, (b) the probability of vibration perception, and (c) the emergence of motion sickness (International Standard Organization, 1997). ISO 2631 is widely applied in many different fields to evaluate human comfort/discomfort; e.g., in structural engineering Kralik and Kralik (2017), industrial engineering Zhang et al. (2018), or automotive engineering Xiao et al. (2020). Specifically targeting ride comfort, Castellanos and Fruett (2014) developed an embedded system to localize discomfort in public transportation networks and consequently derive and apply such thresholds. A comfort index measurement (the ISO 2631) and a jerk threshold detection are utilized to identify the comfort results; finally, the results are based on a threshold detection algorithm. Also, Förstberg (2003) investigates the International Union of Railways (UIC) standard to extend the evaluation to transition and circular curves. The UIC comfort tests are based on the ISO 2631 and specify the procedure for trains. The study aims again to find a correlation between passenger votes and UIC comfort parameters.
Furthermore, another infrastructure-based approach to estimate ride comfort is the utilization of the International Roughness Index (IRI). The IRI utilizes a longitudinal road profile of a test section. The road roughness is measured by taking the fraction of a quarter-car model's output and the section length Sayers (1995). Consequently, an aggregated index for the sections' road roughness is given, which can also be utilized to determine ride comfort. Nitsche et al. (2014) evaluate pavement roughness by utilizing vehicle response measurements and machine learning models. Results show promising results at constant vehicle speed. One of the most recent insights into the interplay of ride comfort and IRI is given by Múčka (2020). This work proposes IRI thresholds as a function of vehicle speed and road category. The methodology is based on finding a relationship between ISO 2631 and IRI. Results are compiled for nine different vehicles out of different categories (from a small family car, SUV, up to a cargo van).
As mentioned, vehicle ride comfort is only partially considered in the literature, and most of the studies focus on public transportation. Even less research has been conducted when looking at AVs. Elbanhawi et al. (2015) provide a review article that reviews the state-of-the-art comfort assessment methods and also highlights their importance in path-planning algorithms for AVs. Further, the work emphasizes a potentially growing research gap of AV's comfort. To derive optimal path-planning, e.g., Sun et al. (2018) proposes a Pareto optimization problem that guarantees an optimal strategy by considering comfort (methodology based on the ISO 2631) and driving requirements. Additionally, AVs can offer different seating positions, which potentially influence the user's ride comfort. Therefore, Salter et al. (2019) performed field experiments where test persons traveled in forward and rearward facing seats while performing office tasks. The results show a significant increase in the mean level of motion sickness with rearward sitting (especially in urban driving conditions). Hence, a conclusion can be drawn that there is a need for tools supporting large-scale ride comfort testing in AVs.
The current work contains three novelties that extend the currently applied methodologies for ride comfort determination of AVs. First, a generic simulation framework is provided based on the work from Nitsche et al. (2018) and Genser et al. (2019), allowing for the generation of vehicle dynamics data with arbitrary resolutions based on a user-specified simulation scenario. Monte Carlo (MC) simulations with LHS sampling are applied. This methodology allows for a representative determination of vehicle dynamics by varying the speed profiles and the lateral vehicle position to model varying driving behaviors. To also include weather conditions in our simulations, we utilize the road surface's friction coefficient as a proxy. Hence, the framework allows for an in-detail evaluation of available guidelines documents. Secondly, this is the first study that utilizes a high precision road surface model to determine (a) highly accurate vehicle dynamics from a simulation environment and (b) accurate ride comfort estimates. The data set for the surface model is measured with laser scanners, providing point data with a vertical accuracy of ±0.1 mm. Moreover, a high precision positioning is ensured with post processing of the recorded trajectories using post processing of base station data . The surface model is incorporated into our framework using the standard OpenCRG specification. Consequently, we utilize the simulation set-up for a model optimization with measurement data from test drives on predefined road segments. The well-known Levenberg-Marquardt algorithm is utilized to optimize the non-linear system and minimize the simulator's model error Levenberg (1944). This procedure can be easily extended to different test vehicles, new test sites, and/or different driving patterns.
Finally, we utilize high-resolution vehicle dynamics and extend the data to apply the most recent threshold values and the ISO 2631 procedure. Furthermore, the two methodologies are compared to comfort estimates based on the International Roughness Index (IRI) to underline the necessity of a robust and sophisticated determination procedure and draw the attention towards the importance of ride comfort data for motion planning of AVs to mitigate critical vehicle maneuvers. Comparison results are presented in the last section of the paper and derived insights are discussed.

ARCHITECTURE OF SIMULATION FRAMEWORK
This section introduces the proposed simulation framework to derive accurate ride comfort data. First, the simulation framework's architecture and the corresponding components are introduced, followed by an in-detail description of the road surface modeling and the MC/LHS techniques. Note that the methodologies for the derivation of ride comfort data are part of the simulation framework, but a detailed overview is provided in Section 5.

Simulation environment
The presented framework provides a flexible tool that can be utilized to determine ride comfort data. Figure 1 depicts the modular simulation framework and all its components. The inputs for the framework are configured in an XML-file. This provides maximal flexibility when specifying test scenarios and the corresponding probabilistic input parameters (block (1) in Fig. 1). The core is implemented with MATLAB/Simulink and provides the automation functionality, i.e. the MC/LHS techniques (see Section 3.3 for details), and the communication interface to the sub-microscopic simulation environment (block (2) in Fig. 1). CarMaker from IPG Automotive is used as a simulation tool (IPG Automotive GmbH, 2018), which offers the possibility to simulate AVs that computes a driving velocity based on an speed profile or utilizes an ACC. Also, a motion planning algorithm is incorporated that provides motion profiles based on the configured driving style. Here the interface between MATLAB/Simulink and CarMaker is utilized for the set-up, configuration of simulation inputs, as well as for exporting the output data (block (3) in Fig. 1).
In block (4) (Fig. 1), a database stores the output data, which are used for the ride comfort calculation. Here the methodologies for ride comfort estimation (i.e., thresholding, ISO 2631, and IRI) are applied. Finally, the ride comfort data are stored again in a database (block (5) in Fig. 1). Due to the generic design of the framework, input or output blocks can be readily replaced by different representation technologies. Furthermore, the simulation environment CarMaker can be replaced by any other simulator, compatible with MATLAB/Simulink and OpenCRG, as explained in the next section. In addition, the framework allows for an extension to Software in the loop (SIL) or Hardware in the loop (HIL) tests. Note, that such changes would not deteriorate the performance of the developed framework.

Road surface modeling
Given the objective to determine high-resolution ride comfort estimates, the simulation output should represent the real-world interactions between road infrastructure and vehicles as accurately as possible. Because the default CarMaker settings for replicating a road surface offer limited complexity, the modeling standard OpenCRG is used in the current implementation. OpenCRG uses a 2D Curved Regular Grid (CRG) for representing elevation data based on a proximal reference line. To derive a road surface model (i.e., road profile and reference line) the following empirical steps are necessary: Using a mobile mapping vehicle, data from a profile laser scanner are collected (900 points per profile over a transversal width of 4m, 800 profiles/sec.) together with a precise trajectory of the vehicle (sampling frequency 200 Hz). After post-processing the trajectory using a static base station, a point cloud is derived from the laser scanner's raw data and trajectory. From the trajectory, a Catmull-Rom spline is interpolated to be used as reference line. Consequently, the reference line is specified with a start/end location as well as heading angles. Along the reference line, equidistant transversal profiles are calculated with a sampling distance of 0.1×0.1 m Schmeitz et al. (2011). ASCII format is used for representing a CRG-file containing post-processed point data from a laser scanner. CarMaker allows importing such road surface files and interprets them according to the standard. For robust surface integration, an error detection procedure (i.e., measurement failures, point data outliers, etc.) must be implemented. The simulation software provides different interpolation methods. For this work, a cubic spline interpolation in the directions x, y, and z is utilized. For deriving an accurate road representation in the simulation, a spline smoothing is performed to define the best parameters λ x , λ y , and λ z that minimize the total error of the model.

Latin Hypercube Sampling technique
By default, a simulation run is performed with an AV following an ideal trajectory, i.e., in the middle of a road lane segment and a predefined speed profile. Essentially, this does not lead to representative vehicle dynamics that would allow for a reasonable ride comfort determination, as driving styles and speed choices usually diverge. Consequently, stochastic input parameters are introduced. All simulation runs are fed with random input samples to introduce a deviation from the idealized set-up. Consequently, a representative data set for every road section under investigation is collected. To reduce the computational complexity, the LHS technique is employed. The method follows the procedure of dividing the Probabilistic Density Function (PDF) of an input into n non-overlapping intervals, where each interval represents an equal probability value. Samples are generated in random order from every n with respect to the corresponding interval density value. Dependent on the number of input variables used for the sampling, denoted as m, an (n×m) matrix can be created Swiler and Wyss (2004). Note that the use of LHS requires a representative number of samples to obtain statistically meaningful simulation results. Different works in the literature (see e.g., Matala (2008)) propose guidelines on guaranteeing a specific confidence interval. In our work, we denote the deviation from a given speed profile sample v dev as a normal distribution with a mean value µ dev and standard deviation σ dev ; i.e., N v dev ∼ N (µ dev , σ dev ). For the lateral lane position, we follow the same approach and introduce the lateral deviation from the ideal vehicle trajectory as l p ; hence, we denote the PDF as N lp ∼ N (µ p , σ p ). To additionally model weather conditions, we utilize the friction coefficient of the road surface µ rs as a proxy. The input values are sampled from a uniform distribution denoted by U rs ∼ U(a rs , b rs ), where a rs and b rs are the distribution parameters.

VEHICLE MODEL VALIDATION AND PARAMETERS OPTIMIZATION
CarMaker software provides state-of-the-art modeling for the sub-microscopic simulation environment of virtual test driving. Although its models have been tested extensively, validated output data is not guaranteed. Because ride comfort estimation depends heavily on the simulated vehicle dynamics, a model validation procedure has been applied to determine the respective error. Therefore, a ground-truth data pool of vehicle dynamics should be available for comparison, such as, e.g., measurements by an Inertial Measurement Unit (IMU). For the validation, the velocity in the x-direction v x , accelerations in the three-dimensional space a x , a y , and a z and the Euler angle rates, i.e., roll Φ-, pitch Θ-and yaw Ψ-rate, are utilized. All parameters with their corresponding units are listed in Table 1.
Essentially, the real-world data can be used to evaluate the performance of the simulation. To assess the error of all quantities in Table 1, performance metrics such as Root Mean Square Error (RMSE) are commonly used. Nevertheless, the quantities in Table 1 have different physical where the function f (x, p) defines the non-linear simulation model with vectors x and p denoting the input and model parameter sets, respectively; y(x) denotes the measurement data (e.g., from an IMU device) and r(x, p) the corresponding residuals with respect to x and p. We can now utilize r(x, p) (by taking (1)) as our objective function to formulate a non-linear simulation-based optimization problem, aiming to minimize the residuals. We start describing the optimization procedure to minimize r(x, p) by explaining the algorithm steps depicted in Fig. 2. Initially, a simulation is performed with an initial guess of vector p, i.e., p i=0 . Based on the result r(x, p), the non-linear optimization Levenberg-Marquardt algorithm from Levenberg (1944) is initialized. For every iteration, the residuals are checked against a certain objective tolerance. When the result does not satisfy the postulated accuracy, p i is updated with p i+1 , which is the new gradient-based guess. When the objective function's magnitude falls below the desired tolerance level, a solution p i,end is returned. Note that the input S represents the simulation scenario's modeling, e.g., test vehicle, road surface, etc., which does not change over time/runs. The error is minimized by an optimization procedure described in Fig. 3. We select five parameters based on a sensitivity analysis from Traub et al. (2016). The parameter values for this   Fig. 3 are the initial input, output of the optimization procedure, and preliminary results between two optimization blocks OP i and OP i−1 , with i > 0, respectively. By defining reasonable lower and upper bounds for the parameters set, the optimization problem can be formulated as follows: minimize p∈R r(x, p) where K sf and K rf denote the front and rear spring stiffness, respectively, µ Tr the value of tire friction, K Tr the tire radial stiffness, and d Tr the tire radial damping. The constraints define the corresponding minimum and maximum values for all parameters. The optimization problem in (2) aims at computing the best parameter values, so as to minimize the error between the simulator and the real data. The obtained set of parameters is then applied to the simulation framework presented in Section 3 and allows the derivation of high accuracy comfort data. The methodologies for comfort estimation are discussed in the next section.

RIDE COMFORT DETERMINATION STRATEGY
To stress the importance of a robust ride comfort classification, we have based the determination of critical ride comfort on three common strategies in this work. First, established threshold values for vehicle accelerations are utilized to classify comfort. The compilation of different acceleration bands and the corresponding technical application is described in Section 5.1. Second, this work considers the commonly used ISO 2631 guideline, allowing for a comfort evaluation by deriving a frequency-weighted acceleration signal. Section 5.2 explains the procedure in detail. Finally, we compare the two evaluation strategies with empirical IRI data, allowing for comfort classification. The concept of the IRI is introduced in Section 5.3.

Comfort classification with thresholding
The application of threshold values in signal processing is a simple standard methodology to mitigate system exposure to extreme signal magnitudes. In the context of AVs, thresholding is applied (among others) to acceleration signals to ensure a realistic system behavior and a comfortable user experience. In this work, we utilize threshold values to determine comfort based on an extensive collection by Bae et al. (2019). This work collects comfort thresholds for public transportation (PT) and two different driving styles, i.e., normal driving (ND) and aggressive driving (AD) for the longitudinal and lateral acceleration, respectively. As the driver does not actively influence the AVs' motion planning, public transportation thresholds are of great interest and logically included in the analysis. In Bae et al. (2019) no threshold values for the acceleration and deceleration in the z-axis; a z and −a z (i.e., the vertical plane) are mentioned. To the authors' knowledge, no commonly used threshold for such quantities is available. Hence, the authors investigated a data set from Makridis et al. (2020), where the impacts of ACC on traffic flow and string stability are studied. Results show that a lower and upper border for a z and −a z can be derived at 0.1m/s 2 and −0.1m/s 2 , respectively. As Makridis et al. (2020) only investigate vehicle maneuvers that are classified as normal driving, this work investigates the vertical plane with an ND threshold. For the PT thresholds in the z-axis, magnitude is assumed to be equal to ND. For AD, we triple the magnitude and set the thresholds to -0.3m/s 2 for the deceleration and 0.3m/s 2 for the acceleration. Table 2 lists the utilized values for longitudinal acceleration a x , deceleration −a x , lateral acceleration a y and deceleration −a y , and vertical acceleration a z and deceleration −a z , respectively.
The thresholds are applied in the following to acceleration signal data to determine critical ride comfort roadway sections (Section 5.4).

ISO 2631 comfort classification
To provide an alternative classification for ride comfort concerning certain frequency ranges of acceleration signals, investigations based on ISO 2631 are performed. Generally speaking, low-frequency contents of a signal, i.e., for health, comfort, and perception between 0.5Hz and 80Hz, and for motion sickness ranging from 0.1Hz to 0.5Hz are considered. A frequency weighting technique based on ISO-guideline is used to examine the different applications and frequency ranges. The frequency weights are chosen upon the passengers' position (i.e., standing, sitting, or recumbent) and the application (i.e., human health and comfort, probability of vibration perception or occurrence of motion sickness). The weighting factor is chosen upon the application domain and indicated with the subscript b from ISO 2631, where b ∈ {c, d, e, f, j, k}. Depending on the situation one need to investigate (e.g., comfort for a seated person by utilizing acceleration a x ) the subscript influences the properties of the signal processing presented in the following.
Let a w b be the frequency-weighted acceleration signal by the type b. In order to obtain a w b , a filtering procedure with the corresponding filter design has to be performed. A high pass H h (p), low pass H l (p), acceleration-velocity transition H t (p), and an upward step filter H s (p) are introduced in International Standard Organization (1997). The filter equations are defined in the where ω i = 2πf i ; Q i and f i are the inputs for the transfer functions and defined as the resonant quality factors and corner frequencies, respectively; the index i ∈ {1, 2, 3, 4, 5, 6}; for further details the reader is referred to the guidelines in International Standard Organization (1997).
Equations (3) where H(z) denotes the total weighting function. By transforming H(p) in the digital domain z, which is obtained by replacing p = (2(1 − z −1 ))/(1 + z −1 ) in equations (3)-(6), a digital filter can be designed (Dyer and Dyer, 2000). A full list of the rearranged terms required to fulfill the requested form of the filter design can be found in Rimell et al. (2014). Finally, the filtering of the acceleration signal can be applied according to the following equation: Note that there are exceptions where certain filter operators of equation (8) where a w b ,RMS is the RMS value of a w b and T the duration of measurements for a w b . For the evaluation of comfort, ISO 2631 recommends the separate evaluation of directions x, y, and z, and the combination of the frequency weighted signals according to: Here, a v represents the total vibration value, a w b,x ,RMS , a w b,y ,RMS , and a w b,z ,RMS the frequency weighted RMS acceleration signals, and k x = k y = k z = 1 the corresponding ISO 2631 weighting factors per direction, respectively. The values of k x , k y , and k z diverge from the proposed values only in special designs, i.e., a seated person, affected by vibrations (a) of the backrest, (b) at the feet, or (c) around a rotation axis. The post-processed acceleration signal a v allows for the comfort level determination, according to the recommended threshold levels from International Standard Organization (1997) listed in Table 3. Consequently, the thresholding allows a classification into levels ranging from not uncomfortable (NU) up to extremely uncomfortable (EU). In addition, the guidelines recommend a threshold range for the probability of vibration perception from 0.01m/s 2 to 0.02m/s 2 . Investigations on human health and motion sickness have been neglected in this study, as vibrations have to be present for a certain amount of time and with a certain magnitude (e.g., negative impact on human health at ∼1m/s 2 for four hours International Standard Organization (1997)), which is not common in vehicle dynamics. Finally, the processed signal a v is utilized to determine critical ride comfort roadway sections with the proposed methodology in Section 5.4.

IRI comfort classification
Development of the International Roughness Index (IRI) started in the late 1960s and went through several stages of development after it was finally introduced by the World Bank as it is known today. IRI constitutes a dimensionless roughness index of a longitudinal road profile and is widely used in pavement assessments but is also deemed suitable to determine ride comfort.
To derive a robust and reproducible IRI, the methodology went through several stages of development. In the 1960s, researchers already choose a quarter-car-model to simulate the interaction between a vehicle and the road surface. As this model involves several parameters, a calibration by correlation procedure was carried out by processing data from the International Road Roughness Experiment (IRRE) (Gillespie et al., 1980). Consequently, the well-known Golden Car parameters were derived for a half-and quarter-car-model. As both models showed similar correlation level and the quarter-car-model allows usage with all profiling methods, today's applications are still based on the quarter-car-model. Also, researches widely agree that IRI should be measured at a velocity of 80 km/h Múčka (2020).
To define the quarter-car-model (definition based on Sayers (1995)) we assume a system consisting of a sprung mass (i.e., vehicle body mass supported by one wheel) denoted as m s and an unsprung mass (i.e., wheel or tire) denoted as m u ; m s and m u are connected through a suspension and a damper. For the quarter-car-model, the suspension spring rate k s and suspension damping rate c s are considered. Further, k t denotes the tire spring rate. As mentioned above, the Golden Car parameters derived during the model calibration process are utilized: c = c s /m s = 6.00; k 1 = k t /m s = 653.00; k 2 = k s /m s = 63.30; µ = m u /m s = 0.15. Note that for simplicity all parameters are normalized by sprung mass m s . Finally, the quarter-car-model defined by first-order ordinary differential equations (ODE) can be written in compact form as:ẋ = Ax + Bh ps , where vectorẋ and matrices A and B are denoted as and h ps denotes the smoothed elevation profile; z s and z u the vertical height of the sprung and unsprung mass, respectively; x the state vector holding z s and z u with its corresponding time derivativeṡ z s andż u , respectively. As IRI can be computed by accumulating the absolute difference of derivativesż s andż u , a standard algorithm for solving ODEs needs to be applied as suggested in Sayers (1995). Finally, the IRI is defined as follows: where L is the profile length and V is the measurement speed. The calculated index can be used to quantify road roughness. Therefore, thresholds allowing for classification of ride quality ratings into very good, good, fair, mediocre, and poor are defined by Federal Highway Administration (2000). Nevertheless, for classification, the assumption of a constant vehicle speed of V = 80 km/h must be full-filled. To overcome this limitation, Yu et al. (2006) defines speed-dependent IRI-thresholds (see Table 4).
As expected, the threshold magnitudes for all categories of ride quality increase with decreasing speed. To derive comfort estimates, thresholds from Table 4 are applied dynamically to the IRI measurement by considering a given speed profile. Again, we utilize the final signal to derive critical comfort road sections (Section 5.4).

Derivation of critical ride comfort roadway sections
To provide a quantitative analysis of critical comfort and obtain such road sections, we define a sliding window procedure. Let TS i be a test track with an index i. Assuming that on TS i multiple test drives are performed, consequently sets of accelerations A, B, and C are derived; i.e., a x (t) ∈ A, a y (t) ∈ B, and a z (t) ∈ C for all test drives. As all elements of A, B, and C contain different speed profiles, the position of the vehicle in space (denoted as s) at a specific time step t is not equivalent in all signals. Hence, a classification based on the time-dependent signals would not yield to the desired classification result. To overcome this issue, we transform all time-dependent acceleration signals to the space-domain. We merge samples of elements from A, B, and C at a specific point in space s. Hence, we derive a x (s), a y (s), and a z (s), which are utilized to apply a sliding window of w, where w represents the length of critical comfort road sections in meters. We perform a convolution of the acceleration signal and the corresponding thresholds from Tables 2, 3, and 4. The window is slid over the acceleration signals, and if for a complete window the applied threshold is exceeded, a critical comfort road section is found. All the simulation outputs from Section 3 are post-processed with the proposed methodology. Moreover, results from a case study in Austria are presented in Section 6.

CASE STUDY
The following case study discusses the application of the proposed framework. Demonstration scenarios are configured that consist of three test sites with available data measurements. First, we present the three test sites where measurement data were recorded. Afterwards, we set up the simulation framework and perform a first model validation, followed by the proposed model parameters optimization. Results of the improvements are presented in Section 6.2. To utilize our simulation framework with the MC/LHS option for deriving the vehicle dynamics data, we present the configuration in Section 6.3. Finally, we compare the ride comfort determination strategies, i.e., the thresholding procedure, ISO 2631 approach, and IRI classification. Classification of critical ride comfort roadway sections is presented for all studied methodologies, and a conclusion about the respective performance is drawn (see Section 6.4).

Test sections description and measurements
Data have been collected from three test sites in Austria (see Fig. 4) that differ in length, curvature, elevation profile, and average driving speed. T s1 is a federal road (classifier B in Austria) with a length of 3.10 kilometers and a corresponding speed limit of 100 km/h. Average slope of the test section is 0.90%. Second, T s2 represents a minor road (classifier L in Austria) with a length of 1.90 kilometers and a corresponding speed limit of 100 km/h. The average slop is again 0.90%. The third section is a minor road of 2.00 km length, denoted as T s3. Its speed limit is again 100km/h, and average slope is 7.1%. Figure 4 shows that the curvature of the test sections increase from T s1 to T s3; i.e., 129.50gon/km, 323.50gon/km and 354.70gon/km,  respectively. Hence, the average driving speed per test section will decrease, although the speed limits are identical. The road characteristics of the studied sections provide a variety of data that exploits the robustness of both the optimization procedure and comfort methodologies. The measurements were performed with an IMU capable of tracking high-resolution vehicle dynamics, mounted on the top of the test vehicle, and connected to a central unit. Data of the parameters listed in Table 1 have been collected and serve as an input for the model parameters optimization.

Vehicle model validation and parameters optimization
In order to validate the various model implementations of the simulator, a comparison between the measurement data and simulation outputs is performed. Table 5 provides the NRMSE for each test site. By averaging the NRMSE values per site, T s1 demonstrates the highest average error with 0.1455, followed by 0.1305 and 0.1198 for T s2 and T s3, respectively. These values justify the decision to optimize the models on T s1. Furthermore, Fig. 5 depicts a comparison of such corresponding signals from T s3. As the NRMSE allows comparability of the different errors and hinders the derivation of the error magnitude for one quantity, we state the RMSE in Fig. 5. The comparison plots show that the simulator models real-world behavior with sufficient accuracy and that there is room for improvement by performing an optimization procedure for model parameters.
Given the problem defined in equation (1), measurement data are defined as the reference y(x). The output of the simulation as a function of x and p is defined as f (x, p). The residuals shown in Table 5 represent the corresponding r(x, p). We utilize the defined optimization problem in (2) and constrain the problem with reasonable lower and upper bound values presented in Table 6.    In the following, the optimization procedure is applied to test site T s1. A simulation with standard model settings is defined as a benchmark (reference) case referred to as Ref. The optimized results are indicated with the abbreviation Opt. The results for T s1 are depicted in Table 7.
All three directions of accelerations a x , a y , and a z achieve an improvement of 1.22%, 1.02%, and 9.72%, respectively. Note that acceleration a z , is of particular interest here, as it induces forces that mainly originate at the road surface and then transferred to the human body. To further justify the optimization concept, the tuned models obtained from T s1 are then tested on sites T s2 and T s3. Tables 8 and 9 present the corresponding results. Once again, the three error values of the acceleration parameters have decreased. The improvement of 5.79% and 3.40% in the z-direction demonstrates the effectiveness of the optimization procedure. The negative gradient of improvement from T s1 to T s3 can be justified by the different sites' characteristics. Note that by optimizing each test site separately, these values could be further improved.

Stochastic simulation set-up
The speed and driving trajectory of human drivers frequently deviate from ideal trajectories; also, weather conditions change regularly in reality; all these factors of influence are modeled in our simulations. The lateral lane position of a vehicle l p , speed profile deviation v dev , and friction coefficient of road surface µ rs vary for each simulation run. The corresponding PDFs with the associated parameters are provided in Table 10.
It should be pointed out that the normal distribution for the speed deviation of T s3 is shifted, with parameters N (−5, 0.2), resulting mostly in negative speed deviation samples. This modeling approach has been chosen due to the high curvature of the specific test site. Note that positive speed deviations would have led to a high number of vehicles running off the road during simulations. To this end, 4.000 input samples have been created for every test site, and the corresponding simulations have been performed. This results in vehicle dynamics output data from 12.000 simulations in total.

Ride comfort results
As proposed in Section 5, the three ride comfort determination strategies are applied to the simulation output (i.e., the simulated vehicle dynamics data) and the IRI measurement data. Afterward, we compare the thresholding and the ISO 2631 strategy against the IRI comfort classifications.
First, we apply the thresholding procedure from Section 5.1. Before applying the methodology, the data needs to be pre-processed; i.e., signals must be transformed from time to space domain. Afterward, the thresholds are applied to the acceleration outputs a x , a y , and a z of all three test sections. Figures 6 and 7 depict the acceleration signals, the corresponding thresholds (only those thresholds which are applicable are utilized) and the resulting classification signal C Ts i ,{x,y,z} , where {x, y, z} stands for the set of investigated acceleration a x , a y , and a z .
A qualitative inspection of the classification outputs C Ts i ,{x} shows that the aggregated accelerations in the x-direction do not present significant deviations. Only two signal peaks at the beginning of TS 1 exceeding the threshold of normal driving and two signal peaks at the beginning and around 750 meters on TS 3 exceeding the threshold for public transportation comfort are identified. The a x signal of TS 2 does not exceed any threshold, and hence no criticalities are found. As the simulations are performed without any traffic interactions, and consequently the vehicle can move in free-flow conditions, the number of signal peaks of C Ts i ,{x} are expected. Note that fluctuations in the acceleration signals do occur as the road surface does influence the ACC, trying to maintain a velocity as close as possible to the desired one. The quantitative classification results draw the same picture and are collected in Table 11. Every test site is processed sequentially for critical sections, with a search window defined by an average car length of l cr = 5m. This value has been assumed to be reasonable in order to produce negative ride experiences; nevertheless, this parameter can be changed in the generic framework. The variable C represents the detected critical road sections; R c denotes the critical ratio of the test section in %; N and R n are the equivalent variables for the non-critical detections. Results in the x-direction are shown for T s1, T s2 and T s3 and for a x , a y and a z , respectively. Along with the qualitative inspection, T s1 shows 24 critical sections, which correspond to 3.97% of the test section's length that exceed the threshold for public transportation. T s3 shows 7 critical sections corresponding to 1.64% of its length.   An inspection of acceleration signals a y for all test sections shows that at all test sections signals exceed the public transportation threshold significantly. The number of detected critical Table 10: Representation of the probabilistic input parameters. The variables l and c denote the length and curvature of the test site, respectively.

Test Site
Variable Distribution Parameters l p Gaussian µ = 0 σ = 0.2 c = high µ rs Uniform a = 0.6 b = 1.0 Table 11: Detected critical ride comfort road sections based on the thresholding method for all test sites with a critical length l cr = 5m. Note that for a z → ±a x,ND = ±a x,PT . Note that PT = Public Transportation, ND = Normal driving, AG = Aggressive driving. C and N represent the critical and non-critical road sections; R c and R n the corresponding relative quantities.  Especially T s1 shows 194 critical sections for ±a z,ND and 38 critical sections considering the AG threshold. The relative numbers correspond to 32.07% and 6.28%, respectively. T s2 and T s3 show 51 (15.99%) and 19 (4.45%) critical sections for normal driving, respectively. For aggressive driving, T s2 and T s3 show 6 and 4 sections, respectively, corresponding to 1.88% and 0.94%. As the thresholding method does not combine the acceleration signals from all three directions, it is not possible to compare the comfort of test sections with one final performance metric. Nevertheless, the results show that T s3 contains significantly less critical sections for all three acceleration signals a x , a y , and a z . T s1 shows the most criticalities for a x and a z , whereas T s2 presents the most critical sections for a y .
Secondly, we apply the ISO 2631 method to the derived simulation results. First, the implementation of the ISO 2631 for comfort evaluation needs to be validated. Hence, a frequency weighting add-on from software DASYLab has been utilized. A sample acceleration signal with time length of t = 16sec was utilized. Let a we,org be the original acceleration signal, a we,matlab the frequency weighted signal of MATLAB implementation, and a we,dasy the output signal from DASYLab. The signals and frequency weighting results are depicted in Fig. 8. The output signals show that the implementation is valid and can be utilized for automated frequency weighting, which is a manual procedure in other software (e.g., DASYLab). Nevertheless, it should be pointed out that a small time drift can be observed in Fig. 8, which is due to the different integration methods used. Because of computational limitations, MATLAB numerical integration was implemented as a convolution integral, whereas DASYLab uses a different approach.
In the following, the automated post-processing has been again applied to the 12.000 simulation outputs by filtering, integrating, and combining the signals according to the procedure described in Section 5.2. Moreover, the time-dependent signals are transferred to the space domain in order to identify the critical sections. The final frequency-weighted acceleration signals are shown in Fig. 9 together with the recommended thresholds from ISO 2631. It can be seen that several signal peaks exceed the corresponding thresholds, thus indicating a negative ride comfort. Furthermore, the signal magnitude is an indicator of test site's smoothness; e.g., results of T s2 in comparison to T s3 show that the average signal magnitude below the threshold level LU is lower on T s3. Note that the threshold for the probability of vibration perception (0.01m/s 2 to 0.02m/s 2 ) is not shown in the results, as the magnitude is higher throughout the investigations, meaning that the perception of vibrations will always be present for the occupant.
To quantify the signal peaks, critical sections are determined per test site and shown in Table 12. Every site is processed sequentially for critical sections, again with a search window defined by l cr = 5m. It can be shown that T s1 has the highest number of little uncomfortable (LU) sections (54 sections/8.92% of the test site), followed by T s2 and T s3 with 20 and 8 sections, respectively. T s2 has the highest number of fairly uncomfortable (FU) and uncomfortable (U) sections, with 21 (6.58% of the test site) and 8 (2.51% of the test site), respectively. Consequently, the highest ratio of non-comfortable sections is detected on T s2, followed by T s1 and T s3. Note that no sections were classified as very (VU) or extremely uncomfortable (EU).
Finally, we utilize the empirical measurements, which allow for the derivation of IRI signals. Data were measured with a test vehicle from the Austrian Institute of Technology (AIT). The vehicle is equipped with a system (as introduced theoretically in Section 5.3) that allows a high-precision derivation of IRI. Note that IRI measurements were taken right before the vehicle dynamics measurements (utilized for model parameters optimization in this work). Hence, it can be guaranteed that the same road conditions are present on T s1, T s2, and T s3. The measurement equipment of the test vehicle from AIT allows for an IRI resolution of 50 meters. Therefore, we interpolate the data between two measurement samples. To derive ride comfort results, the speed-dependent thresholds from the work of Yu et al. (2006) are considered. To apply the semi-dynamic thresholding, and as the correct threshold for an IRI sample is a function of speed, all simulation runs' average speed profiles are utilized. Figure 10 depicts the results for T s1, T s2, and T s3.
As depicted in Fig. 10(a) T s1 overall shows very good and good ride quality. Only one signal peak at the beginning of the test section shows poor quality, and four signal peaks are classified with the category fair. T s3 shows a similar result with overall very good quality. Only a few signal peaks show good quality, and one signal peak identifies mediocre ride quality. Contrary to T s1 and T s3, T s2 shows several signal peaks identifying fair, mediocre, and even poor ride quality. Also, Fig. 10(b) shows long signal peaks in space, meaning that the number of critical sections is significantly higher than on the other test sections. To quantify the visual inspection, Table 13 presents the classification results for critical sections again with a length of l cr = 5m. Note that the category very good (VG) is not shown as all signals not classified by category good (G) to poor (P) are obviously a member of VG.
The category G identifies 30, 68 and 4 sections on T s1, T s2, and T s3, respectively; corresponding to 4.96%, 21.32%, and 0.94%. For categories F, M and P, T s1 shows low critical classification ratios, namely, 0.66%, 0.50% and 0.00%. Also, T s2 only classifies 3.04% as F and 0.00% as M and P. As already identified by ISO 2631 method, T s2 shows the highest number of critical comfort road sections: 11.29% are identified as F and 5.64% as M; no sections are classified as P.
A final comparison of the proposed methods shows that the thresholding method draws a different picture than the ISO 2631 and the IRI methods. The benchmark, i.e., IRI method, shows for all categories the highest number of criticalities on T s2 (68, 36, 18 and 0 critical sections for categories G, F, M, and P). Contrary, the thresholding shows for the acceleration a x the most criticalities on T s1 with 24 and 21 detected sections for the PT and ND threshold, respectively.   Thresholding a y shows for PT and ND T s2 as the most critical, although the result for AG is slightly higher (6 critical sections, 1.41%) on T s3. For a z , the most critical sections are detected on T s1 with a number of 194 or 32.07% of the whole test section. The ISO 2631 method results show the most detected sections exceeding the threshold for LU for T s1. Apart from LU, the number of most detected critical sections for FU and VU occur on T s2 with 21 and 8 or 6.58% and 2.51%. The algorithm detects no EU sections. Consequently, the detections of the ISO 2631 and IRI method are substantially similar, although the magnitude of detections is higher with the IRI method.
Although the thresholding procedure gives different results and draws a different picture for the accelerations a x , a y , and a z , the method is useful for the detection and/or mitigation of uncomfortable longitudinal or latitudinal vehicle motions. Uncomfortable longitudinal movements are detectable by a x exceeding a chosen threshold for uncomfortable acceleration/braking. By a y exceeding a chosen threshold, uncomfortable turns can be detected. Besides, a z gives insights into the roughness of the road surface.

CONCLUSION
This paper presents a novel method for deriving ride comfort data to improve AVs' route and motion planning systems. The proposed methodology provides a general framework for comfort evaluation and compares different methodologies of comfort derivation. The high-resolution outputs can be utilized for additional information to data fusion systems, stimulating AVs' efficient and passenger-oriented development. In addition, the modular structure of the framework allows for its integration to software or hardware in the loop tests that provide vehicle dynamics data as an output. The presented case study demonstrates that the selected test sites indeed show acceleration signal peaks that are critical concerning comfort. All three presented methodologies, i.e., thresholding, ISO 2631, and IRI method, identify critical comfort road sections. Although ISO 2631 and IRI results are substantially similar, the thresholding method can be utilized meaningfully to detect critical vehicle motions. The work suggests to utilize methodologies such as ISO 2631 as a minimum standard for AVs to quantify and mitigate negative ride comfort; i.e., the standalone thresholding of acceleration data can be misleading and over-or underestimate situations that potentially lead to negative ride comfort. The proposed framework can be implemented in future data fusion systems or added to a digitized map utilized by AVs, allowing for more efficient planning of routes and motions and avoidance of negative ride comfort experiences. However, a limitation of the proposed work is the necessity of accurate road data availability required to model the road surface in the simulation environment.
Future research should investigate the application of the derived comfort results to an AV's driver model to assess the different comfort evaluation methods' performance in real-time applications. Moreover, the parameters optimization results could be further improved by comparing different available solvers for non-linear problems. Also, several methods for defining the optimal initial values or application of a random restart under certain conditions could be considered. Finally, another interesting future research direction would be the application the obtained ride comfort data to data fusion systems of AVs and assessment the performance.