Data‐driven framework for warranty claims forecasting with an application for automotive components

Automakers spend billions of dollars annually towards warranty costs, and warranty reduction is typically high on their priorities. An accurate understanding of warranty performance plays a critical role in controlling and steering the business, and it is of crucial importance to fully understand the actual situation as well as be able to predict future performance, for example, to set up adequate financial reserves or to prioritize improvement actions based on expected forthcoming claims. Data maturation, a major nuisance causing changes in performance metrics with observation time, is one of the factors complicating warranty data analysis and typically leads to over‐optimistic conclusions. In this paper, we propose a sequence of steps, decomposing and addressing the main reasons causing data maturation. We first compensate for reporting delay effects using a Cox regression model. For the compensation of heterogeneous build quality, sales delay, and warranty expiration rushes, a constrained quadratic optimization approach is presented, and finally, a sales pattern forecast is provided to properly weigh adjusted individual warranty key performance indicators. The results are shown to dramatically improve prior modeling approaches.


INTRODUCTION
Automakers spend billions of dollars annually on warranty costs.Warranty reduction programs are among the highest priorities for financial departments, and accurate forecasting of automobile warranty performance plays a critical role in enhancing these efforts.The forecasting process concerns multiple key performance indicators (KPIs)/warranty metrics, among which the most important ones are the number of repairs per unit (RPU) and cost per unit (CPU) for specific time-in-service (TIS) values.][3][4] Forecasting warranty claims is challenging, especially in the presence of external dynamic factors like seasonality, delays in service and reporting, heterogeneous manufacturing quality from different plants, and compounded failures.
9][20] Survival models are often parametric and fail to capture non-trivial trends and are further not suitable for long-term forecasting due to poor parameter sensitivity, such as the power parameter in a Weibull distribution. 18Parametric and non-parametric time-series models such as recurrent neural networks and auto-regressive models are excellent for capturing non-linear temporal trends 11,21 ; however, they often fail due to year-to-year variations and the limited temporal data availability for the specific component of interest.Machine learning models have been proposed to address the component-to-component by solving an optimization problem to properly utilize historical data from other components that best mimic the component of interest 22 ; however, those approaches are not designed to address the cutoff-to-cutoff variations within each component as shown in Figure 1.The cutoff is the last date of the calendar month for warranty data.Finally, knowledge-based models often learn adjusting factors from historical data to eliminate the cutoff-to-cutoff variations 23,24 ; however, those models are population-based and not targeted for individual components.Overall, existing literature either (i) ignores the within component variations over observation times (cutoffs), (ii) fails to capture complex temporal trends, (iii) overfits the short history of data from the component of interest, and (iv) simplifies existing knowledge-based information to the population level.From practical expertise, warranty data maturation is one of the factors that make such a forecasting task challenging and often complicates warranty data analysis. 18,20he warranty data maturation phenomenon results in cumulative RPU and CPU estimates for a nominally homogeneous population to change as a function of the observation time, t0-see Figure 1.Overlooking/ignoring the data maturation phenomenon may have serious implications for warranty budget allocation.There are, typically, four main reasons causing the data maturation phenomenon. 18,25(i) Reporting delays refer to the under-reporting of claims due to the time lag between the repair date and the date when the claim is documented in the warranty system.(ii) Sales delays due to the variability in market demand, vehicles produced in the same time period are sold at various points in time, leading to variable reliability degradation at the time of sale.(iii) Production heterogeneity of the quality of the same component based on the production date due to changes in production or supplier quality issues continuously being learned.(iv) Warranty expiration rush is when customers tend to report non-critical failures after a certain delay or at the next maintenance appointment.
This paper proposes a non-parametric data-driven solution for warranty forecasting that addresses the four key causal factors for warranty data maturation.The contributions of the proposed approach are multi-fold and are described in Figure 2. Specifically, the approach (i) addresses warranty data maturation, including sales and production variations, (ii) captures non-linear and complex temporal trends like seasonality, and (iii) leverages an extensive historical database of warranty data from historical model years to forecast the warranty metrics for the component of interest.The remainder of the paper is structured as follows: Section 2 overviews relevant research for warranty data maturation.Section 3 investigates the proposed data-driven framework.Section 4 overviews the obtained results and provides directions for future work.

F I G U R E 2
Contributions of the proposed data-driven approach.

Reporting delays
One of the possible culprits of data maturation is the fact that the number of claims processed at each observation time gets under-reported due to the lag between the repair date and the date when the claim is processed and paid, that is, when it is documented in the warranty system.From an insurance point of view, this effect is called Failed But Not Reported (FBNR).If these delays are not properly accounted for, this could lead to a downward bias in predicting future warranty claims. 26][28][29][30] Using the Poisson model, Kalbfleisch et al. 19 address the issue of reporting delay by adjusting the count of the number of vehicles at risk at a specific time in service and adjusting the under-reported count of claims at a particular time in service.There are recent efforts to utilize deep learning for survival analysis, and such models may be used to estimate the reporting delays (e.g., Dynamic-deep hit 31 ).However, such deep learning models are trained on historical reporting delay data from prior model years, and that may result in prediction biases.Specifically, reporting delays are expected to have different dynamics every year based on economic performance, reporting system updates, warranty policy changes, and so forth.Several other research papers have also addressed the reporting delay problem by adjusting the count of risk set using an empirical distribution of reporting delay based on actual warranty claims data for a population. 32In the automotive industry, we conveniently have both the repair and load dates available in the system.As far as the failure date is concerned, indeed, it may be earlier than the repair date (left censoring).However, for most of the automotive components are subject to right censoring, which is easily accommodated by the Kaplan-Meier estimator.

Sales delay
Another circumstance that may cause data maturation is the sales delay or the so-called "lot rot" effect.Due to variability in market demand, units of the same production month may spend varying amounts of time in the warehouse or on parking lots or during shipping before they get sold.As such, they enter a given fixed time interval, t 0 , at different observation times.Notably, those units entering t 0 at a later observation time would have spent longer time on a lot and, thus, have degraded reliability compared to those that spend little time on lot.Such a population of units, although belonging to the same production month with the same initial quality, is no longer nominally homogeneous.An obvious remedy for the resulting maturation issue is to stratify the subject population by the time-on-lot (i.e., the difference between the sales and production dates).The importance of estimating sales delay in the context of analyzing warranty data has been pointed out in prior research studies. 33,34Several research studies have considered the sales delay from a probability distribution estimation standpoint.Suzuki et al. 35 and Wang et al. 36 used a sample of warranty claims to provide an estimation of the sales delay distribution where the exact monthly sales and the exact number of failures at each month are not readily available.Wang et al. 36 developed a model to estimate monthly sales in the absence of date-specific sales data where only the total sales value is available.Lim 37 and Karim and Suzuki 38 used Non-Homogeneous Poisson Process (NHPP) model, whereas Karim 39 used the lognormal distribution to find the sales delay distribution.Akbarov and Wu 40 on the other hand, suggested a method to analyze the impact of sales delay on the reliability of components under renewing and non-renewing warranty policies.Dorabati et al. 41 considered sales delay as well as the effects of NFBR (Non-Failed But Reported) and FBNR operators in estimating the expected warranty costs.

Warranty expiration "Rush"
According to Davis, 42 engineering failures are categorized as hard and soft failures.The former category includes failures associated with the loss of function (e.g., automobile's alternator failure), while the latter is associated with the degraded function (e.g., increased level of vibration of an interior trim element).It is intuitively clear that customers tend to report the hard failures right after they happen, and the soft ones (because of their non-criticality) after a certain delay.The upper limit for such delay is the warranty expiration date, which is why the manufacturers might see a sharp uptake in the number of claims at the failure times close to the warranty expiration time.In their earlier research, Rai and Singh 43 addressed this bias by truncating the warranty claims data and removing claims below 1000 miles as well as claims happening beyond the last 1000 miles of warranty coverage and consequently estimating the expected first failures at each month in service.In subsequent research, Rai and Singh 44 considered situations where mileage accumulation rates are available for the population of interest and proposed a methodology to overcome such bias by treating soft failures as left-censored data points.To identify these claims, they suggest using technician comments or engineering analysis.

Month of production heterogeneity
The quality and reliability of automotive components in the same model year population could vary from one month of production (MOP) to another due to changes in production.For example, if there is an issue in the initial MOPs for a specific component and the problem is caught early on, it can be addressed and fixed in future MOPs.This often (but not always) leads to higher failure rates for specific components in the earlier MOPs compared to later MOPs (see Figure 3).The vehicles' failure rate is measured at different Time In Service (TIS) points, that is, the time the vehicles are on the road.Consequently, the one TIS line is a measure of the vehicles' quality after one month on the road.In some cases, the heterogeneity in MOPs could be attributed to quality problems of suppliers that support certain MOPs, in which case automakers can make recovery claims to suppliers to get reimbursed for the warranty costs associated with the supplier defects.Mittman et al. 45 and Lewis-Beck et al. 46 have investigated the issue of heterogeneous production populations in constructing prediction bounds for future failures.In their research, they use a Bayesian hierarchical model to borrow information across heterogeneous subpopulations, which is particularly beneficial in cases where some subpopulations have limited information.To capture the various failure modes within these subpopulations, they use the Generalized Limited Failure Population (GLFP) model to model the multiple lifetime distributions jointly.

Problem formulation and AI/ML framework
One of the most critical warranty performance indicators to monitor is the repairs per unit (RPU) metric as a time-dependent function of TIS and observation time point (cutoff).Every warranty repair is associated with a certain failure time.The failure time is estimated as the difference between the repair date and the warranty start date and is assumed to be equated with the time of failure.Failure time and TIS are, strictly speaking, continuous variables, but due to the data collection procedures, they must be treated as discrete ones.Let v t denote the number of vehicles whose TIS is t.Let  tq denote the number of repairs with claim times q for vehicles whose TIS is t, q ≤ t.The TIS and claim times are measured in time intervals (e.g., day, month, year) and they both cannot exceed the observation time Ω, q ≤ t ≤ Ω.It must be stressed that all values depend on the cutoff because the number of repairs and the number of sold vehicles often increase at later cutoffs.Based on that, we can formulate the incremental RPU at TIS t at a given cutoff Ω by: with number of repairs with claim time t which is only possible for vehicles whose TIS is larger than or equal to t the number of vehicles whose TIS is larger than or equal to t resulting into the cumulative RPU at TIS t We assume that the warranty mileage limits are already considered in Equation ( 1) by updating the number of vehicles for mileage dropouts.A typical claims database at a given cutoff can be represented in the form of Table 1.The RPU development over cutoff needs to be understood and foreseen to set the warranty reserves to the right level and to control the cash outflow.Figure 4 shows the changes over the cutoff for vehicles produced at different MOP.From the figure, we can see that (i) individual MOP shows different RPU curves, and (ii) RPU values are significantly changing with observation time (cutoff).The first observation is due to MOP heterogeneity, and the second observation is expected due to changes over cutoffs (observation times).For warranty reserves and quality assessment, it is of interest to forecast the warranty claims until the end of the warranty, typically 36 or 72 months for automotive components.
In this work, we propose a data-driven framework for warranty forecasting.First, we entangle the RPU metric into single MOPs to split potentially heterogeneous production quality into internally homogeneous populations.Second, we account for the reporting delay on each MOP and extrapolate the MOP corrected for the reporting delay to the needed  extent (thus accounting for the remaining heterogeneity within the MOPs, lot rot (as a part of the sales delay), and warranty expiration rush.Third, after the correction of individual MOPs, a sales pattern should be derived and applied, in order to ultimately calculate the weighted average of MOP-wise RPUs at a given point in the future, based on the already produced population.The overall framework is described in Figure 5.

Reporting delay model-cox model
In this section, we uplift the recorded warranty data (number of claims) by a dynamic factor learned through a reporting delay model.This is because not all claims are reported instantaneously and therefore the reported number of claims to the system is always underestimated.However, there are still some claims already occurred and have not been reported to the system because of processing delays.With reporting delays, we extend the definition of the observed incremental RPU at TIS t for cutoff (observation time)  from Equation (1) to: Here, n t is the number of in-service vehicles with TIS larger or equal to t, and  qtl (Ω) is the number of claims at cutoff (observation time)  with reporting delay l present for vehicles with TIS q and claim time t.

F I G U R E 5
The proposed data-driven framework for warranty forecasting.

F I G U R E 6
Effect of reporting delay dependent on a covariate.
Note that the reporting delays could be infinity, but we only have information until time .Hence, the claims with reporting delays that extends beyond the recording observation time (i.e, l ≥  − q − t) are not yet in the system.Mathematically, the adjusted incremental RPU at TIS t for cutoff  to be: From domain-knowledge, we know that there exist critical covariates in the automotive industry that impact the reporting delays such as the cost for the repair or the location of report.Therefore, we consider the situation that the reporting delay depends on covariates, which influence the amount of the reporting delay l.Specifically, we model the reporting delay effect as a cox model as a function of covariates X. Figure 6 shows the cumulative reporting probability as a function of one of the covariates for a real-case study for a specific vehicle line using our proposed Cox model described later in this section.It is clear from the figure that we observe different reporting delay effects at different covariate levels.Mathematically, the adjusted number of warranty claims with claim time q on vehicles with TIS t at cutoff  that have occurred but not yet reported in the system is: where L is the random variable associated with the reporting delay, X ∈ R p is a vector of p covariates, and  qtx is the observed number of warranty claims with covariate x and claim time t for vehicles with TIS q.The uplift factor  qt (, X = x) is the inverse of the survival function S(L = Ω − q − t|X) = P(L ≥ Ω − q − t|X) = 1 − P(L ≤ Ω − q − t|X) that represents the probability of a warranty claim that occurred at time p on a vehicle with TIS t but not yet reported in the system at cutoff .Under the assumption that the reporting delay pattern is unknown but presumably constant, S(u|X) is estimated from historical data.
Reporting probability is stretched/compressed according to the shape parameters for the four factor levels.
For modeling the survival function, we consider the Cox regression model 47 for the following reasons.(i) It is nonparametric in time, and it captures non-linear temporal trends.(ii) The proposed use of the Cox model is an extension of the Kalbfleisch model, 20 originally proposed to model the reporting delays.(iii) It is a well-established model in survival analysis for integrating covariates (a.k.a., the explanatory variables).(iv) It is robust to prediction biases because it is trained separately for each population (e.g., vehicle line) or sub-population (e.g., a MOP for a vehicle line) of interest, unlike machine learning models.For our problem statement, the hazard function is interpreted as the probability of a warranty claim to be reported on day l after the repair.Last but not least, the parameters in the Cox regression model are interpretable.There is a direct dependency between the estimated parameters in the Cox model and their influence on the reporting delay-the curves in Figure 6 depend on the coefficients in the Cox model.Such interpretability supports decision-makers in developing mitigation strategies to reduce reporting delays.The hazard function for a warranty claim with reporting lag l and covariate vector X can be estimated as: where h 0 (l) is the baseline hazard and  ∈ R p are coefficients measuring the impact of the individual covariates.In the following, the covariates are assumed to be independent of l.The corresponding survival function is given by: For practical applications, we should note the following: 1. Normally, the reporting delay pattern is unknown.Under the assumption that the reporting delay pattern is somehow stable over time, historical data can be used for the estimation of the survival function S(l|X).When selecting appropriate training data, it is necessary that the reporting delay pattern is fully reflected and not censored.Particularly, the observation time Ω should much larger than the time of the latest recorded claim.2. When estimating  adjusted qt () in Equation ( 7), we recommend restricting the adjustments by an upper bound .
3. Sometimes the number of forthcoming warranty claims within the next  days addressed by reporting delay is of interest.In this situation the parameters t, Ω and q are assumed to be fixed and an adjusted uplift factor according to Equation ( 7) can be used

MOPs sharing model-Quadratic optimization
Warranty claims from a component often suffer from MOP heterogeneity.In this, we mean that the quality of production differs by MOP due to continuous changes to the manufacturing design that the quality department or customers raise; refer to Section 2.4 for more details.This section models the warranty metric (KPI) at the MOP level for the component of interest to avoid uncertainty introduced by MOP heterogeneity.For simplicity, we consider the components at different MOPs as distinct components in this section and we will focus on RPU as the KPI of interest.To forecast warranty metrics for a component of interest, we develop an approach that searches an extensive database of historical components and identifies the ones that share similar trends at different maturity levels with the component of interest.The identified historical components are then used as surrogates for warranty forecasting.The observation time/cutoff defines the maturity level.

Data pre-processing: Bias reduction and data standardization
We first adjust the RPU for reporting delays to reduce the biases between different components using the reporting delay model developed in Section 3.2.Here, a reporting delay model is developed for each historical and in-service component to eliminate component-to-component differences resulting from reporting delays factors.Thus, the adjusted RPU for reporting delays is better suited than the observed RPU when searching for patterns between different model types and model years.We then standardize the data to reduce biases due to mean shifts and it is mathematically calculated as: Here,  is the observation time (cutoff) of a particular system and it defines the maturity level.For example, R i (t = 1,  = 10) is the cumulative RPU for component i at TIS 1 month when observed after 10 MOPs.Rk (t = {0, … , Ω}, ) is cumulative RPU adjusted for reporting delays described in Equation ( 6) and it is mathematically formulated as: Here, r adjusted t () is the incremental RPU calculated using Equation (6).

Constrained quadratic optimization for warranty forecasting
This section focuses on the problem of forecasting the cumulative RPU for the next k months conditional on the currently observed warranty claims.For example, forecasting the cumulative RPU at the end of the first year of production given the observed RPU from the first six MOPs, which is mathematically equivalent to Next, we model the cumulative RPU at TIS t and maturity level (observation time)  * for component k as a weighted mixture of the cumulative RPU for historical components at TIS t and maturity level  * .The weights measure the similarity between the warranty trend of the historical training components and the testing component of interest at an observed maturity level . Here, is the importance weight of historical component i for predicting the cumulative RPU of testing component k, given the observed cumulative RPU of both components up to maturity level Ω and the cumulative RPU of the historical components at maturity level  * .For notation simplicity, we denote the weights as w ik ( Ω|Ω * ) .We will be considering two objectives to optimize for the weights w k ( Ω * |Ω ) .The first objective is favoring historical components that share similar RPU trends to the component of interest by minimizing the pairwise deviations between the selected historical components and the component of interest at all observed maturation levels up to .The second objective is to reduce the likelihood of highly weighing conflicting historical components (dissimilar) by minimizing the deviations between the selected historical components at the futuristic maturation level  * for the selected historical components.For the first objective, we define the weighted sum of squared differences in Equation ( 16) to quantify the dissimilarity level between any two components i and j.For the second objective, we define the squared difference in Equation (17) to quantify the dissimilarity between two historical components i and i ′ at the futuristic maturation level  * .
Here, Δ is the number of prior maturation levels (cutoffs) considered for calculating the dissimilarity value.
The weights are then optimized by solving the constrained quadratic optimization problem in Equation (18).The constraints are introduced to avoid overfitting and trivial solutions.Specifically, the upper bound constraint on the weights is introduced to encourage selecting a bigger set of historical components to reduce the likelihood of overfitting, and the normalization constraint is introduced to achieve unbiased predictions.The similarity weights between historical MOPs provide partial explainability for the KPI forecast for the MOP of interest, unlike time-series deep learning models.
Here, A ∈ R n×n is a diagonal matrix of dissimilarities between the component of interest k and the historical components with elements d ik (Ω, Δ), n is the number of historical components, B ∈ R n×n is a matrix of pairwise dissimilarities between the historical components with elements  ii ′ ( Ω * ) , and  is a tuning parameter learned to optimize the predictive performance.

Sales pattern model for automotive components-Extra trees
For warranty forecasting, an accurate sales pattern prediction helps in (i) modeling the lot rot effect (reliability degradation due to sales delay) and (ii) accumulating the MOP-level predictions to a component-level prediction by normalizing the outcomes for the MOP-shared model based on forecasted sales contributions.While sales forecasting is a well-known problem, it poses unique challenges for different applications.For our application of interest, we need to forecast the contribution of the sales per MOP for the upcoming months of sales.Considering the variations in the total number of production months and sales months for each vehicle line in individual model years (e.g., a vehicle model could be produced for 12 months in model year 2020 and for 18 months for model year 2019), we need features that are consistent between new and historical components.3) Sales patterns: The cumulative sum and last three months of sales help capture long and short-term sales trends, so the model could be adjustable with respect to the market dynamics.

Extra trees regression
Since we need to be able to forecast the warranty metrics for a set of consequent feature months, we develop a sequential multi-output model.One may exploit advanced models such as Long Short-Term Memory (LSTM) neural networks to develop such a sequential model.However, these models will need a considerable amount of data to be well-trained.The alternatives are the decision-tree-based methods.These approaches are highly efficient methods, especially in handling missed values and where we face the lack of enough training samples.While these models are not by default considering the sequential relationships, our approach to feature definition diminishes the need for inherent sequential modeling.In this work, we use the extra tree regression as our predictive ML model, as opposed to other decision-tree-based approaches such as the Random Forest, since it has a more generalizable optimum split selection strategy and while it is more diversified (please refer to Reference 48 for more info).Finally, the cumulative RPU at TIS t for cutoff Ω for the component of interest consider all MOPs is estimated by: Here, αk is the predicted sales contribution from MOP k using the Extra Trees, and Rk (t, ) is the forecasted cumulative RPU for MOP k using the MOP sharing model described in Section 3.3.

Extensions beyond automotive case-studies
The proposed data-driven framework is extendable to forecast KPIs for various fields, including medical and financial sectors.Specifically, the proposed methodology is effective for case studies that can be decomposed into sub-groups (the equivalent of MOP-level), where each sub-group may exhibit a different trend and the impact of each sub-group varies.
Here, the MOP-sharing model could forecast KPI at the sub-group level.A revised sales delay model could be used to predict the impact of each sub-group.For example, financial markets can be divided into different sectors; each sector is expected to have a different trend.The impact of each sector (sub-group) depends on the volume of investments that varies daily.The presented framework can also be extended to case studies, where main KPIs are heavily biased due to unmature data.Special attention can be given here to the medical sector in case of infectious diseases or a pandemic situation.There is always a delay between the infection date of a disease and the outbreak of the disease.The proposed models could be used to provide a better overview of the real extent of a disease.

RESULTS
This section details the performance of the proposed data-driven approach on several automotive datasets that consist of multiple vehicle lines.Here, the production of each vehicle line spreads across different months, noting the heterogeneity at the MOP-level.Thus, we detail the forecasting performance at the MOP-level using the MOP-sharing model to compensate for MOP heterogeneity.We also detail the performance of the sales delay model using K-fold cross-validation, which quantifies the relative weight/impact of each MOP for the entire vehicle line production.Finally, we discuss the overall KPI forecasting performance for the proposed data-driven framework after baking in the proposed ingredients (reporting delay, MOP-sharing, and sales delay).For confidentiality, we show the RPU prediction performance at the MOP-level in Section 3.1 for one market and the CPU prediction performance at the vehicle-line level in Section 4.3 for another market.

MOP sharing model
All the results shown in this section are at the MOP level.We first show the predictive performance for two MOPs in Figures 7 and 8, and then extend that for an entire market with a total of 32 MOPs in Figure 9.Note that the MOP-sharing model is implemented for each MOP.
Figure 7 shows a real case study that demonstrates the predictions of the cumulative RPU using the MOP sharing model.The red cross curve represents the mature RPU curve, the black circle curve represents the predicted RPU curve, the solid blue lines represent the scaled historical RPU curves with non-zero weights.The solid vertical black line splits the figure into the forecasting region in TIS to the right and the observed region in TIS to the left.Both sides of the vertical black line (the entire figure) are for a forecasted maturation level Ω given the current maturation level Ω * (not shown in the figure).The figure is also annotated to explain the contributions of the different parts of the objective function.The first objective presented in Equation ( 18) and described in Equation ( 16) is based on the TIS points in the observed region at different maturation levels.The second objective presented in Equation ( 18) and described in Equation ( 17) aims to minimize the green shaded bar at TIS 25 by minimizing the differences between the selected historical components.It is clear that the solution of the MOP sharing optimization problem results in identifying historical components that are very close to each other in the observed region and considerably close to each other in the forecasted region.It is worth mentioning that many of the solid blue lines that are on the upper and bottom edges of the green bar have small but non-zero weights.Finally, the predicted RPU curve is near a perfect fit with the mature RPU curve, which indicates that our model accurately forecasted the RPU curve from observation time Ω = 13 to the end of warranty observation time Ω * = 25.which is not possible to account for using traditional survival models.The figure also shows that the MOP sharing model clearly (i) adjusts for maturity at the observed TIS points with near-perfect accuracy and (ii) provides good forecasting performance that accounts for unexpected events.Using traditional survival models without adjusting for maturity would have resulted in extreme undershooting for warranty claims which have a significant drawback for warranty reserve and manufacturing adjustment for newer model years.reduces the error on average by 15% and (ii) provides a tighter 95% confidence interval (−12.5%,12.5%) compared to that of traditional survival models (−20%, 30%) This holistic study further shows the effectiveness and robustness of the MOP sharing model compared to the common practice approach.

Sales delay
The sales delay model is solved once for a large database of 32 historical vehicle lines from four prior model years.We explored single and multiple output regression models, including LSTM networks, autoregressive models, and ensemble methods.Figure 10 shows the performance of those models on 32 vehicle lines using 10-Fold cross-validation to maximize the R-square value.The figure shows that the ExtraTrees regressor performs best with the highest average and median R-squared value over the 32 vehicle lines.
We then fine-tuned the hyperparameters of the ExtraTrees model using grid searches.For performance evaluation, we applied the Extra Trees model on separate testing data with 40 vehicle lines, and the results are shown in Figure 11.The results show that the majority of the test cases are on the first bisector showing an excellent match between the predicted and actual sales volume.We dived deeper into the test cases with poor performance and found out that most of them were test cases at an extremely early stage of production with a limited number of sales.
With ExtraTrees, we can identify which features were significant for training the model, which builds more confidence and trust in the model.The feature importance results are summarized in Figure 12.The most relevant features were the total cumulative sales volume per MOP and relative sales with respect to production from the past three months per MOP.There were negligible effects from seasonality (sales date) and MOP number.

Comprehensive results on an automotive case-study
Previous sub-sections show results from individual components of the data-driven approach.Here, we show a more comprehensive performance evaluation of the proposed data-driven approach compared to a benchmark.Specifically, we have implemented this approach to predict overall warranty performance across 40 vehicle lines for a US auto manufacturer.We used historical data from nine model years as a training dataset.Our target was to predict the RPU and CPU

F I G U R E 12
A plot showing the associated feature importance (left).A plot showing the associated permutation importance (right).
of the 40 vehicle lines at a cutoff that is close to the end of warranty (e.g., August 2022) starting with a cutoff that is 24 months earlier (e.g., August 2020).Forecasts for the target cutoff have been made at 11 successive cutoffs on the vehicle line level until cutoff (e.g., July 2021).Consequently, at each cutoff all available vehicle lines were predicted and then compared to their corresponding truth.The results show a major improvement for the proposed data-driven framework over the benchmark method.The boxplots in Figure 13 represent the relative deviations between the forecasted and true CPU values.Note that the baseline values, which are reported to the system, are always over-optimistic regarding CPU performance, whereas the proposed data-driven framework is only over-optimism at early cutoffs.We can also observe the performance improves over cutoffs because more vehicles are sold, and the number claims reported is maturing until it stops changing after all vehicles are sold and reach the end of warranty.The figure clearly shows that the new solution significantly improves the prediction accuracy, where only after the fourth observation time (cutoff) for the current model year, all predictions are in a ±5% interval around the truth.Such accuracy at early observation times enables early alarms for quality issues to be improved in later MOPs and helps to accurately assign warranty reserves periodically.

F I G U R E 13
Relative deviations for baseline and forecasted cost per unit values at 11 consecutive cutoffs.

CONCLUSIONS
Automakers are spending billions of dollars annually on warranty costs.Accurate forecasting of warranty performance plays a critical role.Data maturation phenomenon that causes warranty performance to change with the observation time makes such a forecasting task challenging.The diverse nature of maturity factors indicates using a hybrid approach over a unique model.In this research, we propose a data-driven framework for warranty forecasting that (i) addresses warranty data maturation, (ii) considers non-parametric models to capture non-linear and complex warranty trends, and (iii) leverages adjusted historical warranty trends from historical components.Specifically, we first compensated for the Reporting Delay effect using a Cox regression model.Second, we developed an MOP sharing model as a Constrained Quadratic Optimization problem that forecasts the warranty metrics at the MOP-level, considering MOP heterogeneity and utilizing adjusted warranty trends from historical components at MOP level to account for Warranty Expiration Rush.Finally, a Sales Delay is considered to forecast the sales pattern of already produced vehicles to weigh the adjusted MOP curves properly.It is important to note that the three suggested models are either fully explainable (Cox) or partially explainable, which shows their advantage over the "black box" AI/ML models.The proposed data-driven framework results show a significant improvement over the baseline that uses population-based uplift factors to adjust for warranty data maturation and survival models for forecasting the warranty metrics over time in service.Finally, the inferences from the proposed model have been used as decision support in scoping the warranty reserves of the company.

F I G U R E 1
Key performance indicator (KPI) (e.g., CPU) estimates for a nominally homogeneous population changing as a function of the failure time (TIS) and the observation time.Each curve represents the same KPI at a different observation time (cutoff).Due to the cutoff dependence, the values of the three functions for the same given failure time (TIS) vary.

F I G U R E 3
Repairs per unit (RPU) for different months of productions measured at different time-in-service (TIS) points: (A) earlier months of productions (MOPs) have higher failure rates, (B) later MOPs have higher failure rates.

F I G U R E 4
Repairs per unit (RPU) progression of different months of productions (MOPs) at a fixed time-in-service over cutoff dates.

3. 4 . 1
Relevant features 1. Relative MOP: This feature indicates the MOP order with respect to the initial mass production MOP (the mass production MOP is defined as the first MOP where the production volume exceeds a certain threshold.)2. Relative MOS: This feature indicates the MOS order with respect to the first MOS (regardless of sales volume.) 3. Quarter of sale: This feature defines the calendar season.4. Cumulative sales per MOP as a function of cutoff: This feature identifies historical information.5.Last three months of relative sales per MOP: This feature identifies relative sales patterns.The selected features offer the following advantages.(1) Consistency: The features are consistent between different model years.(2) Seasonality: The features track seasonality with the quarter of sale and the last three months of sales.(

Figure 8
shows another real case study that demonstrates the efficacy of the MOP sharing model, the y-axis values were removed for confidentiality.Red crosses represent the mature RPU curve at maturation level Ω * = 25 months, the black circles represent the current observed RPU curve at maturation level Ω = 11 months, and the solid blue line represents the predicted RPU curve corrected for maturity using the MOP sharing model.The figure shows that the observed black line can be used neither to predict the RPU at maturation level Ω * = 25 months, nor for the observed TIS and nor for the forecasted TIS points.The observed RPU is way below the mature RPU, and it misses a major event at TIS 12 months F I G U R E 7 A real case study that demonstrates the forecasted cumulative repairs per unit (RPU) from maturation level Ω = 13 months to level Ω * = 25 months as a function of time-in-service (TIS).Red cross: mature RPU curve, Black circle: predicted RPU curve, Solid blue line: scaled historical RPU curves with similar historical RPU curves (closest cluster), vertical black line: to the right forecasted TIS.Here, we focus on three prior maturation levels to measure the dissimilarities between components (Δ = 3).

F
I G U R E 8A real case study that demonstrates the forecasted cumulative repairs per unit (RPU) from maturation level Ω = 11 months to level Ω * = 25 months as a function of time-in-service (TIS).Red cross: mature RPU curve, Black circle: observed immature RPU curve, Solid blue line: predicted RPU using month of production (MOP) sharing model.Here, we focus on three prior maturation levels to measure the dissimilarities between components (Δ = 3).

F I G U R E 9
Histogram of relative errors for 32 real case studies at forecasted time-in-service = 25 months and adjusting for maturity from Ω = 13 months to Ω * = 25 months.Right figure: without month of production (MOP) sharing model, Left figure: with MOP sharing.

Figure 9
shows the distribution of relative errors at TIS 25 months with and without the MOP sharing model for 32 case studies.The right figure shows the results without the MOP sharing model and the left figure shows the results with the MOP sharing model.The RPT CO represents the RPU at TIS 25 for maturation level Ω = 13 months and RPU final is the RPU at TIS 25 for the maturation level Ω * = 25 months.The figure clearly shows that the MOP sharing model (i)

F I G U R E 10
The plot shows R-squared values for 32 different vehicle lines using 10-fold cross-validation for ExtraTrees, ARIMA, LSTM, and Prophet models.

F I G U R E 11 A
scatter plot demonstrating the prediction v.s.actual sales volume for a testing vehicle line overall MOP-TIS combinations (each represented by a point in the graph).
General format of a typical failure data set at a given cutoff (observation time) Ω.
TA B L E 1Abbreviation: TIS, time-in-service.