Determination of tolerance limits for the reliability of semiconductor devices using longitudinal data

Design and production of semiconductor devices for the automotive industry are characterized by high reliability requirements, such that the proper functioning of these devices is ensured over the whole specified lifetime. Therefore, manufacturers let their products undergo extensive testing procedures that simulate the tough requirements their products have to withstand. Such tests typically are highly accelerated, to test the behavior of the products over the whole lifetime. In case of drift of electrical parameters, manufacturers then need to find appropriate tolerance limits for their final electrical product tests, such that the proper functioning of their devices over the whole specified lifetime is ensured. In this study, we present a statistical model for the determination of tolerance limits that minimize yield loss. The model considers longitudinal measurements of continuous features, based on censored data from stress tests. The tolerance limits are derived from multivariate distributions where the dependence structure is described by different copulas. Based on extensive numerical testing, we are able to provide insights into the properties of our model for different drift behaviors of the devices.


INTRODUCTION AND PROBLEM DESCRIPTION
Semiconductor devices are used as key construction elements for a multitude of products such as cars, mobile phones, and personal computers. In most cases, these products heavily rely on the proper functioning of semiconductor devices. Therefore, procedures to ensure prespecified reliability requirements for these devices are of crucial importance for quality control of a wide range of electronic products.
Since semiconductor devices are key elements of many electronic products, each device needs to meet strictly defined requirements over its whole specified lifetime. As malfunctions can be caused by an abnormally high or low performance in one of the physical characteristics of the device, lower and/or upper specification limits S = [LSL, USL] are defined for each electrical feature of the device. To ensure that the devices will meet these strict requirements over their whole specified lifetime, manufacturers draw random samples from their production line and conduct a number of stress tests on them. These stress tests are highly accelerated, letting the devices age artificially fast by exposing them to high temperatures or voltages, for instance, Elsayed. 1 By means of these stress tests, the behavior of the devices during their whole specified lifetime can be simulated. Based on the measurements of these stress tests, we calculate tolerance limits L = [LTL, UTL] such that L ⊆ S and the respective parameter stays within S over the whole specified lifetime with a probability close to 1. More specifically, the probability that a physical parameter of a device moves outside the specification limit over its specified lifetime is, eg, 1 ppm. Using L, which is calculated based on the devices' measurements from the stress tests, every device leaving the production process is tested (without entering the stress test procedures) before it is delivered to a customer. If the measurements of all tests are within their respective L, the device is ready for further processing and shipment. If at least one of the physical parameters of a device is outside L, the device cannot be delivered and will be scrapped. The present study addresses improvements of this quality control procedure by calculating optimal tolerance limits L ⊆ S, such that all subsequent measurements are within S with a probability higher than q (for example, q could be 1 − 1 ppm) to ensure the specified functionality of the device. Thus, from a computational perspective, our study places considerable demands on computational accuracy, as the computation of the resulting probabilities needs to be evaluated to very high precision levels.
Issues of quality control for semiconductor devices have attracted much attention in both academic research as well as in practice. As highlighted in Jeong, 2 guardbanding, ie, setting tolerance limits tighter than the specification limits (see Figure 1), is a crucial tool for controlling the future product performance. When setting tolerance limits, manufacturers need to balance stringent reliability requirements with considerations of avoiding an unnecessary high yield loss and the resulting costs. This trade-off between reliability and yield considerations are described in Kuo. 3 However, in contrast to Jensen, 4 our tolerance limits are calculated based on longitudinal data from accelerated stress tests. While there is a large body of literaturev Usage of guardbanding for the early separation of conforming and nonconforming devices. LSL, lower specification limit; LTL, lower tolerance limit; USL, upper specification limit; UTL, upper tolerance limit on the design of accelerated stress tests and the determination of stress levels, 5-7 the focus in this study is on the computation of tolerance limits based on given test conditions and sample sizes. In this work, we directly model the observations from stress tests using multivariate probability distributions. By this, our study differs from Zhai, 8 who compute tolerance limits based on models for degradation paths of components. Also focusing on degradation processes, Rafiee et al 9 study how the reliability of Micro-Electro-Mechanical Systems is influenced by several shock patterns.
The studies of Healy, 10 Kim, 11 Chou, 12 Albers, 13 and Albers 14 are perhaps most closely related to the focus of our work, as these studies also focus on calculating optimal guardbands or tolerance limits. However, these studies differ from ours in 2 major aspects: First, these articles have a slightly different focus than our study: While Kim 11 specifically consider economic aspects concerned with the inspection procedure, Albers 13 and Albers 14 tackle the calculation of guardbands for a certain characteristic of a device that may be difficult or costly to measure. They do so by investigating how the measurements of a physical parameter can be bypassed by using the measurements of another parameter that is less costly to measure and is highly correlated with the parameter of interest. Our quality control task has a focus that is new to the literature, as we work with longitudinal datasets where a number N of devices that enter the stress test procedure is measured over several time periods t ∈ {1, … , T}. Based on the observations from these stress tests, we calculate optimal tolerance limits that can then be used by automated test equipment to conduct final testing of newly fabricated devices. This focus also differentiates the present study from the ones from Healy 10 and Chou. 12 These studies assume multivariate or bivariate normal distributions of the measurements, while the present study considers multivariate distributions that are constructed by means of copulas with marginal distributions that allow for skewed data.
In our case, the number of devices N that are tested is relatively large (typically 77 for devices used in the automotive industry) and the number of measurements T is relatively small (in many cases below 4). Therefore, the task at hand is also beyond an analysis of variance for repeated measurements, 15 as these approaches generally do not give the distribution functions necessary for guardbanding modeling. Usually, the time dependence of measurements is modeled using Markov chain models or stochastic processes. In our case, where N is large and T is small, these models will not be applicable since the small number of observations over time would result in unstable estimators that could not be used for our high reliability requirements.
Fatahi 16 and Long 17 apply copulas to dependence structures in control charts. Copula models are a highly flexible and convenient way of describing the dependence structure of a multivariate distribution independent of the marginal distributions. While the usage of copulas became a well-established modeling tool in finance, copula models become more and more popular in reliability models as can be seen in the recent studies of Hsu 18 and Zhang. 19 The aim of this study, which was jointly conducted with Infineon Technologies Austria AG, is to find maximal and minimal allowable tolerance limits UTL and LTL for the initial measurements, ie, an upper and a lower guardband UG = USL − UTL and LG = LTL − LSL, respectively, such that the reliability and conformity requirements of the devices are fulfilled over their whole specified lifetime duration. By denoting the vector of measurements at time t with X t , the main focus is, hence, to find L ⊆ S for a specific characteristic, such that holds true. In Equation 1, q reflects the reliability requirement that needs to be ensured, where (1 − q) is predefined and represents the maximal allowable failure probability over all subsequent measurements. Additionally, as an unnecessary restrictive choice of L will lead to a high yield loss, we require that P(X 0 ∈ L) should be as large as possible. This leads to the following optimization problem: For a case with T = 3 repeated measurements and where one device drifts outside of S in period t = 2, this figure demonstrates how guardbanding is used to eliminate nonconforming devices. The tolerance limits calculated by solving optimization problem 2 should make sure that such devices do not successfully leave the production process. As there is no information available on the behavior of the characteristics between the measurement times, Figure 1 assumes linear interpolations between the measurement times.

THE MODEL
This section is organized in 3 parts: Section 2.1 first proposes a way to correct the observed measurements to deal with issues of gage repeatability and reproducibility as well as measurements of a control group of devices. The devices in this control group do not enter the stress test procedures and are primarily used to detect artificial drift of electrical parameters, ie, drift behavior that is not caused by the stress test. Based on the corrected measurements, Section 2.2 then formulates a model for the calculation of optimal tolerance limits that ensure the reliability and conformity requirements for the devices over their whole specified lifetime duration. To evaluate the performance of the 2 proposed models, Section 2.3 presents the results of 2 simulation studies by numerically calculating optimal tolerance limits for a wide range of possible drift behavior of devices.

Correction of measurements for artificial drift and GR&R
Before the calculation of tolerance limits can be done, the observed measurements have to be corrected to reflect errors that may arise from the testers. As there might be different testers used for each time period or changing external factors that may affect the measurement process, a control group of devices is formed. These devices are not stressed and are used for reference purposes to detect artificial drift, ie, drift of electrical characteristics that is not caused by the stress test. Therefore, when external influences could be ruled out, the observations of the control devicesĈ t , should be constant over time. Using the control group measurements, observed (and potentially biased) changes from measurement time t − 1 to t,X t −X t−1 , are equal to true changes of the unobserved and unbiased variables X t plus the difference in the mean of the control group measurements E(Ĉ t −Ĉ t−1 ). More formally, our transformation has the following rationale: Note that at time t = 0, ie, the initial measurements immediately after production, we conduct no transformation, because the devices have not been exposed to any stress levels and, thus, should have characteristics comparable to the devices in the control group. By replacing the mean values of the biased and unbiased observations in and rearranging terms, the logic behind Equation 3 is translated to the following transformation: Figure 2 shows several illustrative patterns of changes in the measurements of the test group and the control group as well as the resulting transformed measurements. If the changes in the control group and the test group are both increasing or decreasing, we can conclude that the changes are solely caused by some external influences and should not be considered for the further calculation. This situation can be observed in Example 3, where the changes in the true and corrected measurements are nullified. On the contrary, when the control group devices and the biased test group devices have opposite signs, ie, either the one is increasing and the other one is decreasing, the changes in the true and corrected measurements need to be artificially increased (see Examples 1 and 2 of Figure 2). This can theoretically cause negative corrected measurements even if the physical characteristics of the measurements only allow nonnegative values.
Based on sample datasets from previous stress tests provided by Infineon Technologies Austria AG, many obser-vationsX t show a significant amount of skewness, which leads to a rejection of the normality assumption in many cases. Therefore, we modelX t using the skew normal distribution, ie,X t ∼ SN( t , 2 t , t ). The pdf of the skew normal distribution is defined by f(x) = 2 (x)Φ( x), with (z) and Φ(z) as the density and the distribution function of the standard normal distribution, respectively. In this density function, parameter mainly determines the skewness, where the coefficient of skewness is bounded on the interval [−0.9953, 0.9953]. 20 Using the transformation X = Z+ , this distribution is then generalized to allow for a mean and standard deviation different from 0 and 1. For a comprehensive discussion of the properties of the skew normal distribution, we refer to the work of Azzalini. 20 When assuming that all measurements are scattered by an additive error term ∼  (0, 2 GR&R ) that reflects gage repeatability and reproducibility (GR&R), Equation 5 gives the distribution of the corrected measurements X t based on the transformation from (4) and using Proposition 2.3 of Azzalini. 20

Model formulation and solution
To solve optimization problem (2), we start with a reformulation of the constraint which ensures that the reliability requirements are fulfilled by using the definition for conditional probabilities: For the computation of this probability, distributional assumptions for the multivariate random vectors (X 0 , … , X T ) are of crucial importance. As indicated in Section 2.1, real-world data from previous stress tests provided by Infineon Technologies Austria AG indicates that many observations X t are significantly skewed and the correlations between the measurement times are high in many cases. To construct such a multivariate distribution, we use copula modeling that is based on a decomposition of a multivariate density into a term that accounts for the dependence and the marginal distributions that account for the marginal behavior of each variable.
A copula is a multivariate distribution, C, with uniformly distributed marginals U(0, 1) on [0, 1]. Sklar 21 proved that every multivariate distribution F with marginals The joint density function f for a continuous distribution function F with strictly increasing, continuous marginal densities where c denotes the copula density. While Nelson 22 provides a comprehensive source of reference on copula theory in general, information on likelihood estimation of multivariate copula models can be found in Hofert. 23 While the skew normal distribution is capable to fit almost all observations at a certain point in time reasonably well, the time dependence of the measurements that is modeled by a copula changes from case to case. As this quality control task requires a very high level of accuracy in calculating the probabilities for inequality (6), the search for an optimal copula model is restricted to Archimedean copulas like the Gumbel-Hougaard, Clayton, Ali-Mikhail-Haq, and Frank copula for instance, since it is possible to derive closed-form distribution functions in these cases.
In these cdfs, the u i 's denote the uniformly distributed margins and denotes the parameter of the respective copula. Using these closed-form expressions for the distribution functions, the probabilities necessary for the calculation of the optimal tolerance limits do not rely on numerical integration. Based on the closed-form expressions for the cdfs, Appendix B details the computation of the probabilities from the multivariate distributions. In the simulations presented in Section 2.3, we have chosen an appropriate copula by comparing the values of the maximized likelihood functions of each copula model.
Note that the tolerance limits that are computed by solving optimization problem (2) are calculated based on the corrected measurements as described in Section 2.1. Thus, before the tolerance limits can be used by automated test equipment to conduct final testing of devices, the tolerance limits have to be transformed back from the corrected measurements X t to the observed measurementsX t . Letx ATE 0 denote an initial measurement from a certain characteristic of a device using automated test equipment. Then, the device fulfills the reliability requirements for this characteristic ifx ATE 0 is at least LTL + F −1 ( ) + E(Ĉ ATE 0 ) − E(Ĉ 0 ) and not larger than UTL + F −1 (1 − ) + E(Ĉ ATE 0 ) − E(Ĉ 0 ). In these limits, E(Ĉ ATE 0 ) and E(Ĉ 0 ) denotes the mean in the control group measurements at time 0 using automated test equipment and those from previous stress test, respectively, and F −1 ( ) denotes the inverse cdf of the normal distribution for the GR&R error ∼  (0, GR&R ). Then, is a prespecified quantile for the GR&R error that could, for example, be set to 1 − 1ppm.
By means of the estimates for the joint density of the vector of the corrected measurements (X 0 , … , X T ), it is then possible to solve optimization problem (2). However, this task is nontrivial, since, in general, convexity of the constraint that ensures the reliability requirements cannot be established and values of (6) for a given LTL and UTL can only be computed numerically. One immediate heuristic approximation is therefore to start with LSL and USL as lower and upper boundaries for LTL and UTL. A heuristic approximation for the optimal tolerance limits can then be found by iteratively restricting S with a constant stepsize s. Then, S is restricted from above or below to keep the objective function as large as possible. The procedure stops, when the S is sufficiently narrowed such that the reliability requirement is fulfilled. A pseudocode for this procedure is given by Algorithm 1 in Appendix A.
As the number of computations of inequality (6) becomes large when L is narrow compared to S, a more efficient way of solving optimization problem (2) can be to perform a grid search procedure for the optimal interval L. In this heuristic, we start with a grid on [LSL, USL] × [LSL, USL] with n×n separations. In the neighborhood of a point on the grid where the constraint of the optimization problem is fulfilled and the objective function is as large as possible, another grid is built. This procedure is then continued until a further narrowing of the search grid does not lead to a change in the solution anymore. This procedure is described in more detail by Algorithm 2 in Appendix A. Since the computation of the reliability constraint in the grid search procedure is, in general, computationally intensive, the total execution time of both algorithms is mainly driven by the computation of this inequality. It is thus important to compare both algorithms in terms of efficiency to get more information on the benefits and drawbacks of both algorithms.
With n partitions of the interval [LSL, USL] and k repetitions of the grid search procedure, the respective reliability inequality needs to be computed c gs = n(n+1)∕2+n 2 (k+1) times. This follows from considering that the computation of the first grid requires n(n + 1)∕2 evaluations because of LTL ≤ UTL and assuming that all subsequent computations of the grid require n 2 evaluations of the inequality. As the search procedure is conducted until the stop criterion is met, there will be k + 1 repetitions in total. Such a grid search procedure will result in an error square for the 2 variables that is not larger than ( 2 · (USL − LSL)∕n k ) 2 . With a stepsize s, the iterative procedure requires c it = ((USL − UTL * ) + (LTL * − LSL) + 1)∕s computations of the reliability inequality and yields an error square of (2s) 2 . Obviously, the iterative procedure will be less computationally expensive if the optimal tolerance limits are only slightly narrower than the specification limit S. In contrast, the grid search procedure will be less computationally expensive in situations where L is much narrower than S. It is thus straightforward to combine both approaches by building an initial grid that allows for a rough estimate of future computations of the reliability inequality. Let (x * (0) , y * (0) ) denote the initial guess of the tolerance limits after the initialization step, and let h = (USL − LSL)∕n. Then, the number of evaluations of the reliability constraint is given by (7).
Assume that the optimal tolerance limits are in the proximity of the initial guess from the grid search. Then, for a given stepsize s of the iterative procedure and n partitions of the search interval for the grid search, it is computationally less expensive to continue with the grid search if condition (8) is fulfilled: This criterion can be used by a hybrid search algorithm that combines both the iterative and the grid search procedure (see Algorithm 3 in Appendix A). For several choices of the stepsize s and several resolutions of the search grid n, the left panel of Figure 3 compares the efficiency of the 3 algorithms. Considering the gain in efficiency provided by the hybrid algorithm on the one hand and the n(n + 1)∕2 extra steps compared to the iterative procedure it requires on the other hand, Figure 3 indicates parameter combinations where the hybrid algorithm is more or less efficient compared to the simple grid or iterative search procedure. Additionally, the left panel of Figure 3 reveals that the grid search procedure is especially efficient compared to the iterative process when the resolution of the grid is relatively low, ie, n is approximately lower than 10, and the desired accuracy of the solution is high (the stepsize s is small).   The lower panel of Figure 3 sheds light on the optimal usage of the 3 algorithms depending on several specification limits LSL and USL. While the iterative search procedure is less computationally intensive when USL − LSL is relatively small, the hybrid algorithm will be faster if USL − LSL is relatively large.

Simulation results
Using our model and the proposed solution strategy, we test the performance of our technique for calculating optimal guardbands using the following simulation: Based on (9), we conduct 2 simulation studies: One set of simulations tackles the influence of inter-device variability, ie, the spread between the devices inter , on the resulting tolerance limits. Another set of simulations then analyzes the model's performance when the variability between the measurement times intra is varied. While Figure 4 shows the drift behavior of devices for selected intra-and inter-devices variability as simulated from (9), Figure 5 shows the resulting tolerance limits and their performance evaluation.
The left panel of graphics in Figure 5 shows the results of variations of 2 intra , ie, the intra-device variability, and the right panel of graphics shows the results of variations of 2 inter , ie, the inter-device variability. While the top 2 plots depict how the length of L, ie, UTL − LTL, reacts to several choices of 2 intra and 2 inter , the lower 4 graphics contain performance evaluations of these simulations.
From these simulations, it can be seen that our model indicates stricter tolerance limits when variability between the measurement times 2 intra is increased compared to increases in the inter-device variability 2 inter . The middle 2 panels highlight that, while controlling P({X 1 , … , X T ∈ S}|{X 0 ∈ L}) ≥ q, the complementary probability, ie, P({X t , … , X T ∉ S}|{X 0 ∈ L}) ≤ 1 − q, may get relatively high. To evaluate this probability, the graphs in Figure 5 show the results of simulating the behavior of 10 7 devices and then analyzing which of these devices drift outside of S, given X 0 ∈ L. The results indicate that the model is more capable to keep this probability constantly low when the inter-device variability 2 inter is changed compared to situations where the intra-device variability 2 intra is changed. Also, by simulating the drift behavior of 10 7 devices for "□" denotes the results from our model for finding optimal tolerance limits and "△" stands for the results when assuming UTL = USL and LTL = LSL. While the left panel of graphics shows the results from changes in the standard deviation of the intra-device variability intra in X 0 ∼ SN(10, 2 2 , 0) and X t ∼ X t−1 + SN(0, 2 intra , 0), the right panel of graphics shows the simulation results from changes in the inter-device standard deviation inter in X 0 ∼ SN (10,2 inter , 0) and X t ∼ X t−1 + SN(0, 0.2 2 , 0) both different values of intra and 2 inter , we estimate the fraction of devices that have to be discarded because of the tolerance limits, ie, the resulting yield loss. As high values of the intra-device variability 2 intra lead to tight tolerance limits, in this case, the resulting yield loss might get much higher than in situations where the inter-device variability 2 inter takes high values.

CONCLUSIONS
Quality on reliability requirements are of crucial importance for the production of semiconductor devices, as their reliability and conformity requirements are key determinants for many complex products such as personal computers, cars, or telephones. To gather detailed information on the future behavior of their devices, manufacturers let their devices to undergo stress test procedures that let the devices age artificially fast. This study proposes an approach for the calculation of optimal tolerance limits based on longitudinal datasets that arise from product stress tests. These tolerance limits make sure that the probability that the devices' physical characteristics remain within their specification limits is close to 1. Since the tolerance limits are used by automated test equipment to test in advance if a device is expected to fulfill its reliability requirements or not, we require that these limits also minimize a potential yield loss that could arise from this procedure.
We compute these tolerance limits using a 2-stage procedure: In a first step, observations of devices' physical characteristics are corrected for measurements from a control group of devices and corrected for measurement errors (GR&R). The devices from the control group do not enter the stress test procedure and are solely kept for reference purposes to detect drift that may be caused by external factors. Based on these corrected measurements, optimal tolerance limits are calculated in a second step. This is done by modeling the corrected observations using multivariate distributions that are constructed using Archimedean copulas and skew normally distributed marginal distributions. In the article, we propose a solution strategy for the resulting optimization problem and then use this solution strategy in a comprehensive simulation study to test the performance of our tolerance limits. Using a simulation study, we demonstrate the performance of these tolerance limits by simulating the model's behavior when the devices' intra-and inter-device variability is changed. These simulations indicate that the model is capable in controlling the devices' reliability reasonable well.
This work can be extended in several directions: First, it is worthwhile to investigate effects that arise from censoring of observations. When manufacturers discard devices that drift outside of S at a certain point in time during the stress test, adequate probability distributions have to found for the resulting interval-censored dataset. Another direction for future research is an adaptation of the model to incorporate consumers' future usage behavior in the computation of the tolerance limits. This could be especially relevant for manufacturers to prevent overengineering and to achieve a further reduction of the resulting yield loss from the quality control procedure.

APPENDIX B: CALCULATION OF PROBA-BILITIES FROM MULTIVARIATE DISTRI-BUTIONS
For given lower and upper limits a and b, probabilities from simple univariate distributions can be computed using their cumulative density functions F: Then, in the bivariate case for the random variables (X 1 , X 2 ), the probabilities are given by a 1 , b 2 ) + F(a 1 , a 2 ).
In the multivariate case, this can be generalized to get P(a < x ≤ b) = ∑ y∈a×b (−1) |{a 1 , … ,a T }∩{y 1 , … ,y T }| F(y), (B1) where a, b ∈ R T again denote the vectors of lower and upper limits and, trivially, a i ≤ b i for i = 1, … , T has to hold. In this formulation, |{a 1 , … , a T } ∩ {y 1 , … , y T }| gives the number of elements in a that are equal to elements in y. An equivalent form of Equation B1 can be found, for example, in Bilodeau. 24 Note that for the numeric computations, the sum of the negative terms should be subtracted from the sum of the positive terms to minimize the numeric error that arises from cancellation.