Research and discussion on the evaluation scheme of reagent lot‐to‐lot differences in 16 chemiluminescence analytes, established by the EP26‐A guidelines of the CLSI

Abstract Background Verification of new reagent lots is a part of the crucial tasks in clinical laboratories. The Clinical and Laboratory Standards Institute (CLSI) EP26‐A guideline provides laboratories with an evaluation method for reagent verification. The purpose of this study was to compare the performance of EP26‐A with our laboratory reagent lot verification protocol and get the final scheme. Method 16 chemiluminescence analytes including estradiol (E2), progesterone (P), ferritin (FER), cortisol (COR),carbohydrate antigen 153 (CA153), and free prostate‐specific antigen (FPSA). were prospectively evaluated in two reagent lots. The laboratory's lot verification process included evaluating 5 patient samples with the current and new lots and acceptability according to a predefined criteria. For EP26‐A, method imprecision data and critical differences at medical decision points were important factors affecting the sample size requirements and rejection limits. Result The number of samples required for EP26‐A was 3 to 12, of which P, CA153, and FPSA had increased by more than 5 samples compared with the current protocol. Of the 16 chemiluminescence analytes, 11 had higher rejection limits when using EP26‐A than the current laboratory scheme. Our current protocol and EP26‐A were in agreement in 32 of the 32 (100%) paired verifications. Conclusion The EP26‐A protocol is an important tool to find the differences between reagent lots, and it makes up for the loopholes in the statistical efficiency, sample concentration and quantity, and the selection of rejection limits in the current protocol.

I2000 chemiluminescence instrument, and the other analytes were detected by Roche Combas 8000 c601. 5,6 The reagents were all matching reagents from various manufacturers.

| Laboratory scheme
Samples from five patients in a stable period were selected for all analytes. The concentration covered the measured range of analysis, and the old and new lots were used for detection. Bias was calculated based on the following formula: 1/3 of the total allowable error (TEA) of each analyte, as specified in the "Quality Evaluation Standard of Interventricular Quality in Clinical Examination," issued by the China Health Commission, was taken as the judgment standard. 7 If the bias was less than or equal to the standard in at least four of the five samples, then the new lot reagent could be used. Otherwise, it was not acceptable.
This evaluation method will be referred to hereinafter as the old scheme.

| Lot verification based on the EP26-A guidelines
The critical difference, CD, was determined based on Appendix D1 of the EP26-A guidelines. To reduce a class of errors, the Z-value was assigned a unilateral value of a 99% confidence interval (3.09). Based on this, CD = 3.09 × 1.41 × CV WRL = 4.36 × CV WRL . Imprecision data came from performance verification. If the within-reagent lot imprecision and repeatability at the same medical decision level could not be obtained, the interpolation method was used. If the medical decision level was within the laboratoryevaluated concentration range, the imprecision of the medical decision level was estimated by the TREND and OFFSET functions.
If the target concentration was not within the concentration range already evaluated in the laboratory, an imprecision close to the concentration was used for the evaluation. The efficacy of the statistical analytes was 0.80. This evaluation method will be referred to hereinafter as the new scheme.
The sample number and judgment scope of each analyte were obtained following the steps detailed in Figure 1. In the actual evaluation, the samples selected according to the evaluation results were tested with new and old lots. The bias was calculated based on the formula mentioned above. The average bias was obtained at the medical decision level, and a decision was then made as to whether the average bias is less than the judgment limit. The judgment limit was obtained by looking up in the table. If the average bias was less than the judgment limit, the new lot was acceptable. Otherwise, it was unacceptable.

| Comparison of scheme effect
The two methods were compared for a sample number range of judgment limits, judgment results of previous data using the new scheme, and the percentage of the previous evaluation bias from the judgment limit.

| Scheme decision
Deciding on the final scheme according to the clinical use of each analyte, the number of evaluation samples, and the judgment limit.
If there was no result in the look-up table, the current method was retained; if the evaluation sample size was more than 30 in the lookup table analyte, the current method was retained. If the analyte was used for disease diagnosis, we chose the medical decision level on the concentration. Otherwise, we retained the current linear range.
In terms of judgment limit, if the ratio of the new scheme to the old one was between 0.5 and 2, we chose the new scheme. Beyond this range, we followed the analysis results. Table 1 shows the results of the evaluation based on the new scheme. Sixteen of the 32 concentrations (50%) were interpolated to obtain the imprecision of the target concentration. The Sr/Swr of all analytes was found to be greater than 0.3, using a judgment limit of 0.7 CD, and the required quantity samples ranged between 3-12 per test. Among them, a sample size of 10 analytes was less than or equal to five, and the number of samples of six analytes was greater than five. CA153 and P had much larger sample size than other analytes, with 11 and 12 samples, respectively. Table 2 shows that the number of evaluations of six of the analytes in the new scheme was higher than in the old scheme. The items with the largest increase were P and CA153. The new scheme was wider than the old one in 11 analytes. The judgment limit was narrower in the new scheme than that of the old scheme in four analytes, and the judgment range of the two schemes for FPSA was the same. In two analytes (P and B12), the difference in the judgment limit of the two schemes was more than 2.

| Using the new scheme to evaluate historical data
In 2019, a total of 49 lots, covering the 16 analytes, involved 245 tests. Only 62 tests, covering 14 analytes, were at the medically determined level, accounting for 25.3% of the total number of tests.
Sample results of the previous medical decision level were judged by the new scheme (Table 3). All 32 paired tests (100%) were consistent and passed the evaluation.

| Getting the final scheme according to the actual situation
The final proposal is shown in Table 4. F I G U R E 1 Evaluation steps of the improved EP26-A protocol. MDL, medical decision level; Sr, repeatability (withinrun imprecision); Swrl, within-reagent lot imprecision; CD, critical differences Owing to the changes in raw materials through the production process, and the decline in activity during transportation and storage, clinical laboratories should verify the performance of any new lot before using it for the detection of clinical samples. Betweenreagent lot variation can affect results for QC materials, patient samples, or both. It is possible that a difference in patient sample results occurs between two different reagent lots, but there is no difference seen for QC results. This is because the manufacturing process for QC materials has a significant impact on the matrix of these samples and the reagent manufacturer's first concerns must be accuracy and consistency with patient sample results. In addition, QC material supplied with the reagents may be "optimized" to perform correctly with each new reagent lot. Therefore, it is important that reagent lotto-lot evaluations be performed using patient samples for all reagent lot changes. Ideally, the same evaluation scheme would be used by multiple users of the same reagent factory. Problems with abnormal lot can then be found, 8    From the results of the evaluation, the number of samples of FPSA, CA153, and P was higher by more than five each. The judgment limit of more than 90% of the concentrations in the two methods was between 1 and 2. These findings indicate that the allowable limit of the new scheme is generally wider than that of the old scheme. Since the difference in the limit of judgment for P and VB12 was greater than 2, we analyze the reasons from the following perspectives: 1. Swrl was too large. The calculation of inter-batch imprecision was mixed with reagent inter-batch differences: in the past, the statistical time of inter-batch imprecision in our laboratory was limited by the replacement of quality control lots. The problem is that inter-batch differences lead to an increase in inter-batch imprecision variation. The bias caused by the inter-batch differences should be considered as the test noise, and thus the smaller the better. The appropriate statistical interval of inter-batch precision shall be subject to any replacement of the quality control lot and the reagent lot. Reviewing the history of quality control, The CV of P low concentration reagent lots was 5.82% and 6.02%, respectively, and the total CV was 5.98%. The CV of each lot of high concentration VB12 was 5.02%, 4.35%, and 3.27%, the total CV was 5.62%, and the respective total CV was less than 1/3 of the maximum allowable error. We compared the inter-batch precision of other laboratories in Jinyu. P range was between 2.48% and 7.52%, and at our laboratory it was 66%. Range of VB12 was: 3.1%, Murray 6.28%, and at our laboratory it was 58%. Therefore, we need to find out the reasons for the decrease in Swrl. 2. Whether the maximum allowable error of this concentration was appropriate. Referring to the external quality evaluation standard of ESfEQA GmbH in Germany, the judgment limit of P in 0.47 ng/ml is 0.3 ng/ml, and the limit of the judgment of VB12 in 992 pg/ml is 89 pg/ml. In our laboratory, we use the percentage form the maximum allowable error for P, and the low concentration is narrow. Finally, we tried to evaluate the above two analytes, using the method proposed by George Klee to calculate the allowed difference based on patient historical data. Based on this method, the allowed low concentration difference of P was 0.06 ng/ml, which was between the CD of the two regimens. The high allowed difference of VB12 was 152.25 pg/ml, which was closer to the EP26-A result. To sum up, we chose to use the EP26-A method for subsequent verification.
Using the new scheme to evaluate the historical data, although all the previous data passed, only 25.3% of the tests were at the medical decision level. The sample number required by FER, AFP, CEA, FPSA, CA153, and P was greater than 5, and the significant lot-to-lot difference could not be found due to the small amount of previous data.
In the choice of final adoption scheme, as the clinical use of CEA, CA125, CA153, and CA199 is mainly for monitoring function, the judgment limit of CEA and CA153 was greater than 1, and the ratio of previous evaluation bias to judgment limit was more than 50%, so choose a new scheme. The ratio of past bias to the judgment limit of CA125 and CA199 was-11.22% and 0.7%, respectively, so both schemes are acceptable.
In this article, the EP26-A method was used to evaluate the applicability of 16 chemiluminescence analytes in our laboratory. We think that the new scheme is very helpful for the quality improvement of the laboratory. the old scheme and increases the complexity and cost of laboratory evaluation.
In conclusion, reasonable and scientific evaluation is important and urgent, and the introduction of the EP26-A scheme is an important tool to find the differences between reagent lots, The implementation of the scheme can promote the applicability of laboratory review quality indicators and benefit the quality risk education of personnel.

CO N FLI C T O F I NTE R E S T
The authors declare that there are no competing interests associated with the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.