External quality assessment for PML‐RARα detection in acute promyelocytic leukemia: Findings and summary

Background The confirmation of clinical diagnosis, molecular remission, and sequential minimal residual disease monitoring required PML‐RARα detection in acute promyelocytic leukemia (APL). The current status of PML‐RARα detection in various laboratories remains unknown. Methods In 2018, external quality assessment (EQA) for PML‐RARα detection was carried out in China. Three EQA sample panels for PML‐RARα isoform L/S/V were prepared by different mock leukocyte samples. The performances of PML‐RARα detection, including admission screening, and qualitative and quantitative detection by real‐time quantitative reverse transcription PCR (RT‐qPCR), were assessed based on APL simulated clinical case. Results The mock leukocyte samples met the requirements of a clinically qualified sample for PML‐RARα EQA panel. Among the laboratories, 13/50 (26.0%) were “competent,” 21/50 (42%) classified as “acceptable,” and 16/50 (32.0%) classified as “improvable.” One (1/50, 2.0%) laboratory reported one screening mistake. Twenty‐six (26/50, 52.0%) laboratories reported 29 false‐positive and 19 false‐negative results. Twenty‐three (23/50, 46.0%) laboratories reported 42 quantitative incorrect results. Conclusion Significant differences were not found in PML‐RARα detection performance among laboratories that used different extraction methods. The performances of qualitative and quantitative RT‐qPCR detection were worse accurate for PML‐RARα isoform V. Quantitative variation was higher for low‐level samples. Further continuous external assessment and education are needed in the management of PML‐RARα detection.


| INTRODUC TI ON
Acute promyelocytic leukemia (APL) is a distinct subtype of acute myeloid leukemia (AML) with characteristic biological and clinical features, 1 comprising approximately 10% of de novo AML cases in younger adults. 2  PML-RARα FG is present in almost all APL cases and is a biomarker for APL diagnosis, disease burden, minimal residual disease (MRD) monitoring, and molecular remission. [5][6][7] Detection methods for t (15;17) or PML-RARα FG include conventional chromosome analysis, fluorescence in situ hybridization, and polymerase chain reaction (PCR). Compared with common reverse transcription PCR (RT-PCR), real-time quantitative reverse transcription PCR (RT-qPCR) for PML-RARα has higher precision and reliability, and is routinely used, especially in molecular hematology laboratories. 8 Clinical detection of the PML-RARα fusion gene is important in APL development. APL can be diagnosed in patients with abnormal hematopoiesis and characteristic cytogenetic abnormalities with t(15;17), regardless of the percentage of marrow blasts. 9 PML-RARα FG transcript level can reflect the abnormal leukemia blasts load, quantitatively document disease burden, and confirm molecular remission. 10 The goal of consolidation therapy for APL is a durable molecular remission, defined as undetectable PML-RARα FG. 7,11 Rigorous sequential MRD monitoring by RT-qPCR coupled with pre-emptive therapy can help reduce clinical relapse rates in APL patients. 5,8 External quality assessment (EQA) programs of common RT-PCR for PML-RARα FG test were first performed nearly 20 years ago. 12,13 These programs used RNA, cDNA, or plasmid as EQA samples, and examined the heterogeneous sensitivities of PML-RARα FG RT-PCR detection. In 2003, the Europe Against Cancer (EAC) program established RT-qPCR standardization and quality control analysis for the PML-RARα FG transcript and recommended the ratio of FG copy number to control genes (CG) copy number (FG CN /CG CN ) as the PML-RARα FG transcript level. 14,15 The MRD value is a ratio between the FG transcript level in follow-up ((FG CN /CG CN ) FUP ) and diagnostic samples ((FG CN /CG CN ) DX ). 14,15 These studies promoted the improvement of the PCR detection sensitivity and accuracy for PML-RARα FG, especially the EAC-sanctioned RT-qPCR. However, there existed some limitations. For some detection defects, total RNA, cDNA, recombinant plasmid, and NB4 cells were not suitable as EQA samples. 16,17 Little is known about the evaluation of PML-RARα isoform V detection. These EQA programs only assessed the accuracy of the RT-PCR or RT-qPCR methodology, but did not analyze MRD monitoring results for PML-RARα based on APL clinical information. [5][6][7] The EQA scoring criteria for BCR-ABL1 are unsuitable for PML-RARα, because only the accuracy of quantitative RT-qPCR detection was analyzed, with no admission screening and qualitative test. 18 We made MS2 armored RNAs for PML-RARα FG transcript, CG transcript, and 23s rRNA. Armored RNAs are stable, nuclease-resistant, and precisely quantifiably synthetic RNAs. They were already used as BCR-ABL1 and control gene standards. 19 for isoforms L/S/V 6,7 (see Appendix S1).

| Preparation and evaluation of mock leukocyte samples for EQA panel
Total RNA extracted from BM was divided into three components, including PML-RARα FG transcript RNA, CG transcript RNA, and other non-target RNA. We used MS2 virus-like particle packaging RNAs were expressed and purified as previously described. 22,23 The The EQA panel was evaluated using a routine detection process.
Total RNA was extracted by TRIzol reagent and spin column, quantified using NanoDrop 2000c (Thermo Fisher). Using the one-step or two-step RT-qPCR method, qualitative and quantitative detection of PML-RARα FG and CG was performed by the manufacturer's instructions on ABI 7500 Instrument (Applied Biosystems).

| Organization of the EQA
Before sample processing, the EQA samples should be centrifuged at 12 000 r/min for 1 min and did not need the reconstitution and the lysis of red blood cells. Total RNA extraction was performed by using routine operating procedure of individual laboratory. Participating laboratories first performed screening tests for the admission sample (A1711, B1721, and C1731) based on APL simulated clinical case; then, RT-qPCR was carried out, and the FG CN /CG CN ratio and MRD value were calculated. EQA panel A or B set was randomly assigned to the participants beside EQA panel C delivery to all laboratories.
Each participant was asked to report the results on the data sheet within 2 weeks.

| Laboratory performance scoring
Accurate detection of the PML-RARα FG was prerequisite for APL diagnosis and MRD monitoring. 6,7,9 Any result distinct from the established value was considered as "incorrect result" which will affect evaluation of treatment effect for APL MRD. Any error in RT-qPCR is multiplicative, rather than additive, data distributions from RT-qPCR-based EQA testing program produce a lognormal distribution, that is an asymmetric distribution of results with a strong positive skew. 24 The log reduction was calculated by using the admission sample in each EQA panel as the baseline. The reduction in PML-RARα levels from this baseline value was then calculated for each correct qualitative positive EQA sample and reported as a log reduction. 18,25,26 The log reduction was analyzed using a robust statistical Z-score, 27 the score ≥3 as "incorrect result".

| Statistical analyses
All data were analyzed using SPSS version 16.0. PML-RARα detection sensitivity, specificity, accuracy, and variation distribution between different samples or groups were compared using t test or one-way ANOVA or Fisher chi-square test. P values < 0.05 were considered statistically significant.  Figure 1B). Digesting with RNase A and DNase I for 1 hour at 37°C, only one single band between 1 kb and 2 kb was visible using 1% agarose gel electrophoresis ( Figure 1C). RT-PCR was performed respectively to confirm encapsulation of the five target sequences ( Figure 1D), followed by sequencing. To verify their stability and availability of the armored RNAs for the EQA study before panel distribution, stability analyses were performed and approved that armored RNAs were stable

| Performance of laboratories
The mock leukocyte samples had good adaptability to various RNA extraction methods. We did not find significant differences in RNA extraction performance among laboratories that used different extraction methods (P = 0.79; Figure 2A). RNA yields extracted by TRIzol reagent between EQA samples in panel C were consistent (P = 0.99; Figure 2B). All 50 laboratories used ABL1 as the control gene. Excluding 6 results from one laboratory, other laboratories had control gene ABL1 CN >10 4 and the median of CG CN ranged from 1.14 × 10 4 to 4.57 × 10 7 ( Figure 2C). The different RNA extraction methods had no effect on PML-RARa detection accuracy and no significant difference (P = 0.40; Figure 2E).
Among the laboratories, 13/50 (26.0%) laboratories were "competent," 21/50 (42%) classified as "acceptable," and 16/50 (32.0%) classified as "improvable." The performances of the different RT-qPCR assays used for the qualitative and quantitative tests indicated overall accuracy, sensitivity, and specificity were 91.1%, 94.0%, and 86.0%, respectively; the accuracy of in-house methods was better than commercial kits, and EQA panel C for isoform V detection was worse than that of EQA panels A and B (Tables 1 and 3).

TA B L E 3 Qualitative incorrect results of different reagents and EQA panels
In quantitative RT-qPCR test, 42 incorrect results were reported by 23 participating laboratories. The mean, median, standard deviation (SD), and coefficient of variation (CV) of log reduction for PML-RARα quantitative results are summarized ( Table 4). The CV of case set C was greater than the value of case sets A and B. The CV value increased at a higher PML-RARα level in each sample set (Table 4).
The quantitative accuracy of in-house methods was higher than that of commercial kits (P = 0.036; Table 2). The slope and R 2 value of the standard curve for quantitative RT-qPCR were analyzed in participating laboratories. The range was from −2.19 to −4.15 for the slope and 0.96 to 1.00 for the R 2 value. According to RT-qPCR quantitative results, we divided the participating laboratories into 3 groups, including correct detection group, quantitative incorrect group, and only qualitative incorrect group. Using the difference in slope as an index, the inconsistencies in amplification efficiency of PML-RARα FG and CG in the quantitative incorrect group were statistically significantly greater than those in the other two groups ( Figure 2D).

| D ISCUSS I ON
We successfully designed mock leukocyte samples as the EQA panel for qualitative and quantitative RT-qPCR detection performance.
Thirty-seven of the laboratories reported incorrect qualitative or quantitative results of PML-RARα detection. The detection performance of the laboratories using in-house methods for PML-RARα was significantly better than those using commercial reagents.
Among three sample sets, the detecting ability to rare isoform V was worse than L or S. In the same sample set, the detection accuracy of PML-RARα low-level samples was lower than the high-level samples which prompt these participants needed to improve RT-qPCR test reliability.
The mock leukocyte samples met the requirements of a clini-  Figure 2E, Table 2). This We found that commercial reagents had lower sensitivity than in-house method. Five laboratories using commercial unclassified reagents reported 11 false-negative results for medium-level and low-level isoform V samples. This may be due to the low detection sensitivity of the PML-RARα unclassified reagents for rare isoform V, especially low-level sample. These laboratories were obliged to improve program documentation to accommodate PML-RARα rare isoform V detection. Only one laboratory reported ABL1 CG <10 4 copies, and the FN results were due to incorrect preservation of RNA of the laboratory leading to RNA degradation.
There was a great deal of PML-RARα FG quantitative variation between not only reagents but also case sets (Table 4). We observed that commercial reagents reported more quantitative improper results than in-house method, especially for PML-RARα isoform V (  Figure 2D). In addition, unequal amplification efficiency between the plasmid calibration standard and the RNA template will bring about an potential augment in quantitative detecting inaccuracy 29 ; thus, standard curve should satisfy both slope range (from-3.2 to −3.6) and R 2 > 0.980 like BCR-ABL1. 30 Laboratories should optimize and validate the RT-qPCR procedures to achieve consistent quantitative detection capacity of different isoforms.
Mock leukocyte samples successfully can be used to assess PML-RARα detection. Significant differences were not found in PML-RARα

CO N FLI C T O F I NTE R E S T
The authors declare that they have no conflict of interest.

AUTH O R S' CO NTR I B UTI O N S
Qisheng Wu designed the research study, performed the research,