Serum free light chain (sFLC) measurement has gained widespread acceptance and is incorporated into various diagnostic and response criteria. Non-linearity and antigen excess are the main causes of ‘variability’ in the measurement of sFLC using immunoassay, but the impact of these on measurement has been unclear. We performed a retrospective evaluation using a dilutional strategy to detect these phenomena. A total of 464 samples in 2009 and 373 samples in 2010 were analysed for sFLC. Non-linearity was detected in both high and apparently normal sFLC. Major non-linearity of more than twofold is common in high kappa (20·2%) and lambda (14·1%). It is less common in samples with apparently normal levels – kappa (6·4%) and lambda (9·5%). 9·4% of kappa and 15·5% of lambda showed antigen excess at screening dilutions. 34·4% of the samples had either non-linearity or antigen excess. We conclude that significant measurement variability is common in the measurement of sFLC. There is currently no reliable technique to detect non-linearity phenomena unless a serial dilution strategy is applied to every analysis. We recommend that laboratories routinely reporting sFLC results for clinical services need appropriate strategies for addressing these issues. Clinicians should be aware of these limitations in interpretation of sFLC assay for individual patients. Future guidelines should adopt action thresholds which are grounded firmly in test performance parameters.
The monitoring of multiple myeloma (MM) involves clinical and laboratory assessment. In well-defined subsets of patients with light chain myeloma, non-secretory myeloma and AL-amyloidosis, where an intact immunoglobulin molecule paraprotein is not present, the sFLC assay has gained widespread acceptance as an adjunct to monitoring MM, and has been incorporated into both diagnostic and response criteria for clinical trials [1, 2]. The relative utility of sFLC in other categories of MM with detectable intact paraprotein, including ‘oligosecretory’ disease (defined arbitrarily as intact paraproteins of less than 10g/l) is less well defined. For many laboratories, the majority of paraproteins are ‘oligosecretory’ by that definition.
Until recently, sFLC assays have been available from a single manufacturer in the United Kingdom. Although the assay provides clinically useful data, both in large cohort studies and in individual experience, there is increasing literature on the challenges encountered in routine service provision of sFLC – individual samples which display non-linear dose–response relationships [3, 4], antigen excess (AgXs) in some samples [5-9], interferences  and difficulties in the use of sFLC measurements for assessing remission [11, 12], and concerns have been expressed about the accuracy of calibration for some monoclones [6, 13, 14] where apparent levels appear improbably high, yet cannot be detected by other methods that should easily detect monoclonal light chains at that level. Most previous cohort studies have not taken any additional technical measures to detect AgXs, which is more common than once believed. Improved detection of antigen excess is a goal of newer assay platforms.
Despite best practice recommendations from the manufacturer, UK NEQAS (National External Quality Assessment Service), have determined that few centres are performing routine serial dilutions to check for AgXs or non-linearity phenomena. New assay platforms with mechanisms to detect AgXs phenomena exist, but it remains to be seen if these techniques will detect non-linear dose–responses in addition to true ‘pro-zoning’ AgXs phenomena. There is often confusion between these phenomena.
It is unclear what effect the routine detection of measurement variability will have on assay performance clinically, as most of the literature is generated without any additional procedures to detect these. This needs to be studied. Furthermore, there is growing evidence of considerable interlaboratory and interplatform variability in measurement [15, 16] and EQA distributions (http://www.immqas.org.uk) which must be taken into account. Previous publications acknowledge some of these issues, but more recent data suggest that non-linear dose–response relationships affect significant numbers of patients .
Thus, the potential impact of these issues on the use of the assay in routine patient management is yet to be evaluated fully, especially when some clinicians are: defining increases in sFLC or ratio alone as evidence of relapse ; utilizing levels or ratios for monitoring/prognostication, often without considering measurement variability [18-24]; or when monitoring across centres in national guidelines [2, 25] or via networks as a core part of routine haemato-oncology multi-disciplinary team (MDT) practice in the United Kingdom. The fact that many centres find routinely that greater than 40% of monoclones can have normal sFLC values and ratios, even in progressive disease [26, 27], is an added complication.
In order to evaluate further the issues and inform routine laboratory practice, we examined data from a retrospective survey of measurement variability in a high-throughput tertiary centre with a high prevalence of clonal disease, utilizing a careful dilutional strategy to detect AgXs and non-linearity.
Materials and methods
We surveyed non-linear behaviour on 464 (2009) and 373 (2010) unselected new patients where sFLC was requested for a 12-month period in each study. Large changes in a substantial number (8–20%) of patient samples on dilution were found in 2009 survey. We therefore analysed the 2010 data in detail to determine if our current analytical practice, clinical guidelines and manufacturer's recommendations were appropriate to mitigate these known problems (Table 1).
Table 1. Total frequency of non-linear or antigen excess (AgXs) in samples over 2 years.
No. kappa or lamda sFLC changing >×2 but less than × 4 on dilution (NL2)
No. kappa or lamda sFLC changing > ×4 on dilution (NL4)
Number sFLC with implausibly high values > 10 g/l
sFLC: normal free light chains; NL: non-linearity.
52 kappa (11·2%)
24 kappa (5%)
36 kappa (8%)
7 kappa (1%)
44 lambda (9·5%)
9 lambda (2%)
7 lambda (1·5%)
2 lambda (0·5%)
35 kappa (9·4%)
26 kappa (6·9%)
20 kappa (5%)
3 kappa (0·8%)
58 lambda (15·5%)
16 lambda (4·2%)
16 lambda (4·5%)
1 lambda (0·3%)
All results for sFLC were collated from the laboratory computer system within the time-period and duplicate samples from the same patients were removed to ensure that the true prevalence of AgXs was determined. NEQAS samples were excluded from analysis. All sFLC requests from all sources were analysed according to the protocol.
We assayed sFLC (Binding Site, Birmingham, UK) using a Siemens Dade Behring Nephelometer II Analyzer System (BNII). All assays were performed and calibrated according to the manufacturer's instructions and covered three different batches of reagents during the study. All assays were run with multi-point third-party control materials at target values of 24, 44–46 and 2500–2800 mg/l for kappa and lambda with coefficient of variation (CV) of less than 10% overall. We use validated ‘Westgard’ warning rules for flagging potential problems with quality control (QC). All internal QC (IQC) were satisfactory.
The following dilution steps were performed in measurement of sFLC on all samples, as follows: (1) a screening dilution of 1/100; (2) an AgXs check at dilution 1/2000; (3) where dilution did not occur automatically, a 1/2000 dilution was requested manually to detect whether non-linearity or AgXs had caused the initial result to appear measurable (prozoning); (4) no further dilutions were performed if there was a reportable value at 1/2000; (5) if the 1/2000 result was unmeasurable the samples were examined for the presence of AgXs or non-linearity routinely by performing a serial dilution series of 1/400, 1/8000 and 1/16000 until a first reportable result was achieved; (6) any sample which produced a ‘greater than’ result at screening dilution was serially diluted (as above) until the first reportable number was generated. These samples are designated as ‘AgXs’, i.e. high requiring dilution; and (7) samples with a ‘less than value’ were repeated at dilutions from 1/20 to undiluted until a reportable number was obtained. These are designated as ‘L’ (below measurable range). Few samples (less than 2%) were so low that no measurable sFLC value could be obtained, and this practice was never useful in our experience.
We have found no other way to identify reliably clones which display non-linear dilutional measurement behaviour. The manufacturer's guidance suggests that changes less than fourfold are ignored and the lower value is reported, but the higher number is reported if the change is greater than fourfold.
Initial and final results were compared to look for changes which are used in the clinical decisions referenced in recent International Myeloma Working Group (IMWG) and British Committee for Standards in Haematology (BCSH) guidelines  or in the manufacturer's recommendations for AgXs detection (Binding Site 2008).
On data analysis, we identified sFLC values which changed significantly, and which altered the sFLC ratio between normal and abnormal with dilution. Any sample which gave a reportable result within the apparent working range of the assay and subsequently gave a greater than twofold result (higher or lower) on dilution was defined as showing non-linearity (NL). For this study, we classified the NL samples into different groups, NL2 and NL4 (Fig. 1). There are significant numbers of samples with changes greater than 100 mg/l due to NL (Fig. 1).
Subsequently, we compared the sFLC results of the local (Sheffield Teaching Hospitals NHS Foundation Trust) samples with serum and urine immunofixation and densitometry results.
Dilutional and precision experiments
A separate dilutional and precision experiment on randomly selected high kappa sFLC samples (‘H’) was also carried out to demonstrate the variation in the nature and behaviour of monoclonal proteins and how measurement is affected markedly by small changes in the screening dilution used (Fig. 4).
Samples with high sFLC (‘H’) showing NL phenomena from the 2010 cohort were selected. Five ‘H’ samples were measured for the values in the following manual dilutions: neat, 1 in 2 and 1 in 4. The following ratio was calculated for each sample and lines drawn on a graph to demonstrate the effect of changes in values as a result of dilution: ‘ratio = observed value/expected value’.
‘H’ samples from three patients were assayed for precision analysis. The sFLC values were measured in three dilutions (neat, 1/ 2, 1/ 4).
EQA data review
We also reviewed the EQA data of distributions 101 and 095. These distributions were the same material. The sFLC values reported by participants showed significant variation between the two distributions as well as among laboratories (Table 2).
Table 2. External quality assessment (EQA) data show that the interassay coefficient of variation (CV) between centres is considerably higher than 25% for some samples; 101 and 095 are the same sample. Many participants appear to be missing antigen excess (AgXs) in the pilot national EQA (NEQAS) scheme for sample 095, where a plausibly but falsely normal level is produced on screening. The same happened when the same sample was distributed later as 101. Many laboratories reported correctly on one occasion, but wrong the second time; approximately 10% were incorrect on both occasions.
S D mg/l
Monoclone in g/l by densitometry (CV%)
In 2009, 464 samples were tested at both 1/100 and 1/2000 dilutions. Of these, 96 (21%) were in AgXs and 43 (9·5%) showed NL4. One sample showed NL in both the kappa and lambda assays. Six samples showed non-linearity in a different light chain isotype from that of the monoclone. Thus, 30·5% of the samples had either non-linearity or AgXs (Table 1).
In 2010, 373 samples were tested at both 1/100 and 1/2000 dilutions. Of these, 93 (24·9%) had AgXs, 36 (9·5%) showed NL4, four samples showed non-linearity in both the kappa and lambda assays and five samples showed non-linearity in a different light chain isotype from that of the monoclone. Thus, 34·4% of the samples had either non-linearity or AgXs (Table 1, Fig. 1).
Non-linearity potentially affects any sample producing apparently measurable results at first screening (Fig. 1 and Table 1, Figs 2-4). In 5·6% (n = 21) of the 2010 cohort, non-linearity was detected in both kappa and lambda sFLC. Non-linearity greater than 1·25-fold (NL1·25) affected up to 56% and 35% of ‘H’ kappa and lambda samples, respectively (Fig. 1). Non-linearity greater than twofold (NL2+NL4) was seen in 13% (2009) – 20·2% (2010) of ‘H’ kappa and 3·5% (2009) – 14·1% (2010) of ‘H’ lambda measurements in these 2 years. Thus, in approximately one in 10 samples the result at the screening dilution was very different from the ‘optimal’ final report. More kappa than lambda samples were affected.
Marked variation was seen in sFLC on dilution of ‘H’ samples. Changes in substrate levels unpredictably change the optimal dilution required to produce a stable result and definitive numbers cannot be delivered at a fixed dilution across time. Thirty-eight of 184 (20%) of ‘H’ kappa and 17 of 120 (14%) ‘H’ lambda samples at screening show non-linearity of greater than twofold (NL2+NL4). Approximately half these were NL4 (17 kappa and seven lambda) (Figs 1 and 2). Initial AgXs is frequent in 9% of kappa and 15% of lambda sFLC, requiring further automated dilutions.
While non-linearity was frequent in apparently raised sFLC levels (‘H’ samples), it was also common at apparently normal values. Twelve of 123 (9·7%) kappa sFLC and 38 of 157 (24%) lambda sFLC with normal-range samples showed changes on repeat analysis of greater than 25%. Overall, 35–56% showed changes of greater than 25% on repeat analysis.
Non-linearity affecting both isotypes or the isotype different to that of the monoclone is a less common event, but can be very marked (Fig. 2). This effect results in large uncertainties in the measurement produced (potential over- or underestimation). All such non-linear sample reports should be flagged to the clinician, as clinical decisions based on guideline algorithms will be affected.
Ten per cent of ‘H kappa’ and 5% of ‘H lambda’ with high but measurable sFLC levels at screening showed non-linearity. Their ratios were normal at the screening, but became abnormal on dilution.
Dilutional and precision experiments
Dilutional experiments show that the observed values are markedly different to the expected values which should be obtained in a linear dose–response assay. The measurement is affected markedly by small changes in the screening dilution used. Some increase across the fixed dilutions, some fall. Others rise and fall consistent with an ‘antigen excess’ phenomenon (Fig. 4).
Thus, measurement uncertainty is increased markedly by dilution in any sample – a caveat for all monitoring samples where dilution is required. Measurement uncertainty increases even more between separate samples at different times, as the dilution required will alter as the clone increases or decreases. Changes of less than two- to fourfold may be impossible to detect with certainty in these samples.
The variability of many samples on dilution is much greater than the 10% seen with selected IQC materials. This reflects the much greater CV of the assay seen in EQA distributions (Table 2) and in serial estimations (25) and should be built into monitoring threshold recommendations.
We describe an unselected series of more than 800 samples analysed for sFLC in our laboratory and highlight the specific approach to identifying and dealing with these specimens. We demonstrate that performance issues identified in 2009 persisted into 2010 across multiple reagent batches and are consistent with previous publications from other sources.
The literature establishing the clinical utility of the sFLC assay in a variety of clinical contexts is extensive, with large studies emanating from a limited number of laboratories and services. However, samples with non-linear measurement variability seem to have been recognized rarely in these circumstances. Little discussion of analytical platform-related measurement variability has occurred. Several papers and reviews (including guidance from the manufacturer) have highlighted the potential for non-linear behaviour and the necessity to identify samples which demonstrate antigen or antibody excess [3-9, 14]. It is likely that this will be very different in a population screening exercise compared to use in a tertiary centre with a high pretest prevalence. We therefore decided to look at the effect of dilution on results, as detected by our routine protocol on a BNII nephelometer, which is said to represent best practice.
Impact of measurement variability on sFLC measurement
We demonstrate that current clinical guidelines incorporate decision thresholds that we cannot detect with certainty in a substantial number of samples, and that results from several assays should be used to diagnose and monitor patients.
Checking for measurement variability is mandatory before reporting a sFLC level or ratio. Non-linearity is usually associated with abnormal sFLC measurements, but is a feature of samples with initial screening measurements at all levels, including those within the reference ranges. This demonstrates behaviour of the sFLC itself. It cannot be detected when using a single screening dilution as there are no other clues to its presence, because the first result is apparently measurable and believable (whether normal, or raised but within the reportable range). At initial dilutions the sFLC ratio should not be used to exclude non-linearity, as has been suggested (Fig. 3). The ratio itself cannot predict reliably the behaviour of individual samples on dilution. Ratios and derived calculations cannot substitute for this step and, indeed, increase uncertainty of measurement . It is safer to measure levels. Non-linearity checks are strictly required at every measurement and during subsequent monitoring. Without this, approximately one in 10 samples may be subject to fourfold uncertainty in measurement and it is currently not possible to determine which samples are affected in any other way.
Our data showed significant numbers of samples with more than 25% changes, making monitoring at this level challenging. The fact that interassay CVs in EQA are, at best, approximately 25% in some samples has implications for clinical practice. No single measurement with a change of less than 50% is definitive proof of a change, even in samples which do not display non-linear behaviour . Hence, measurement with the precision required by the remission criterion of the IMWG and BSCH is virtually impossible at present in many samples, even in an experienced laboratory with very tight quality control (QC).
Although pooled IQC material performance is excellent with a CV of less than 10%, it does not therefore follow that changes of 25% (25% = 2.5 the CV of the assay QC material) can be detected in patient material. They are not the same. Up to 20% of samples show non-linearity and cannot currently be detected reliably by any means other than automatic dilution. This is where automated AgXs detection in platforms will have the most impact. It will be important to know if improved AgXs detection on some platforms is able to detect this phenomenon. Validation of the performance of platforms with antigen detection steps in unselected cohorts with a high pretest prevalence of monoclones will be essential. Similar evaluation of any new methods is also required.
Non-linear measurement variability has implications for haemato-oncology networks
All assays need to be performed on the same analyser, in the same centre, to minimize uncertainty and to provide comparable measurements. This has implications for networks applying the same protocols on results from different laboratories, where results may not be strictly comparable.
Use of IQC-derived CVs (less than 10% in our laboratory) does not provide information about interassay CV across networks (25–70% in EQA) and during monitoring within the same laboratory. Review of EQA comparative data is required to validate recommendations, and guidelines should adopt realistic and achievable thresholds.
Between-laboratory agreement is very poor for those samples demonstrating AgXs or non-linearity. It is likely that overall assay performance, which is good on non-monoclonal samples and in many monoclonal ones, cannot be replicated on these samples showing non-linearity. This requires further evaluation. It is assumed, but not proved, that the ‘non-linearity’ phenomenon is due to ‘AgXs’ caused by variation in the detector/substrate ratio due to the heterogeneity of the abnormal light chains.
Recommendations for sFLC measurement
Previously published rates of ‘pro-zoning’ or ‘non-linearity’ have varied in different reports from under 1% to 9%. Our overall rates demonstrate that up to one in five kappa samples with high ratios may be affected by a greater than 100% change on dilution, and half these by greater than 200%. This can occur in high or normal apparent levels on screening dilution. We simply do not know if the higher or the lower value is more ‘correct’. Caution is required for interpreting smaller changes during monitoring. Our data suggest that changes of less than 400% should be viewed with caution. The problem is how to identify these samples without expensive and labour-intensive multiple dilutions.
Reporting of values or ratios where non-linearity is present is difficult to defend without flagging the uncertainty, even when the same dilutional strategy is used on the same sample at different times. The optimal dilution for any sample may change with sFLC level and possibly with reagent batch. Caveats should be applied to all samples where there is a major difference in the numbers produced on dilution by twofold. In these samples no change of less than 200% or more can be regarded as certain, and a repeat should be made at least once to confirm a trend. Laboratories supporting haemato-oncology MDTs should have appropriate strategies for identifying and dealing with these samples. Performance data need to be reappraised where newer techniques have been introduced.
Management and treatment decisions for myeloma patients are complex, but are increasingly dependent upon guideline thresholds. Assessment of response (and calculation of response criteria by 50 or 90% reductions, etc.) may be more challenging using low-level intact paraprotein, and thereby an argument for the relative utility of sFLC as the primary measurement.
However, use of sFLC is subject to the limitations that we have highlighted. The interpretation needs to take into account the performance characteristics of protein electrophoresis and immunofixation which can detect down to 1 g/l and 200 mg/l reliably in experienced laboratories, with similar or better CVs than immunochemical assays. Thus, the relative utility of sFLC over other assays in monitoring ‘oligosecretory disease’ is an area for further consideration, requiring more stringent QC than for effective diagnosis. A reconsideration of the definition of ‘oligosecretory’ MM in future iterations of the IMWG and other guidelines is needed.
It is impossible to verify the calibration in all samples on an individual basis. Each light chain is potentially a different substrate with unique properties in the assay. Therefore, heterologous calibration against polyclonal light chains may not provide accurate calibration for all sFLC analytes. For each sample this relationship is essentially unknown. The fact that many validation studies have demonstrated clinical utility of sFLC measurement without non-linearity checks may reflect the fact that clonal products often change exponentially and treatment responders often fall precipitously.
sFLC results should always be interpreted with caution, being aware of their limitations and using them carefully. Triangulation of results with other methods and clinical details is required for interpretation, but will not detect non-linearity reliably. We recommend that national and international guidelines set action thresholds for tests which are grounded firmly in the reality of test performance parameters and individual differences in the behaviour of abnormal specimens in these assays.
We declare there are no conflicts of interest. Ethical approval was not required in this assay validation study.