Acceptance of alternative methods in toxicology requires that they are valid. This means, according to the dictionary definition, that they are shown by independent review to be relevant and reliable for a specific purpose. There are good arguments that the standard against which new methods are being judged is much higher than those in place when existing assays were adopted. Furthermore, existing assays cannot necessarily be regarded as representing a ‘gold standard dataset’ against which novel tests are evaluated (1). However, the scientific and regulatory community charged with the acceptance of new methods has to make pragmatic decisions. The comparison of a new method against the one it is intended to replace and its quality in the context of good scientific principles form an important part of the process of acceptance. This process has already been undertaken several years ago for the local lymph node assay (LLNA) (2). The conclusions of that independent evaluation were that the LLNA was a methodology established on the basis of sound science and that was robust (readily transferred between laboratories and giving the same results therein) and reliable (giving the same results on repeated testing within a laboratory) for the purposes of skin sensitization hazard identification. More recently, it has become apparent that in addition to hazard identification, the dose–response data available from the LLNA also permit further characterization of skin sensitization hazards that confers additional important benefits compared with the standard guinea pigs tests, which it has tended to replace. This extension is the measurement of the relative skin sensitizing potency of a substance, usually expressed as the estimated concentration of the chemical necessary to produce a threefold increase in proliferation in draining lymph nodes compared with concurrent vehicle-treated controls (the EC3 value) (3–5). In this commentary, we have examined critically the relevance to humans, the robustness and the reliability of LLNA EC3 estimations, using as a benchmark the previous LLNA validation (2), and with reference to the validation criteria set out by various learned bodies (6–10). In essence, a key point is that using the LLNA EC3 value, it is possible to get an estimate of the relative potency of a sensitizer to humans (without the need for recourse to human testing) and so provide a starting point for hazard categorization and risk assessment.
For the prediction of skin sensitization potential, the local lymph node assay (LLNA) is a fully validated alternative to guinea-pig tests. More recently, information from LLNA dose–response analyses has been used to assess the relative potency of skin sensitizing chemicals. These data are then deployed for risk assessment and risk management. In this commentary, the utility and validity of these relative potency measurements are reviewed. It is concluded that the LLNA does provide a valuable assessment of relative sensitizing potency in the form of the estimated concentration of a chemical required to produce a threefold stimulation of draining lymph node cell proliferation compared with concurrent controls (EC3 value) and that all reasonable validation requirements have been addressed successfully. EC3 measurements are reproducible in both intra- and interlaboratory evaluations and are stable over time. It has been shown also, by several independent groups, that EC3 values correlate closely with data on relative human skin sensitization potency. Consequently, the recommendation made here is that LLNA EC3 measurements should now be regarded as a validated method for the determination of the relative potency of skin sensitizing chemicals, a conclusion that has already been reached by a number of independent expert groups.
LLNA EC3 validation status
In a formal validation process, the novel method/approach must be assessed so that its qualities and limitations as a practical assay are properly understood. Prevalidation includes optimization of the protocol, initial testing of interlaboratory transferability, and the optimization of the prediction model. In this respect, the LLNA itself has been exhaustively examined (11–14). Subsequently, initial approaches to the determination of a relative potency index using LLNA dose–response data were explored. The first work reported a quantitative structure–activity study (15). The fact that a quantitative structure activity relationships (QSAR) could be derived provided a strong indication that the potency data available from the LLNA should be biologically meaningful. Subsequently, the initial suggestion concerning the utility of LLNA dose–response information for risk assessment was made, again with particular reference to the use of the concentration to produce a threshold positive in the LLNA, the EC3 value, as a valuable benchmark (16). In parallel, the final phases of the interlaboratory trials associated with formal validation of the LLNA for the purposes of hazard identification also included a demonstration of the reproducibility of LLNA threshold values across 5 laboratories (17, 18). Finally, the retrospective analysis of a large LLNA dataset (19) and an evaluation of a range of statistical approaches to the determination of the EC3 value (3) provided the basis for the protocol/prediction model paradigm of the LLNA EC3. Essentially, the method represents a simple linear interpolation of the points in the dose–response curve, which lie immediately above and below the classification threshold, i.e. a stimulation index of 3. If the data points lying immediately above and below the SI value of 3 have the co-ordinates (a, b) and (c, d), respectively, then the EC3 value may be calculated using the following equation: EC3 = c + [(3 − d)/(b − d)](a − c). This is presented graphically in Fig. 1. Where this equation cannot be applied, then an approach to modest extrapolation of LLNA dose–response data can be deployed (5).
The LLNA has been shown to be relevant as a model for the predictive identification of chemicals with the potential to cause skin sensitization potential. The protocol provides a quantitative and objective measure of the crucial stage of the sensitization process, the clonal expansion of lymphocytes that results from the application of a test substance by the appropriate route, epidermal application (20, 21). Both the route of administration and the immunological mechanisms involved are the same as those in humans. However, the quantitative element of this response was also noted some years ago (22). In an extensive range of publications, the method for the determination of the EC3 value having been fixed (see above), and the relationship between LLNA EC3 values and human skin sensitization potency was described (23–30). These publications also served to show that the dynamic range of these measures covered some 4–5 orders of magnitude.
2 important points must be made here. First, potency refers to the intrinsic property of a sensitizing chemical, which is thus entirely independent from the frequency with which allergic contact dermatitis occurs in the general or a clinical population (as this depends heavily on exposure as well as on potency); second, there is a paucity of data indicating the intrinsic potency of chemical skin sensitizers in humans because this requires experimental studies of dubious ethics. Thus, the work that appears in the literature cannot offer the degree of accuracy in human/mouse correlations that would ideally be liked, and a degree of judgement is inevitable to help compensate for the relatively poor quality of the limited human data that are available. Hence, it has been important that many of the publications in this area have involved independent partners closely associated with the LLNA, including dermatologists, regulators, and independent scientists (24–30).
However, the potency comparisons referred to above tended only to distribute human skin sensitizers into one of a number of categories (non, weak, moderate, strong, and extreme) and to use the LLNA EC3 value to show that it was possible to distribute the sensitizing chemicals into these categories if certain cut off limits were applied. Although the outcome of this type of analysis was very successful, more interesting work was performed by a number of groups who attempted to compare experimental thresholds in humans, typically a no effect level in a human repeated insult patch test (HRIPT) with the LLNA threshold, the EC3 value. Neither of these thresholds is of course absolute; they depend very much on the exposure conditions of the protocols. However, as each protocol is standardized, particularly the LLNA, they represent a reasonable point of departure for a comparison. 2 groups have published such comparisons in 2003. In 1 study, over 50 substances were assessed, and a satisfactory relationship between the LLNA and the HRIPT thresholds was shown (29). In the 2 study, a slightly different approach was chosen, but again a good relationship was shown (30). Last, in a more recent analysis, a very critical approach was taken to the selection of human data to try to ensure that only good quality HRIPT threshold information was used (31). This restricted the analysis to just 25 substances, but again a good relationship between EC3 values and HRIPT thresholds was shown; this analysis is reproduced in Fig. 2.
The LLNA is already accepted as a reliable method for hazard identification. Furthermore, 5 laboratories have used the assay with a set of sensitizers and nonsensitizers, and even with the technical variations that inevitably arise in the detail of test conduct, came up with essentially identical threshold predictions on all the substances evaluated (17, 18). On this foundation, the reliability of the prediction of EC3 values has been further assessed within a single laboratory. Data have been published, which show that the Organisation for Economic Cooperation and Development (OECD)-positive control, hexyl cinnamic aldehyde, a weak sensitizer, gives reproducible EC3 values over time in an individual laboratory (32). This has also been shown for other weak allergens. The reproducibility of EC3 values has also been tested at the opposite end of the potency spectrum for the very strong allergen, p-phenylenediamine, which was assessed in each of 2 laboratories (33). EC3 values were highly consistent over each of 4 determinations in each laboratory. Last, the EC3 value for a moderate allergen, isoeugenol, was assessed in a single laboratory (34). The outcome of these various assessments supplemented with a small amount of additional unpublished data for 15 chemicals of widely varying skin sensitization potency has been collated in Table 1. What is of particular note here is that, while there is of course biological variation in the EC3 determination (e.g. isoeugenol, where 31 determinations give a mean and standard error EC3 value of 1.5% ± 0.1%), the values typically lie well within their order of magnitude banding. Expressed differently, the variation in EC3 value is distinctly less than an order of magnitude, whereas when a wide range of skin sensitizers are examined, then EC3 values for substances of different potency span several orders of magnitude.
|Substance||EC3 values (%)||Vehicle||Mean EC3 (%) ± SEa||Reference(s)|
|Bandrowski’s base||0.04, 0.02||AOO||0.03||(47)|
|2,4-Dinitrochlorobenzene||0.04, 0.02, 0.05, 0.03, 0.03, 0.02, 0.06, 0.03, 0.06, 0.05, 0.05, 0.06, 0.05||AOO||0.04 ± 0.004||(17,18, 48–50) and unpublished results|
|p-Phenylenediamine||0.07, 0.12, 0.09, 0.08, 0.06, 0.14, 0.06, 0.18, 0.16, 0.13||AOO||0.11 ± 0.014||(33, 47)|
|Methyldibromoglutaronitrile||1.8, 0.9, 1.3||AOO||1.3||(36, 52) and unpublished results|
|Isoeugenol||1.7, 1.1, 1.4, 1.3, 1.3, 1.0, 1.4, 1.5, 2.9, 0.8, 1.3, 1.6, 2.8, 0.9, 1.0, 1.7, 1.2, 1.4, 0.8, 2.1, 2.3, 1.1, 1.2, 1.2, 0.7, 1.0, 2.3, 1.3, 2.0, 1.6, 1.3||AOO||1.5 ± 0.1||(23, 34)|
|Cinnamal||3.1, 1.7, 2.7||AOO||2.3 ± 0.4||(36) and unpublished results|
|1-Bromopentadecane||5.2, 5.1||AOO||5.1 ± 0.02||(36) and unpublished|
|l-Perillaldehyde||8.1, 7.8||AOO||8.0||(53) and unpublished results|
|Hexyl cinnamal||6.6, 11.3, 10.6, 4.4, 11.5, 8.8, 7.6, 11.0, 7.0, 10.6, 11.9, 11.7, 10.9, 11.7, 12.2||AOO||9.9 ± 0.6||(19, 32)|
|Eugenol||15.0, 4.9, 12.9, 7.5||AOO||10.1 ± 2.3||(18) and unpublished results|
|Abietic acid||14.7, 8.3, 10.6||AOO||11.3 ± 1.8||Unpublished results|
|Penicillin G||16.7, 17.9, 30||DMSO||21.5 ± 4.3||(36) and unpublished results|
|Hydroxycitronellal||33.0, 27.5, 23.0||AOO||27.8 ± 2.9||(27) and unpublished results|
|2-Ethylbutyraldehyde||60, 76||AOO||68||Unpublished results|
The ability of the LLNA to transfer between laboratories was established extensively as a key component of the original validation (2). The interlaboratory assessment of the EC3 value was expected to be similarly robust. This has been shown to be the case in a number of studies (32, 33). Indeed, the measurement is sufficiently robust that even when the protocol and prediction model are not followed with great precision, very similar results are obtained (35), an outcome in accord with some of the early data also derived from nonstandard LLNAs (17, 18). The accumulated evidence that different laboratories achieve closely similar EC3 values is effectively summarized in Table 1.
LLNA EC3 database and its application
Using the approach described in this article, EC3 values for approximately 200 chemicals have been reported (36). Currently, this dataset is being expanded to approximately 300 chemicals (37). 2 primary uses are suggested for this type of data. First, as has already been proposed by both industry (38) and regulatory (39) expert groups, potency data can lead to improvements in hazard classification and thus risk management. Second, potency data can facilitate improved risk assessments for skin sensitization (40–44). Both of these are in our highly desirable goals, but the details fall outside the remit of this article, which is simply to discuss the status of validation of EC3 potency determinations in the LLNA.
Alternative tests should be valid. That is they should be demonstrably relevant and reliable in the context of their specific purpose. The LLNA is relevant to the property being assessed (skin sensitization hazard) and has been shown to be both reproducible and reliable (2, 11–14, 18, 33–35). For simple skin sensitization hazard identification, the LLNA has been the subject of extensive and successful interlaboratory evaluations (both national and international) as well as of comparisons with older (guinea-pig) methods and with human data. Furthermore, the data that derive from the LLNA are quantitative and objective. Based on all these considerations, the assay was reviewed formally and independently and found to be fully validated (2). Against this background, the question is whether the use of the LLNA dose–response data for the purposes of deriving measurement of relative potency based on EC3 values requires the same or similar degree of independent scrutiny and validation. In this context, it is instructive to keep in mind the comparisons with no observed effect levels in subacute toxicity studies. In such studies, and in the absence of any formal validation of any element, dose–response data have been used successfully for decades as the basis for risk assessment and risk management. Whatever the view is of the need for any further validation, it is the case that the use of the LLNA for potency assessments has already been the subject of an extensive programme of evaluation and that material has been compiled and reviewed in this article.
In order to protect human health, chemicals that have the intrinsic property of skin sensitization needed to be identified, characterized, and subjected to appropriate risk assessments and risk management. Risk assessments already make extensive use of information on sensitization potency (40–45). In contrast, regulatory toxicology currently fails to exploit any information of this sort, despite several recommendations on this subject (38, 39, 46). 1 reason may be that in the context of regulatory use, it is important to have methods not only that permit with confidence measurement of relative sensitization potency but also that have been recognized to deliver such potency measurement using a standardized and transparent approach. This is what EC3 values from the LLNA offer. The validity of this approach has therefore been summarized in this review article. The EC3 measure has been shown to be a relevant indicator of human skin sensitization potency and is robust and reproducible in inter- and intralaboratory investigations. The really important issue here is one of improving risk assessment and risk management by embracing and exploiting information of considerable moment and relevance (and of making maximum use of information from in vivo studies). It has been estimated that skin sensitizing chemicals vary by up to 5 orders of magnitude with respect to relative potency. Any unwillingness to integrate data on relative potency therefore would represent nothing less than a major lost opportunity in developing the more accurate hazard categorization, risk assessment, and appropriate risk management strategies, which will facilitate a reduction in the health burden that allergic contact dermatitis represents.