Immune Function Tests for Hazard Identification: A Paradigm Shift in Drug Development


Author for correspondence: Elizabeth R. Gore, UE0359, Department of Safety Assessment, GlaxoSmithKline Pharmaceuticals, P.O. Box 1539, King of Prussia, PA 19406, U.S.A. (fax +610 270 7712, e-mail


Abstract: Routine immune function testing in preclinical drug development was established as a regulatory requirement in June of 2000 under the Committee of Proprietary Medicinal Products (CPMP) Note for Guidance on Repeated Dose Toxicity (CPMP/SWP/1042/99). The purpose of the more stringent approach to immunotoxicology testing was to better identify unintended immunosuppression; however, the requirement was met with much discussion and debate. At the center of the discussion was an attempt to reconcile opposing regulatory directives from agencies outside of Europe that adhere to a more selective, weight-of-evidence approach to functional evaluations. Uncertainty over the predictive value of the recommended immune function tests relative to conventional toxicology parameters prompted an investigation by the International Committee on Harmonization (ICH). The results of a preliminary, industry-wide survey indicated that only a low percentage of pharmaceuticals adversely affect immune function without alterations to standard toxicology parameters. Expected ICH guidelines will ultimately determine to what extent and for what purpose immune function tests will be conducted. In the meantime, optimization of the recommended immune function tests is ongoing. The T-cell dependent antibody response (TDAR) by either conventional Sheep Red Blood Cell (SRBC) plaque assay or by the modified ELISA method using either SRBC or keyhole limpet hemocyanin (KLH) as antigen is being extensively evaluated to determine best practices and procedures for preclinical immunotoxicity evaluations. This review addresses some aspects of the debate concerning the appropriateness of immune function tests for hazard identification, along with recommendations for optimizing TDAR methodology to ensure adequate sensitivity and predictability in risk assessments for immunotoxicity.

Regulatory guidelines on immune function testing

It has been several years since the European Committee for Proprietary Medicinal Products released the Note for Guidance on Repeated Dose Toxicity (CPMP/SWP/1042/99). The guidance, particularly with regard to immunotoxicology screening of pharmaceuticals, has been the subject of considerable discussion, and a summary of the regulatory debate will be provided in this review. Routine evaluation of new chemical entities for their potential to cause adverse effects on immune function in preclinical species is at the center of the debate. The required functional tests (T-cell dependent antibody response, or Natural Killer Cell cytotoxicity and lymphocyte phenotype determination) were selected based on demonstrated comparability with models of host defense (Luster et al. 1992). The purpose of incorporating the functional test(s) in standard toxicology evaluations was to broaden the scope of immunotoxicity testing to better predict unintended immunosuppressive potential of new chemical entities.

Prior to the official release of the CPMP Note for Guidance, functional tests were performed in the pharmaceutical industry for new chemical entities flagged as a ‘cause-for-concern’ based on either pharmacology-related effects or findings in standard preclinical toxicity evaluations (i.e. haematology and histopathology). However, such testing occurred sporadically based on the individual company's philosophy and/or capabilities. No mandatory requirements were in place. Compounds with signals for immunotoxic potential would be evaluated in studies employing functional tests and/or models of host defense with consideration given to preliminary findings and laboratory expertise (Badger et al. 1999). In general, follow-on investigations, commonly referred to as Tier II testing, range in degree of complexity (Vos & van Loveren 1998) and are considered an important component of risk assessment as they can provide further evidence on immune alterations, mechanisms of observed toxicity, and potentially, clinical relevance.

The CPMP requirement to conduct immune function tests on all new chemical entities caused a paradigm shift in regulatory immunotoxicology, as it suggests that all new chemical entities are a ‘cause-for-(immunotoxic) concern’ and would therefore command more extensive, purportedly more sensitive, evaluations at the screening phase of drug development. Routine functional testing is a considerable undertaking with regard to methods validation, study design, data analysis and data interpretation, and yet there is no clear evidence that implementation improves upon conventional immunotoxicity evaluations for hazard identification. Furthermore, conflicting directives from regulatory agencies outside the EU suggests some uncertainty over the predictive value of routine functional evaluations. For example, US FDA guidance on immunotoxicity testing recommends immune function tests when ‘signs’ of unintended immunosuppression have been observed on standard toxicology studies (Hastings 2002; FDA 2002) and/or the intended patient population is immunocompromised. In other words, immune function testing is more relevant to assessing the risk of a given drug once immunotoxic potential has been identified by conventional toxicology parameters. Despite the discord in regulatory expectations, pharmaceutical companies have complied with the more rigorous approach (i.e. routine functional tests for hazard identification) to avoid registration delays in Europe due to incomplete safety testing.

Inconsistent regulatory expectations and uncertainty over the predicative value of immune function tests prompted action by the International Conference on Harmonization on Technical Requirements for Pharmaceuticals for Human Use (ICH). The ICH enlisted a committee to formulate a harmonized guideline for immunotoxicity testing of novel pharmaceuticals. As part of the harmonization effort, a survey on preclinical immunotoxicity evaluations was conducted to assess utility of immune function tests for hazard identification. Preliminary results of the survey indicated that only a low percentage of new chemical entities adversely affect immune function without alterations to standard toxicology parameters (van der Laan & van Loveren 2005), weighing significantly on the cost-benefit ratio of the CPMP directive. Interestingly, the new chemical entities identified belong to drug classes with established immunomodulatory properties, i.e. non-steroidal anti-inflammatory drugs (NSAIDs) (Paccani et al. 2003) and selective serotonin re-uptake inhibitors (SSRIs) (Pelligrino & Bayer 2000). Therefore, consideration for drug class and pharmacologic properties, in addition to any effects on standard immune parameters, will likely be the basis for conducting immune function tests, rather than routine screening. While the European position on immune function tests has been reconsidered, a middle ground approach will take into account the intrinsic properties of new chemical entities, including but not limited to, affinity for leukocyte receptors and/or soluble mediators that directly or indirectly influence immune responsiveness (van der Laan & van Loveren 2005).

In parallel to the ICH process, discussions regarding the need for more sensitive, perhaps standardized functional evaluations to better identify immunosuppressive potential are ongoing. The highly redundant and compensatory nature of the immune system is well recognized, yet it is important to appreciate its vulnerability and sensitivity to specific perturbations that could result in life-threatening disease (i.e. neoplasia, bacterial and viral infection). This concern is underscored by widely publicized reports on resistant bacteria (Karchmer 2000; Gholizadeh & Courvalin 2001; Ruef 2004), and latent viruses with the potential to wreak havoc in patients with compromised immune systems (Polo et al. 2004; Hahm et al. 2005). Although unintended drug-induced immunosuppression is considered a rare phenomenon (Hastings et al. 1997; Hastings 2002), the implications of even a low incidence of occurrence can have detrimental consequence to both patients and drug companies, alike. A recent example is the halt in sales and testing of the novel therapeutic, natalizumab (www.fda.govcder). Natalizumab acts on the T-cell adhesion molecule, α4 integrin, and has demonstrated great potential in both multiple sclerosis and inflammatory bowel disease (Van Assche et al. 2005). However, several clinical cases indicate an association of natalizumab with progressive multifocal leukoencephalopathy after chronic treatment (Kleinschmidt-DeMasters & Tyler 2005; Langer-Gould et al. 2005; Van Assche et al. 2005). Progressive multifocal leukoencephalopathy is a life-threatening brain disorder caused by infection of the central nervous system with the JC virus. Reactivation of dormant JC virus, to which the majority of healthy human adults have been exposed, is not uncommon in transplant, HIV+, or otherwise immunocompromised patients, with often severe, if not fatal, consequence (Hou & Major 2000; Koralnik 2004). While a causal relationship between natalizumab and progressive multifocal leukoencephalopathy has not been established, clinical events associated with progressive multifocal leukoencephalopathy have suspended further marketing. From a preclinical perspective, it is uncertain whether extended immunotoxicity evaluations could have predicted the immune-associated adverse event. Ongoing investigations with natalizumab may help identify potential mechanisms related to the clinical findings, and possibly, the appropriate experimental models to enhance current risk assessments. The risk/benefit ratio of natalizumab therapy, and an increased understanding of the progressive multifocal leukoencephalopathy-associated immunotoxicity, will determine further development of this and other related compounds for clinical use.

Unintended immunosuppression, if discovered late in development or postmarketing, has the potential to derail promising therapeutics and subject patients to less effective alternatives. Early detection of immunosuppressive potential is paramount to improved risk assessments for immunotoxicity and appropriate clinical monitoring. An example of enhanced clinical monitoring to protect against potential drug-induced immunosuppression has been reported for a tumour necrosis factor-α blocker with demonstrated risk of rare serious infections (Scheinfeld 2005). Specialized clinical monitoring is essential to managing the risk while gaining the benefits of otherwise safe and effective new medicines.

The predictive value of immune function tests relative to standard toxicology parameters has not been clearly and sufficiently established for the purpose of hazard identification. It is important, however, to continue efforts leading to improvement of both hazard identification and risk assessment for immunotoxicity.

Regulatory immunotoxicology at the cross roads

Over the last two decades immunotoxicologists have greatly influenced approaches to immunotoxicity evaluations across academic, pharmaceutical and regulatory sectors exemplified in the following statement: “pathology alone, while adequate to identify many potential immunotoxic agents, can not with 100% confidence identify all immunologically active agents that might be identified in one or more functional screening tests” (Dean 2005). That said, selected functional test(s) may not identify immunosuppressive potential otherwise indicated by standard toxicity evaluations. Taken together, the ideal paradigm for immunotoxicity assessment would consist of conventional parameters with established high predictability, i.e. histopathologic examinations (Basketter et al. 1995; Kuper et al. 2000), in addition to one or more functional tests. Striving for increased sensitivity and predictability in preclinical evaluations is essential for reducing unanticipated adverse events in the clinic. More preclinical information, derived from well-defined procedures with established high sensitivity, will likely improve identification and characterization of the immunotoxic potential of new drugs.

Comparative analyses by Luster et al. (1992) established the T-cell dependent antibody response (TDAR), NK cell cytotoxicity and lymphocyte phenotype determinations as reliable indicators of immunosuppressive activity. Furthermore, drug-induced alterations in TDAR have been detected in the absence of adverse haematologic and/or histopathologic findings indicative of differential sensitivity (Dean et al. 1983; Ladics et al. 1995 & 1998). And yet, the examples of drugs used to support the CPMP Note for Guidance do not demonstrate adverse effects on the recommended immune function tests, but rather on crude to more specialized immune parameters, namely total IgG levels and host resistance to Tricinella spiralis infection (Putnam et al. 2003). The drug examples that were contested include the long-acting β2-adrenoreceptor agonist, salmeterol, and the opioids, morphine and methadone (Snodin 2004; Herzyk 2005; Ryle 2005). In the case of salmeterol, there is no evidence of immunosuppressive activity based on a thorough review of the preclinical and clinical data (Herzyk 2005). Similarly, the argument for routine functional tests based on opioid evaluations is tenuous. Both morphine and methadone have tested negative in the primary antibody response to a T-cell dependent antigen (DeWaal et al. 1998), despite immunosuppressive activity in T. spiralis host defense model (e.g. ∼45% decrease in muscle larva count) (Putnam 2003) and well-established immunomodulatory effects of opioids in the clinic (Alonzo & Bayer 2002). Ironically, these particular drug examples call into question the sensitivity and/or reliability of the current TDAR evaluations in identifying unintended immunosuppressive effects.

Establishing the sensitivity and reliability of immune function tests relative to the standard screening parameters (i.e. haematology and histopathology) is critical to determining their utility for hazard identification. Published reports suggest that histopathologic evaluations are sufficient to identify potential immunotoxicants (Descotes et al. 1996; Crevel et al. 1997; ICICIS Investigators 1998); however, these studies were conducted with potent immunosuppressive drugs (e.g. cyclosporine, azathioprine). More studies with a wider range of drugs, at concentrations causing mild to moderate effects on lymphoid organ morphology and/or haematology, are needed to fully establish differential sensitivity between immune function tests and conventional parameters.

Immune function tests: current practices and future directions

Since the release of the CPMP Note for Guidance, much effort has been placed on developing and validating functional tests suitable for routine preclinical evaluations (Cederbrant et al. 2003; Smith et al. 2003; Gore et al. 2004; Ulrich et al. 2004). Of the two options cited in the guidelines (TDAR or NK cytotoxicity and lymphocytephenotype analysis), more attention has been given to the TDAR parameter. A likely explanation for the preference is two-fold: 1) TDAR is considered a general-purpose screen as it requires multiple immune processes involving various soluble mediators and cell types of the immune system and 2) the analytical method (e.g. ELISA) is amenable to batch testing, automation and standardization.

A review of the literature indicates that the sensitivity of TDAR measurement may depend on the methodology employed (Descotes et al. 1996; Dean et al. 1998; ICICIS Investigators 1998; Smith et al. 2003; Gore et al. 2004). Differences in TDAR study designs and analytical methods have demonstrated differential sensitivity in detecting immunosuppression using the reference compounds, cyclosporine and azathioprine. For example, in Dean et al. (1998), rat primary antibody response to keyhole limpet haemocyanin (KLH) was suppressed after cyclosporine treatment when measured by conventional plaque assay, but not by ELISA; azathioprine had no effect on TDAR by either method. In contrast, Gore et al. (2004) demonstrated significant suppression of the rat primary antibody response to KLH after treatment with cyclosporine and azathioprine using the ELISA method. Based on reported evaluations of TDAR methodology, the reproducibility and sensitivity of this regulatory-driven parameter has yet to be established. The sensitivity of the TDAR measurement will likely be influenced by the particular study design (e.g. rat strain, immunization procedure, duration of drug exposure, etc.) and method of analysis (e.g. plaque assay or ELISA by titer, relative or absolute quantification). Further evaluations and more data will help to determine the optimal TDAR approach for the detection of varying degrees of immunosuppression (e.g. mild to severe).

Optimization and standardization of the analytical methods for measurement of antibody response in TDAR tests would have obvious benefits for inter-laboratory immunotoxicity data comparison and efforts in this area, lead by S. Spanhaak (Johnson and Johnson Pharmaceuticals), are currently underway. The inherently high variability of the antibody response in outbred rat strains (e.g. Sprague Dawley) may be further affected by variations in analytical methods, e.g. ELISA, for which alterations in detection reagents can influence data output, specifically the dynamic range of the standard curve. Reported inconsistencies of immunosuppressive effects of reference compounds, often at concentrations below the maximum tolerated dose, suggests that current TDAR methods have not been optimized to detect moderate immunosuppression, that is, in the absence of other adverse effects (e.g. lymphoid organ histopathology). Furthermore, the absence of an established ‘normal response range,’ contingent upon standardization, has precluded use of biological indicators in TDAR evaluations. Consequently, statistical analysis has been the basis for data interpretations. A weight-of-evidence approach to data analysis, however, may safeguard against misinterpretations based solely on statistics. Additional considerations for such an approach include the following: dose-dependent effects, alterations to both IgM and IgG response, similar findings in both sexes, and the compound class or target. Generation of a historical database for antigen specific antibody responses, in the presence or absence of immunosuppressive drugs, using consistent practices and procedures (from immunization protocol to analytical method and reagents) are essential to establishing benchmark data that would afford additional perspective on future data analyses and interpretations.

An example for a streamlined approach to TDAR methodology using immunization of rats with keyhole limpet haemocyanin (KLH) and specific antibody measurements by ELISA has been previously described (Gore et al. 2004; Herzyk & Gore 2004). Immunization of rats with KLH on day 14 of a 28-day repeated dose study enables measurement of both primary IgM and IgG responses (on days 19 and 29, respectively). A second immunization five days prior to termination would enable evaluation of a secondary IgG response. Measurement of a single endpoint (i.e. primary IgM) is unnecessarily limiting and potentially less predictive, particularly in a model with high inter-individual variability. It is important to note that in at least three published reports on separate TDAR evaluations, IgG analysis demonstrated greater sensitivity in the detection of immunosuppression relative to IgM (Smith et al. 2003; Ulrich et al. 2003; Gore et al. 2004). Thus, extending antibody evaluations beyond the primary IgM response may improve the sensitivity of the KLH immunization model. Furthermore, IgG measurement, in addition to IgM, provides a more comprehensive assessment of the primary response and increased likelihood for identifying a ‘true’ potential hazard. A final consideration for streamlining the TDAR approach within and across laboratories would be use of a commercially available KLH ELISA kit, complete with antigen, reference standard and quality controls. While currently unavailable, such an advance in immunotoxicology methods would relieve industry specialists of the cumbersome and time-consuming aspect of assay validations and change controls, and may provide more reliable and consistent data to identify and/or assess immunosuppressive potential of new drugs.

Performing TDAR evaluations as separate studies, or in satellite groups, as opposed to inclusion on regular toxicology studies, has garnered considerable support given the concern that immunization may interfere with standard toxicology evaluations. Interestingly, no data exists to support this concern, and in fact, evidence suggests that immunization does not interfere with the ability to detect characteristic pathologic and haematologic changes associated with well-established immunosuppressive drugs (Ladics et al. 1995; Gore et al. 2004). Nevertheless, advantages to conducting a separate study later in development (e.g. post proof of concept/Phase I) include more relevant dose selections and/or a reduction in effort due to early stage attrition.

Concluding remarks

Unintended drug-induced immunosuppression, albeit of low incidence and high likelihood for reversibility, has potentially life-threatening consequences and is therefore a legitimate concern in the development of new medicines. Identifying drugs with immunosuppressive activity early in development is paramount to protecting the health of individual patients, especially in times of increasing bacterial resistance and viral virulence. However, the existing data regarding utility of the recommended immune function test(s) in preclinical studies do not support an absolute requirement for screening all new drugs by the fuctional tests. Continued efforts in evaluating immune function tests, namely TDAR, using a range of concentrations of known immunosuppressive drugs will help establish the sensitivity of the method and its future application for either hazard identification or hazard characterization to support the risk assessment for immunotoxicity.


A special thanks to Dr. Danuta Herzyk and Dr. Patrick Wier for helpful comments and additional insight on TDAR approaches and applications in drug development.