Current challenges facing the determination of product bioequivalence in veterinary medicine


Marilyn N. Martinez, US Food and Drug Administration, Center for Veterinary Medicine, Office of New Animal Drug Evaluation, Rockville, MD 20855, USA. E-mail:


Martinez, M.N., Hunter, R.P. Current challenges facing the determination of product bioequivalence in veterinary medicine. J. vet. Pharmacol. Therap. 33, 418–433.

Despite the pharmacological and statistical advances that have occurred since the early days of bioequivalence assessments, there remain many unresolved issues associated with the bioequivalence evaluation of human and veterinary pharmaceuticals. While many of these issues are common to both human and veterinary medicine, there are also challenges specific to veterinary drug products. Examples of complex problems that remain to be resolved include the assessment of drugs associated with complex kinetics (e.g., sustained release formulations that produce multiple peaks), the evaluation of intramammary formulations, uncertainty associated with conditions under which specific enantiomers of metabolites need to be factored into the bioequivalence evaluation, the study design for products and active pharmaceutical ingredients that exhibit highly variable kinetics, equivalence of biomass products, methods for evaluating topical formulations or formulations with very long duration of release, the evaluation of products where destructive sampling is necessary (e.g., aquaculture products), and the evaluation of bioequivalence for Type A medicated articles. This manuscript highlights many of the unresolved challenges currently impacting the evaluation of product bioequivalence in veterinary medicine, and provides a summary of the associated scientific complexities with each of these issues.


Legal considerations

The plan for patent term restoration was initiated in 1978 when President Carter launched a major domestic policy review of industrial innovation. The result of that review was a recommendation for patent term restoration for pharmaceuticals (and for any other product that required regulatory review) as a mechanism for compensating for the time lost during the regulatory review process. Subsequently, under the Reagan administration, the then Secretary of Commerce, Malcolm Baldridge, established an intellectual property committee that recommended patent term restoration. That recommendation turned into a bill that passed the Senate but failed in the House of Representatives. Despite failure to ratify this bill, this incident spurred Congressman Henry A. Waxman (D Calif), who was then Chairman of the Health Subcommittee, to establish a modified bill which included both patent term restoration and drug price competition. The resulting Public Law 98-417 (the Hatch-Waxman Act) was enacted in 1984. This law, the Drug Price Competition and Patent Term Restoration Act of 1984 (often referred to as the Hatch-Waxman Act), was designed to promote generics while leaving intact a financial incentive for research and development. It allowed generics to receive Food and Drug Administration (FDA) marketing approval by submitting bioequivalence studies (as opposed to clinical data). It also granted a period of additional marketing exclusivity to compensate for the time a patented pipeline drug remains in development. This extension cannot exceed 5 years, and it is in addition to the 20 years exclusivity granted by the issuance of a patent (Mossinghoff, 1999).

Defining product bioequivalence

Prior to Hatch-Waxman (1960s–1970s), the focus was on one fundamental assumption: if the drug concentration–time profiles resulting from the two formulations are superimposable, then the safety and the effectiveness of the two formulations will likewise be indistinguishable. Extending this assumption further, there was the basic belief that the primary impact of the formulation was control of drug absorption. Accordingly, so long as the excipient(s) did not have activity in the body, bioequivalence was considered to be an evaluation of the rate and extent of drug absorption.

These early bioequivalence determinations evaluated the extent of drug exposure by comparing the weights of individual subject concentration–time plots that were graphed on and cut from specially designed paper. This was followed in 1971 by a National Academy of Sciences bioequivalence symposium that resulted in recommendations for methods of determining the area under the concentration vs. time curve (AUC) via numerical integration and the evaluation of rate of absorption by assessing the observed peak drug concentrations (Cmax) and the time to Cmax, Tmax (Ronfeld & Benet, 1977). The bounds for defining bioequivalence were set as ±20% based upon consultation with physicians who concluded that this magnitude of difference would be without clinical significance. This determination was first published in the Federal Register in 1977 (Federal Register, 1977).

The next step was determining the statistics to be used for establishing with confidence that the two products differed by no more than ±20%. In the early years (∼1985–1987) of the Division of Bioequivalence, Center for Drug Evaluation and Research (CDER) of the US FDA, the statistical basis for concluding that two products are bioequivalent involved the use of an analysis of variance (anova) to test the null hypothesis (H0) of ‘no difference’ between the average bioavailability of the two products. The results of such an evaluation (the ability to correctly accept or reject H0) are a function of the Type I and Type II error of the test (Table 1).

Table 1.   Defining Type I and Type II error. In a bioequivalence trial, we can think of α as the risk of declaring two products as being different when they in fact are bioequivalent; and we can think of β as the risk of declaring two products as being bioequivalent when they in fact are different
 Investigator accepts H0Investigator rejects H0
When H0 is trueValid conclusionType I error (α) sponsor risk
When H0 is falseType II error (β) patient’s riskValid conclusion

The Type I error relates to the ‘α’ value [the degree of risk (e.g., α = 0.05)] associated with rejecting the null hypothesis when it is in fact true. This is also referred to as the level of significance. From a pharmaceutical perspective, one can view the Type I error as the level of the sponsor’s risk (i.e., the risk of failing to accurately define a product as being a bioequivalent product). On the other hand, from a patient perspective, there is the need to minimize the risk of failing to identify products that, in fact, are NOT bioequivalent. The probability of failing to reject the null hypothesis when it is in fact false is termed β. For example, a test may not have adequate power (where power = 1 − β) to reject the null hypothesis when it is in false. Power is a function of the variability in the parameter estimate, the number of observations included in the comparison, and the magnitude of differences that the investigator wishes to detect (e.g., 80% power to detect a 20% difference). The risk associated with failing to reject the H0 when it is false is known as the Type II error.

Initially, a two-tiered approach to this analysis was employed:

  • a) A F test where the average bioavailability of the two products were compared at α = 0.05 (i.e., a nominal consumer risk of 5%) and β = 0.20 (i.e., 80%).
  • b) If there was less than 80% power and if statistically significant differences were not observed, the 75/75 rule could be employed. The 75/75 rule stated that two products could be declared bioequivalent if 75% of the subjects had ratios for AUC and Cmax that were within the limits of 75–125 (Federal Regulation, 1977; Cabana, 1983).

As can be deduced from these relationships, there were inherent inconsistencies associated with the use of a traditional anova in the assessment of product bioequivalence. In particular, when using the power approach, the likelihood of declaring two products as being bioequivalent increased as the standard error increased (Schuirmann, 1987). In other words, the more variable the data (i.e., the greater the level of uncertainty), the greater the likelihood of declaring two products as being bioequivalent. The upper limit of the bioequivalence boundaries was defined by the 75/75 rule (Fig. 1). Considering the weaknesses in this set of statistical metrics, it is not surprising that the first several years after ratification of the Hatch-Waxman Act was met with much controversy as debates regarding FDA’s ability to insure therapeutic equivalence abounded in the literature (e.g., Hamrell et al., 1987). During these early years, many issues remained to be resolved, from in vitro test methods to statistical procedures for evaluating product bioequivalence (i.e., shortcomings associated with the use of anova methods as described above).

Figure 1.

 Rejection region when using the power approach (based upon the work of Schuirmann, 1987). inline image represents the difference between the treatment means. Note that as the standard error of the estimate increases, the size of the allowable difference between the treatment means likewise increases up to a maximum difference as defined by the need to achieve no less than an 80% power of the test. [Correction added after online publication 7 July 2010: The x-axis marking was updated from 1 to 0, the x-axis label was changed from T/R to inline image and the legend was described more accurately to reflect this change].

As the statistical principles associated with the power approach were called into question, alternative statistical approaches were suggested (e.g., Hauck & Anderson, 1984). Ultimately, in 1987, Don Schuirmann of the FDA published a landmark manuscript where he split the statistical assessment into two-one-sided test problems and applied the two-sample t-tests for evaluating the upper and lower limits describing bioequivalence (Schuirmann, 1987). In so doing, the H0 was changed to an assessment that the test (T) and reference (R) products are different (i.e., >20% difference between treatment means). The corresponding alternative hypothesis (Ha) is that the two products are bioequivalent (i.e., ≤20% difference between treatment means). By changing the test in this manner, an investigator needed to reject H0 in order to conclude that the two products are bioequivalent. Thus, Schuirmann’s statistical method provided a logical approach to the assessment of product bioequivalence: i.e., the likelihood of declaring two products as bioequivalent improves as statistical power is increased (Fig. 2).

Figure 2.

 Relationship between allowable difference between treatment means and the corresponding standard error of the estimate for establishing product bioequivalence when using the two one-sided test procedure (confidence interval approach). Note that as the standard error increases, the allowable difference between product means likewise decrease (based upon the work of Schuirmann, 1987). [Correction added after online publication 7 July 2010: The x-axis marking was updated from 1 to 0, the x-axis label was changed from T/R to inline image and the legend was described more accurately to reflect this change].

When comparing the bioavailability of two products using a confidence interval approach, the number of subjects needed for inclusion in a bioequivalence trial will be based upon the following statistical factors:

  • a) The targeted bioequivalence bounds (e.g., T/R = 0.80–1.25). The more narrow the bounds, the greater the number of subjects that will be needed to meet the bioequivalence criterion.
  • b) The targeted Type I error (e.g., use of the 90% confidence interval where α = 0.05 for the upper and lower bounds). As one decreases the targeted Type I error (e.g., use a 95% confidence interval where α = 0.025 for the upper and lower bounds), the estimated width of the confidence interval will increase. Accordingly, the wider the estimated confidence interval, the more difficult it will be to achieve the targeted bioequivalence bounds (e.g., 0.80–1.25). Accordingly, as the targeted Type I error is reduced, the number of subjects needed to demonstrate product bioequivalence will increase.
  • c) The power of the test (which is a function of the risk we are willing to accept that we will accept the null hypothesis when it is in fact false). By reducing the risk of a Type II error (β), we increase the power of the test. If we hold the Type I error at α = 0.05 and if we keep the confidence bounds for declaring bioequivalence at 0.80–1.25, then as the targeted power of the test increases (e.g., aiming for a statistical power of 90% rather than 80%), then, for any given magnitude of variability, the corresponding number of subjects needed for demonstrating product bioequivalence will likewise increase.

During the time that bioequivalence concepts were still evolving within the human pharmaceutical community, the issue of bioequivalence within veterinary medicine was in its infancy. The importance of these concepts to animal health scientists within academia, industry, and government grew rapidly with the advent of the Generic Animal Drug Patent Term Restoration Act of 1988 (GADPTRA). GADPTRA legally allowed for the US FDA Center for Veterinary Medicine (CVM) to approve generic animal drug applications (Public Law 100-670, Nov. 16, 1988, 102 Stat. 3971). In response to this new legislation, CVM released nine policy letters that fleshed out the issues and considerations associated with the regulation of veterinary generic drug products. In 1996, following several workshops on this topic (Martinez & Riviere, 1994) the in vivo bioequivalence test methods and considerations were solidified into the FDA/CVM guidance document #35 of 1996. This guidance has undergone several minor revisions, with the current version having been released in November, 2006 (

Furthermore, educational efforts were underway to provide an understanding of the basic statistical and pharmacological considerations associated with product bioequivalence to the veterinary community (Toutain & Koritz, 1997). These efforts reflected the need to translate the issues and methodologies associated with the determination of product bioequivalence from the human to the veterinary patient. Currently, these basic bioequivalence concepts and methods of data analysis are accepted within veterinary medicine and are being applied to both generic product approvals and bridging studies associated with new and investigational new drug evaluations. However, marked changes have occurred within our therapeutic landscape. This includes the development of novel release technologies (e.g., Martinez et al., 2008, 2010), and a growing awareness of the relationship between the physicochemical characteristics of the active pharmaceutical ingredient (API) and formulation effects, not only as they apply to human pharmaceuticals (e.g., Yu et al., 2002, Dahan et al., 2009) but also to veterinary medications (e.g., Martinez et al., 2002; Fahmy et al., 2008).

An understanding of drug-formulation relationships has lead to new opportunities for the granting of biowaivers. For example, based largely upon the biopharmaceutics classification system (BCS) concepts that were incorporated into CDER’s August 2000 guidance titled ‘Waiver of In Vivo Bioavailability and Bioequivalence Studies for Immediate-Release Solid Oral Dosage Forms Based on a Biopharmaceutics Classification System’ (, CVM published its 2008 guidance titled ‘Waivers of In Vivo Demonstration of Bioequivalence of Animal Drugs in Soluble Powder Oral Dosage Form Products and Type A Medicated Articles’ ( The latter guidance describes conditions under which certain drug products intended for administration in food and water can be granted a waiver from conducting an in vivo bioequivalence trial.

Because of recent pharmaceutical advances both within human and veterinary medicine, we now face new and unresolved issues associated with the evaluation of product bioequivalence. While many of these issues are common to both human and veterinary medicine, there are also challenges specific to veterinary drug products. This manuscript highlights the currently unresolved challenges impacting the bioequivalence assessment of veterinary drug products and provides a summary of the associated scientific complexities.

Unique Complexities on Issues Shared with Human Drugs

Long acting and extended release formulations

The duration of blood sampling needed to insure product bioequivalence was a question raised soon after enactment of the Hatch-Waxman Act. There was general agreement that so long as the absorption phase is adequately captured, the remainder of the profile reflects formulation-independent (drug-specific) effects. Therefore, treatment comparison of AUC values can be adequately defined by truncated profiles of long half-life (t½) drugs, so long as absorption is complete (Lovering et al., 1975; Martinez & Jackson, 1991; Endrenyi & Tothfalusi, 1997). However, the question of sampling duration is far more complex when the long t½ is reflective of prolonged absorption (i.e., when the formulations incorporate a sustained release technology). In these situations, terminating blood sampling prior to the completion of the absorption phase may result in failure to capture differences in product performance.

Within the human drug arena, a large proportion of the sustained release products are orally administered. Therefore, CDER’s 2003 Bioequivalence Guidance for Industry generally recommends a single dose pharmacokinetic study for both immediate- and modified-release drug products to demonstrate product bioequivalence: single dose studies are generally more sensitive than steady-state studies to differences in the rate of drug substance absorption. It is recommended that the duration of blood sampling continue until the completion of gastrointestinal (GI) transit of the dosage form. This recommendation reflects the need to characterize total drug absorption. GI retention times have been estimated as approximately 46 h in people. In contrast, for veterinary species, the GI retention time of particles has been reported to range from as little as 13 h in cats to longer than 60 h in cattle (Martinez et al., 2002).

A far more challenging study protocol is needed when drug release from the test and reference products can be measured for a duration of months. Enrolling subjects for trials of long duration of evaluation can lead to extensive within- and between-subject variability due to inherent variations in absorption, metabolism, distribution, and elimination processes. There can also be a much greater risk of subject drop-out. For these reasons, the possibility of alternative study designs (e.g., a random assignment of subjects to various segments of the absorptive phase such that no one subject is enrolled for the entire duration of the study) may be worthy of consideration. However, even if subjects are enrolled for the entire study duration, the bioequivalence study may need to employ a parallel (rather than crossover) design. Such designs have their own statistical complications, as discussed later in this manuscript.

The challenges associated with efforts to assess product bioequivalence for extended release formulations will be problematic at best. Clearly, additional work is needed to identify and explore possible approaches for dealing with these types of formulations.

Topical formulations

Blood level bioequivalence considerations do not normally apply to dosage forms designed for topical administration and corresponding local effects, such as those products formulated as lotions/creams/ointments for application to the skin or mucous membranes or nonsystemically absorbed oral formulations. For certain locally acting (topical) agents, there may be an in vitro endpoint(s) that can be measured to support bioequivalence. However, for both human and veterinary medicine, methods for resolving bioequivalence for these formulations are not obvious. While clinical bioequivalence trials could be used in lieu of blood level bioequivalence trials for these formulations, the economic costs and study design complexities for such clinical endpoint trials can be prohibitive. This is further confounded by the potential for multiple local effects which may be associated with differing exposure–response relationships. While methods such as dermal microdialysis (Benfeldt et al., 2007) and tape stripping techniques (N’Dri-Stempfer et al., 2009) have been explored within human medicine, the applicability of these techniques in veterinary species has not been assessed. Considering the very high hair density of most veterinary species, it is not likely that tape stripping methods will be able to sample the outer layer (the stratum corneum) of the epidermis, rendering this method not technically feasible for the evaluation of veterinary topical product bioequivalence.

Assessing the comparability of the active pharmaceutical ingredient/formulation

Biomass products.  Medicinal products are pharmaceutically equivalent if they contain exactly the same amount of the same active substance(s) in the same dosage form. Pharmaceutical equivalence does not imply bioequivalence. The lack of bioequivalence may occur due to one or more of the following reasons: differences in the manufacturing process; differences in particle size or crystal structure of the active substance; differences in the excipients. The potential for inequivalence of biomass products results from possible differences in API particle size and in the fermentation bioproducts.

Premixes are, typically, not pure drugs. Similar to other oral dosage forms (such as tablets), they consist of an API(s) and excipients. However, because these represent fermentation processes, the nature of the ‘excipient’ profile cannot be exactly controlled. Accordingly, given the nature of the dosage form and the potential differences in relation to drug concentration, excipients, particle size, and manufacturing process between a pioneer and generic product, there are concerns associated with the evaluation of product bioequivalence. First, we need to be certain that we have adequately characterized the appropriate moieties to measure (considering that the biomass may have multiple APIs present at varying concentrations). Secondly, there is the uncertainty about potential safety concerns (including human food safety) associated with residual bacterial components, such as DNA fragments (Chakrabarty et al., 1990; Webb & Davies, 1993). In particular, we do not as yet know whether or not these transmissible elements could have an impact on the human gut flora.

Efforts to predict the rate and extent of absorption based upon API chemical structure and physicochemical properties (including aqueous solubility and dissolution rates) may not be straightforward. The many factors influencing the rate and extent of absorption of drugs in a veterinary species have been a subject of review elsewhere (Martinez et al., 2002). As discussed by Burton et al. (2002), drug absorption is a complex process that is dependent upon such drug properties such as solubility and permeability, product formulation and physiological variables, including regional permeability differences, pH, luminal and mucosal enzymology, and intestinal motility.

Biosimilars.  The evaluation of bioequivalence for large molecules is particularly important within the human drug arena where there is rapidly expanding interest in the applicability of the development of these molecules as therapeutics. Although this concern certainly impacts veterinary medicine, the relatively small number of large molecules for veterinary use allows this challenge to be of limited concern to the animal health scientist.

Currently, debates focus on the potential clinical implication of formulation-specific differences in the size and complexity of the active substance, and the nature of the manufacturing process. Excellent reviews on this topic have been published (Roger & Mikhail, 2007; Schellekens, 2009).

Stereoisomers.  Pharmacokinetic processes are often stereospecific. Reasons include the stereospecificity of enzyme systems (metabolism), protein binding characteristics (which can influence glomerular filtration), transporter-mediated excretion, and distribution (which includes specificity in concentration in erythrocytes, specific and nonspecific tissue binding, and penetration into body compartments such as synovial fluids) (Brocks, 2006). Certain stereospecific compounds distribute differently across the various plasma lipoprotein layers, and the characteristics of this stereospecific partitioning can be highly animal species specific, reflecting inter-species differences in the composition of the plasma lipoproteins (Brocks et al., 2000).

The importance of using stereospecific methods when evaluating the pharmacokinetics of chiral molecules is well recognized in human (e.g., Boulton & Fawcett, 2001; Mehvar et al., 2002) and veterinary medicine (Landoni & Lees, 1996). However, the issue of product bioequivalence is a slightly different question and we need to focus on the conditions under which the use of nonstereospecific methods would fail to detect formulation-related inequivalence of the separate enantiomers. There remains substantial debate on this issue.

When using the same route of administration, it was shown that for the majority of drug products, no difference in the product relative bioavailability assessment will occur regardless of whether the analytical method employed is or is not stereospecific (Midha et al., 1998a). Therefore, some have concluded that for racemic drugs with linear pharmacokinetics and minimal to modest stereoselectivity in their kinetic parameters, and for those with nonstereoselective pharmacodynamics, the use of stereospecific analytical methods is not warranted (Mehvar & Jamali, 1997). However, in other cases, concern has been expressed. For example (Mehvar & Jamali, 1997; Srichana & Suedee, 2001; Nerurkar et al., 2005):

  • a) For drugs which exhibit nonlinear pharmacokinetics, the results of bioequivalence studies based on the total drug may differ from those based on the individual enantiomers.
  • b) Discrepancies in equivalence conclusions for Cmax and Tmax (but not for AUC) can occur when testing racemic mixtures whose pharmacokinetics are linear but whose separate enantiomers differ substantially with regard to clearance and/or volume of distribution.
  • c) In some cases, drugs can interact in a stereospecific manner with certain chiral excipients.
  • d) In some cases, not only do the different enantiomers have different physicochemical properties (including differences in their respective aqueous solubility characteristics) but also exhibit markedly different magnitude of solubilization in the presence of solubilizing agents.

In considering the issue of stereospecific bioequivalence assessments, CDER, which evaluates human pharmaceutical products, recommends the following (see CDER, March 2003):

For bioavailability studies, measurement of individual enantiomers may be important. However, in the case of bioequivalence evaluations, it is adequate to measure the racemate using an achiral assay except when all of the following conditions are met:

  • i)the enantiomers exhibit different pharmacodynamic characteristics,
  • ii)the enantiomers exhibit different pharmacokinetic characteristics,
  • iii)primary efficacy and safety activity reside in the minor enantiomer,
  • iv)nonlinear absorption is present (as expressed by a change in the enantiomer concentration ratio with change in the input rate of the drug) for at least one of the enantiomers.

Within veterinary medicine, bioequivalence concepts may be applied to bridging between two routes of administration (e.g., parenteral vs. oral). In this situation, marked differences in the enantiomeric ratio may occur due to stereospecificity in drug first-pass drug metabolism (liver and gut). This could lead to unequal depletion of the two parent enatiomers or to the generation of active chiral metabolites (Capece et al., 2009). Therefore, additional consideration may be appropriate when the pharmacokinetics of chiral molecules is compared across routes of administration.

Metabolites.  Occasionally, bioequivalence assessments need to be based upon concentrations of an active metabolite. For example, the administered ingredient may be formulated as a prodrug. Reasons for administering the API as a prodrug include its improved solubility, intestinal stability, bioavailability, or even flavor improvement (Testa, 2009). In these situations, concentrations of the administered moiety may be too low to be adequately measured, leaving the active metabolite as the moiety upon which product bioequivalence is based.

Notwithstanding the above exception, there is an overwhelming consensus that product bioequivalence should be based upon comparative concentrations of the parent compound whenever possible (Midha et al., 2004). In this regard, CDER’s (2003) Guidance for Industry states that the ‘concentration–time profile of the parent drug is more sensitive to changes in formulation performance than a metabolite’. The validity of this position, even if the API undergoes extensive first-pass metabolism, was recently reconfirmed through simulation studies (Fernández-Teruel et al., 2009). Thus, there appears to be few cases where both the parent and metabolite (or just the metabolite) needs to be assessed when evaluating product bioequivalence. The history of and international perspective on the debated role of metabolites in the evaluation of product bioequivalence is reviewed elsewhere (Jackson et al., 2004). That said, the one remaining question is as follows: if the concentrations of the parent compound are too low to measure, and if multiple active metabolites are present, what factors will determine the choice of metabolite(s) used to describe product comparability?

Statistical issues

Describing rate in the presence of multiple maxima.  The identification of a method to accurately compare the rate of product absorption has always been a challenge, even for immediate-release dosage forms. In particular, despite the wide variety of metrics that have been proposed, all of these metrics were found to be insensitive to changes in absorption rate constants (Bois et al., 1994). These other metrics include: center of gravity (Veng-Pedersen & Tillman, 1989), mean absorption time (Jackson & Chen, 1987), maximum entropy (Charter & Gull, 1987) and Cmax/AUC (Endrenyi et al., 1991). Alternatively, others have suggested the use of partial AUCs (Midha et al., 1994). Chen (1992) suggested the use of AUC computed from time zero to some common timepoint for both the test and reference product within each individual. Endrenyi et al. (1998) extended this work further to show that the ability to accurately assess product bioequivalence in the early phase of concentration–time profiles by partial AUCs generally decreases when the duration for measuring the metric is extended.

Further exploration into this issue resulted in the conclusion that as the goal of bioequivalence trials should be to assure similarity in the shape of the concentration–time curve of the test and reference products, the comparison should be one of ‘peak and total exposure’ rather than of ‘rate and extent of absorption’ (Tozer et al., 1996; Chen et al., 2001). This perspective was further supported in the CDER Guidance for Industry titled: Bioavailability and Bioequivalence Studies for Orally Administered Drug Products – General Considerations (2003) where it is stated that both direct (e.g., rate constant, rate profile) and indirect (e.g., Cmax, Tmax, mean absorption time, mean residence time, Cmax normalized to AUC) pharmacokinetic measures are limited in their ability to assess rate of absorption. Therefore, the focus should be changed from measures of absorption rate to measures of systemic exposure. With that switch in mind, Cmax (peak exposure) and AUC (total exposure) can continue to be used as measures of product bioavailability and bioequivalence.

Although the use of Cmax to define peak exposure may work well in situations where there is a single maximum concentration, its utility becomes blurred when products exhibit multiple absorption maxima. Multiple maxima may be attributable to factors such as enterohepatic recirculation (Roberts et al., 2002; Granero & Amidon, 2008), gastric drug retention and gastric motility cycles (Lipka et al., 1995; Wang et al., 1999; Higaki et al., 2008), and ion trapping (Veldhuyzen van Zanten et al., 1996). When the multiple peaks are associated with the API rather than with the formulation, they should not interfere with our ability to identify formulation-related differences in absorption. Multiple maxima can also occur as a result of product formulation. For example, in the case of diclofenac, a very rapid absorption rate allows for the drug product to pass through the gut without formation of the poorly soluble hydrated form of the API, tetrahydrodiclofenac. If the product is more slowly absorbed, hydration of the API occurs in the stomach and multiple peaks are subsequently observed (Marzo & Reiner, 2004). In the case of diclofenac, any differences in absorption rates are readily detectable as differences in Cmax.

However, in other situations, the presence of apparently random peaks and troughs leads to uncertainties regarding how the ‘rate of absorption’ can be compared. Such challenges are particularly prevalent with certain extended release implants, such as those containing the growth promotants zeranol (Pusateri & Kenison, 1993) and trenbolone (Henricks et al., 1982). These products can release drug for periods exceeding 90 days, during which time huge fluctuations in serum drug concentrations can occur. Assessing the bioequivalence of these products is problematic because no singular absorption rate constant or peak exposure can be defined.

Drug absorption from these implants follows a pattern that can be described by a series of random bolus inputs (Russek-Cohen et al., 1999). Accordingly, a demonstration of equivalent AUC and Cmax values does not necessarily indicate that two products will pay out similarly throughout the implant period. For example, consider the fictitious profiles shown in Fig. 3.

Figure 3.

 Example of two curves with the same Cmax and AUC, but with very different profile shapes.

Alternative metrics for comparing absorption rates have been proposed, but currently, no alternative metric has been considered as a viable alternative in a regulatory environment. For example, Rescigno (1992) proposed a dimensionless index that considers the difference in the concentrations of the test and reference product at each timepoint, with these differences weighted by some positive integer. That integer adjusts the assessment so that one may place more weight on the magnitude of the profile difference vs. the duration of the profile difference. However, a problem with that method was that the resulting value had little pharmacological meaning. Therefore, Mauger and Chinchilli (2000) modified the Rescigno bioequivalence metric to improve its pharmacological relevance. The latter was achieved by considering the integrated area about the relative difference in drug concentrations at any point in time and dividing that value by the total AUC of the reference product, However, how to define ‘equivalence’ when using this metric was not adequately resolved.

Another ‘difference’ approach was suggested by Russek-Cohen et al. (1999) where the equivalence determinations for these types of products could be based upon the use of a two-stage approach. The first stage would involve the traditional approach for demonstrating an equivalent extent of drug absorption based on either the natural log (LnAUC) or AUC values. Assuming that the test product passes this first criteria, we could then compare the time series of two drug profiles by constructing a distance metric inline image on log-transformed data (where xi and yi are the drug concentrations at time i for any two profiles). The distribution of distances within the reference drug treatment (dr-r) is then compared with the distribution obtained with the reference vs. test formulations (dt-r).

Niazi et al. (1997) used multiple partial areas and applied several methods of statistical comparison to this work. With regard to the latter, it is important to recognize that as investigators increase the number of tests needed to confirm product bioequivalence, consideration needs to be given with regard to conservation of the Type I error. In other words, if we test equivalence at the α = 0.05 (per tail), and if we have multiple tests (i.e., several partial areas) included in our bioequivalence assessment, then in fact, our overall evaluation of bioequivalence is not at α = 0.05 but rather at a value that exceeds that amount. Thus, there is a risk of inflating the Type I error.

Ultimately, no method has as yet been validated for its applicability to bioequivalence assessments for these complex situations. Therefore, additional work is needed.

Assessing the equivalence of highly variable drugs.  Both CVM’s 2001 Bioequivalence Guidance (#35) and CDER’s July 1992 guidance on Statistical Procedures for Bioequivalence Studies Using a Standard Two-Treatment Crossover Design (the 1992 guidance) recommended that a standard in vivo bioequivalence study design be based on the administration of either single or multiple doses of the test and reference products to healthy subjects. The corresponding statistical analysis of the pivotal pharmacokinetic parameters is based upon the two-one-sided tests procedure (termed average bioequivalence). The latter approach involves the calculation of a 90% confidence interval for the ratio of the averages of the measures for the T and R products. To establish bioequivalence, the calculated confidence interval is expected to fall within the limits of 80–125% for the ratio of the product averages (using Ln-transformed data). However, the applicability of the statistical models for assessing product equivalence is challenged when evaluating products with highly variable pharmacokinetics. A highly variable drug product is generally defined as one whose unexplained error associated with AUC and/or Cmax, when expressed as the coefficient of variation (%CV), is equal to or greater than 30% (Boddy et al., 1995).

This ‘unexplained’ variability is generally estimated by the root mean square error (RMSE) associated with the anova. When employing the traditional two-period, two-sequence, two-treatment crossover study design, the RMSE contains several components, including:

  • a) the within-subject variability
  • b) subject-by-formulation interactions
  • c) unexplained noise, including that associated with the analytical method, study effect, and formulation variation.

When a parallel study design is used (such as with growing animals, products associated with a very long duration of release, or when the use of a crossover will result in undue physiological or psychological stress), the RMSE also contains the between-subject variability that would have otherwise been removed from the treatment comparison if a crossover study design was employed. Accordingly, it is not uncommon to find that the magnitude of the error (the width of the confidence intervals) tends to be greater when using a parallel vs. crossover study design.

Regardless of the source of the variability, if a two-treatment, two-period, two-sequence crossover study design is used, there is a point where the magnitude of variability will lead to the need to include a very large number of subjects in order to meet the bioequivalence criteria. If the variability is sufficiently large, the necessary study size could be cost-prohibitive or not technically feasible. Examples of the relationship between variability and subject number are provided by Hauschke et al. (1999).

This statistical challenge is encountered both within human and veterinary medicine. To address this problem, CDER published their January 2001 guidance document titled ‘Statistical Approaches to Establishing Bioequivalence’. Within this document, they describe two alternative approaches for statistical analysis: population and individual bioequivalence. These approaches include inter-product comparisons of the treatment averages and variances. Despite the theoretical benefits associated with this guidance, debate remains on the applicability and benefits of its proposed methodologies (Haidar et al., 2008). The population BE approach assesses total variability of the measure in the population. The individual BE approach assesses within-subject variability for T and R, as well as the subject-by-formulation interaction. Population and individual bioequivalence approaches, but not the average bioequivalence approach, allow for two types of scaling: reference-scaling and constant-scaling. In reference-scaling, the criterion used is scaled to the variability of R. This effectively widens the acceptable confidence limit defining product bioequivalence, thereby allowing for the successful demonstration of product bioequivalence studies when the reference product is highly variable. The latter may be due either to formulation or to the pharmacokinetics of the active ingredient. In fact, if the variability is attributable to a poorly formulated reference product, the test product will be rewarded if formulated such that it presents with less variable absorption kinetics.

One of the difficulties encountered when trying to apply the concepts forwarded in the CDER 2001 guidance to situations encountered in veterinary medicine is that although these guidance documents suggest that their algorithms can be applied to studies employing a parallel design, we found that the constants incorporated into the CDER guidance cannot be used unless a crossover study design is employed (M. Martinez, unpublished data). In this regard, we found that the width of the scaled ‘acceptance limits’ was of a magnitude that failed to provide any assurance of product bioequivalence. Accordingly, to remedy this problem, extensive additional work would be needed (i.e., simulations to allow for the determination of a value of the constant that would be consistent with its use in a parallel study design). An additional problem with the use of the scaled method is that when applied to a crossover study design, it necessitates the within-subject replication of treatments [or the use of a four-sequence, two-period Baalam design (e.g., the four sequence groups described as T T; R R; T R; R T for periods 1 and 2, respectively: Jones & Kenward, 1989)] to enable an estimation of the within-subject variability for the test and reference formulations. With the exception of the Balaam design (which has a cost in terms of the efficiency of defining the T/R parameter ratio), the replicate study design is rarely used in veterinary medicine because of issues pertaining to animal growth (e.g., studies in calves) or stress-induced changes in animal physiology.

From a prescribing perspective, within the human drug arena, concerns have been raised regarding scaling approaches, such as the possibility of interstudy differences in the scaling factors and the corresponding risk of lack of interchangeability between generic formulations (Midha et al., 1998b; Tothfalusi et al., 2001). Clearly, for both human and veterinary medicine, the statistical methods for assessing the bioequivalence of highly variable drug products are problematic and will continue to need extensive consideration.

Challenges Unique to Veterinary Medicine

Mastitis products

Intramammary products are injected directly into the mammary gland. In most cases, the drug remains in the udder until it is voided in the milk (Ziv, 1980). However, there are also cases reported whereby a drug, administered via intramammary infusion, does appear in the blood (e.g., Schadewinkel-Scherkl et al., 1993). Unlike parenteral mastitis products, where partitioning of the drug from the blood into the infected udder is a function of the API (unless we are dealing with a targeted delivery system), the transfer and retention of drug in the udder and the ability of the drug to migrate to the site of the infected tissue could in fact be a function of the intramammary product’s formulation. Moreover, as stated by Soback et al. (1990), ‘Mastitis is a disease of the mammary tissue. This is especially pronounced when S. aureus, considered to be a deep tissue invader, is involved. Antibiotic concentrations attained in the milk will have only a low, if any, correlation to the concentrations reached at the foci of infection, i.e., the tissue’. With this point in mind, we have reason for concern with regard to the use of drug concentrations in the milk to confirm product bioequivalence.

A formulation can influence the availability of the drug to the tissue, as well as the rate at which the drug is eliminated (Ehinger & Kietzmann, 2000a,b). Critical formulation-related in vivo performance characteristics include the rate at which the drug dissolves, the ionization of the compound, and the presence of surfactants to facilitate the interaction between drug and biological membrane (Gehring & Smith, 2006). Factors to consider include lipophilicity of the formulation, particle size of the API (if it is a suspension), and vehicle viscosity (Ehinger & Kietzmann, 2000a,b). A further complicating factor is that the distribution of drug into the various portions of the bovine udder is not homogeneous. In fact, in their perfused bovine udder model, Ehinger and Kietzmann (2000a) showed that there is an exponential decrease in tissue concentration as a function of distance from the teat base.

Therefore, when dealing with intramammary products, conventional bioequivalence approaches (e.g., drug measurement in blood or milk) may not be appropriate. The use of a clinical endpoint study may likewise not be an attractive alternative because of the large numbers of animals and the potential complexity of the study design needed to have confidence in a determination of product bioequivalence. Many fundamental questions need to be addressed in an effort to ascertain the appropriate methods for assessing the bioequivalence of these formulations.

Type A medicated articles

Type A medicated articles are used in the manufacture of a complete medicated feed (Type B or C) or for use in drinking water (see Table 2 for details). Medicated feeds may be used as a mechanism for drug delivery for a variety of animal species, including cattle, swine, poultry, horses, and fish (e.g., refer to FDA CVM Guidance #171). To improve stability, shelf-life, and to facilitate homogenous dispersion of drug in the complete feed, granular premixes have become a standard manufacturing practice (del Castillo & Wolff, 2006).

Table 2.   Definitions of Medicated Articles and Feeds based upon the Code of Federal Regulations (CFR) volume 21, section 558.3
A Type A medicated article is intended solely for use in the manufacture of another Type A medicated article or a Type B or Type C medicated feed. It consists of a new animal drug(s), with or without carrier (e.g., calcium carbonate, rice hull, corn, gluten) with or without inactive ingredients. The manufacture of a Type A medicated article requires an application approved under 21 CFR Sec. 514.105 or an index listing granted under 21 CFR Sec. 516.151
A Type B medicated feed is intended solely for the manufacture of other medicated feeds (Type B or Type C). It contains a substantial quantity of nutrients including vitamins and/or minerals and/or other nutritional ingredients in an amount not less than 25% of the weight. It is manufactured by diluting a Type A medicated article or another Type B medicated feed. The maximum concentration of animal drug(s) in a Type B medicated feed is 200 times the highest continuous use level for Category I (zero withdrawal time) drugs and 100 times the highest continuous use level for Category II drugs (drugs with a withdrawal time). The term ‘highest continuous use level’ means the highest dosage at which the drug is approved for continuous use (14 days or more), or, if the drug is not approved for continuous use, it means the highest level used for disease prevention or control. If the drug is approved for multiple species at different use levels, the highest approved level of use would govern under this definition. The manufacture of a Type B medicated feed from a Category II, Type A medicated article requires a medicated feed mill license application approved under Sec. 515.20
A Type C medicated feed is intended as the complete feed for the animal or may be fed ‘top dressed’ (added on top of usual ration) on or offered ‘free-choice’ (e.g., supplement) in conjunction with other animal feed. It contains a substantial quantity of nutrients including vitamins, minerals, and/or other nutritional ingredients. It is manufactured by diluting a Type A medicated article or a Type B medicated feed. A Type C medicated feed may be further diluted to produce another Type C medicated feed. The manufacture of a Type C medicated feed from a Category II, Type A medicated article requires a medicated feed mill license application approved under 21 CFR Sec. 515.20

Bioequivalence is based upon comparability of the Type A medicated article (i.e., the premix). The Type A article consists of as little as 1 part per 2000 of active substance in a carrier composed of minerals and vitamins, and vegetable matter from various sources, e.g., corn, wheat, soybean, etc. The vegetable matter comprises the largest (w/w) component of the premix. The feed into which the premix is incorporated will vary (across feed manufactures as well as across time due simply to yearly crop variation). However, the FDA regulates only the Type A medicated article (21 CFR 558.3).

Despite its small proportion to total feed, the premix formulation and method of manufacture can significantly influence the oral absorption of the API. To demonstrate this point, del Castillo and Wolff (2006) adapted the United States Pharmacopeia standard dissolution testing method to determine the dissolution into simulated porcine gastric fluid (pH = 1.6) of four commercial brands of granular chlortetracycline premixes. The test was conducted under standardized conditions with 21 samples collected for analysis over a 120-min period, to enable determination of the fraction of drug which becomes available for absorption. The results revealed complex patterns of chlortetracycline release and major interactions of the dissolving drug with feed particles. Type of premix was the greatest determinant of rate of drug release, with significant differences reported for the four premixes investigated. In addition, there were major differences in the amount of chlortetracycline released, as indicated by available-to-total concentration ratios.

Unlike traditional dosage forms where the administration (intake) is under the control of the investigator, for medicated articles, intake is dependent upon animal behavior. Variability in drug intake is recognized to be an important source of variability in drug response. The inconsistent intake of medicated feed, influenced by individual animal feeding behavior, is a determinant of the rate and extent of drug exposure from medicated feed (Li et al., 2008). This source of variability complicates the assessment of formulation effects because while other dosage forms allow for product administration at a fixed time, the inherent nature of premixes is that intake is free choice. Accordingly, even if presented at a specific time and even if nonmedicated feed has been withheld, the times of intake and the duration of intake at each feeding time are dictated by the animal. Furthermore, if housed as a group (pen), there is competition for feed that will further influence the rate and amount of drug consumed, with the more dominant animals consuming the greater quantity of food (and therefore drug).

Possibly one of the most complicating factors in the design of these studies is that the act of collecting blood samples will lead to an unavoidable disruption of the feeding behavior. Clearly, this will further influence the ability to render an unbiased bioequivalence determination.

To avoid this problem, it could be argued that formulation effects on the bioavailability of the API can be evaluated by administering the feed as a bolus (gavage) dose. Such studies would address the conventional bioequivalence issue: the influence of formulation on drug absorption characteristics. However, in situations when systemic (or gut) drug concentrations are determined by the ad libitum intake of medicated feed or drinking water, the relative exposure assessment also needs to account for a problem unique to this veterinary dosage form: the influence of formulation on the amount of drug ingested in the medicated article, and the rate at which the medicated article is consumed. If a medicated feed (or if medicated water) contains a drug that can lead to taste aversion, and if efforts to mask that bad taste are not comparably successful in the test and reference formulations, then inequivalent drug exposure can occur. For medicated feeds, the impact of formulation on voluntary food intake can be as important to product bioequivalence as is the effect of formulation on drug absorption. Therefore, further deliberations on the best methods for evaluating the bioequivalence of medicated articles are clearly needed.

Biopharmaceutics classification system

The challenges associated with interspecies extrapolation of BCS concepts have been described elsewhere (Martinez et al., 2002, 2004). The BCS is a classification system, originally developed by Dr. Gordon Amidon (Amidon et al., 1995), to define the formulation and bioavailability challenges associated with specific drug physicochemical characteristics. The system is based upon the classification of compounds in accordance with their solubility and membrane permeability:

  • a) CLASS I: High Solubility, High Permeability: generally very well absorbed.
  • b) CLASS II: Low Solubility, High Permeability: exhibits dissolution rate-limited absorption.
  • c) CLASS III: High Solubility, Low Permeability: exhibits permeability-limited absorption.
  • d) CLASS IV: Low Solubility, Low Permeability: very poor oral bioavailability.

By understanding the relationship between a drug’s permeability and solubility, and the dissolution characteristics of the formulation, it is possible to identify situations when in vitro dissolution data can serve as a surrogate for in vivo bioequivalence assessments. The use of this surrogate relies upon the validity of three fundamental assumptions:

  • a) That a comparison of product in vitro dissolution performance accurately reflects the relative differences in product in vivo dissolution behavior.
  • b) If two products present with equivalent in vivo dissolution profiles under all luminal conditions, they will likewise present equivalent drug concentrations at absorptive membrane surfaces.
  • c) For comparable dissolution profiles to assure comparable in vivo absorption, the rate and extent of drug presentation to absorptive membrane surfaces must determine the absorption characteristics of that drug product. In other words, we need to assume that there are no other components of the formulation that can influence drug absorption (Martinez & Amidon, 2002).

CDER has used this classification system to distinguish between those compounds that are unlikely to present with formulation-related problems in drug absorption, those compounds whose bioavailability will likely be dictated by product in vivo dissolution characteristics, and those drugs whose bioavailability will be limited by their ability to cross the GI mucosa (e.g., refer to the CDER Guidance titled, ‘Waiver of In Vivo Bioavailability and Bioequivalence Studies for Immediate-Release Solid Oral Dosage Forms Based on a Biopharmaceutics Classification System’ dated 8/2000).

When applied to humans, solubility is calculated on the basis of the largest strength manufactured. It is defined by the minimum solubility of drug across a pH range of 1–8 and at a temperature of 37 ± 0.5 °C. High-solubility drugs are those drugs that are associated with a ratio of dose to solubility volume that is less than or equal to 250 mL. Unfortunately, this straightforward definition of solubility cannot be extrapolated to veterinary medicine because of interspecies differences in the fluid volume of the GI tract (e.g., refer to CVM guidance #171). Particularly as we consider the dog, the issue of potential nonlinearity in volume/kg needs to be considered because most drugs are labeled for administration on a mg/kg basis (Martinez et al., 2004). Moreover, the pH of the GI tract is associated with marked interspecies differences. For example, while the pH of the human ileum is about 7, that of ruminants is approximately 8. The human posterior stomach has a fasting pH of about 2.5 while that of cats and dogs (carnivores) is closer to 4 (Martinez et al., 2002).

To examine the effect of pH on drug ionization one can use a rearrangement of the Henderson Hasselback equation:


Weakly basic drugs tend to have a slower dissolution rate at higher pH (when more drug exists in its unionized form), whereas weakly acidic drugs dissolve faster at higher pH (when more drug exists in its ionized form). Therefore, weak bases may precipitate when gastric pH is elevated during a meal, resulting in a significant reduction in AUC and Cmax. Conversely, that same meal can increase the dissolution rate of a weak acid by increasing the proportion of drug existing in its ionized state, thereby making it more water-soluble and better absorbed.

Similarly, there can be marked interspecies differences in drug permeability (Fahmy et al., 2008). When applied to humans, permeability (Peff, expressed in units of 104 cm per second) is defined as the effective human jejunal wall permeability of a drug. High-permeability drugs are generally those with an extent of absorption greater than or equal to 90% (prior to first-pass metabolism) and are not associated with any documented instability in the GI tract. While the transcellular movement of highly permeable, lipophilic compounds is likely to be similar across animal species, we anticipate differences in the permeability of drugs that are absorbed via paracellular mechanisms due to interspecies differences in pore diameter and effective surface area (Martinez & Amidon, 2002). This can be seen when examining the equation that describes the rate of passive diffusion of any molecule, whether it is absorbed via transport between mucosal cells or through the mucosal membrane (Lennernas, 1995):


where Daq is the diffusion coefficient of the compound in water; λaq is the aqueous diffusion distance; Jfluid is the fluid flow between epithelial cells; α is the ratio of the water flow relative to the solute flux. This ratio is influenced by the existing pressure gradient, and is dependent upon molecular size, volume, charge, and hydration number. It may also be influenced by the dynamic width of the tight junction; Dm is the diffusion coefficient of the compound within the membrane, which is dependent upon factors such as drug lipophilicity, hydrogen bonding capacity, polar surface area of the molecular, molecular volume and shape; Jmax is the maximal transport capacity of the carrier-mediated process; Km is the substrate specificity of the membrane transporter (Michaelis constant); λ is the thickness of the rate-limiting diffusion barrier; Am and Ap are the available surface areas for transcellular and paracellular transport, respectively. Of these parameters, the rate-limiting differences expected to be seen between animal species include Jmax, Km, λ, Am, and Ap.

By understanding the relationship between a drug’s absorption, solubility, and dissolution characteristics, it is possible to define situations when in vitro dissolution data can provide a surrogate for in vivo bioequivalence assessments. This relationship provided the basis of CVM’s Guidance #171 (Guidance for Industry on Waivers of In Vivo Demonstration of Bioequivalence of Animal Drugs in Soluble Powder Oral Dosage Form Products and Type A Medicated Articles).

Defining bioequivalence with destructive sampling study designs

When repeated blood samples are not feasible, the true shape of the blood level profile within each subject cannot be ascertained. In these cases, bioequivalence assessments need to be based upon the use of composite curves. The profile shapes represent the ‘luck of the draw’ as animals are randomly assigned to specific times of sampling (or sacrifice).

For rapidly absorbed, immediate-release formulations, sparse sampling designs can provide a relatively accurate estimate of the AUC, but not of Cmax. Bailer (1988) and Nedelman et al. (1995) described methods for estimating the means and variances about the composite AUC values. Generally, these estimates closely approximate the true AUC and the variance of the study population. However, large errors in the profile shapes can occur (depending upon the randomization procedure). The ability to accurately reproduce the shape of the true population (i.e., to generate complete curves for each subject) is very difficult.

To illustrate this point, we simulated 50 individual profiles for a single drug product, but we selected only one timepoint per profile for inclusion in our composite curves. We then repeated this process such that a different set of timepoints were included from each profile. In so doing, we obtained a range of composite curves. From these, we selected the most disparate profiles for inclusion in Fig. 4 to illustrate the potential impact of the subject randomization to a particular timepoint. The composite curves reflect one sample per profile, and the average of five samples (‘animals’) per timepoint. We contrasted these results against the true mean concentrations when averaged across all 50 ‘animals’. Obviously, the magnitude of disparity between composite curves will be a function of the inherent profile variability. Therefore, for example, we can anticipate far more variability (and accordingly, difficulty with the use of composite curves) when testing the equivalence of drugs administered in feed (where feeding behavior provides an additional level noise to our measurements) vs. formulations where animal behavior does not influence systemic drug exposure.

Figure 4.

 Influence of subject selection on estimated concentration vs. time profile when a small number of animals are used per sampling time.

Interestingly, despite the disparity in the shapes of the three curves, the AUC values were remarkably similar. This is shown in Table 3. The fundamental difference between the full profile, subset 1, and subset 2 was the Cmax estimate and the variability in the AUC and Cmax values. In fact, we can see that estimation of peak concentrations will be highly problematic, even in the relatively simple situation described in this simulation.

Table 3.   Impact of subject selection on destructive sample bioavailability estimates (variability of subset AUC estimates based upon the method of Bailer, 1988)
Study groupAUC (ng·h/mL) mean, %CVRatio Subset/TrueCmax (ng/mL) mean, %CVRatio subset/true
Full profile, mean for all simulated curves1197, 19% 105, 19% 
Subset 11190, 27%0.99106, 31%1.01
Subset 21198, 12%1.00119, 15%1.13

These findings lead to several additional questions such as:

  • a) For any set of conditions, how does the number of subjects per timepoint influence the accuracy of our AUC and Cmax estimates?
  • b) Will the variability in time to peak concentration (due either to formulation or delays associated with oral consumption), affect our ability to accurately characterize the AUC when there is only one sample per animal?
  • c) Even if we can accurately estimate mean AUC values, how do we estimate the confidence interval about the ratio of treatment means?

Furthermore, to evaluate product bioequivalence, an estimate of parameter variance is needed. In these situations, selecting an appropriate method for estimating confidence intervals is an enormous hurdle. Numerous possibilities have been suggested, and the implications of these methods have been reviewed elsewhere (Bonate, 1998). Included in his assessment were the use of jackknife, bootstrap, and modifications of the original Bailer method. In examining the confidence interval about an AUC estimate (slightly different but still related to the question of product relative bioavailability), he noted that primary considerations on method performance include the precision of the confidence interval (i.e., the width of the interval), whether or not the resulting interval resulted in values that had physiologic plausibility (i.e., confidence bounds should not allow for the possibility of negative AUC values) and if predictions contained the actual population mean.

More recently, methods have been proposed for generating the approximate 1 − α confidence interval about the ratio of mean AUC values when destructive sampling methods are employed (Takemoto et al., 2006; Wolfsegger, 2007; Jaki et al., 2009). Several points emerged from these various simulation studies.

  • a) Some form of resampling technique will likely be needed to stabilize the means and variances associated with the estimated pharmacokinetic parameters and therefore to estimate product relative bioavailability when destructive sampling is used.
  • b) The distributional underpinnings of the data will need to be understood when using these alternative methods for assessing product bioequivalence.
  • c) The relationship between the numbers of samples (‘animals’) included per timepoint, the number of timepoints evaluated, and the estimated ratio of the treatment means need to be understood, regardless of the method employed.

It is evident that the fundamental challenges in evaluating product bioequivalence with sparse datasets are as much statistical as they are pharmacokinetic and logistic. Much work is needed to determine the appropriate methods for assessing product bioequivalence in birds, fish, or any situation when destructive sampling methods need to be employed.

Bioequivalence within species but across age or production class

The anatomical differences between true monogastrics (canine or feline species), hindgut fermentors (rodents, rabbits, or horses), foregut fermentors (llamas and alpacas), and ruminants (cattle, goats, or sheep) can result in profound differences in oral drug absorption characteristics (Hunter, 2009). Furthermore, oral absorption characteristics can also vary within a species, as has been demonstrated by the differences in mouth-to-small-intestine transit times in Beagles and mongrel dogs (Sagara et al., 1995). Oral bioavailability studies facilitate our appreciation of the potential influence of physiological, formulation, or circumstantial factors (such as fasting vs. fed state, type of food), animal sex, breed, age, disease condition, time of day of dosing, route of administration, etc., on the rate and extent of absorption (Toutain & Bousquet-Melou, 2004).

In veterinary medicine, the most extreme example of class differences is seen when dealing with cattle. Energy and nutrient metabolism are profoundly different between lactating dairy cattle, dry cows, and growing beef animals. This is illustrated by experiments in which cattle underwent blood vessel catheterization along the digestive tract and liver to measure nutrient uptake and output. Eisemann et al. (1996) measured nutrient fluxes in beef steers at three stages of growth, while Reynolds et al. (2003) compared late gestation and lactation in dairy cattle. Table 4 provides the minimum and maximum values for steers during growth from 236 to 522 kg body weight and comparable data for dairy cows in the late dry period and near peak lactation. Dry matter intake by growing steers near their mature weight was similar to that of dry cows, while cows near peak lactation consumed almost twice as much on a dry matter basis. Generally, blood flows and nutrient fluxes across the portal-drained viscera and liver in dry cows in late gestation were similar to the maximum values found in growing beef animals. Blood flows and uptake of volatile fatty acids from the digestive tract into portal blood were higher in lactating dairy cows than in either dry cows or in growing beef cattle. There was no net uptake of glucose from the digestive tract, so the glucose required for lactose synthesis came entirely from gluconeogenesis. This is reflected in a markedly higher uptake of propionate, and output of glucose, by the liver in lactating cows.

Table 4.   Splanchnic metabolism in growing beef steers and dairy cows in late gestation and lactation (Eisemann et al., 1996; Reynolds et al., 2003)
 Beef steersDairy cows
236–522 kg average body weightDay relative to calving
Dry matter intake (kg/day)5.8–9.99.622.1
Production (kg/day)
 Milk fat1.543
 Milk protein1.282
 Milk lactose1.917
Blood flow (L/h)
 Portal vein560–8849642093
Net flux of metabolites across portal drained viscera (mmol/h)
 Glucose−2 to −66−24.6−4.7
 β-hydroxybutyric acid (BHBA)90–171127299
Net flux of metabolites across liver (mmol/h)
 Propionate−229 to −383−322−1156
 Butyrate−53 to −86−37.7−185.1

As is typical of production for Holsteins when near their peak of lactation, the dairy cows in the experiment of Reynolds et al. (2003) produced more than 1.5 kg of fat, 1.2 kg protein, and 1.9 kg of carbohydrate (lactose) via the milk of over a 24-h milking cycle (Table 3). In contrast, a 300 kg steer with a shrunk weight gain of 1.3 kg/day deposited approximately 140 g of protein and 460 g of fat and negligible carbohydrates in daily carcass gain. This translated to the lactating dairy cow producing more than three times as much fat and at least eight times as much protein as the growing beef animal deposited in carcass gain each day. The high level of milk production and the high nutrient intakes needed by lactating dairy cows to sustain this level of production demand very different ration formulation and feeding strategies from beef animals. The National Research Council finds it necessary to issue separate recommendations on the nutrient requirements of each class of animal (National Research Council, 2000, 2001).

It is also recognized that there are class-related factors, such as regional blood flow, intrinsic hepatic metabolism, and renal clearance that can lead to differences in the elimination of xenobiotics (therapeutics, nutrients, and environmental toxins). This could possibly result in a decrease in the activity of a compound (Short, 1994). For example, these physiological differences have been shown to manifest as significant differences in the pharmacokinetics of ceftazidime and ketoprofen in lactating vs. nonlactating cattle (Rule et al., 1996; Igarza et al., 2004). There is also information indicating the presence of breed differences in drug pharmacokinetics in cattle (Sumano et al., 2001; Sallovitz et al., 2002; Giantin et al., 2008).

Based upon the available information on drug pharmacokinetics, it is evident that these breed/class/age/gender, etc., effects may influence product prescribability (i.e., the relationship between dose, drug exposure, and drug effect). Accordingly, class differences often need to be studied when a drug sponsor seeks to expand a label indication to include this alternative animal specification. However, what is not clear is whether or not these same physiological differences can influence drug switchability (i.e., whether or not there may be an interaction between the breed/class/age/gender, etc., of the animal and the influence of the product formulation on drug bioavailability). It is the latter question that is relevant to the assessment of product bioequivalence.

Although product bioequivalence needs to be confirmed for each major species included on the pioneer product label (CVM Guidance #35), it is considered unlikely that factors within a species, such as age, sex, weight, and breed, will lead to unique interactions with a product’s formulation. Accordingly, generally only one bioequivalence study per species is needed. However, there are certain obvious exceptions, such as when an oral product is intended for use in both young preruminant calves and in older animals with functional rumens. In this situation, separate bioequivalence studies in each physiological state might be needed. Similarly, Martinez et al. (2001) posed the question ‘are we certain that products found to be equivalent in young, healthy animals will also be equivalent in geriatric or pediatric populations?’

From a bioequivalence perspective, the important question is whether or not inherent physiological differences within an animal species can lead to a class-by-formulation interaction? To date, no solid evidence in support either confirming or refuting such interactions has been published. Therefore, additional work is needed in this area.

Concluding Thoughts

Clearly, these points are only the beginning of the challenges that both human and veterinary medicine will face when assessing the bioequivalence of future products and dosage forms. Will it even be feasible to have generics of products intended for targeted drug delivery? Or, what about stealth liposomes where the challenge is no longer absorption but rather clearance?

While such questions showcase the challenges associated with a burgeoning technology, it is difficult to begin addressing such concerns until we can adequately resolve the bioequivalence quandaries currently facing veterinary medicine. Each of the respective issues raised in this manuscript is a thesis project unto itself. Yet, the reality is that unless we can work toward scientifically and statistically sound solutions, we will be unable to evaluate product bioequivalence for many of the veterinary pharmaceuticals that are soon to come off patent. Within the same perspective, unless these issues can be resolved, pharmacokinetic bridging studies may not be a mechanism for evaluating the equivalence of the innovator formulations as they undergo the many changes often encountered during their market life.

These questions will be discussed in detail during the collaborative AAVPT and the European College of Veterinary Pharmacology and Toxicology (ECVPT) to be held in Bethesda, MD, during June 27–30, 2010. Additional information on the agenda is available at or

This workshop will provide a platform for dialog on each of these topics. It is also our goal that by the end of the workshop, Working Groups will be formed to explore each of these questions in detail.


The contents of this article are the sole responsibility of the authors. The opinions do not reflect those of the US FDA.