Replication Needed to Distinguish Alterations in Cell Ratios, the Frequency of Individual Stages of the Cycle of the Seminiferous Epithelium, or the Appearance of Abnormalities in the Testes of Rodents, Rabbits, or Humans
Department of Biological Sciences, University of New Hampshire, Durham, NH 03824 (e-mail: firstname.lastname@example.org).
ABSTRACT: The typical levels of inherent variability among untreated sexually mature rodents, rabbits, or men were determined for several endpoints used to characterize spermatogenesis. The latter included germ cell: germ cell and germ cell: Sertoli cell ratios, the frequency of specific stages of the cycle of the seminiferous epithelium, and the appearance of a variety of testicular abnormalities. Based on this variability, the number of replicates needed to provide future experiments of a predictable power and sensitivity was estimated. Replication requirements differed greatly as a function of species and/or among specific endpoints within a single species. In addition, the replication needed to provide robust experiments was substantially greater than that employed in most investigations. The application of these findings during the design and interpretation of future experiments was discussed.
Spermatogenesis involves a lengthy, complex process that is of tremendous importance to male fertility. It is also quite susceptible to disruption by a variety of agents (Amann and Berndtson, 1986; Berndtson and Clegg, 1992). Accordingly, numerous studies have been undertaken to increase basic understanding of this process, and there is an ongoing need to assess spermatogenic responses to known or suspected environmental toxins, proposed new human or animal drugs, or other factors. A variety of endpoints has been utilized for such purposes. One that has proven quite useful is the numerical ratio of various cell types to each other. This endpoint has several applications. First, a doubling in the number of daughter cells has been used to identify the number and timing of cell divisions when establishing models for the kinetics of spermatogenesis (eg, Clermont, 1954, 1962, 1972; Clermont and Leblond, 1959; Clermont and Antar, 1973; Berndtson and Desjardins, 1974a; Bilaspuri and Guraya, 1984). In addition, comparisons of actual vs theoretical yields of specific germ cells have permitted the efficiency of specific cell divisions to be quantified (eg, Berndtson, 1977; Huckins, 1978; Berndtson and Igboeli, 1989; Castro et al, 2002). Similar comparisons among the ratio of young vs old cells of the same type at different points during spermatogenesis provide insight into the magnitude of cell losses during the course of cell maturation (Johnson et al, 1983). Differences between specific germ cell ratios in control vs treated subjects allows quantification of treatment effects on cell losses and/or spermatogenic efficiency (Berndtson, 1977; Amann and Berndtson, 1986; Russell et al, 1990; Berndtson and Foote, 1997). Finally, the number of germ cells per Sertoli cell is the basis for one method for quantifying relative rates of sperm production (Clermont and Morgentaler, 1955; Berndtson, 1977), and such ratios have been used to assess the functional capacity or workload of individual Sertoli cells (Russell and Peterson, 1984; Johnson, 1986; Jones and Berndtson, 1986; Berndtson et al, 1987a,b; Berndtson and Igboeli, 1989; Berndtson and Jones, 1989; Berndtson and Thompson, 1990b; Thompson and Berndtson, 1993; Castro et al, 2002).
Another useful endpoint is the frequency of a specific stage(s) of the cycle of the seminiferous epithelium. Although spermatogenesis involves a continuum of events, researchers often find it useful to characterize this process via the identification of stages, which constitute manmade divisions of this process. If one could observe a single cross section of a seminiferous tubule over time, one would detect changes in its appearance. Cell divisions and the progressive development of specific cells would produce a number of distinct combinations of cells or cellular associations. These cellular associations would continue to change until the seminiferous epithelium once again appeared as it did initially. The interval from the appearance of one cellular association until its reappearance constitutes 1 cycle of the seminiferous epithelium, and the different cellular associations that have been identified have been designated as stages of the cycle of the seminiferous epithelium (Clermont, 1972; Berndtson, 1977).
One reason that staging is valuable is because some cells divide or are only present during specific stages. If one wished, for example, to determine the number of germ cells of a particular type per seminiferous tubular cross section or the ratios of one type of germ cell to another, it would be important to select tubular cross sections at a stage(s) containing the cell(s) of interest. Several other applications of stage frequency data are based on the expectation that, with adequate sampling, the frequency at which one observed tubular cross sections at a particular stage would equal its relative duration. If the duration of the cycle of the seminiferous epithelium is known, the duration of a specific stage can be determined. For example, if 1 cycle of the seminiferous epithelium required 13.5 days and the frequency of stage I was 10%, one would know that stage I required 10% of the cycle, or 1.35 days. Knowledge of the duration of individual stages has been useful when examining the progression of adverse treatment effects during periods of exposure to antispermatogenic agents and/or to predict the time course for recovery after the cessation of treatment (Foote and Berndtson, 1992). The duration of individual stages has also been useful in determining time divisors that enable estimation of daily sperm production rates from data generated via volume density or homogenization techniques (Amann and Almquist, 1962; Johnson et al, 1980, 1981; Robb et al, 1978).
The frequency of the stages can also be a valuable endpoint in itself. Specifically, differences in the frequency of individual stages between treated vs untreated subjects could provide evidence of treatment effects on the timing of certain spermatogenic events. Alternatively, where extremely harsh treatments result in the complete absence of germ cells of a particular type, the composition of some cellular associations would be altered. This might cause some stages to appear to be of a different stage. For example, both stages VIII and IX of normal male rats contain type A spermatogonia, young and old primary spermatocytes, and spherical spermatids (Leblond and Clermont, 1952). Stage VIII also contains elongated spermatids, which distinctively line the lumen of the tubule prior to spermiation. Such spermatids are absent during stage IX. In the absence of elongated spermatids, as might result from exposure to a toxic agent, stage IX tubules would appear very similar to those of stage VIII. Unless these stages could be distinguished by features of the remaining cells, the frequency of stages might appear different because of incorrect stage identification. Consequently, the apparent frequency or duration of certain stages could be altered even if the timing of spermatogenic events, per se, was unchanged. Although investigators need to remain cognizant of such fundamentally different mechanisms by which stage frequencies might appear altered, it is clear that changes in apparent stage frequencies would, at a minimum, indicate that some alteration in spermatogenesis had occurred.
Cellular degeneration and the presence of other abnormalities represent additional endpoints of some interest. Such features can be observed routinely in normal individuals (Clermont, 1962; Swierstra, 1966; Roosen-Runge and Leik, 1968; Hochereau-de Reviers, 1970; Barr et al, 1971; Berndtson and Desjardins, 1974a; Russell and Clermont, 1977; Huckins, 1978; Wing and Christensen, 1982; Morton et al, 1986). However, an increase or unusually large number of abnormalities would provide evidence of spermatogenic disruption, whether naturally occurring or treatment induced. Where identification of the abnormal cell type(s) is possible, one can determine which cell(s) is the target of the causative factor.
The aforementioned endpoints have many applications, and some of the information that each provides can be quite unique. However, the reliability of the information resulting from their application is an important consideration. Inherent variability is a characteristic of all living things, and such variability is also apparent among the testes of normal, sexually mature mammals. Because of this, it is important to provide adequate replication when performing experiments to assess testicular function. Indeed, the likelihood that an experiment will be capable of detecting an actual treatment response or difference of some magnitude is a direct function of the inherent variability in the endpoint chosen for evaluation and the number of replicates used per treatment group. The inherent variability associated with a number of reproductive endpoints has been characterized, and this information has been used to identify the number of replicates needed to provide future experiments of predictable power and sensitivity (eg, Berndtson, 1989, 1990, 2008; Berndtson et al, 1989, 1997; Berndtson and Thompson, 1990a; Berndtson and Clegg, 1992). As used herein, power denotes the probability that an experiment will be capable of detecting a statistically significant treatment response, if there actually is a treatment effect, and sensitivity represents the minimal size of the response that would be detectable. Similar information is not readily available for use in planning experiments involving cell: cell ratios, the frequency of stages, or the incidence of testicular abnormalities as experimental endpoints. The purpose of this investigation was to characterize the inherent variability in these endpoints among rodents, rabbits, and humans, and to provide estimates of the number of replicates needed to provide future experiments of predictable power and sensitivity when these endpoints and species are to be used.
Materials and Methods
The coefficients of variability (CVs; SD expressed as a percentage of the mean) associated with the endpoints of interest were determined from data identified within the published literature or, alternatively, from raw data sets possessed by the author. Only data for untreated, control populations were used for this study. Because investigators do not normally publish CV values per se, it was usually necessary to calculate these from other data. This was only possible when a publication contained the mean ± SD, or the mean, the SE of the mean, and the number of replicates (n) in the mean. Where the latter information was available, the SD (s) was calculated by entering the SE and values of n into the equation: SE = √s2/n (Berndtson, 1991). CVs were identified for each endpoint for rodents, rabbits, and humans. To avoid the potential for introducing personal biases, CVs were determined for each study that the author reviewed that contained the requisite data, and every value was reported herein. Because the resulting values were based on sample populations of differing size, the typical CV associated with each endpoint was determined as the weighted mean for all studies. However, because the primary objective was to characterize inherent variability and replication requirements for studies with breeding-age males, data for prepubertal or senescent males were excluded for calculation of the weighted mean. In such instances, any exclusion(s) of specific data was cited. The number of replicates required for future studies of predicted power and sensitivity was identified from tables constructed for that purpose (Berndtson, 1991). To use these tables, one must provide an estimate of the anticipated CV for the experimental population to be examined. The “typical” CVs determined herein as described previously were used for that purpose. The investigator must also choose a power and sensitivity combination that is appropriate for the experiment being planned. With this information, the required number of replicates can be read directly or extrapolated from the appropriate reference table. For this study, replication requirements were determined for studies that would provide a power of 80% or 90% and the sensitivity enabling detection of 10%, 20%, or 30% changes in each endpoint due to treatment with statistical significance at P ≤ .05.
The CVs associated with the number of spermatogonia per Sertoli cell are presented in Table 1. Several types of spermatogonia have been identified in most species (eg, hamster: Clermont, 1954; mouse: Oakberg, 1956; rabbit: Swierstra and Foote, 1963; human: Heller and Clermont, 1964). By classical definition, those possessing nucleoplasm having a smooth, coarse, or intermediate texture have been designated as types A, B, and intermediate (I), respectively. However, further morphological features and kinetic data have enabled the identification of a variety of spermatogonial subtypes (eg, types A1, A2, A3, etc. in the rat). For some investigations (eg, to establish the kinetics of spermatogenesis), it is important to obtain separate counts for each subtype, whereas for others such distinctions may be unnecessary. Thus, the data in Table 1 include several CVs based on the major types of spermatogonia (eg, type A) and others based on specific subtypes or groupings of subtypes (eg, types A1, A2). For one study with the mouse (Huckins and Oakberg, 1978b), all of the spermatogonia present within each stage were pooled, without attempting to distinguish among those of types A, I, or B, respectively. Because the use of such broad groupings is uncommon, the CVs identified within that study have been presented but will not be given further consideration herein.
Table 1. . CVs associated with the number of spermatogonia per Sertoli cell
|Rat|| || || ||Berndtson and Thompson (1990a)|
| 60-d-old||25||Type A||25.6|| |
| 150-d-old||23||Type A||23.4|| |
| 240-d-old||24||Type A||18.0|| |
|Mousea||10|| || ||Huckins and Oakberg (1978a)|
| Stage I|| ||Type A1 + A||13.2|| |
| Stage II|| ||Type A2 + A||18.2|| |
| Stage III|| ||Type A3 + A||9.8|| |
| Stage III|| ||Undifferentiated A||65.4|| |
| Stage IV|| ||Type A4||8.5|| |
| Stage IV|| ||Undifferentiated A||35.6|| |
| Stage V|| ||Type I||6.1|| |
| Stage V|| ||Undifferentiated A||35.4|| |
| Stage VI|| ||Type B||3.2|| |
| Stage VI|| ||Undifferentiated A||16.0|| |
|Mouseb||3||Stage I||7.2||Huckins and Oakberg (1978b)|
| || ||Stage III||35.8|| |
| || ||Stage IV||19.5|| |
| || ||Stage V||19.4|| |
| || ||Stage VI||8.8|| |
|Rabbitc||7||Type A||9.5||Thompson and Berndtson (1993)|
|Humand||21||Type A—dark||23.6||Shakkebaek and Heller (1973)|
| || ||Type A—pale||23.4|| |
| || ||Type B||23.2|| |
| || ||All types||21.6|| |
For the rodent, the CVs for type A spermatogonia and the various type A spermatogonial subtypes ranged from 7.2% to 65.4%. This is a very wide range, but as noted elsewhere for other endpoints, CVs tend to be greater for those cells that are present in limited numbers, for components of the testis occupying a smaller proportion of the tissue, etc (Berndtson, 1989, 2008). Not surprisingly, the largest of the foregoing reported values (65.4%) was associated with a population of undifferentiated type A spermatogonia at stage III in the mouse. Undifferentiated spermatogonia are present only in very small numbers in sexually mature males. When values for undifferentiated type A spermatogonia were excluded, the CVs for the remaining type A populations ranged from 8.5% to 25.6%, with a weighted mean value of 18.8% (Table 2). Only a single value of 9.5% was identified for the rabbit (Table 1). This value, which was based on type A spermatogonia irrespective of subtype, was within the range but lower than most values reported for rodents. As the only value obtained for the rabbit, it was listed as the typical CV for this endpoint within Table 2. The corresponding CVs for dark and pale type A spermatogonia in one study with the human equaled 23.6% and 23.4%, respectively, for a weighted mean of 23.5% (Table 2).
Table 2. . Typical CVs associated with selected endpoints
|Round spermatids/type A||20.4||…||…|
|Old primary/young primary spermatocyte||3.3||…||…|
|Round spermatids/primary spermatocyte||7.6||…||25.0|
|Secondary spermatocytes/old primary spermatocyte||7.8||…||…|
|Frequency of stages:|| || || |
| Mean for all stages||24.4||18.9||…|
| Range for individual stages||9.0–43.4||9.9–33.0||…|
|Degenerating germ cells||21.8–316.2||…||44.4|
|Spermatid giant cells||…||112.0||…|
In the mouse, intermediate (type I) spermatogonia arise from the division of the most developmentally advanced type A spermatogonia, and they in turn divide to produce type B spermatogonia (Oakberg, 1956). Type I and type B spermatogonia are not present within all stages of the cycle of the seminiferous epithelium. However, because each spermatogonial division has the theoretical potential to yield a doubling in the number of resulting daughter cells, the numbers of type I spermatogonia should exceed the corresponding numbers of type A spermatogonia present during the preceding stage(s). Similarly, the numbers of type B spermatogonia should exceed the numbers of type I spermatogonia. The average numbers of type A, I, and B spermatogonia per Sertoli cell usually reflect these differences. Because of these numerical differences, it is not surprising that the CVs for type I and type B spermatogonia per Sertoli cell in the mouse were smaller than the corresponding values for type A spermatogonia, and equaled 6.1% and 3.2%, respectively (Table 1). Because these were the only CVs identified for these cell types, they are cited as the typical CVs within Table 2. The one CV identified for type B spermatogonia per Sertoli cell in the human equaled 23.2%, which was assumed to be the typical CV for this cell type (Table 2).
Spermatocytes are relatively abundant in the testes of normal, sexually mature males. Therefore, it is not surprising that the CVs associated with the spermatocyte: Sertoli cell ratio (Table 3) were generally lower than the corresponding values for the less numerous type A spermatogonia (Table 1). For primary spermatocytes (preleptotene, leptotene, zygotene, pachytene, and/or diplotene), individual values ranged from 5.4% to 16.4% for rats, from 8.2% to 24.7% for hamsters, from 12.7% to 14.8% for rabbits, and from 21.5% to 32.3% for humans. The weighted mean values for rodents, rabbits, and humans equaled 10.4%, 13.5%, and 28.1%, respectively (Table 2). Only 1 value, equaling 11.5% for the rat, could be identified to serve as the typical CV (Table 2) for the secondary spermatocyte: Sertoli cell ratio.
Table 3. . CVs associated with the number of spermatocytes per Sertoli cell
|Rat||13–15||L||11.1||Berndtson et al (1989)|
|Rat|| || || ||Berndtson and Thompson (1990a)|
| 60-d-old||25||PL||15.5|| |
| 150-d-old||23||PL||6.7|| |
| 240-d-old||24||PL||8.3|| |
| 60-d-old||25||PACH||16.4|| |
| 150-d-old||23||PACH||7.7|| |
| 240-d-old||24||PACH||7.8|| |
|Rata||4||PL + L + Z||7.0||Wing and Christensen (1982)|
| || ||PACH + DIPL||5.4|| |
| || ||Secondary||11.5|| |
|Hamsterb||5||PL||11.8||Sinha Hikim et al (1988)|
| || ||PACH||8.2|| |
|Hamsterc||7||Primary||24.7||Barr et al (1971)|
|Rabbitd||5||L||14.8||Berndtson et al (1989)|
|Rabbite||7||Young primary||12.7||Thompson and Berndtson (1993)|
| || ||Old primary||13.3|| |
|Humanf||21||PL||30.5||Shakkebaek and Heller (1973)|
| || ||L||32.3|| |
| || ||Z + PACH||21.5|| |
The CVs associated with the numbers of spermatids per Sertoli cell are presented in Table 4. Values for round spermatids (including step 1–10 spermatids in one study) in rodents ranged from 6.8% to 17.5%. The largest of these values was associated with 60-day-old rats, for which slightly greater among-animal variability might be anticipated because of differential rates of postpubertal testicular development. When that value was excluded, the weighted mean based on all of the remaining studies equaled 8.7% (Table 2). The corresponding CV for elongated spermatids was slightly greater, with a weighted mean of 13.8%, but this value was based on a relatively small sample of animals. Only single CVs of 13.7% (regarded as typical for this endpoint, Table 2) and 9.4% were identified for round and elongated spermatids per Sertoli cell, respectively, in rabbits. The CVs associated with round spermatids per Sertoli cell in 3 studies with the human ranged from 23.3% to 30.4%, with a weighted mean of 26.9% (Table 2).
Table 4. . CVs associated with the number of spermatids per Sertoli cell
|Rat|| || || || |
| 60-d-old||25||Round||17.5||Berndtson and Thompson (1990a)|
| 150-d-old||23||Round||7.0|| |
| 240-d-old||24||Round||7.7|| |
|Rat||3||Elongated||15.7||Russell and Peterson (1984)|
|Rata||4||Step 1–10||6.8||Wing and Christensen (1982)|
|Rat||15b||Round||11.9||Berndtson et al (1989)|
| ||15c||Round||8.1|| |
|Hamster||5||Elongated||12.7||Russell and Peterson (1984)|
|Hamsterd||5||Round||15.3||Sinha Hikim et al (1988)|
|Rabbit||4||Elongated||9.4||Russell and Peterson (1984)|
|Rabbite||7||Round||13.7||Thompson and Berndtson (1993)|
|Humanf||15||Spermatids||23.3||Barr et al (1971)|
|Human|| || || ||Johnson et al (1984)|
| 20–48 y||37||Round||30.4|| |
| 50–85 y||34||Round||27.1|| |
|Humang||21||Early||23.6||Shakkebaek and Heller (1973)|
| || ||Late||26.4|| |
Data from which the CVs associated with the ratio between the number of germ cells of one type to those for another (ie, the germ cell: germ cell ratio) could be determined were limited, but the values that were identified are summarized in Table 5. For the rat, the number of round spermatids per spermatogonium in stage VII (Leblond and Clermont, 1952) seminiferous tubular cross sections was associated with CVs of 18.3–22.2 among 60-, 150-, and 240-day-old males, with a weighted mean value of 20.4% (Table 2). For the hamster, the CVs associated with the ratios (yields) of preleptotene primary spermatocytes per spermatogonium, pachytene primary spermatocytes per preleptotene primary spermatocyte, and round step 7 spermatids per pachytene primary spermatocyte in stage VII seminiferous tubular cross sections were 22.5%, 4.5%, and 6.2%, respectively. Although slightly different germ cell populations were examined, the CV of 4.5 for the pachytene: preleptotene ratio in hamsters was relatively similar to the value of 1.8% for the ratio of older (pachytene + diplotene) vs younger (preleptotene + leptotene + zygotene) primary spermatocytes in the rat. The weighted mean value of 3.3% for these 2 studies would seem to represent a reasonable estimate of typical variability for the ratio of young to old primary spermatocytes in untreated rodents (Table 2). Similarly, the CV of 9.4% for the ratio or yield of round step 1–10 spermatids per older (pachytene + diplotene) primary spermatocyte in the rat was similar to the value of 6.2% for the step 7 spermatid: pachytene primary spermatocyte ratio in the hamster. The weighted mean CV for the round spermatid: primary spermatocyte ratio based on these 2 studies equaled 7.6% (Table 2). This value was nearly identical to the one CV of 7.8% associated with the secondary spermatocyte: old primary spermatocyte (pachytene + diplotene) observed and assumed to be typical (Table 2) for the rat. Only limited comparable information could be identified for the human, for which the CVs associated with the spermatid: spermatocyte ratio ranged from 18.9% to 27.8% (weighted mean = 25.0%, Table 2). CVs associated with germ cell: germ cell ratios could not be identified for the rabbit.
Table 5. . CVs associated with germ cell: germ cell ratios
|Rat|| || || ||Berndtson and Thompson (1990a)|
| 60-d-old||25||Round spermatids: spermatogonium||20.7|| |
| 150-d-old||23||Round spermatids: spermatogonium||22.2|| |
| 240-d-old||24||Round spermatids: spermatogonium||18.3|| |
|Rata||4||PACH + DIPL: PL + L + Z||1.8||Wing and Christensen (1982)|
| || ||Secondary: PACH + DIPL||7.8|| |
| || ||Step 1–10: secondary||16.0|| |
| || ||Step 1–10: PACH + DIPL||9.4|| |
|Hamster||5||PL: type A spermatogonia||22.5||Berndtson and Desjardins (1974b)|
| || ||PACH: PL||4.5|| |
| || ||Step 7: PACH||6.2|| |
|Humanb||7||Spermatids: spermatocyte||18.9||Barr et al (1971)|
| ||15||Spermatids: spermatocyte||27.8|| |
Because of the sequential distribution of cellular associations along the length of individual seminiferous tubules (Perey et al, 1961), round seminiferous tubular cross sections of most mammalian testes contain only a single cellular association or stage of the cycle of the seminiferous epithelium. Although stages of the cycle of the seminiferous epithelium have been identified for the human, specific cellular associations (or “stages”) are confined to small, discrete patches distributed along the seminiferous tubules of men (Heller and Clermont, 1964). The latter pattern of distribution has precluded the traditional classification of seminiferous tubular cross sections as belonging to a particular “stage.” Therefore, the CVs associated with the frequency of individual stages could be determined only for the rat and rabbit. Those values are presented in Table 6. These data were based on staging by the acrosomal system of Leblond and Clermont (1952) in the rat, which enables identification of 14 distinct stages. Staging of the rabbit seminiferous epithelium was based on the tubular morphology system of Swierstra and Foote (1963), which permitted identification of 8 different stages. The CVs associated with a single stage of the cycle in the rat (Table 6) ranged from a low of 7.0% for stage VII in the study by Van Beek and Meistrich (1990) to a high of 80.2% for stage XIII in the study by Bartlett et al (1990). The corresponding weighted mean CVs associated with the frequency of stages I–XIV based on these 3 studies equaled 18.9, 24.1, 29.7, 25.4, 17.1, 21.9, 9.0, 23.3, 32.0, 22.2, 35.2, 10.9, 43.4, and 29.0, respectively. Because values differed so markedly among individual stages, both the range of 9.0%–43.4% for individual stages and a mean value of 24.4%, based on all 14 stages, are presented within the summary of typical CVs given in Table 2. Similarly, the CVs associated with the frequency of individual stages in the rabbit ranged from a low of 6.0% for stage I in the study of Swierstra and Foote (1963) to a high of 35.8% for stage V in the study of Amann and Lambiase (1969). When averaged over all 3 investigations, the CVs associated with stages I–VIII equaled 9.9, 10.2, 21.3, 17.6, 33.0, 18.2, 18.2, and 23.1%, respectively (range: 9.9%–33%, Table 2). The mean based on all 8 stages equaled 18.9% (Table 2).
Table 6. . Coefficients of variability associated with the frequency of stages of the seminiferous epithelium
|IX||36.9||25.8||40.9|| || || |
|X||32.9||12.1||33.3|| || || |
|XI||31.7||25.8||63.6|| || || |
|XII||12.4||8.9||14.1|| || || |
|XIII||80.2||31.2||25.0|| || || |
|XIV||47.6||17.1||34.0|| || || |
Table 7 contains the CVs identified for the numbers of degenerating germ cells per 100 Sertoli cells (or Sertoli nucleoli) in the rat, and for rates of cellular attrition during the postprophase of meiosis in the human. For the rat, the CVs differed by several orders of magnitude because of the specific type of germ cell examined and/or among individual studies. Six of 10 CVs exceeded 100%, with one high value of 316%. Potential explanations for these large and sometimes inconsistent values are discussed subsequently. However, such inconsistencies precluded the identification of one single level of variability that one might consider as typical or reliable for estimating the power and sensitivity of future investigations. The range of CVs identified herein (Table 7) will be considered for the latter purpose, and is presented among the typical values summarized in Table 2. The CVs associated with cellular attrition during the progression of primary spermatocytes to round spermatids ranged from 37.6% to 51.7% in the human, for which the weighted mean CV was 44.4% (Table 2).
Table 7. . Coefficients of variability (CV) associated with numbers of degenerating germ cells
|Rat||Sprague-Dawley (225–300 g)||10|| ||Russell and Clermont (1977)|
| || Pachytene spermatocytes (Stage VII)a|| ||158.1|| |
| || Dividing spermatocytes (Stage I)a|| ||112.9|| |
| || Step 7 spermatids (Stage VII)a|| ||316.2|| |
| || Step 19 spermatids (Stages VIII–IX)a|| ||126.5|| |
| || Unidentified cells (Stage VII)a|| ||158.1|| |
|Rat||250–300 g||8|| ||Russell et al (1981)|
| || Spermatocytes (Stage XIV)|| ||127.3 (132.0)b|| |
| || Spermatocytes (Stage VII)|| ||57.5 (59.7)b|| |
| || Step 7 spermatids|| ||21.8 (27.9)b|| |
| || Step 19 spermatids|| ||38.3 (45.7)b|| |
| || Unidentified cells (Stage VII)|| ||23.0 (25.4)b|| |
|Human||26- to 53-y-old|| || ||Johnson et al (1983)|
| || Histometric||10||47.2c|| |
| || Homogenization||10||51.7c,d|| |
| || Homogenization||15||37.6c,d|| |
The CVs associated with the incidence of spermatid giant cells, hypospermatogenesis, spermatogonial swelling, and cytoplasmic vacuoles in rabbits of different ages are depicted in Table 8. For males greater than 52 weeks of age, the CVs associated with those variables and reported as typical CVs (Table 2) were 112.0%, 173.2%, 28.7%, and 37.1%, respectively.
Table 8. . Coefficients of variability associated with the incidence of testicular abnormalities in New Zealand rabbits of different agesa
The typical CVs identified herein have been summarized in Table 2. Those values, in turn, were used to estimate the approximate number of replicates needed to provide future experiments of 80% or 90% power for detecting statistically significant treatment responses of 10%, 20%, or 30% at P ≤ .05. Those replication requirements are summarized in Tables 9, 10, 11.
Table 9. . Approximate number of rats needed per treatment group to provide experiments of a given level of power and sensitivitya
|Round spermatids/type A||66||18||10||89||24||12|
|Old/young primary spermatocyte||3||2||2||4||2||2|
|Round spermatids/primary spermatocyte||11||4||3||14||5||3|
|Frequency of individual stagesb||95||25||12||127||33||16|
|Degenerating germ cellsc||≥76||≥21||≥10||≥102||≥27||≥13|
Table 10. . Approximate number of rabbits needed per treatment group to provide experiments of a given level of power and sensitivitya
|Frequency of stagesb||57||16||8||77||21||10|
|Spermatid giant cellsc||>1571||>393||>175||>2103||>526||>234|
Table 11. . Approximate number of men needed per treatment group to provide experiments of a given level of power and sensitivitya
|Round spermatids/primary spermatocyte||99||26||12||132||34||16|
|Degenerating germ cells||305||78||36||416||105||48|
Despite extensive literature on spermatogenesis, the characterization of the inherent variability associated with some endpoints proved quite challenging. In many instances, means were reported, but the additional data needed to calculate CVs were unavailable. Identifying the variability associated with the ratio of one cell type to another proved especially difficult. In most studies, researchers first determined the group mean for the number of cells of each type per unit of tissue. Those means, rather than the values for each male, were used to calculate the ratios between one cell type and another. In such instances, it was not possible to characterize the variability among the individual males in the sample population. Therefore, to expand the present database, the author found it necessary to calculate and include several CVs from sets of raw data in his possession. Where this was done, the reference for the original studies from which these data were taken has been cited.
Different approaches have been used to quantify daily sperm production or to determine the number of cells of a specific type per seminiferous tubular cross section, per gram of tissue, or per testis (Berndtson, 1977). Of these, only 2 have been used extensively to determine cell: cell ratios. One is the classical approach of Clermont and Morgentaler (1955) by which one performs direct counts of the number of germ cells and Sertoli cells per round seminiferous tubular cross section at a given stage of the cycle of the seminiferous epithelium. The second is the volume density approach, in which one employs a system of random “hits” of a pointer to determine the percentage of the testis occupied by specific nuclei. The total volume of those nuclei is calculated, and the total number of cells of that type is estimated by dividing the total volume of those nuclei by the volume of a single nucleus (Berndtson, 1977). A third and more specialized approach has been to examine the number of germ cells embedded in the apex of a single Sertoli cell via reconstruction of serial sections examined by electron microscopy (Russell and Peterson, 1984; Johnson, 1986). The fundamental basis for these approaches and a critical examination of the technical assumptions required with each have been described elsewhere (Berndtson, in press). The inherent variability and replication requirements associated with many endpoints derived with these methods have also been reported (eg, Berndtson, 1989, 1990, 2008; Berndtson et al, 1989, 1997; Berndtson and Thompson, 1990a; Berndtson and Clegg, 1992). Despite their relative merits and shortcomings, each would appear to provide data that are both accurate and reliable when applied to assess spermatogenesis in normal males. Accordingly, the CVs associated with cell ratios have been pooled across assessment methods for presentation and discussion herein.
The CVs associated with numbers of germ cells per Sertoli cell included a very wide range of values (Tables 1, 3, and 4). Although the cause(s) may not be understood in their entirety, some of this variability does have a probable or obvious explanation. First, the among-animal variability in the ratio of type A spermatogonia (Table 1), preleptotene and pachytene primary spermatocytes (Table 3), and round spermatids (Table 4) per Sertoli cell was greater for 60-day-old rats than among counterparts aged 150 or 240 days. Potential differences in the rate of testicular maturation could account for the greater variability among younger rats during the immediate postpubertal period. Similarly, the challenge in obtaining testicular tissues from humans has undoubtedly necessitated the use of subjects of different ages, unknown environmental backgrounds, and much less genetic homogeneity in comparison to that for the more highly inbred laboratory species. These factors probably contribute to the greater variation among men. Differences in the size of the reported CVs (Tables 1, 3, and 4) are also apparent among the specific populations of cells being evaluated within the same study. In the author's experience (Berndtson, 2010), CVs tend to be larger for those cells that are less prevalent and for specific tissue components that occupy only a small portion of the testis, and such trends are generally apparent among the typical CVs identified herein. The reason for this relationship can be readily appreciated by simple illustration, as follows. Imagine that the mean percentage of the testicular parenchyma occupied by seminiferous tubules and interstitial tissue for a group of males equaled 90% and 10%, respectively. Because these 2 components must collectively constitute 100% of the testicular parenchyma, any deviation from this mean would cause a proportionately larger change for the smaller component. For example, a male in which these 2 components occupied 81% and 19% of the testis, respectively, would deviate from the hypothetical mean by possessing 10% less seminiferous tubular tissue and 90% more interstitial tissue. Accordingly, a much larger CV should be expected for the smaller of these 2 components.
In most studies, germ cells were quantified at a specific stage of the cycle. However, whereas some investigators simply counted all of the spermatogonia that were present (eg, per seminiferous tubular cross section), others performed separate counts for each of the spermatogonial types or subtypes. Because some cells are present only in very low numbers, one would expect those cell types to be accompanied by larger CVs. Such a relationship between the numerical size of a population and its relative CV is apparent by comparing data in Tables 1, 3, and 4. Although the number of cells of each type is not stated, each successive mitotic or meiotic division has the potential to produce a theoretical doubling in the number of daughter cells. Thus, spermatogonia should be less numerous than primary or secondary spermatocytes, and spermatids should be the most numerous cells within the seminiferous epithelium. The CVs for many of the spermatogonial populations were often in the 20s, whereas those for the spermatocytes and spermatids were more frequently in the teens or single digits. Also undifferentiated spermatogonia are typically the least numerous spermatogonia in the testes of sexually mature males, and the CVs associated with those cells tended to be very large (Table 1).
Another factor that may have contributed to the variation among CVs in different studies is the sampling intensity employed. For example, because of rigorous technical requirements, the direct enumeration of germ cells embedded in the Sertoli cell apex has often been limited to a small number of Sertoli cells per male. Unless the number of germ cells per Sertoli cell is highly consistent within a single male, the examination of only a few Sertoli cells could not be expected to yield an accurate characterization of each male (Berndtson et al, 1989; Berndtson and Thompson, 1990a). Similarly, some studies involved a relatively small number of experimental subjects. Thus, some of the apparent variability among males observed in Tables 1, 3, and 4 might be attributable to limited replication and/or sampling intensity.
The variability associated with the ratio of old to young primary spermatocytes in the rodent was quite small (CV = 3.3%, Table 2). Accordingly, it is apparent that rates of attrition during the progressive development of the primary spermatocytes are quite similar among normal, sexually mature rodents. Although the CVs were still relatively small, slightly greater among-male variability was recorded (Table 2) for the progression of events by which primary spermatocytes produce secondary spermatocytes (CV = 7.8%) or round spermatids (CV = 7.6%). A larger CV of 20.4% was noted for the ratio of round spermatids to type A spermatogonia (Table 2). This might be anticipated, because the progression from spermatogonia to round spermatids would encompass a number of mitotic and meiotic divisions extending over multiple cycles of the seminiferous epithelium. Because comparable data for other species was either limited or unavailable, the potential for or magnitude of species differences remains a matter of conjecture. However, the one CV for the human of 25.0% for the ratio of round spermatids per primary spermatocyte was approximately 3 times greater than the corresponding CV in the rodent. The greater variability associated with the human was consistent with that noted for most other endpoints.
Data presented for the rat in Table 6 were based on the acrosomal system for identifying stages developed by Leblond and Clermont (1952), which enables identification of 14 distinct stages, whereas the 8-stage tubular morphology system of classification of Swierstra and Foote (1963) was employed for the rabbit. Because increases in the number of identifiable stages will, in general, decrease the frequency of most individual stages, one would expect to encounter larger CVs with the 14- vs 8-stage systems. In general, the CVs for most stages could be considered to be moderate to large, and it is clear that the values for specific stages sometimes differed substantially among the different studies.
It is interesting to note that the study by Hess et al (1990) with rats involved a much greater number of rats and observations per rat than were used in the other investigations with this species. This greater replication and sampling intensity (ie, number of observations per male) may have contributed to the absence of CVs for individual stages of the great magnitude noted in each of the other studies (eg, 80.2%, 63.6%, etc). Nonetheless, the CVs associated with most stages, even in the study of Hess et al (1990), were in the teens to 30th percentile range. Corresponding values appear to be slightly lower for the rabbit, as would be expected at least in part because only 8 different stages were distinguished.
At one time, it was thought that the length of 1 cycle of the seminiferous epithelium, the duration of spermatogenesis (time required to produce a sperm cell from the least differentiated spermatogonium) and the timing of various spermatogenic events was constant and rigidly controlled (Clermont, 1972). If so, the frequency of stages should also be identical and constant among all males. However, moderate to large differences in the frequency of individual stages was apparent even among normal, untreated males (Table 6). When using this endpoint, the importance of taking a sufficient number of observations per male and of including an adequate number of males per treatment group should be apparent. This issue also has important implications for the validity of time divisors used to estimate daily sperm production. Time divisors are intended to represent the life span of specific testicular germ cells (ie, the length of time during which specific cells are present within the testis). For example, one might determine the total number of round spermatids per testis or per gram of testicular parenchyma. An estimate of DSP could then be derived by dividing the total number of such cells by the appropriate time divisor. Because the frequency of a given stage is directly proportional to its duration, most time divisors are determined from stage frequency data and knowledge of the duration of 1 cycle of the seminiferous epithelium. Because of inherent variability in stage frequencies (Table 6), a time divisor derived by examination of a relatively large number of males would be expected to yield a reliable estimate of the mean for a given population, from which the actual time divisor for some individual males might deviate. In addition to the inherent variability noted herein, recent evidence has documented the potential for arrested development and stage synchronization (ie, most tubular cross sections containing the same stage) in response to specific experimental treatments (Morales and Griswold, 1987; Bartlett et al, 1990; VanBeek and Meistrich, 1990). The implications of these findings on the validity of time divisors have been described elsewhere at length (Berndtson, in press).
Many of the CVs associated with rates of germ cell degeneration were huge, with a value for Step 7 spermatids in one study with rats determined to equal 316%. Even the smallest recorded CV of 21.8% was of moderate size. Collectively, these data (Table 7) reinforce how a characteristic that is normally observed only at a low frequency will typically be characterized by a large CV. This relationship was further evident from the CVs for a number of different abnormalities observed in one study with the rabbit (Table 8).
The importance of considering inherent variability while designing and interpreting results of a given experiment should be apparent from the data in Tables 9, 10, 11. Whereas the use of only 4 rats per treatment group would be expected to provide an experiment with a 90% probability (ie, power) for detecting an actual 10% change in the number of type B spermatogonia per Sertoli cell, 89 rats would be required to provide equivalent power for detecting an alteration of similar magnitude in the ratio of round spermatids per type A spermatogonium. Indeed, the number of replicates needed to provide experiments of equivalent power and sensitivity differed by several orders of magnitude among the various endpoints one might wish to utilize, and replication requirements also differed for the same endpoint among species.
Although a relatively large number of replicates are needed to yield powerful and sensitive experiments for assessing treatment effects via most of the endpoints examined herein, many of the studies reviewed for this investigation involved a relatively small number of replicates (see Tables 1, 3, 4, 5, and 7). Such investigations are clearly of very limited power and sensitivity. This conclusion is not intended to imply that certain studies were poorly designed or inadequately replicated. Indeed, decisions about the number of replicates to use in a given study will continue to require consideration of cost and many other factors (Berndtson, 1991, 2008). However, it is important to remain cognizant of the relationship between levels of replication and the power and sensitivity of each experiment. In that regard, it is especially important to realize that most statistical analyses focus only on type I error (ie, the probability of error when declaring the existence of a treatment effect). In the absence of statistical significance, typically at P ≤ .05, many investigators conclude inappropriately that a treatment(s) was without effect. The failure to detect a treatment effect at P ≤ .05 simply indicates that one cannot say with 95% certainty that the treatment has had an effect. It does not assure the absence of a treatment effect, especially if the capacity of the experiment to detect a treatment effect was very limited. Indeed, if one did not detect a treatment response in an experiment possessing 95% power for detecting a 5% change, one could be relatively confident that the treatment was either without effect or that any treatment effect was likely to be quite small. In contrast, if the experiment provided a power of only 80% for detecting a 50% change, the need for a more cautious interpretation would be evident. For this reason, the author strongly recommends that investigators adopt the practice of considering and presenting the power and sensitivity of their experiments when interpreting and publishing their results.
As discussed elsewhere in detail (Berndtson, 2008), readers should not attempt to judge the relative ability of specific endpoints to detect treatment effects based on replication requirements alone. First, the responses to a given treatment usually differ among various endpoints, and such differences may be difficult to predict in advance. For example, a treatment that depressed sperm production would be likely to reduce the number of spermatids per Sertoli cell, but might be without effect on the frequency of specific stages of the cycle of the seminiferous epithelium. Accordingly, one should select from among those endpoints that are most meaningful and appropriate for the intended objectives of a given experiment. In addition, the endpoints characterized by larger CVs are sometimes associated with treatment responses of substantially greater relative magnitude. One hypothetical example presented previously illustrated how a 10% change in the volume density of the seminiferous tubules would produce a 90% change in the volume density of the interstitial tissue. A simple comparison of the number of replicates needed to detect changes of equal magnitude with these 2 endpoints would be clearly misleading. For such reasons, the identification of the endpoint(s) by which one would be most likely to detect a response, if there were an actual treatment effect, is a somewhat complex and challenging exercise. In another investigation (Berndtson, 2008), the author ranked several methods for quantifying sperm production rates based on their expected relative ability to detect actual treatment responses. The approach used in that investigation may serve as a model for others wishing to develop similar rankings for other endpoints.
Because the CVs identified within the literature (Tables 1, 2, 3, 4, 5, 6, 7, 8) often differed for the same endpoint among individual studies, the replication requirements cited in Tables 9, 10, 11 should be regarded as approximate values intended to serve as a general guide. Investigators possessing data sets of their own or with access to other published studies may readily characterize the replication requirements that might be most appropriate for future investigations with similar or different populations of experimental subjects or endpoints via the general approach utilized herein. This same approach may also be used to confirm the actual power and sensitivity of completed studies, which can be invaluable during the interpretation of the results.
The reliability of one's results and conclusions is of great importance for most experiments. This investigation has identified and characterized large differences in the inherent variability associated with several endpoints that may be used for assessing testicular function in rodents, rabbits, and men, and the impact of this variability on the number of replicates needed to provide future experiments of a given power and sensitivity. Hopefully, this information will contribute to greater awareness and appreciation of the value of power and sensitivity considerations, while serving as a valuable resource for other investigators during the design and interpretation of future studies with these endpoints and species.