The male and female pubertal assays, which are included in the U.S. Environmental Protection Agency's (EPA) Endocrine Disruptor Screening Program (EDSP) Tier 1 battery, can detect endocrine-active compounds operating by various modes of action. This article uses the collective experience of three laboratories to provide information on pubertal assay conduct, interlaboratory reproducibility, endpoint redundancy, and data interpretation. The various criteria used to select the maximum tolerated dose are described. A comparison of historical control data across laboratories confirmed reasonably good interlaboratory reproducibility. With a reliance on apical endpoints, interpretation of pubertal assay effects as specifically endocrine-mediated or secondary to other systemic effects can be problematic and mode of action may be difficult to discern. Across 21–23 data sets, relative liver weight, a nonspecific endocrine endpoint, was the most commonly affected endpoint in male and female assays. For endocrine endpoints, patterns of effects were generally seen; rarely was an endocrine-sensitive endpoint affected in isolation. In males, most frequently missed EPA-established performance criteria included mean weights for kidney and thyroid, and the coefficient of variation for age and body weight at preputial separation, seminal vesicle weight, and final body weight. In females, the frequently missed EPA-established performance criteria included mean adrenal weight and mean age at vaginal opening. To ensure specificity for endocrine effects, the pubertal assays should be interpreted using a weight-of-evidence approach as part of the entire EDSP battery. Based on the frequency with which certain performance criteria were missed, an EPA review of these criteria is warranted.