## INTRODUCTION

The hypothesis test approach is commonly used in the National Pollutant Discharge Elimination System Program in the United States to analyze whole effluent toxicity (WET) test data. The test of significant toxicity (TST) was recently proposed as an alternative statistical approach for analyzing WET data. The TST approach is a statistical method that uses hypothesis-testing techniques based on previous U.S. Environmental Protection Agency (U.S. EPA) guidance 1 as well as work by many researchers 2–5. The TST examines whether the results of two treatments differ by an a priori prescribed amount rather than whether they are the same, as in traditional hypothesis testing 1, 4. The TST approach for WET testing uses the *null hypothesis: µ*_{T}* ≤ b * µ*_{C.} This null hypothesis includes a specific value for the ratio µ_{T}/µ_{C}, designated *b* (where *b* is a constant, 0.0 < *b* < 1.0), to delineate unacceptable and acceptable levels of toxicity. It also reverses the inequalities so that it is assumed that the sample has an unacceptable level of toxicity until demonstrated otherwise.

The TST approach uses Welch's *t* test to compare organism response in an effluent or receiving water sample with the response in a control or reference site sample. Welch's *t* test is a modification of the Student's *t* test and is intended for use in two-sample comparisons when there is a possibility of unequal variances between the two treatments 6. Welch's *t* test accounts for different variances in two groups and assumes data are normally distributed 6–10. Many researchers report that when unequal variances are combined with nonnormal distributions, both the traditional *t* test and nonparametric methods (e.g., Mann-Whitney-Wilcoxon test) have type I error rates that strongly deviate from nominal error rates 11–14. In these situations, Welch's *t* test has been recommended because it is more robust to type I error than other statistical tests 13–15. However, for nonnormal data that have skewed, long-tailed distributions (e.g., log normal or exponential distribution), the Welch's *t* test is known to have poor coverage 14; that is, the realized error rate (α) under the null hypothesis is greater than the intended, nominal value. The WET data are subject to unequal variances between a control and an effluent treatment, particularly for those test methods that measure acute mortality 16. Thus, use of Welch's *t* test rather than the traditional *t* test (which assumes equal variances) is advisable. However, the issue of nonnormality is also a concern when applying the Welch's *t* test. If WET test data are nonnormally distributed in a way that does not substantially compromise coverage of the Welch's test, such as a leptokurtic distribution 10–14, Welch's *t* test would still be an appropriate test for analyzing two-sample (e.g., concentration) comparisons of WET data.

In the present study, we examined the distribution and variance of typical WET test data to determine the suitability of using Welch's *t* test, particularly within the TST approach. We demonstrate that: (1) moderately unequal variances observed in WET test data have little effect on coverage of the *t* test or Welch *t* test (for normally distributed data), and (2) for the type of nonnormally distributed data observed in WET tests using a two-sample design, Welch's *t* test yields similar to nominal coverage using the TST approach.