Determining the strength of the ancient geomagnetic field (paleointensity) can be time consuming and can result in high data rejection rates. The current paleointensity database is therefore dominated by studies that contain only a small number of paleomagnetic samples (n). It is desirable to estimate how many samples are required to obtain a reliable estimate of the true paleointensity and the uncertainty associated with that estimate. Assuming that real paleointensity data are normally distributed, an assumption adopted by most workers when they employ the arithmetic mean and standard deviation to characterize their data, we can use distribution theory to address this question. Our calculations indicate that if we wish to have 95% confidence that an estimated mean falls within a ±10% interval about the true mean, as many as 24 paleomagnetic samples are required. This is an unfeasibly high number for typical paleointensity studies. Given that most paleointensity studies have small n, this requires that we have adequately defined confidence intervals around estimated means. We demonstrate that the estimated standard deviation is a poor method for defining confidence intervals for n < 7. Instead, the standard error should be used to provide a 95% confidence interval, thus facilitating consistent comparison between data sets of different sizes. The estimated standard deviation, however, should retain its role as a data selection criterion because it is a measure of the fidelity of a paleomagnetic recorder. However, to ensure consistent confidence levels, within-site consistency criteria must be depend on n. Defining such a criterion using the 95% confidence level results in the rejection of ∼56% of all currently available paleointensity data entries.