Oligotyping assumes that quality-filtering techniques have corrected or eliminated most reads that contain sequencing errors. However, even the most effective quality filtering (Qu, Hashimoto & Morishita 2009; Schroder et al. 2009; Bravo & Irizarry 2010; Leek et al. 2010; Meacham et al. 2011; Minoche, Dohm & Himmelbauer 2011; Quince et al. 2011; Benjamini & Speed 2012; Victoria et al. 2012) will not produce error-free data sets. Oligotyping, by using only a fraction of each read to define closely related but distinct organisms, drastically diminishes the actual number of nucleotides used for read comparison. However, during the generation of oligotypes, any sequencing error that may have occurred at one of the selected sites will indeed spawn a new oligotype. The pipeline implements various parameters that help to identify and discard such noisy oligotypes and reduce the impact of sequencing errors on results. These include (s) the minimum number of samples in which an oligotype is expected to be present, (a) the minimum per cent abundance of an oligotype in at least one sample, (A) the minimum actual abundance of an oligotype across all samples and (M) the minimum count of the most abundant unique sequence in an oligotype. The pipeline can also incorporate machine-reported quality scores to set (q) the minimum quality threshold for bases to be used for oligotyping. As with the selection of variable positions for oligotyping, the noise removal step requires user input. Default values are set at s = 1, a = 0, A = 0 and M = 4. These values perform well for data sets that contain 1000–10 000 reads and 1–10 samples. However, data set size and the number of samples should be considered when setting the value of each parameter. Our empirical tests with the oligotyping pipeline showed that the criteria s and M eliminate noise most efficiently. For instance, if there are biological or technical replicates in the experiment, setting s to match the number of replicates will eliminate oligotypes that appear in fewer than s samples. For very large data sets, setting M to equal the average number of reads per sample divided by 1000 will eliminate oligotypes with very low substantive abundance. Although they are similar, M is more efficient than A at reducing noise. Parameter A is comparable to the ‘minimum OTU size’ parameter used by OTU clustering pipelines. However, the actual number of reads that form an OTU rarely indicates the robustness of an OTU alone. For instance, two OTUs, one with 10 unique reads with the abundance of 1 and another with 1 unique read with the abundance of 10, would have the same abundance, but different authenticity. Both would have a parameter value of 10, but the first has a substantive abundance, M, of 1 and the latter a substantive abundance, M, of 10. Hence, we suggest M serve as a noise reduction step instead of the more conventional parameter A. The oligotyping pipeline tracks the read fate throughout the process to inform the user of the number of reads lost by quality-filtering criterion and sample, which makes it possible to detect potential biases in eliminated reads among samples.