Ensemble methods aim at combining multiple learning machines to improve the efficacy in a learning task in terms of prediction accuracy, scalability, and other measures. These methods have been applied to evolutionary machine learning techniques including learning classifier systems (LCSs). In this article, we first propose a conceptual framework that allows us to appropriately categorize ensemble-based methods for fair comparison and highlights the gaps in the corresponding literature. The framework is generic and consists of three sequential stages: a pre-gate stage concerned with data preparation; the member stage to account for the types of learning machines used to build the ensemble; and a post-gate stage concerned with the methods to combine ensemble output. A taxonomy of LCSs-based ensembles is then presented using this framework. The article then focuses on comparing LCS ensembles that use feature selection in the pre-gate stage. An evaluation methodology is proposed to systematically analyze the performance of these methods. Specifically, random feature sampling and rough set feature selection-based LCS ensemble methods are compared. Experimental results show that the rough set-based approach performs significantly better than the random subspace method in terms of classification accuracy in problems with high numbers of irrelevant features. The performance of the two approaches are comparable in problems with high numbers of redundant features.