• catalyst development;
  • CO oxidation;
  • data mining;
  • knowledge extraction;
  • statistical analysis


The objective of this work is to demonstrate that some valuable knowledge can be extracted from past publications by using various data mining tools so that the continuously growing experience accumulated in the literature over the years can be used in a more effective manner. Selective CO oxidation over noble metal catalysts is chosen as a case to test the validity of this approach because a considerable number of papers were published on this subject in the last decade. Thus, 249 papers published in the last 12 years have been inspected, 80 of which were used to form a database containing 5610 data points. First, the database was analyzed by using decision tree classification to determine the conditions that lead to high CO conversion. Then, the relative importance of various catalyst preparation and operational variables for CO conversion were determined by using artificial neural networks. Finally, the database was separated into smaller clusters by using a genetic algorithm-based clustering technique, and the data in each cluster was modeled by artificial neural networks to predict the effects of individual catalyst preparation and operational conditions on the catalytic activity. All these analyses were effective in the extraction of knowledge from the literature and the deduction of some useful trends, rules, and correlations, which are otherwise not easily comprehensible.