Standard Article

Data Mining, Software Packages for

  1. Dominique Haughton

Published Online: 15 JUL 2005

DOI: 10.1002/0470011815.b2a13093

Encyclopedia of Biostatistics

Encyclopedia of Biostatistics

How to Cite

Haughton, D. 2005. Data Mining, Software Packages for. Encyclopedia of Biostatistics. 2.

Author Information

  1. Bentley College, Boston, MA, USA

Publication History

  1. Published Online: 15 JUL 2005


The term data mining refers to the identification—within a typically large database—of new, valid, and interesting patterns. While data mining has become most popular in the context of, for example, database marketing, most of the methods under the data mining umbrella have been widely applied in biostatistics. We describe which main applications of data mining have arisen recently in biostatistics, and introduce the reader to some of the available data mining software packages with a reference to biostatistical needs.


  • association analysis;
  • CART;
  • data mining;
  • Kohonen maps;
  • MARS;
  • microarray data;
  • neural nets;
  • pharmacovigilance;
  • SAS Enterprise Miner, XLMiner