Get access

Use of pretransformation to cope with extreme values in important candidate features

Authors

  • Anne-Laure Boulesteix,

    Corresponding author
    1. Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, 81377 Munich, Germany
    • Phone: +49-89-7095-7598, Fax: +49-89-7095-7491
    Search for more papers by this author
  • Vincent Guillemot,

    1. Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, 81377 Munich, Germany
    2. Supélec, Department of Signal and Electronic Systems, F-91192 Gif-sur-Yvette, France
    Search for more papers by this author
  • Willi Sauerbrei

    1. Institute of Medical Biometry and Informatics, University Medical Center Freiburg, Stefan-Meier-Str. 26, 79104 Freiburg, Germany
    Search for more papers by this author

Abstract

Extreme values in predictors often strongly affect the results of statistical analyses in high-dimensional settings. Although they frequently occur with most high-throughput techniques, the problem is often ignored in the literature. We suggest to use a very simple transformation, proposed before in a different context by Royston and Sauerbrei, as an intermediary step between array preprocessing and high-level statistical analysis. This straightforward univariate transformation identifies extreme values in continuous features and can thus be used as a diagnostic tool for outliers. The use of the transformation and its effects is demonstrated for diverse univariate and multivariate statistical analyses using nine publicly available microarray data sets.

Get access to the full text of this article

Ancillary