Evolution of PLS for Modeling SAR and omics Data

Authors

  • Kiyoshi Hasegawa,

    1. Chugai Pharmaceutical Company, Kamakura Research Laboratories, Kajiwara 200, Kamakura, Kanagawa 247-8530, Japan
    Search for more papers by this author
  • Kimito Funatsu

    Corresponding author
    1. The University of Tokyo, Department of Chemical System Engineering, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-8656, Japan fax: +(81)03-5841-7771
    • The University of Tokyo, Department of Chemical System Engineering, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-8656, Japan fax: +(81)03-5841-7771
    Search for more papers by this author

Abstract

In quantitative structure-activity relationship (QSAR), multivariate statistical methods are commonly used for data analysis. Partial least squares (PLS) is of particular interest because it can analyze data with strongly collinear, noisy and numerous X variables, and also simultaneously models several activity variables Y. PLS provides several prediction regions and diagnostic plots as statistical measures. PLS has evolved for coping with the severe demands imposed by complex data structures. In this review article, we outline the algorithms of five advanced PLS techniques and provide some representative examples of each. The selected models are Nonlinear PLS, Multiway PLS, Hierarchical PLS, Orthogonal PLS, and Bi-modal PLS. Studies of particular aspects of living cells (such as the set of genes or proteins in the cell and their interactions) are collectively known as the -omics fields. Omics integrate heterogeneous scientific disciplines and include chemogenomics, proteomics, and metabolomics. The datasets produced within the omics fields are numerous, megavariate and extremely complex. The data structures are frequently incomplete, noisy, nonlinear and collinear demanding modern and powerful multivariate data analysis methods. In particular, the omics technologies have steered biology towards the adoption of orthogonal PLS. We also describe future prospects for the use of PLS algorithms in the omics fields.

Ancillary