Volume 17, Issue 1
Research Article

O2‐PLS, a two‐block (X–Y) latent variable regression (LVR) method with an integral OSC filter

Johan Trygg

Corresponding Author

E-mail address: j.trygg@imb.uq.edu.au

Institute for Molecular Bioscience, University of Queensland, Australia

Smythe Group/Gehrmann Building Floor 7, Institute for Molecular Bioscience, University of Queensland, Brisbane QLD 4072, Australia.Search for more papers by this author
Svante Wold

Research Group for Chemometrics, Institute of Chemistry, Umeå University, Umeå, Sweden

Search for more papers by this author
First published: 12 February 2003
Citations: 193

Abstract

The O2‐PLS method is derived from the basic partial least squares projections to latent structures (PLS) prediction approach. The importance of the covariation matrix (Y TX) is pointed out in relation to both the prediction model and the structured noise in both X and Y. Structured noise in X (or Y) is defined as the systematic variation of X (or Y) not linearly correlated with Y (or X). Examples in spectroscopy include baseline, drift and scatter effects. If structured noise is present in X, the existing latent variable regression (LVR) methods, e.g. PLS, will have weakened score–loading correspondence beyond the first component. This negatively affects the interpretation of model parameters such as scores and loadings. The O2‐PLS method models and predicts both X and Y and has an integral orthogonal signal correction (OSC) filter that separates the structured noise in X and Y from their joint X–Y covariation used in the prediction model. This leads to a minimal number of predictive components with full score–loading correspondence and also an opportunity to interpret the structured noise. In both a real and a simulated example, O2‐PLS and PLS gave very similar predictions of Y. However, the interpretation of the prediction models was clearly improved with O2‐PLS, because structured noise was present. In the NIR example, O2‐PLS revealed a strong water peak and baseline offset in the structured noise components. In the simulated example the O2‐PLS plot of observed versus predicted Y‐scores (u vs u hat) showed good predictions. The corresponding loading vectors provided good interpretation of the covarying analytes in X and Y. Copyright © 2003 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 193

  • Metabolomics: a recent advanced omics technology in herbal medicine research, Medicinal and Aromatic Plants, 10.1016/B978-0-12-819590-1.00005-7, (97-117), (2021).
  • Integrative analysis of time course metabolic data and biomarker discovery, BMC Bioinformatics, 10.1186/s12859-019-3333-0, 21, 1, (2020).
  • Low AMY1 Copy Number Is Cross‐Sectionally Associated to an Inflammation‐Related Lipidomics Signature in Overweight and Obese Individuals, Molecular Nutrition & Food Research, 10.1002/mnfr.201901151, 64, 11, (2020).
  • Metabolomics of Exhaled Breath Condensate by Nuclear Magnetic Resonance Spectroscopy and Mass Spectrometry: A Methodological Approach, Current Medicinal Chemistry, 10.2174/0929867325666181008122749, 27, 14, (2381-2399), (2020).
  • Rusty sink of rhizodeposits and associated keystone microbiomes, Soil Biology and Biochemistry, 10.1016/j.soilbio.2020.107840, (107840), (2020).
  • Blood metabolic profile tests at dairy cattle farms as useful tools for animal health management, BULGARIAN JOURNAL OF VETERINARY MEDICINE, 10.15547/bjvm.2161, 23, 1, (1-20), (2020).
  • Correlation and association analyses in microbiome study integrating multiomics in health and disease, , 10.1016/bs.pmbts.2020.04.003, (2020).
  • Envelopes: A new chapter in partial least squares regression, Journal of Chemometrics, 10.1002/cem.3287, 34, 10, (2020).
  • Resistance to lean mass gain in constitutional thinness in free‐living conditions is not overpassed by overfeeding, Journal of Cachexia, Sarcopenia and Muscle, 10.1002/jcsm.12572, 0, 0, (2020).
  • Recent advances in the application of metabolomics for food safety control and food quality analyses, Critical Reviews in Food Science and Nutrition, 10.1080/10408398.2020.1761287, (1-22), (2020).
  • Evaluation of gene–drug common module identification methods using pharmacogenomics data, Briefings in Bioinformatics, 10.1093/bib/bbaa087, (2020).
  • Metabolomics of Myrcia bella Populations in Brazilian Savanna Reveals Strong Influence of Environmental Factors on Its Specialized Metabolism, Molecules, 10.3390/molecules25122954, 25, 12, (2954), (2020).
  • Common and distinct variation in data fusion of designed experimental data, Metabolomics, 10.1007/s11306-019-1622-2, 16, 1, (2019).
  • Integrated lipidomic and transcriptomic analyses identify altered nerve triglycerides in mouse models of prediabetes and type 2 diabetes, Disease Models & Mechanisms, 10.1242/dmm.042101, 13, 2, (dmm042101), (2019).
  • Multivariate Statistical Methods for Metabolic Phenotyping, The Handbook of Metabolic Phenotyping, 10.1016/B978-0-12-812293-8.00009-8, (261-308), (2019).
  • Joint and unique multiblock analysis of biological data – multiomics malaria study, Faraday Discussions, 10.1039/C8FD00243F, (2019).
  • Performance evaluation of methods for integrative dimension reduction, Information Sciences, 10.1016/j.ins.2019.04.041, (2019).
  • The application of omics techniques to understand the role of the gut microbiota in inflammatory bowel disease, Therapeutic Advances in Gastroenterology, 10.1177/1756284818822250, 12, (175628481882225), (2019).
  • Statistical Methods in Metabolomics, Handbook of Statistical Genomics, 10.1002/9781119487845, (949-976), (2019).
  • Effects of theabrownin on serum metabolites and gut microbiome in rats with a high-sugar diet, Food & Function, 10.1039/C9FO01334B, (2019).
  • Host and microbiome multi-omics integration: applications and methodologies, Biophysical Reviews, 10.1007/s12551-018-0491-7, (2019).
  • Joint Analysis of Metabolite Markers of Fish Intake and Persistent Organic Pollutants in Relation to Type 2 Diabetes Risk in Swedish Adults, The Journal of Nutrition, 10.1093/jn/nxz068, (2019).
  • Multivariate Discriminant Analysis of Single Seed Near Infrared Spectra for Sorting Dead-Filled and Viable Seeds of Three Pine Species: Does One Model Fit All Species?, Forests, 10.3390/f10060469, 10, 6, (469), (2019).
  • Mapping the microbial interactome: Statistical and experimental approaches for microbiome network inference, Experimental Biology and Medicine, 10.1177/1535370219836771, (153537021983677), (2019).
  • Cytokine profiles in autoantibody defined subgroups of systemic lupus erythematosus (SLE), Journal of Proteome Research, 10.1021/acs.jproteome.8b00811, (2019).
  • Sex-Specific Associations of Blood-Based Nutrient Profiling With Body Composition in the Elderly, Frontiers in Physiology, 10.3389/fphys.2018.01935, 9, (2019).
  • Rapid and non-destructive evaluation of seed quality of Chinese fir by near infrared spectroscopy and multivariate discriminant analysis, New Forests, 10.1007/s11056-019-09735-8, (2019).
  • Systems Biology and Multi-Omics Integration: Viewpoints from the Metabolomics Research Community, Metabolites, 10.3390/metabo9040076, 9, 4, (76), (2019).
  • Evaluation of integrative clustering methods for the analysis of multi-omics data, Briefings in Bioinformatics, 10.1093/bib/bbz015, (2019).
  • Taste Evaluation of Yellowtail (Seriola Quinqueradiata) Ordinary and Dark Muscle by Metabolic Profiling, Molecules, 10.3390/molecules24142574, 24, 14, (2574), (2019).
  • Multivariate patent analysis—Using chemometrics to analyze collections of chemical and pharmaceutical patents, Journal of Chemometrics, 10.1002/cem.3041, 34, 1, (2018).
  • Visualization of descriptive multiblock analysis, Journal of Chemometrics, 10.1002/cem.3071, 34, 1, (2018).
  • Exploring the latent variable space of PLS2 by post‐transformation of the score matrix (ptLV), Journal of Chemometrics, 10.1002/cem.3079, 34, 1, (2018).
  • Clinical metabolomics of exhaled breath condensate in chronic respiratory diseases, , 10.1016/bs.acc.2018.10.002, (2018).
  • QTL Mapping of Wood FT-IR Chemotypes Shows Promise for Improving Biofuel Potential in Short Rotation Coppice Willow (Salix spp.), BioEnergy Research, 10.1007/s12155-018-9901-8, 11, 2, (351-363), (2018).
  • Multiomics Data Integration in Time Series Experiments, Data Analysis for Omic Sciences: Methods and Applications, 10.1016/bs.coac.2018.06.005, (505-532), (2018).
  • Probabilistic partial least squares model: Identifiability, estimation and application, Journal of Multivariate Analysis, 10.1016/j.jmva.2018.05.009, 167, (331-346), (2018).
  • Multivariate Analysis of Multiple Datasets: a Practical Guide for Chemical Ecology, Journal of Chemical Ecology, 10.1007/s10886-018-0932-6, 44, 3, (215-234), (2018).
  • Angle-based joint and individual variation explained, Journal of Multivariate Analysis, 10.1016/j.jmva.2018.03.008, 166, (241-265), (2018).
  • A multi-omics approach reveals function of Secretory Carrier-Associated Membrane Proteins in wood formation of​ ​​Populus​​ ​trees, BMC Genomics, 10.1186/s12864-017-4411-1, 19, 1, (2018).
  • Characterization of metabolic responses to healthy diets and association with blood pressure: application to the Optimal Macronutrient Intake Trial for Heart Health (OmniHeart), a randomized controlled study, The American Journal of Clinical Nutrition, 10.1093/ajcn/nqx072, 107, 3, (323-334), (2018).
  • Multivariate Calibration for the Development of Vibrational Spectroscopic Methods, Calibration and Validation of Analytical Methods - A Sampling of Current Approaches, 10.5772/intechopen.69918, (2018).
  • Integrating omics datasets with the OmicsPLS package, BMC Bioinformatics, 10.1186/s12859-018-2371-3, 19, 1, (2018).
  • Statistical Inference in High‐Dimensional Omics Data, Integration of Omics Approaches and Systems Biology for Clinical Applications, 10.1002/9781119183952, (196-206), (2018).
  • Projection to latent structures with orthogonal constraints for metabolomics data, Journal of Chemometrics, 10.1002/cem.2987, 32, 5, (2018).
  • Data‐Driven Modelling of Gas Solubility in Ionic Liquids Using Principal Properties as Orthogonal Descriptors, ChemistrySelect, 10.1002/slct.201800238, 3, 7, (2181-2184), (2018).
  • Heterogeneous Domain Adaptation for IHC Classification of Breast Cancer Subtypes, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10.1109/TCBB.2018.2877755, (1-1), (2018).
  • Urine metabolic signatures of multiple environmental pollutants in pregnant women - an exposome approach, Environmental Science & Technology, 10.1021/acs.est.8b02215, (2018).
  • The O‐PLS methodology for orthogonal signal correction—is it correcting or confusing?, Journal of Chemometrics, 10.1002/cem.2884, 34, 1, (2017).
  • Modeling from Theory and Modeling from Data: Complementary or Alternative Approaches? The Case of Ionic Liquids, ChemistryOpen, 10.1002/open.201600119, 6, 1, (90-101), (2017).
  • Grandpaternal-induced transgenerational dietary reprogramming of the unfolded protein response in skeletal muscle, Molecular Metabolism, 10.1016/j.molmet.2017.05.009, 6, 7, (621-630), (2017).
  • Urinary metabolic insights into host-gut microbial interactions in healthy and IBD children, World Journal of Gastroenterology, 10.3748/wjg.v23.i20.3643, 23, 20, (3643), (2017).
  • Discussion on the paper ‘Statistical contributions to bioinformatics: Design, modelling, structure learning and integration’ by Jeffrey S. Morris and Veerabhadran Baladandayuthapani, Statistical Modelling: An International Journal, 10.1177/1471082X17706135, 17, 4-5, (319-326), (2017).
  • Application of a handheld NIR spectrometer in prediction of drug content in inkjet printed orodispersible formulations containing prednisolone and levothyroxine, International Journal of Pharmaceutics, 10.1016/j.ijpharm.2017.04.014, 524, 1-2, (414-423), (2017).
  • Metabolic and functional characterization of effects of developmental temperature in Drosophila melanogaster , American Journal of Physiology-Regulatory, Integrative and Comparative Physiology, 10.1152/ajpregu.00268.2016, 312, 2, (R211-R222), (2017).
  • Protein expression in tension wood formation monitored at high tissue resolution in Populus, Journal of Experimental Botany, 10.1093/jxb/erx186, 68, 13, (3405-3417), (2017).
  • Application of near infrared spectroscopy for authentication of Picea abies seed provenance, New Forests, 10.1007/s11056-017-9589-1, 48, 5, (629-642), (2017).
  • NMR-Based Metabolomics of Oral Biofluids, Oral Biology, 10.1007/978-1-4939-6685-1_5, (79-105), (2017).
  • Amyloid β42 peptide is toxic to non-neural cells in Drosophila yielding a characteristic metabolite profile and the effect can be suppressed by PI3K , Biology Open, 10.1242/bio.029991, 6, 11, (1664-1671), (2017).
  • Infant nutrition and the microbiome, Nutrigenomics and Proteomics in Health and Disease, 10.1002/9781119101277, (220-257), (2017).
  • An integrated RNAseq-1H NMR metabolomics approach to understand soybean primary metabolism regulation in response to Rhizoctonia foliar blight disease, BMC Plant Biology, 10.1186/s12870-017-1020-8, 17, 1, (2017).
  • Common and distinct components in data fusion, Journal of Chemometrics, 10.1002/cem.2900, 31, 7, (2017).
  • Beef, Chicken, and Soy Proteins in Diets Induce Different Gut Microbiota and Metabolites in Rats, Frontiers in Microbiology, 10.3389/fmicb.2017.01395, 8, (2017).
  • Computational Approaches for Integrative Analysis of the Metabolome and Microbiome, Metabolites, 10.3390/metabo7040062, 7, 4, (62), (2017).
  • Evaluation of O2PLS in Omics data integration, BMC Bioinformatics, 10.1186/s12859-015-0854-z, 17, S2, (2016).
  • Post‐transformation of PLS2 (ptPLS2) by orthogonal matrix: a new approach for generating predictive and orthogonal latent variables, Journal of Chemometrics, 10.1002/cem.2780, 30, 5, (242-251), (2016).
  • Metabolic changes may precede proteostatic dysfunction in a  Drosophila model of amyloid beta peptide toxicity, Neurobiology of Aging, 10.1016/j.neurobiolaging.2016.01.009, 41, (39-52), (2016).
  • Robustness of NMR-based metabolomics to generate comparable data sets for olive oil cultivar classification. An inter-laboratory study on Apulian olive oils, Food Chemistry, 10.1016/j.foodchem.2015.12.064, 199, (675-683), (2016).
  • New findings on the in vivo antioxidant activity of Curcuma longa extract by an integrated 1H NMR and HPLC–MS metabolomic approach, Fitoterapia, 10.1016/j.fitote.2015.12.013, 109, (125-131), (2016).
  • Quantitative proteomics reveals protein profiles underlying major transitions in aspen wood development, BMC Genomics, 10.1186/s12864-016-2458-z, 17, 1, (2016).
  • Group Component Analysis for Multiblock Data: Common and Individual Feature Extraction, IEEE Transactions on Neural Networks and Learning Systems, 10.1109/TNNLS.2015.2487364, 27, 11, (2426-2439), (2016).
  • Metabolomics-guided analysis of isocoumarin production by Streptomyces species MBT76 and biotransformation of flavonoids and phenylpropanoids, Metabolomics, 10.1007/s11306-016-1025-6, 12, 5, (2016).
  • Separating common from distinctive variation, BMC Bioinformatics, 10.1186/s12859-016-1037-2, 17, S5, (2016).
  • Metabolic profiling in kidneys of Atlantic salmon infected with Aeromonas salmonicida based on 1 H NMR, Fish & Shellfish Immunology, 10.1016/j.fsi.2016.08.055, 58, (292-301), (2016).
  • Urinary Metabolic Phenotyping Reveals Differences in the Metabolic Status of Healthy and Inflammatory Bowel Disease (IBD) Children in Relation to Growth and Disease Activity, International Journal of Molecular Sciences, 10.3390/ijms17081310, 17, 8, (1310), (2016).
  • Noninvasive Recognition and Biomarkers of Early Allergic Asthma in Cats Using Multivariate Statistical Analysis of NMR Spectra of Exhaled Breath Condensate, PLOS ONE, 10.1371/journal.pone.0164394, 11, 10, (e0164394), (2016).
  • A review on machine learning principles for multi-view biological data integration, Briefings in Bioinformatics, 10.1093/bib/bbw113, (bbw113), (2016).
  • A Systems Oncology Approach Identifies NT5E as a Key Metabolic Regulator in Tumor Cells and Modulator of Platinum Sensitivity, Journal of Proteome Research, 10.1021/acs.jproteome.5b00793, 15, 1, (280-290), (2015).
  • Systematic integration of molecular profiles identifies miR-22 as a regulator of lipid and folate metabolism in breast cancer cells, Oncogene, 10.1038/onc.2015.333, 35, 21, (2766-2776), (2015).
  • Feasibility of visible + near infrared spectroscopy for non-destructive verification of European × Japanese larch hybrid seeds, New Forests, 10.1007/s11056-015-9514-4, 47, 2, (271-285), (2015).
  • Adopting Multivariate Nonparametric Tools to Determine Genotype-Phenotype Interactions in Health and Disease, Metabonomics and Gut Microbiota in Nutrition and Disease, 10.1007/978-1-4471-6539-2_3, (45-62), (2015).
  • Chemometrics methods for the analysis of genomics, transcriptomics, proteomics, metabolomics, and metagenomics datasets, Metabolomics as a Tool in Nutrition Research, 10.1016/B978-1-78242-084-2.00003-4, (37-60), (2015).
  • A pilot metabolic profiling study in hepatopancreas of Litopenaeus vannamei with white spot syndrome virus based on 1H NMR spectroscopy, Journal of Invertebrate Pathology, 10.1016/j.jip.2014.09.008, 124, (51-56), (2015).
  • Multivariate modelling to study the effect of the manufacturing process on the complete tablet dissolution profile, International Journal of Pharmaceutics, 10.1016/j.ijpharm.2015.03.040, 486, 1-2, (112-120), (2015).
  • Metabolic Phenotyping of an Adoptive Transfer Mouse Model of Experimental Colitis and Impact of Dietary Fish Oil Intake, Journal of Proteome Research, 10.1021/pr501299m, 14, 4, (1911-1919), (2015).
  • Longitudinal omics modeling and integration in clinical metabonomics research: challenges in childhood metabolic health research, Frontiers in Molecular Biosciences, 10.3389/fmolb.2015.00044, 2, (2015).
  • The promise of metabolic phenotyping in gastroenterology and hepatology, Nature Reviews Gastroenterology & Hepatology, 10.1038/nrgastro.2015.114, 12, 8, (458-471), (2015).
  • 1H NMR-Based Metabolomics Investigation of Copper-Laden Rat: A Model of Wilson’s Disease, PLOS ONE, 10.1371/journal.pone.0119654, 10, 4, (e0119654), (2015).
  • Blood plasma lipidomic signature of epicardial fat in healthy obese women, Obesity, 10.1002/oby.20925, 23, 1, (130-137), (2014).
  • Computational Statistics Approaches to Study Metabolic Syndrome, A Systems Biology Approach to Study Metabolic Syndrome, 10.1007/978-3-319-01008-3, (319-340), (2014).
  • Multivariate Analysis for the Processing of Signals, Oil & Gas Science and Technology – Revue d’IFP Energies nouvelles, 10.2516/ogst/2013185, 69, 2, (207-228), (2014).
  • Systems biochemical responses of rats to Kansui and vinegar-processed Kansui exposure by integrated metabonomics, Journal of Ethnopharmacology, 10.1016/j.jep.2014.03.022, 153, 2, (511-520), (2014).
  • Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of 1 H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection , Analytical Chemistry, 10.1021/ac500161k, 86, 11, (5308-5315), (2014).
  • Design and Analysis of Metabolomics Studies in Epidemiologic Research: A Primer on -Omic Technologies, American Journal of Epidemiology, 10.1093/aje/kwu143, 180, 2, (129-139), (2014).
  • Constrained kernelized partial least squares, Journal of Chemometrics, 10.1002/cem.2636, 28, 10, (762-772), (2014).
  • Towards Global QSAR Model Building for Acute Toxicity: Munro Database Case Study, International Journal of Molecular Sciences, 10.3390/ijms151018162, 15, 10, (18162-18174), (2014).
  • Multivariate relationships between molecular descriptors and isomer distribution patterns of PCBs formed during household waste incineration, Environmental Science and Pollution Research, 10.1007/s11356-013-2257-x, 21, 4, (3082-3090), (2013).
  • Impact of breast-feeding and high- and low-protein formula on the metabolism and growth of infants from overweight and obese mothers, Pediatric Research, 10.1038/pr.2013.250, 75, 4, (535-543), (2013).
  • Deciphering the complex: Methodological overview of statistical models to derive OMICS‐based biomarkers, Environmental and Molecular Mutagenesis, 10.1002/em.21797, 54, 7, (542-557), (2013).
  • Multiblock and Path Modeling with OnPLS, New Perspectives in Partial Least Squares and Related Methods, 10.1007/978-1-4614-8283-3_14, (209-220), (2013).
  • See more

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.