Journal of Chemometrics

Cover image for Vol. 24 Issue 11‐12

Special Issue: Herman Wold Medal Winners 2007–2009

November - December 2010

Volume 24, Issue 11-12

Pages 635–789

Issue edited by: Jenny Forshed, Mats Josefson, Torbjorn Lundstedt, Lennart Eriksson, Johan Gottfries

  1. Editorial

    1. Top of page
    2. Editorial
    3. Reviews
    4. Special Issue Articles
    1. Herman Wold medal winners 2007–2009 (page 635)

      Jenny Forshed, Johan Gottfries, Mats Josefson, Lennart Eriksson and Torbjörn Lundstedt

      Article first published online: 29 DEC 2010 | DOI: 10.1002/cem.1373

  2. Reviews

    1. Top of page
    2. Editorial
    3. Reviews
    4. Special Issue Articles
    1. The evolution of partial least squares models and related chemometric approaches in metabonomics and metabolic phenotyping (pages 636–649)

      Judith M. Fonville, Selena E. Richards, Richard H. Barton, Claire L. Boulange, Timothy M. D. Ebbels, Jeremy K. Nicholson, Elaine Holmes and Marc-Emmanuel Dumas

      Article first published online: 9 DEC 2010 | DOI: 10.1002/cem.1359

      Metabolic information from spectroscopic profiles can be extracted using multivariate statistical methods. In this review, various approaches to partial least squares modeling are discussed, with a particular focus on visualization, interpretation and biomarker recovery. The removal of orthogonal variation and other derivative methods are evaluated, and avenues for future research are highlighted. Methods are presented together with relevant applications to highlight the uses and limitations of available techniques.

  3. Special Issue Articles

    1. Top of page
    2. Editorial
    3. Reviews
    4. Special Issue Articles
    1. On the separation of classes: can the Fisher criterion be improved upon when classes have unequal variance–covariance structure? (pages 650–654)

      K. Magnus Åberg and Sven P. Jacobsson

      Article first published online: 8 OCT 2010 | DOI: 10.1002/cem.1326

      A value of class separation can be used to optimise data-analytical protocols for multivariate classification problems. The Fisher criterion assumes equal variance-covariance structure, an assumption which is often violated in real datasets. Here, we show that about fifty samples are required to benefit from using the unequal variance-covariance structures in two dimensions; higher dimensions require more samples. The Fisher criterion is robust and accurate for selection between pre-treatments compared to a newly derived Cooke criterion.

    2. Parallel factor analysis of EEM of the fluorescence of carbon dots nanoparticles (pages 655–664)

      João M. M. Leitão, Helena Gonçalves and Joaquim C. G. Esteves da Silva

      Article first published online: 21 JUL 2010 | DOI: 10.1002/cem.1327

      The effect of experimental factors [pH and Hg(II)] on the fluorescence excitation emission matrices (EEMs) of nanosensor carbon dots (CDs) was analyzed by parallel factor (PARAFAC) analysis. Parallel profiles with Linear Dependences (PARALIND) model with three components in the excitation–emission spectral modes and two components in the Hg(II) or pH mode gave similar results as PARAFAC, but PARALIND showed that the two different-sized CDs have similar chemical reactivity toward Hg(II) and pH.

    3. Ridge and PLS based rational function regression (pages 665–673)

      Veli-Matti Taavitsainen

      Article first published online: 28 OCT 2010 | DOI: 10.1002/cem.1328

      This study introduces both ridge and PLS regression based rational function regression techniques. The results of four different cases are compared to those obtained using other regression techniques. The results are mostly good, and the proposed method should be considered as a noteworthy alternative in nonlinear modeling.

    4. Comparison of multivariate methods for quantitative determination with transmission Raman spectroscopy in pharmaceutical formulations (pages 674–680)

      Magnus Fransson, Jonas Johansson, Anders Sparén and Olof Svensson

      Article first published online: 21 JUL 2010 | DOI: 10.1002/cem.1330

      The use of transmission Raman spectroscopy for quantitative assessment of pharmaceutical tablets, with variation in paracetamol content, using different multivariate approaches was investigated. The different multivariate models resulted in calibration errors between 2.4 and 6.5%. Models based on only two concentration levels (4 samples) resulted in almost as good results as if 9 concentration levels (18 samples) were used in the calibration. All calibration models were easy to interpret because of the selective Raman spectra.

    5. Is your QSAR/QSPR descriptor real or trash? (pages 681–693)

      Rudolf Kiralj and Márcia M. C. Ferreira

      Article first published online: 21 OCT 2010 | DOI: 10.1002/cem.1331

      This work studies the sign change problem of regression models in QSAR, QSPR and related studies, the controversy in the signs of correlation coefficients and regression coefficients of a descriptor in univariate and multivariate regressions, before and after the data split. Significant fraction of 50 investigated regression models with 227 descriptors from the literature incorporates the sign change problem. New criteria to reduce or event eliminate this problem are proposed in this work.

    6. Assessment of the water quality of a river catchment by chemometric expertise (pages 694–702)

      Stefan Tsakovski, Aleksander Astel and Vasil Simeonov

      Article first published online: 8 OCT 2010 | DOI: 10.1002/cem.1333

      The advantages of the SOM algorithm for visualization and classification of large data sets are used for proper selection of chemical parameters being most effective in water quality assessment. The proper variables selection was used for performing new classification separating the objects of interest (sampling sites) into specific patterns. In the final stage of the study a decision support system approach (Hasse diagram technique) was introduced to establish the relationship between the selected quality parameters and the sampling locations in order to distinguish between specific reasons for water pollution.

    7. Shedding new light on Hierarchical Principal Component Analysis (pages 703–709)

      Mohamed Hanafi, Achim Kohler and El Mostafa Qannari

      Article first published online: 31 AUG 2010 | DOI: 10.1002/cem.1334

      Hierarchical Principal Component Analysis (HPCA) is a multiblock method. The computation of the parameters of this method is based on an iterative procedure. However, very few properties are known regarding the convergence of this procedure. The paper discloses a monotony property of HPCA and exhibits an optimization criterion for which HPCA algorithm provides a monotonic convergent solution. This makes it possible to shed a new light on this method and pinpointing its relation to existing methods (CCSWA), INDSCAL and PARAFAC.

    8. Interaction study with rats given two flame retardants: polybrominated diphenyl ethers (Bromkal 70-5 DE) and chlorinated paraffins (Cereclor 70L) (pages 710–718)

      Katrin Lundstedt-Enkel, Daniel Karlsson and Per Ola Darnerud

      Article first published online: 30 NOV 2010 | DOI: 10.1002/cem.1354

      Female Sprague-Dawley rats were given (by gavage) 2 flame retardants that are common in the environment; one, a mixture of polybrominated diphenyl ethers and the other a mixture of chlorinated paraffins. Doehlert design was used to select the concentrations. Several endpoints were measured and we show that hepatic enzyme activities were induced while plasma thyroxine decreased. Notably, the exposure combination causing the most marked effects represented intermediate doses of both substances.

    9. Gaussian mixture models for the classification of high-dimensional vibrational spectroscopy data (pages 719–727)

      Julien Jacques, Charles Bouveyron, Stéphane Girard, Olivier Devos, Ludovic Duponchel and Cyril Ruckebusch

      Article first published online: 9 DEC 2010 | DOI: 10.1002/cem.1355

      Gaussian mixture models designed for the classification of high-dimensional data are applied on multi-class problems in spectroscopy. The results are compared to state-of-the-art methods and the abalities of the proposed approach are demonstrated.

    10. Variable selection in regression—a tutorial (pages 728–737)

      C. M. Andersen and R. Bro

      Article first published online: 10 DEC 2010 | DOI: 10.1002/cem.1360

      This paper provides a practical guide to variable selection in chemometrics with a focus on regression-based calibration models. Several approaches such as genetic algorithms, jack-knifing, forward selection, etc., are explained; it is also explained how to choose between different kinds of variable selection methods. The emphasis in this paper is on how to use variable selection in practice and avoid the most common pitfalls.

    11. Screening design for computer experiments: metamodelling of a deterministic mathematical model of the mammalian circadian clock (pages 738–747)

      Kristin Tøndel, Arne B. Gjuvsland, Ingrid Måge and Harald Martens

      Article first published online: 29 DEC 2010 | DOI: 10.1002/cem.1363

      Computer experiments require experimental designs in order to probe the parameter space efficiently, and determining the relevant ranges within which to set up a factorial experimental design is a critical and difficult step. Here we show how a sparse initial range-finding design based on a reduced factorial design method—an optimised multi-level binary replacement (MBR) design—can reveal the region of relevant system behaviour of a nonlinear dynamic mathematical model. The same MBR design is subsequently used for predictive metamodelling.

    12. Multi-level binary replacement (MBR) design for computer experiments in high-dimensional nonlinear systems (pages 748–756)

      Harald Martens, Ingrid Måge, Kristin Tøndel, Julia Isaeva, Martin Høy and Solve Sæbø

      Article first published online: 29 DEC 2010 | DOI: 10.1002/cem.1366

      Computer experiments are increasingly used for assessment of high-dimensional nonlinear systems. Since multivariate soft metamodeling (“modelometrics”) requires many design factors to be tested in combination, at many levels each, a design method is presented that allows controlled reduction in design size, avoiding combinatorial explosion. The multi-level binary replacement (MBR) design method applies fractional factorial design to a binary representation of design factors, and is here explained, and shown to facilitate the study of a simulated biological system.

    13. Multivariate assessment of virtual screening experiments (pages 757–767)

      C. David Andersson, Brian Y. Chen and Anna Linusson

      Article first published online: 29 DEC 2010 | DOI: 10.1002/cem.1367

      Chemometric methods such as DoE, PCA and PLS were applied to simulated virtual screening (VS) experiments to find and compare suitable conditions for performing VS against six proteins selected from the DUD databases. The study revealed that the choice of scoring function has the greatest influence on VS outcome, and that other parameters have varying influence depending on the protein. We also found that substantial bias can be introduced in VS by the lack of variation of molecule properties in the databases used in the screening.

    14. A chemometrical approach to study interactions between ethynylestradiol and an AhR-agonist in stickleback (Gasterosteus aculeatus) (pages 768–778)

      Carin Andersson, Katrin Lundstedt-Enkel, Ioanna Katsiadaki, William V. Holt, Katrien J.W. Van Look and Jan Örberg

      Article first published online: 29 DEC 2010 | DOI: 10.1002/cem.1368

      We show that Ethoxyresorufin-O-deethylase (EROD) activity in gill and liver as well as sperm quality in male three-spined stickleback (Gasterosteus aculeatus) is affected after co-exposure to 17α-ethynylestradiol (EE2) and the AhR-agonist β-naphthoflavone (BNF). If the capacity to induce gill EROD activity is a general property of estrogen acting chemicals, our finding is important. We suggest that a gill EROD assay could be used as a sensitive biomarker for biomonitoring purposes.

    15. A graphical index of separation (GIOS) in multivariate modeling (pages 779–789)

      Lennart Eriksson and Svante Wold

      Article first published online: 29 DEC 2010 | DOI: 10.1002/cem.1372

      We introduce a new measure for the importance of predictor variables, X, for the separation of two groups (classes) of observations. The measure is a Graphical Index of Separation (GIOS), and is, for each predictor, determined from the distribution of all possible pairs of observations with one from each group. GIOS is quantitative, intuitively simple, and easy to interpret.