Journal of Chemometrics

Cover image for Journal of Chemometrics

January 2009

Volume 23, Issue 1

Pages 1–65

  1. Research Articles

    1. Top of page
    2. Research Articles
    3. Short Communications
    1. Determination of rank by median absolute deviation (DRMAD): a simple method for determining the number of principal factors responsible for a data matrix (pages 1–6)

      Edmund R. Malinowski

      Article first published online: 8 SEP 2008 | DOI: 10.1002/cem.1182

      DRMAD is a statistical method designed to determine the rank of a data matrix. It applies the MAD statistic to the residual standard deviation obtained from principal component analysis. The method does not require strict adherence to normal distributions of experimental uncertainties. The computations are direct, simple and fast. An algorithm, written in MATLAB format, is presented.

    2. Powered partial least squares discriminant analysis (pages 7–18)

      Kristian Hovde Liland and Ulf Geir Indahl

      Article first published online: 19 SEP 2008 | DOI: 10.1002/cem.1186

      The powered partial least squares discriminant analysis (PPLS-DA) extends the powered PLS methodology to handle classification problems. For several real datasets the latent variables of models obtained by PPLS-DA are found to be simpler and/or fewer compared to their ordinary PLS counterparts.

    3. Use of cluster separation indices and the influence of outliers: application of two new separation indices, the modified silhouette index and the overlap coefficient to simulated data and mouse urine metabolomic profiles (pages 19–31)

      Sarah J. Dixon, Nina Heinrich, Maria Holmboe, Michele L. Schaefer, Randall R. Reed, Jose Trevejo and Richard G. Brereton

      Article first published online: 8 SEP 2008 | DOI: 10.1002/cem.1189

      Four indices are compared which aim to quantify how separate classes are. Four sets of simulations, consisting of data with varying degrees of overlap, and differing in the nature of outliers, and three experimental datasets consisting of the GCMS of extracts from mouse urine to study the effect of stress, diet, and age, are used to illustrate the methods. The paper discusses the robustness of each index to outliers and to allow assessment of class separation.

    4. Sorting variables by using informative vectors as a strategy for feature selection in multivariate regression (pages 32–48)

      Reinaldo F. Teófilo, João Paulo A. Martins and Márcia M. C. Ferreira

      Article first published online: 29 OCT 2008 | DOI: 10.1002/cem.1192

      A new variable selection procedure to enhance prediction of multivariate calibration models is presented. The methodology sorts the variables from an informative vector, investigates systematically PLS regression models and finds the most relevant set of variables by comparing the cross-validation parameters of the models obtained. Seven informative vectors and their combinations were successfully tested for data sets from different applications.

    5. X-tended target projection (XTP)—comparison with orthogonal partial least squares (OPLS) and PLS post-processing by similarity transformation (PLS + ST) (pages 49–55)

      Olav M. Kvalheim, Tarja Rajalahti and Reidar Arneberg

      Article first published online: 24 SEP 2008 | DOI: 10.1002/cem.1193

      For the same number of latent variables, target projection (TP), orthogonal partial least squares (OPLS) and PLS post-processing by similarity transformation (PLS+ST) are shown to provide score and loading vectors for the predictive component that are equivalent except for a scaling factor. The TP approach can be extended to embrace systematic variation in X unrelated to the response. The method is called X-tended target projection (XTP) and explains more of the orthogonal variation than OPLS and PLS+ST.

    6. Statistical methods for evaluating the linearity in assay validation (pages 56–63)

      Eric Hsieh, Chin-fu Hsiao and Jen-pei Liu

      Article first published online: 6 OCT 2008 | DOI: 10.1002/cem.1194

      The sum of squares of deviations from linearity (SSDL) is proposed as an alternative metric for evaluation of linearity. Based on SSDL, the method of generalized pivotal quantities (GPQ) is applied to the inference for evaluation of linearity. The simulation results demonstrate that the proposed GPQ method not only adequately controls size at the nominal level but also provides sufficient power. A numeric example illustrates the proposed methods.

  2. Short Communications

    1. Top of page
    2. Research Articles
    3. Short Communications
    1. Short note: Estimating the number of pure chemical components in a mixture by maximum likelihood (pages 64–65)

      Edmund R. Malinowski

      Article first published online: 30 OCT 2008 | DOI: 10.1002/cem.1191

      Computations attributed to the Malinowski F-test are incorrect because they were based on the determining the equality of secondary eigenvalues instead of the equality of reduced eigenvalues as prescribed in the original work of Malinowski. When the correct expression was employed, the results were found to be in complete agreement with expectations. This does not, in any way, affect the validity of the proposed MLE methodology, which appears to be a valuable tool in the chemometric arsenal.