Journal of Chemometrics

Cover image for Vol. 26 Issue 3-4

Special Issue: Conferentia Chemometrica 2011, Sümeg, Hungary

March-April 2012

Volume 26, Issue 3-4

Pages i–iii, 41–133

Issue edited by: Karoly Heberger

  1. Issue Information

    1. Top of page
    2. Issue Information
    3. Meeting Reports
    4. Review
    5. Special Issue Articles
    1. Issue Information (pages i–iii)

      Version of Record online: 20 MAR 2012 | DOI: 10.1002/cem.2439

  2. Meeting Reports

    1. Top of page
    2. Issue Information
    3. Meeting Reports
    4. Review
    5. Special Issue Articles
  3. Review

    1. Top of page
    2. Issue Information
    3. Meeting Reports
    4. Review
    5. Special Issue Articles
    1. Review of sparse methods in regression and classification with application to chemometrics (pages 42–51)

      Peter Filzmoser, Moritz Gschwandtner and Valentin Todorov

      Version of Record online: 17 FEB 2012 | DOI: 10.1002/cem.1418

      Sparse statistical methods lead to parameter estimates that contain exact zeros. This has advantages especially in the analysis of high-dimensional data because the contribution of single variables—potential noise variables—is set to zero. We review recent proposals for sparse methods in the context of regression and classification and compare the performance of these methods with their non-sparse counterparts, using several data examples from chemometrics.

  4. Special Issue Articles

    1. Top of page
    2. Issue Information
    3. Meeting Reports
    4. Review
    5. Special Issue Articles
    1. Chemometric methods to classify stationary phases for achiral packed column supercritical fluid chromatography (pages 52–65)

      Caroline West and Eric Lesellier

      Version of Record online: 22 FEB 2012 | DOI: 10.1002/cem.1414

      This paper investigates classification models for packed columns used in supercritical fluid chromatography (SFC). 48 columns with varied stationary phases available on the market are evaluated. The retention factors of 134 test-compounds are used to compute hierarchical cluster analysis (HCA), principal component analysis (PCA) and quantitative structure-retention relationships (QSRRs). Besides, different coefficients were calculated between all couples of columns to identify a ranking index, which would provide meaningful information, whenever two columns need to be compared on an objective basis.

    2. Evaluation of target factor analysis and net analyte signal as processes for classification purposes with application to benchmark data sets and extra virgin olive oil adulterant identification (pages 66–75)

      Kevin Higgins, John H. Kalivas and Erik Andries

      Version of Record online: 20 MAR 2012 | DOI: 10.1002/cem.2419

      Presented is a study of the simple well known processes of target factor analysis (TFA) and net analyte signal (NAS) for classification purposes. Classification by TFA and NAS are compared to classifications by the Mahalanobis distance and k-nearest neighbors (KNN). The measures are evaluated with three spectroscopic data sets including extra virgin olive oil adulterant identification. A fourth data set is an archeological data set. Results indicate that the simple TFA and NAS processes are useful underutilized classification tools.

    3. Comparison of six multiclass classifiers by the use of different classification performance indicators (pages 76–84)

      Dániel Szöllősi, Dénes Lajos Dénes, Ferenc Firtha, Zoltán Kovács and András Fekete

      Version of Record online: 15 MAR 2012 | DOI: 10.1002/cem.2432

      Six different multiclass classifiers were used to classify soft drink samples with different sweeteners. The model comparison was performed according to the classification accuracy value, Cohen's kappa, area under the ROC curve, sum of ranking differences method, and a corrected accuracy value that was developed for the purpose. The similarities between the samples were taken into account. The best classification model was the “K-nearest neighbor” for the tested samples, and the corrected accuracy value was a useful classification performance indicator.

    4. Novel algorithm to select basis functions in spline regression: applications in quantitative structure–activity relationship studies (pages 85–94)

      Jyotsna Bahl, Narayanan Ramamurthi and Sitarama B. Gunturi

      Version of Record online: 20 MAR 2012 | DOI: 10.1002/cem.2415

      We described herein a novel variable selection method, namely, random replacement method (RRM) combining the principles of replacement methods (RM) and genetic algorithms (GA). We applied RRM with multiple linear regression on two model data sets and showed that the method outperforms other variable selection approaches. We extended the application of RRM for the selection of basis functions in spline regression, and this approach is named as random function approximation (RFA). We compared the performance of RFA with that of multivariate adaptive regression splines (MARS) and genetic function approximation (GFA) and demonstrated the improved performance of the proposed method in terms of, R2 and Q2.

    5. Multivariate evaluation of the correlation between retention data and molecular descriptors of antiepileptic hydantoin analogs (pages 95–107)

      Tatjana Djaković-Sekulić, Adam Smoliński, Nemanja Trišović and Gordana Ušćumlić

      Version of Record online: 29 FEB 2012 | DOI: 10.1002/cem.1421

      Molecular properties relevant to pharmacokinetics of 24 newly synthesized hydantoin derivatives based on two well-known drugs, Nirvanol and Phenytoin, were studied. Properties under consideration represent selected structural features that affect their processes of absorption, distribution, metabolism, excretion, and toxicity. To find appropriate quantitative relationships between RM, W for the tested compounds and calculated molecular descriptors, principal component analysis, stepwise regression, partial least squares (PLS), and robust PLS were used.

    6. Approximation of physicochemical properties of homologs using recurrent and related non-recurrent relations (pages 108–116)

      Igor G. Zenkevich

      Version of Record online: 20 MAR 2012 | DOI: 10.1002/cem.1419

      Linear recurrent relations A(n) = aA(n−1) + b recently introduced into chemistry provide precise evaluation of the values of most of physicochemical properties of homologs using the data from previous members of series. A number of updatings can be suggested for these recurrences. The equation A(n) = a[A(n−1)A¥] + A¥, where A¥ denotes the limiting value of physicochemical property at the number of carbon atoms in the molecule hypothetically tending to infinity, appears to be the most convenient.

    7. How to identify cross correlations: a statistical test with time lag and its application on air-pollutant time series (pages 125–133)

      Gergely Tóth and Bertalan Balogh

      Version of Record online: 22 FEB 2012 | DOI: 10.1002/cem.2414

      We developed a statistical test with ANOVA and resampling to detect cross correlations in time series, but the plot of the p-values in respect to the time lag provides an intuitive recognition of cross correlations as well. It is applied on meteorological and air-pollutant data measured in Budapest. We detected easily the periodic and odd cross correlations on the graphs, we estimated the length of the correlation effects, and we found different correlations for yearlong and seasonal time domains.