A diversity of multiresponse optimization methods has been introduced in the literature; however, their performance has not been thoroughly explored, and only a classical desirability-based criterion has been commonly used. With the aim of contributing to help practitioners in selecting an effective criterion for solving multiresponse optimization problems developed under the response surface methodology framework, and thus to find compromise solutions that are technically and economically more favorable, the working ability of several easy-to-use criteria is evaluated and compared with that of a theoretically sound method. Four case studies with different numbers and types of responses are considered. Less-sophisticated criteria were able to generate solutions similar to those generated by sophisticated methods, even when the objective is to depict the Pareto frontier in problems with conflicting responses. Two easy-to-use criteria that require less-subjective information from the user yielded solutions similar to those of a classical desirability-based criterion. Preference parameters range and increment impact on optimal solutions were also evaluated.

]]>Quantitative structure-activity and structure-property relationships of complex polycyclic benzenoid networks require expressions for the topological properties of these networks. Structure-based topological indices of these networks enable prediction of chemical properties and the bioactivities of these compounds through quantitative structure-activity and structure-property relationships methods. We consider a number of infinite convex benzenoid networks that include polyacene, parallelogram, trapezium, triangular, bitrapezium, and circumcorone series benzenoid networks. For all such networks, we compute analytical expressions for both vertex-degree and edge-based topological indices such as edge-Wiener, vertex-edge Wiener, vertex-Szeged, edge-Szeged, edge-vertex Szeged, total-Szeged, Padmakar-Ivan, Schultz, Gutman, Randić, generalized Randić, reciprocal Randić, reduced reciprocal Randić, first Zagreb, second Zagreb, reduced second Zagreb, hyper Zagreb, augmented Zagreb, atom-bond connectivity, harmonic, sum-connectivity, and geometric-arithmetic indices. In addition we have obtained expressions for these topological indices for 3 types of parallelogram-like polycyclic benzenoid networks.

]]>We present the response-oriented sequential alternation (ROSA) method for multiblock data analysis. ROSA is a novel and transparent multiblock extension of the partial least squares regression (PLSR). According to a “winner takes all” approach, each component of the model is calculated from the block of predictors that most reduces the current residual error. The suggested algorithm is computationally fast compared with other multiblock methods because orthogonal scores and loading weights are calculated without deflation of the predictor blocks. Therefore, it can work effectively even with a large number of blocks included. The ROSA method is invariant to block scaling and ordering. The ROSA model has the same attributes (vectors of scores, loadings, and loading weights) as PLSR and is identical to PLSR modeling for the case with only one block of predictors.

]]>Penalized regression with a combination of sparseness and an interframe penalty is explored for image deconvolution in wide-field single-molecule fluorescence microscopy. The aim is to reconstruct superresolution images, which can be achieved by averaging the positions and intensities of individual fluorophores obtained from the analysis of successive frames. Sparsity of the fluorophore distribution in the spatial domain is obtained with an *L*_{0}-norm penalty on estimated fluorophore intensities, effectively constraining the number of fluorophores per frame. Simultaneously, continuity of the fluorophore localizations in the time mode is obtained by penalizing the total numbers of pixel status changes between successive frames. We implemented the interframe penalty in a sparse deconvolution algorithm (sparse image deconvolution and reconstruction) for improved imaging of densely labeled biological samples. For simulated and real biological data, we show that more accurate estimates of the final superresolution images of cellular structures can be obtained.

Osteoarthritis (OA) is an insidious joint disease that gradually leads to cartilage loss and the morphological impairment of other joint tissues. Therefore, early diagnosis and timely therapeutic intervention are of importance. Although there are a few diagnostic techniques used in clinics, these methods have various drawbacks. Infrared spectroscopy has emerged as an important analytical technique with wide applications in a variety of areas including clinical diagnosis. Research has shown that the presence of OA is associated with biochemical changes that are presumed to be reflected in serum or joint fluid. Hence, OA may be detected provided that serum or joint fluid is measured by infrared spectroscopy and appropriate data analysis methods are used to extract the diagnostic information from the infrared spectra. In this work, 5 discrimination and classification methods ([1] principal component analysis coupled with linear discriminant analysis, [2] principal component analysis coupled with multiple logistic regression, [3] partial least squares discriminant analysis, [4] regularized linear discriminant analysis, and [5] support vector machine) were used to build OA diagnostic models based on mid-infrared spectra of serum and joint fluid. Useful diagnostic models were developed, indicating that infrared spectroscopy coupled with multivariate data analysis methods is very promising as a simple and accurate approach for OA diagnosis. The results also showed that models built from the 5 methods were different, as were the models' predictive performances. Therefore, choice of appropriate data analysis methods in model development should be taken into account.

]]>The feasibility of direct (ie, without sample preparation) quantitative analysis of total hydrocarbons and water in oil-contaminated soils using mid-infrared spectroscopy and an attenuated total reflection (ATR) probe has been investigated. Spectral characteristics of unpolluted and oil-contaminated soils composed of sand, clay, dolomite, and humus have been studied over the full mid-infrared range (4000-400 cm^{−1}). Spectra of 25 typical soil samples containing varying levels of oil and water have been analyzed using a chalcogenide infrared fiber–based probe with a ZrO_{2} crystal as an ATR element. The spectral data were used to build calibration models for the analysis of hydrocarbon contamination as well as moisture content of soil samples. The low quality of ATR spectra of drier samples and variable spectral intensity inherent in the ATR measurement of solids has been overcome by suitable data processing. Further improvement of the model performance has been achieved using a variable selection based on the modified genetic algorithm. Our proposed method allows the determination of oil and moisture content in soils with accuracies of 1.1% and 0.6%, respectively, which is sufficient for a number of practical applications. The reported results may be used to develop portable devices for measuring petroleum and water content of soils.

Multivariate curve resolution (MCR) of absorption spectra is now a ubiquitously used tool. However, MCR methods, which use ordinary least squares (OLS) approach, assume that the measurement uncertainties are unbiased and homoscedastic. This is not true for absorption measurements, in which uncertainty variance and bias both increase as the true absorbance increases. The bias produces a well-known flattening/saturation of the peaks at high optical densities, which makes the data nonlinear and unsuitable for OLS-based MCR analysis. This problem can be reduced by using weighted least squares (WLS).

In the present paper, the ability of WLS-based MCR to handle simulated and real datasets with realistic optical noise and flattening was assessed. Three weighting schemes were tested: OLS (unity weights), weights based on the maximum likelihood principle (MLP) and the physics of absorption measurement, and weights based on empirical cutoff (zero weights for saturated data points). The abilities of MCR to recover the true profiles and to evaluate rotational ambiguity of the solutions were compared for the 3 weighting schemes. MLP- and cutoff-based WLS-MCR produced better resolution of flattened data than OLS, but the success of the extension to strongly flattened spectra depended on data structure. MLP-based MCR was general and stable, while cutoff-based MCR was more sensitive to the data but could recover unbiased profiles. Generally, the use of WLS can expand MCR functionality to the analysis of flattened spectra.

The specifics of finding WLS bilinear solutions and approaches to migrate factor-based MCR methods from OLS to WLS are also discussed.

Borgen plots are geometric constructions that represent the set of all nonnegative factorizations of spectral data matrices for three-component systems. The classical construction by Borgen and Kowalski (Anal. Chim. Acta 174, 1-26 (1985)) is limited to nonnegative data and results in nonnegative factorizations. The new approach of generalized Borgen plots allows factors with small negative entries. This makes it possible to construct Borgen plots for perturbed or noisy spectral data and stabilizes the computation. In the first part of this paper, the mathematical theory of generalized Borgen plots has been introduced. This second part presents the line-moving algorithm for the construction of generalized Borgen plots. The algorithm is justified, and the implementation in the *FACPACK* software is validated.

Maintaining multivariate calibrations involves keeping models developed on an instrument applicable to predicting new samples over time. Sometimes, a primary instrument model is needed to predict samples measured on secondary instruments. This situation is referred to as calibration transfer. Sometimes, a primary instrument model is needed to predict samples that have acquired new spectral features (chemical, physical, and environmental influences) over time. This situation is referred to as calibration maintenance. Calibration transfer and maintenance problems have a long history and are well studied in chemometrics and spectroscopy. In disciplines outside of chemometrics, particularly computer vision, calibration transfer and maintenance problems are more recent phenomena, and these problems often go under the umbrella term *domain adaptation*. Over the past decade, domain adaptation has demonstrated significant successes in various applications such as visual object recognition. Since domain adaptation already constitutes a large area of research in computer vision and machine learning, we narrow our scope and report on penalty-based eigendecompositions, a class of domain adaptation methods that has its motivational roots in linear discriminant analysis. We compare these approaches against chemometrics-based approaches using several benchmark chemometrics data sets.

No abstract is available for this article.

]]>The relationship between linear independence, orthogonality, and uncorrelatedness of vectors is described.

]]>Bootstrapping can be used for the estimation of parameter variances, and it is straightforward to be implemented but computationally demanding compared with other methods for parameter error estimation. It is not bound to any restrictions such as the distribution of measurement errors. And because of the possible asymmetry of the probability densities of the parameters, the parameter estimation errors acquired by bootstrapping are likely to be more accurate.

In this work the feasibility of a bootstrap-based method for optimal experimental design was evaluated for the Peleg model. The optimal design was performed, based on the Cramér-Rao lower bound as a benchmark. Afterwards, the optimal design was calculated based on the bootstrap method.

It is demonstrated that a bootstrap-based optimal design of experiments will give comparable results with the Cramér-Rao lower bound optimal designs, however with slightly different measurement points in time. If the parameter errors obtained from both optimal experimental designs are compared, they deviate for the 2 methods on average by 1.5%.

Bootstrapping can be used for problems, which cannot be solved using Cramér-Rao lower bound because of necessary but invalid assumptions. However, the benefits of the bootstrap method come at the cost of a significant increase in computational effort. Under similar conditions, the computation time for a bootstrap-based optimal design was 25 minutes compared with 5 seconds when using the Cramér-Rao lower bound method. As computers get faster and faster over time, the increase in computational demand will probably become less relevant in the future.

Application of chemometric methods to mass spectrometry imaging (MSI) data faces a bottleneck concerning the vast size of the experimental data sets. This drawback is critical when considering high-resolution mass spectrometry data, which provide several thousand points for each considered pixel. In this work, different approaches have been tested to reduce the size of the analyzed data with the aim to allow the subsequent application of typical chemometric methods for image analysis. The standard approach for MSI data compression consists in binning mass spectra for each pixel to reduce the number of *m*/*z* values. In this work, a method is proposed to handle the huge size of MSI data based on the adaptation of a liquid chromatography-mass spectrometry data compression method by the detection of regions of interest. Results showed that both approaches achieved high compression rates, although the proposed regions of interest–based method attains this reduction requiring lower computational requirements and keeping utter spectral information. For instance, typical compression rate reached values higher than 90% without loss of information in images and spectra.

The ultimate goal of projection methods is to search “interesting” projections in a low-dimensional subspace that can uncover the natural structure of the data. The aim of this work is to compare the ability of projection pursuit (PP) and principal component analysis (PCA) in dimension reduction. For this purpose to be achieved, the scores of PP and PCA, by a different number of factors, were used as inputs of radial basis function (RBF) neural network. RBF neural network was used as a nonlinear regression method in a quantitative structure-retention relationships study of 209 polychlorinated biphenyls (PCBs). The dependent variable was the high-resolution gas chromatographic relative retention times of PCBs on 18 different stationary phases, and independent variables were solvatochromic solute descriptors. The results demonstrate that the dimension reduction ability of the PP is better than that of the PCA for both single and full column retention models.

]]>Surface fitting is one of the well-known retrospective methods for bias field estimation from magnetic resonance imaging (MRI) images. Bias field in MRI images is primarily caused because of radio frequency–coil nonuniformity, improper image acquisition process, patient movement, and so on. The bias field can be characterized by any slow variant and smooth function because of its slow variant nature. In this paper, we present a comparative study between polynomial and Gaussian surface fitting methods. In particular, we have used both the second- and third-order polynomial functions to estimate the bias field. In this study, we approximate the bias field in two different ways. In the first method, the surfaces are fitted on the anatomical tissue regions individually and then fused to estimate the bias field. Conversely, in the second method, we have done the same over the entire image region. We have tested on three volumes of simulated and one volume of real-patient MRI brain images and validated the results by both the qualitative and quantitative analyses. The quantitative analyses are presented in standard deviation and coefficient of joint variation. The analysis of the simulation results show that the Gaussian surface fitting method yields better results in both the cases, where the surface fitting is done on entire image and individual tissue regions.

]]>The programmed cell death 4 (PDCD4) has recently been recognized as a new and attractive target of acute respiratory distress syndrome. Here, we attempted to discover new and potent PDCD4 mediator ligands from biogenic compounds using a synthetic strategy of statistical virtual screening and experimental affinity assay. In the procedure, a Gaussian process-based quantitative structure-activity relationship regression predictor was developed and validated statistically based on a curated panel of structure-based protein-ligand affinity data. The predictor was integrated with pharmacokinetics analysis, chemical redundancy reduction, and flexible molecular docking to perform high-throughput virtual screening against a distinct library of chemically diverse, drug-like biogenic compounds. Consequently, 6 hits with top scores were selected, and their binding affinities to the recumbent protein of human PDCD4 were identified, 3 out of which were determined to have high or moderate affinity with *K*_{d} at micromolar level. Structural analysis of protein-ligand complexes revealed that hydrophobic interactions and van der Waals contacts are the primary chemical forces to stabilize the complex architecture of PDCD4 with these mediator ligands, while few hydrogen bonds, salt bridges, and/or π-π stacking at the complex interfaces confer selectivity and specificity for the protein-ligand recognition. It is suggested that the statistical Gaussian process-based quantitative structure-activity relationship screening strategy can be successfully applied to rational discovery of biologically active compounds. The newly identified molecular entities targeting PDCD4 are considered as promising lead scaffolds to develop novel chemical therapeutics for acute respiratory distress syndrome.