Volume 5, Issue 16
Regular Article

Improved peak detection and quantification of mass spectrometry data acquired from surface‐enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform

Kevin R. Coombes Dr.

Corresponding Author

E-mail address: krc@odin.mdacc.tmc.edu

Department of Biostatistics and Applied Mathematics, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA

Department of Biostatistics and Applied Mathematics, Box 447, The University of Texas M. D. Anderson Cancer Center, 1500 Holcombe Blvd., Houston, TX 77030, USA Fax: +1‐713‐745‐4949===Search for more papers by this author
Spiridon Tsavachidis

Department of Biostatistics and Applied Mathematics, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA

Search for more papers by this author
Jeffrey S. Morris

Department of Biostatistics and Applied Mathematics, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA

Search for more papers by this author
Keith A. Baggerly

Department of Biostatistics and Applied Mathematics, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA

Search for more papers by this author
Mien‐Chie Hung

Department of Molecular and Cellular Oncology, The University of Texas M. D. Anderson Cancer Center, Houston, TX , USA

Search for more papers by this author
Henry M. Kuerer

Corresponding Author

Department of Surgical Oncology, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA

Department of Biostatistics and Applied Mathematics, Box 447, The University of Texas M. D. Anderson Cancer Center, 1500 Holcombe Blvd., Houston, TX 77030, USA Fax: +1‐713‐745‐4949===Search for more papers by this author
First published: 27 October 2005
Citations: 190

Abstract

Mass spectrometry is being used to find disease‐related patterns in mixtures of proteins derived from biological fluids. Questions have been raised about the reproducibility and reliability of peak quantifications using this technology. We collected nipple aspirate fluid from breast cancer patients and healthy women, pooled them into a quality control sample, and produced 24 replicate SELDI spectra. We developed a novel algorithm to process the spectra, denoising with the undecimated discrete wavelet transform (UDWT), and evaluated it for consistency and reproducibility. UDWT efficiently decomposes spectra into noise and signal. The noise is consistent and uncorrelated. Baseline correction produces isolated peak clusters separated by flat regions. Our method reproducibly detects more peaks than the method implemented in Ciphergen software. After normalization and log transformation, the mean coefficient of variation of peak heights is 10.6%. Our method to process spectra provides improvements over existing methods. Denoising using the UDWT appears to be an important step toward obtaining results that are more accurate. It improves the reproducibility of quantifications and supplies tools for investigation of the variations in the technology more carefully. Further study will be required, because we do not have a gold standard providing an objective assessment of which peaks are present in the samples.

Number of times cited according to CrossRef: 190

  • An Accelerated Computational Approach in Proteomics, Biomedical Signal Processing, 10.1007/978-981-13-9097-5_16, (389-432), (2020).
  • Recent applications of chemometrics in one‐ and two‐dimensional chromatography, Journal of Separation Science, 10.1002/jssc.202000011, 43, 9-10, (1678-1727), (2020).
  • Peak detection for MALDI mass spectrometry imaging data using sparse frame multipliers, Journal of Proteomics, 10.1016/j.jprot.2020.103852, 225, (103852), (2020).
  • Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry, Mass Spectrometry Reviews, 10.1002/mas.21602, 39, 3, (245-291), (2019).
  • Broadband acoustic scattering from oblate hydrocarbon droplets, The Journal of the Acoustical Society of America, 10.1121/1.5121699, 146, 2, (1176-1188), (2019).
  • Diffusion enhancement model and its application in peak detection, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2019.04.012, (2019).
  • Linear MALDI-ToF simultaneous spectrum deconvolution and baseline removal, BMC Bioinformatics, 10.1186/s12859-018-2116-3, 19, 1, (2018).
  • Diabetic Retinopathy Detection in Fundus Image Using Cross Sectional Profiles and ANN, Computational Vision and Bio Inspired Computing, 10.1007/978-3-319-71767-8_84, (982-993), (2018).
  • Time-fractional diffusion equation for signal smoothing, Applied Mathematics and Computation, 10.1016/j.amc.2018.01.007, 326, (108-116), (2018).
  • Time fractional super-diffusion model and its application in peak-preserving smoothing, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2018.02.005, 175, (13-19), (2018).
  • Hybrid Mechanism to Detect Paroxysmal Stage of Atrial Fibrillation Using Adaptive Threshold-Based Algorithm with Artificial Neural Network, IEICE Transactions on Information and Systems, 10.1587/transinf.2017EDP7235, E101.D, 6, (1666-1676), (2018).
  • Identification of Black Plastics Based on Fuzzy RBF Neural Networks: Focused on Data Preprocessing Techniques Through Fourier Transform Infrared Radiation, IEEE Transactions on Industrial Informatics, 10.1109/TII.2017.2771254, 14, 5, (1802-1813), (2018).
  • MALDIrppa: quality control and robust analysis for mass spectrometry data, Bioinformatics, 10.1093/bioinformatics/btx628, 34, 3, (522-523), (2017).
  • undefined, 2017 3rd International Conference on Frontiers of Signal Processing (ICFSP), 10.1109/ICFSP.2017.8097057, (39-43), (2017).
  • Automated multigroup outlier identification in molecular high-throughput data using bagplots and gemplots, BMC Bioinformatics, 10.1186/s12859-017-1645-5, 18, 1, (2017).
  • Improving HVAC operational efficiency in small-and medium-size commercial buildings, Building and Environment, 10.1016/j.buildenv.2017.05.010, 120, (64-76), (2017).
  • Deep learning and 3D-DESI imaging reveal the hidden metabolic heterogeneity of cancer, Chemical Science, 10.1039/C6SC03738K, 8, 5, (3500-3511), (2017).
  • Statistical contributions to bioinformatics: Design, modelling, structure learning and integration, Statistical Modelling: An International Journal, 10.1177/1471082X17698255, 17, 4-5, (245-289), (2017).
  • undefined, 2017 International Conference on Advanced Systems and Electric Technologies (IC_ASET), 10.1109/ASET.2017.7983663, (34-37), (2017).
  • Mass Spectrometry Analysis Using MALDIquant, Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry, 10.1007/978-3-319-45809-0, (101-124), (2017).
  • Logistic Regression Modeling on Mass Spectrometry Data in Proteomics Case-Control Discriminant Studies, Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry, 10.1007/978-3-319-45809-0, (213-238), (2017).
  • MALDIViz: A Comprehensive Informatics Tool for MALDI-MS Data Visualization and Analysis, SLAS DISCOVERY: Advancing Life Sciences R&D, 10.1177/2472555217727517, 22, 10, (1246-1252), (2017).
  • Proteomic Approach to Breast Cancer, Cancer Control, 10.1177/107327480701400406, 14, 4, (360-368), (2017).
  • Parametric Power Spectral Density Analysis of Noise from Instrumentation in MALDI TOF Mass Spectrometry, Cancer Informatics, 10.1177/117693510700300019, 3, (117693510700300), (2017).
  • Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study, Cancer Informatics, 10.1177/117693510700300021, 3, (117693510700300), (2017).
  • Challenges in the Analysis of Mass-Throughput Data: A Technical Commentary from the Statistical Machine Learning Perspective, Cancer Informatics, 10.1177/117693510600200004, 2, (117693510600200), (2017).
  • Nonparametric estimation and inference for polytomous discrimination index, Statistical Methods in Medical Research, 10.1177/0962280217692830, (096228021769283), (2017).
  • Spatial-fractional order diffusion filtering, Journal of Mathematical Chemistry, 10.1007/s10910-017-0795-z, (2017).
  • Power Normalization for Mass Spectrometry Data Analysis and Analytical Method Assessment, Analytical Chemistry, 10.1021/acs.analchem.5b04418, 88, 6, (3156-3163), (2016).
  • Preprocessing and Analysis of LC-MS-Based Proteomic Data, Statistical Analysis in Proteomics, 10.1007/978-1-4939-3106-4_3, (63-76), (2016).
  • Nonlinear diffusion filtering for peak-preserving smoothing of a spectrum signal, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2016.06.007, 156, (157-165), (2016).
  • Recursive Wavelet Peak Detection of Analytical Signals, Chromatographia, 10.1007/s10337-016-3155-4, 79, 19-20, (1247-1255), (2016).
  • Quantification of Individual Lipid Species in Lipidomics, Lipidomics, 10.1002/9781119085263, (305-334), (2016).
  • An ensemble regularization method for feature selection in mass spectral fingerprints, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2015.05.009, 146, (322-328), (2015).
  • A new peak detection algorithm for MALDI mass spectrometry data based on a modified Asymmetric Pseudo-Voigt model, BMC Genomics, 10.1186/1471-2164-16-S12-S12, 16, S12, (2015).
  • Multiscale peak detection in wavelet space, The Analyst, 10.1039/C5AN01816A, 140, 23, (7955-7964), (2015).
  • Comparison of three‐dimensional ROC surfaces for clustered and correlated markers, with a proteomics application, Statistica Neerlandica, 10.1111/stan.12065, 69, 4, (399-418), (2015).
  • Using collective expert judgements to evaluate quality measures of mass spectrometry images, Bioinformatics, 10.1093/bioinformatics/btv266, 31, 12, (i375-i384), (2015).
  • Premalignant Pancreatic Cancer Diagnosis Using Proteomic Pattern Analysis, Journal of Medical and Bioengineering, 10.12720/jomb.4.4.288-292, 4, 4, (288-292), (2015).
  • Mass-Up: an all-in-one open software application for MALDI-TOF mass spectrometry knowledge discovery, BMC Bioinformatics, 10.1186/s12859-015-0752-4, 16, 1, (2015).
  • Ariadne’s Thread: A Robust Software Solution Leading to Automated Absolute and Relative Quantification of SRM Data, Journal of Proteome Research, 10.1021/pr500996s, 14, 9, (3779-3792), (2015).
  • Outlier detection using neighborhood rank difference, Pattern Recognition Letters, 10.1016/j.patrec.2015.04.004, 60-61, (24-31), (2015).
  • Application of dual tree complex wavelet transform in tandem mass spectrometry, Computers in Biology and Medicine, 10.1016/j.compbiomed.2015.05.002, 63, (36-41), (2015).
  • A concise iterative method using the Bezier technique for baseline construction, The Analyst, 10.1039/C5AN01184A, 140, 23, (7984-7996), (2015).
  • Computational and statistical analysis of metabolomics data, Metabolomics, 10.1007/s11306-015-0823-6, 11, 6, (1492-1513), (2015).
  • Comprehensive MALDI-TOF Biotyping of the Non-Redundant Harvard Pseudomonas aeruginosa PA14 Transposon Insertion Mutant Library, PLOS ONE, 10.1371/journal.pone.0117144, 10, 2, (e0117144), (2015).
  • Signal Partitioning Algorithm for Highly Efficient Gaussian Mixture Modeling in Mass Spectrometry, PLOS ONE, 10.1371/journal.pone.0134256, 10, 7, (e0134256), (2015).
  • Standardized processing of MALDI imaging raw data for enhancement of weak analyte signals in mouse models of gastric cancer and Alzheimer’s disease, Analytical and Bioanalytical Chemistry, 10.1007/s00216-014-8356-9, 407, 8, (2255-2264), (2014).
  • Spectral matching approaches in hyperspectral image processing, International Journal of Remote Sensing, 10.1080/01431161.2014.980922, 35, 24, (8217-8251), (2014).
  • Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data, BMC Bioinformatics, 10.1186/s12859-014-0385-z, 15, 1, (2014).
  • Stochastic regression modeling of chemical spectra, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2014.08.002, 139, (26-32), (2014).
  • Proteomics-Based Theranostics, Cancer Theranostics, 10.1016/B978-0-12-407722-5.00003-7, (21-42), (2014).
  • undefined, 2014 14th International Conference on Intelligent Systems Design and Applications, 10.1109/ISDA.2014.7066267, (180-186), (2014).
  • undefined, IEEE SENSORS 2014 Proceedings, 10.1109/ICSENS.2014.6985366, (1764-1767), (2014).
  • A Peak Synchronization Measure for Multiple Signals, IEEE Transactions on Signal Processing, 10.1109/TSP.2014.2333568, 62, 17, (4390-4398), (2014).
  • QUDeX-MS: hydrogen/deuterium exchange calculation for mass spectra with resolved isotopic fine structure, BMC Bioinformatics, 10.1186/s12859-014-0403-1, 15, 1, (2014).
  • Advances in ovarian cancer proteomics: the quest for biomarkers and improved therapeutic interventions, Expert Review of Proteomics, 10.1586/14789450.5.4.551, 5, 4, (551-560), (2014).
  • Laser capture sampling and analytical issues in proteomics, Expert Review of Proteomics, 10.1586/14789450.4.5.627, 4, 5, (627-637), (2014).
  • Bioinformatics and data mining in proteomics, Expert Review of Proteomics, 10.1586/14789450.3.3.333, 3, 3, (333-343), (2014).
  • The C-Terminal Fragment of Prostate-Specific Antigen, a 2331 Da Peptide, as a New Urinary Pathognomonic Biomarker Candidate for Diagnosing Prostate Cancer, PLoS ONE, 10.1371/journal.pone.0107234, 9, 9, (e107234), (2014).
  • Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data, Information Sciences, 10.1016/j.ins.2010.12.013, 222, (229-246), (2013).
  • undefined, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 10.1109/EMBC.2013.6610095, (2692-2695), (2013).
  • Using ABC Algorithm with Shrinkage Estimator to Identify Biomarkers of Ovarian Cancer from Mass Spectrometry Analysis, Hybrid Artificial Intelligent Systems, 10.1007/978-3-642-40846-5_35, (345-355), (2013).
  • Feature Selection and Machine Learning with Mass Spectrometry Data, Mass Spectrometry Data Analysis in Proteomics, 10.1007/978-1-62703-392-3_10, (237-262), (2013).
  • Retinal Microaneurysm Detection Through Local Rotating Cross-Section Profile Analysis, IEEE Transactions on Medical Imaging, 10.1109/TMI.2012.2228665, 32, 2, (400-407), (2013).
  • A rapid MALDI-TOF mass spectrometry workflow for Drosophila melanogaster differential neuropeptidomics, Molecular Brain, 10.1186/1756-6606-6-60, 6, 1, (60), (2013).
  • undefined, Proceedings of 2013 10th International Bhurban Conference on Applied Sciences & Technology (IBCAST), 10.1109/IBCAST.2013.6512140, (109-112), (2013).
  • Bioinformatic Analysis of Data Generated from MALDI Mass Spectrometry for Biomarker Discovery, Applications of MALDI-TOF Spectroscopy, 10.1007/128_2012_365, (193-209), (2013).
  • Volatile Organic Compounds (VOCs) for Noninvasive Plant Diagnostics, Pest Management with Natural Products, 10.1021/bk-2013-1141.ch006, (73-95), (2013).
  • Locally Adaptive Bayes Nonparametric Regression via Nested Gaussian Processes, Journal of the American Statistical Association, 10.1080/01621459.2013.838568, 108, 504, (1445-1456), (2013).
  • Cleansing of Mass Spectrometry Data for Protein Identification and Quantification, Biological Knowledge Discovery Handbook, 10.1002/9781118617151, (59-76), (2013).
  • Differential Intrahepatic Phospholipid Zonation in Simple Steatosis and Nonalcoholic Steatohepatitis, PLoS ONE, 10.1371/journal.pone.0057165, 8, 2, (e57165), (2013).
  • An implantable optical blood pressure sensor based on pulse transit time, Biomedical Microdevices, 10.1007/s10544-012-9689-9, 15, 1, (73-81), (2012).
  • undefined, 2012 4th Conference on Data Mining and Optimization (DMO), 10.1109/DMO.2012.6329800, (67-72), (2012).
  • undefined, 2012 2nd International Conference on Uncertainty Reasoning and Knowledge Engineering, 10.1109/URKE.2012.6319562, (264-267), (2012).
  • Wavelet-Based Method for Time-Domain Noise Analysis and Reduction in a Frequency-Scan Ion Trap Mass Spectrometer, Journal of The American Society for Mass Spectrometry, 10.1007/s13361-012-0455-2, 23, 11, (1855-1864), (2012).
  • Fast Elastic Peak Detection for Mass Spectrometry Data Mining, IEEE Transactions on Knowledge and Data Engineering, 10.1109/TKDE.2010.238, 24, 4, (634-648), (2012).
  • Peptide Reranking with Protein-Peptide Correspondence and Precursor Peak Intensity Information, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10.1109/TCBB.2012.29, 9, 4, (1212-1219), (2012).
  • TOFwave: reproducibility in biomarker discovery from time-of-flight mass spectrometry data, Molecular BioSystems, 10.1039/c2mb25223f, 8, 11, (2845), (2012).
  • A Filtering Method for Pressure Time Series of Oil Pipelines, Advances in Brain Inspired Cognitive Systems, 10.1007/978-3-642-31561-9_21, (192-197), (2012).
  • High Throughput Profiling of Serum Phosphoproteins/Peptides Using the SELDI-TOF-MS Platform, SELDI-TOF Mass Spectrometry, 10.1007/978-1-61779-418-6_14, (199-216), (2012).
  • Multidimensional Median Filters for Finding Bumps in Chemical Sensor Datasets, Journal of Sensor Technology, 10.4236/jst.2012.21005, 02, 01, (23-37), (2012).
  • Feature Detection with Controlled Error Rates in LC/MS Images, Journal of Computational Biology, 10.1089/cmb.2009.0125, 19, 4, (349-364), (2012).
  • Semi-Automated Identification of N-Glycopeptides by Hydrophilic Interaction Chromatography, nano-Reverse-Phase LC–MS/MS, and Glycan Database Search, Journal of Proteome Research, 10.1021/pr201183w, 11, 3, (1728-1740), (2012).
  • Improve accuracy and sensibility in glycan structure prediction by matching glycan isotope abundance, Analytica Chimica Acta, 10.1016/j.aca.2012.07.009, 743, (80-89), (2012).
  • Self-correlation method for processing random phase signals in Fourier Transform Mass Spectrometry, International Journal of Mass Spectrometry, 10.1016/j.ijms.2012.06.013, 325-327, (73-79), (2012).
  • N-glycans in liver-secreted and immunoglogulin-derived protein fractions, Journal of Proteomics, 10.1016/j.jprot.2012.01.024, 75, 7, (2216-2224), (2012).
  • Bio-Inspired Metaheuristic Optimization Algorithms for Biomarker Identification in Mass Spectrometry Analysis, International Journal of Natural Computing Research, 10.4018/jncr.2012040104, 3, 2, (64-85), (2012).
  • Sensitive and Specific Peak Detection for SELDI-TOF Mass Spectrometry Using a Wavelet/Neural-Network Based Approach, PLoS ONE, 10.1371/journal.pone.0048103, 7, 11, (e48103), (2012).
  • An Efficient Algorithm for Automatic Peak Detection in Noisy Periodic and Quasi-Periodic Signals, Algorithms, 10.3390/a5040588, 5, 4, (588-603), (2012).
  • Statistical signal processing methods for intraoral pressure curve analysis in orthodontics, The European Journal of Orthodontics, 10.1093/ejo/cjr039, 34, 4, (437-441), (2011).
  • Evaluation of Peak-Picking Algorithms for Protein Mass Spectrometry, Data Mining in Proteomics, 10.1007/978-1-60761-987-1_22, (341-352), (2011).
  • Biomarker Discovery and Redundancy Reduction towards Classification using a Multi-factorial MALDI-TOF MS T2DM Mouse Model Dataset, BMC Bioinformatics, 10.1186/1471-2105-12-140, 12, 1, (2011).
  • Peakbin Selection in Mass Spectrometry Data Using a Consensus Approach with Estimation of Distribution Algorithms, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10.1109/TCBB.2010.18, 8, 3, (760-774), (2011).
  • Reproducibility of SELDI Spectra across Time and Laboratories, Cancer Informatics, 10.4137/CIN.S6438, 10, (CIN.S6438), (2011).
  • undefined, 2011 Sixth International Conference on Bio-Inspired Computing: Theories and Applications, 10.1109/BIC-TA.2011.7, (75-79), (2011).
  • Shape-Based Feature Matching Improves Protein Identification via LC-MS and Tandem MS, Journal of Computational Biology, 10.1089/cmb.2010.0155, 18, 4, (547-557), (2011).
  • undefined, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 10.1109/ICASSP.2011.5946480, (621-624), (2011).
  • Deconvolution of pulse trains with the L0 penalty, Analytica Chimica Acta, 10.1016/j.aca.2011.05.030, 705, 1-2, (218-226), (2011).
  • Recent advances in computational analysis of mass spectrometry for proteomic profiling, Journal of Mass Spectrometry, 10.1002/jms.1909, 46, 5, (443-456), (2011).
  • See more

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.