Spectral Databases, Infrared
Published Online: 15 SEP 2006
Copyright © 2000 John Wiley & Sons, Ltd. All rights reserved.
Encyclopedia of Analytical Chemistry
How to Cite
Dçebska, B. J. and Guzowska-Świder, B. 2006. Spectral Databases, Infrared. Encyclopedia of Analytical Chemistry. .
- Published Online: 15 SEP 2006
Infrared (IR) spectroscopic analysis is one of the most important means of structure determination of organic compounds because it can provide much information about the molecular structures of compounds. Innumerable IR spectra have been measured since the Coblentz collection was published in 1905 and many catalogs (printed collections and, further, computer IR databases) have been developed. These experimental data were very useful in the structure elucidation process, and many studies have been conducted on the relationships between various organic functional groups and their absorption bands. Usually, chemists make structural analyses using two methods. One method is to search a library of standard spectra and to find the closest match to the unknown spectrum [library search (LS) Method]. The other method is to find characteristic spectral features (connected with some parts of a chemical molecule), which is based on the empirical examination of a large number of spectra of known compounds. The conventional process of structure elucidation can be time-consuming, especially in the case of complex, multifunctional compounds. Computers offer the promise of enhanced human productivity in this field. A number of systems for searching a collection of spectral data in order to find reference spectra, identical with or similar to a spectrum of an unknown compound, have been developed. The systems may also retrieve other information, such as molecular formulas, chemical names, Chemical Abstracts Service (CAS) registry numbers, molecular fragments or complete structures of compounds. Most of the search systems are designed for a single type of spectra (e.g. IR spectra), but there are also some multimethod systems that apply various spectral techniques [IR, nuclear magnetic resonance (NMR), mass spectrometry (MS), Raman or ultraviolet/visible (UV/VIS)] to improve the results of a structure recognition process. The weakness of the LS approach is related to the necessity for a powerful computing system and a large database. Also, even a large database may not contain identical or highly similar reference spectra with respect to the sample spectrum. Thus the LS method can solve the problem of structure identification when the database contains a spectrum identical with (or highly similar to) the sample spectrum. Otherwise, to elucidate the structure of an unknown compound, it is necessary to use computer methods that permit the recognition of structural fragments present in the molecule of the analyzed substance. These programs use correlation tables (a list of structural fragments and their spectral characteristic parameters) that can form a knowledge database or be an integral part of the program. Usually, these parameters are taken from the literature, but they can also be generated from a computer library of IR spectra by computer-simulated neural networks, application of a statistical algorithm or other methods. There are systems for structure recognition which test a large number of spectral and structural features in order to calculate decision functions between the classes of compounds. When applied to a spectrum of an unknown structure, these functions indicate the presence or absence of the respective molecular fragments. Methods of this type include various cluster analyses, pattern recognition methods and computer-simulated neural networks.
The first collections of IR spectra (printed and computer IR databases) were often of poor quality, owing to the low technological level of the spectrometers used. Contemporary IR databases contain spectra of a considerably higher quality owing to the advent of high-resolution spectrometers and proper sample preparation (according to the standards required). Apart from large spectral databases (SDBs) containing data on different chemical compounds, a large number of collections dedicated to specialized groups of compounds are offered. Recently, a new type of printed spectral catalogue has appeared on the market: the traditional form of a book is supplemented by a diskette containing peak tables in a digital format, allowing the reader to search for unknowns. As far as the computer databases are concerned, many efforts are aimed at the unification of spectral and chemical structure codes, which will assist an easy exchange of information between various scientific centers. The International Union of Pure and Applied Chemistry (IUPAC) has additionally published a very positive opinion about the Joint Committee on Atomic and Molecular Physical Data (JCAMP) format which makes it very probable that this format will be broadly used in future.
All the above-mentioned issues are discussed in this article, with detailed focus on (1) history of printed and computer IR collections, (2) more important IR databases, (3) standards in IR databases, (4) structure of IR databases, (5) monomethod identification of chemical compounds and (6) application of multimethod SDBs to structure elucidation.