Information assessment is fundamentally either qualitative or quantitative. The former is subjective, based on apparent qualities, and the latter is objective, involving measurable quantities and numerical descriptors. Medicine is primarily quantitative; routinely, we use biomarkers to make diagnoses, distinguish normal from abnormal physiological conditions and monitor treatments. According to Wikipedia, a biomarker is: ‘anything that can be used as an indicator of a particular disease state or some other physiological state of an organism’. In order for a biomarker to be clinically useful, it must be validated, and its assessment standardized and reproducible between different laboratories and equipment, and the information provided by the biomarker should be relevant to the physiological condition. A simple example is body temperature: fever is a biomarker of infection or inflammation.
Medical imaging, in contrast, is primarily qualitative. The details of the physical interactions that occur between an object and an imaging modality hold a wealth of information but, historically, we have ignored these details in favor of qualitative interpretation. In other words, we rely on pattern recognition of the spatially varying image brightness to identify a structure and distinguish normal from abnormal. This is changing, however, as the value of image quantification is being recognized increasingly. Quantitative imaging is: ‘the extraction of quantifiable features from medical images for the assessment of normal or the severity, degree of change, or status of a disease, injury or chronic condition’.
The process for evaluating a biomarker can be simple – most people keep thermometers at home to test for fever – or it can be fairly sophisticated, involving tracking of radiolabeled substances introduced into the body to identify sites of interest. For instance, positron emission tomography (PET) using radiolabeled  F-fluorodeoxyglucose provides information about sites of metabolic activity (glucose uptake). Similarly, gadolinium-based magnetic resonance imaging (MRI) contrast agents can be tagged to improve target specificity and provide analogous information about tumor volume or pharmacokinetics. These complex imaging techniques are presently receiving a great deal of attention as the field of quantitative imaging is exploding. However, one of the earliest successful examples of quantitative imaging is simple and is quietly being used daily in clinical practice: fetal biometry. The metrics (biparietal diameter, head circumference, abdominal circumference and femur length) have been validated and standardized, they are applicable across different imaging systems and the information is relevant to the fetal condition and clearly clinically useful.
We in obstetrics and gynecology continue to push the boundaries of quantitative imaging. Evaluation of the fetal heart is an excellent example[2-5]. Recently, the Journal published a summary of the state of quantitative fetal heart imaging, in which Hornberger described how relating individual fetal cardiovascular measurements to normative data improves neonatal prognosis, because it allows interventions to be based on a more sophisticated understanding of individual cardiac pathophysiology. In other words, validation and standardization of specific measures which are relevant to the fetal condition has resulted in clinically useful interventions. This is precisely the role of an effective biomarker.
There are many other examples of quantitative imaging in the fetus, including of bones, craniofacial structures, lungs and brain. Quantitative ultrasound techniques are also being explored actively for assessment of the placenta and especially the cervix[11-17]. The first reference to cervical quantitative imaging was in this Journal in 2006, and the current issue holds a compelling paper by Hernandez-Andrade et al., which is likely the most recent word.
Why quantify the cervix?
We already evaluate the cervix quantitatively when we measure its length, and the value of this measurement is indisputable. It is also not enough; our approach to screening and treatment for preterm birth prevention remains controversial after hundreds of publications since the original 1996 paper establishing the inverse relationship between midtrimester cervical length and risk of preterm birth. Essentially, the short cervix does not tell the whole story; most women with a midtrimester short cervix (but no history of preterm birth) do not deliver preterm, the risk reduction for those who have an intervention is relatively modest (less than 50%), and most preterm births in low-risk women occur in those with a normal midtrimester cervical length.
Long before it is short, the cervix softens as its collagen microstructure begins rearranging and it takes on water[20-27]. Recent investigations thus attempt to assess this microstructure non-invasively, and generally fall into three categories, evaluating: hydration status, tissue softness/stiffness and/or actual collagen arrangement. Cervical hydration increases as pregnancy progresses, and attenuation (the loss of ultrasound signal power with depth as a function of frequency) decreases with increasing hydration. Measuring attenuation should therefore be useful for studying cervical remodeling in pregnancy, and the first report of attenuation was published in this Journal in 2010. In that cross-section of 41 women in all trimesters of pregnancy, transvaginal images of the cervix were obtained in the standard manner, several regions of interest (ROIs) were chosen manually from the B-mode image and attenuation was calculated offline (Figure 1). Attenuation was found to be a predictor of interval from ultrasound exam to delivery, but not of gestational age. The latter is surprising, given the known increase in hydration throughout gestation, but the technique's potential was nevertheless suggested by the findings in two cases: one woman at 18 weeks and another at 29 weeks had a high attenuation value but a short cervix and both delivered at term.
Evaluating cervical softness is attractive because the cervix softens in preparation for delivery, and therefore several metrics to compare tissue pre- and postdeformation of the cervix have been proposed. One simple method appeared in the Journal in 2011. An image of the cervix is obtained in the standard manner and a second image is acquired after pressure is applied with the transducer until the cervix will not deform further (indicated by maximal shortening of the anteroposterior diameter) (Figure 2). The metric is a ratio of the pre- to postdeformation dimensions, and is termed the ‘cervical consistency index’ (CCI); a lower CCI is associated with a softer cervix. In a cross-section of more than 1000 women across gestation, the CCI had a linear relationship with gestational age and was lower in pregnancies that ultimately delivered preterm.
Elastography is a more sophisticated means of ultrasound assessment of tissue pre- and postdeformation, and there are several viable approaches. This technique has gained attention recently as equipment capable of generating elastographic data has become available commercially. Elastography is based on determining relative motion in areas of a tissue compared to neighboring areas, typically presented as a color map overlaid on the B-mode image (an ‘elastogram’ or ‘elastographic image’). Conceptually, this approach to imaging tissue elasticity is simple. It is analogous to describing the squeezing or stretching of a spring: elastographic images display mechanical strain, which is the change in the length of the spring relative to its original length. In other words, how much the tissue ‘squeezes’ when the transducer is pressed against it provides information about its relative elasticity (its stiffness/softness) compared to its surroundings. The most common cervical elastography method involves applying gentle rhythmic pressure with a transvaginal probe pressing on the cervix to generate small deformations.
The first mention of elastography for an obstetric purpose was in a ‘Picture of the Month’ feature in this Journal in 2006. It described a metric called the ‘tissue quotient’, a ratio of the percentage of soft tissue (easily deformed, indicated by a red color on the elastographic image), to that of stiffer tissue (green) (Figure 3). Although this approach held promise initially, results in 61 pregnant women indicated no significant stiffness changes in the cervix during normal pregnancy, when in fact there is ample evidence (including experiential) to suggest that the cervix softens throughout pregnancy.
Recent approaches have shown greater promise. In 2011, Swiatkowska-Freund and Preis evaluated elastography for determining success of labor induction in 29 pregnant women at term. After identifying the standard B-mode image of the cervix, the probe was held in place while the patient's breathing and arterial pulsation caused enough movement to create elastographic images. The metric, ‘elastography index’ (EI) is based on the points assigned to each color on the map: purple (stiffest, 0), blue (1), green (2), yellow (3) and red (softest, 4) (Figure 4). They found the tissue around the internal os to be statistically significantly softer (higher EI) in patients with successful induction compared with those in whom induction failed.
In 2012, Molina et al. examined a cross-section of 112 women with a mean gestational age of 21 weeks. Deformation was created by advancing the probe by ‘about 1 cm’ into the cervical tissue. On the colored elastographic image, four ROIs 6 mm in diameter were evaluated (external/superior lip, internal/superior lip, internal/inferior lip and external/inferior lip) (Figure 5). The external and superior regions of the cervix were statistically significantly softer compared with the internal and inferior regions.
In this issue of the Journal, Hernandez-Andrade et al. address the problem that: ‘in most elastography reports in obstetrics and gynecology the definition of tissue stiffness/softness relies on this operator-dependent qualitative estimation’. In other words, deformation has not been standardized. They attempt to semiquantify the transducer force by watching the ‘pressure bar’ displayed on the screen, adjusting the pressure on the transducer to keep it within a certain range while applying continuous oscillatory pressure to deform the tissue. They found that strain (elastographic) images differed significantly according to history of previous preterm delivery and cervical length.
A lively debate
Measurement standardization is a critical requirement for a biomarker. It is therefore not surprising that a lively debate on the topic has appeared in the Journal. All of the above investigators mention their own challenges with standardization, and several use this as a means with which to criticize others' work. For instance, McFarlin et al. noted considerable intersubject variability in their attenuation measurements, complicated by an inability to standardize comparison between subjects because ROIs were selected manually. Parra-Saavedra et al. noted that careful training would be necessary for reproducibility because a limitation of their CCI method was standardization. With regard to elastography, years ago Thomas et al. noted that ‘assessment of compression should be performed under standardized conditions’. More recently, Swiatkowska-Freund and Preis discussed the necessity of ‘standardization of the cervical properties seen on elastography’, Molina et al. summarized that ‘additional useful information on pregnancy outcome, over and above what is currently achieved by measurement of cervical length, will require standardization’, and Hernandez-Andrade et al., in this issue of the Journal, discuss a semiquantitative method of standardization.
In the November 2012 issue of the Journal, Correspondence between Fruscalzo and Schmitz and Molina et al. highlighted the relative merits of each of their approaches to standardization. Fruscalzo and Schmitz criticized Molina et al.'s technique, stating that ‘it is obvious that the arbitrary guideline of compressing the tissue by 1 cm is not associated in any predictable way with the actual force applied’. They discussed their own method, in which compression is achieved by pressing the probe on the anterior cervical lip, perpendicularly to the longitudinal axis of the cervix, until the posterior lip is displaced. In a pilot study of 10 pregnant women, their interobserver reliability was high, and they stated that ‘even if not measurable, the force can be standardized’. Molina et al. argued that Fruscalzo and Schmitz's method ‘does not sound to us more reproducible’, and countered that, although their method demonstrated adequate intra- and interobserver reproducibility, ‘some mechanical parameters (e.g. stiffness modulus), cannot be measured as absolute values; only their relative values within a region of interest (e.g. rate-of-change of displacement) can be assessed’. Fruscalzo and Schmitz further criticized Molina et al.'s ROI dimensions as arbitrary, noting that ‘the dimensions and location of the region of interest (ROI) play a central role in the strain values obtained, independent from the force applied’. Certainly, exact location of the ROI with respect to the transducer needs to be taken into account, because tissue closer to the transducer will deform differently from that further away; Molina et al. in fact recognized this, stating that, while it may be possible to standardize the force at the contact point of the tissue with the transducer, it is impossible to do so throughout the cervix. Hernandez-Andrade et al. reiterate this concern, noting that it could affect the strain rate values if evaluation of the entire cervix is desired; even if the force from mechanical compression can be standardized at the transducer site, it may not reach more remote areas of the cervix, and thus regions located ‘deeper and laterally may not show strain values in agreement with real tissue stiffness’.
These important discussions illustrate the fundamental issue with cervical elastography: most elastographic imaging systems display relative displacement (mechanical strain) and therefore this approach to elastography is most useful for imaging adjacent areas of very different stiffness, rather than for determining the overall stiffness of a structure. The original intent of elastography was to discern regions of marked stiffness differences relative to neighboring tissue for the identification of isolated tumors (such as in the breast), and the approach works well for that purpose. Molina et al. note that ‘in contrast to the study of tumors, where the stiffness is compared with that of normal adjacent tissues equidistant from the tip of the transducer, the application of elastography in the study of the healthy cervix in pregnancy is limited by the lack of a reference tissue for comparison’. Swiatkowska-Freund and Preis also said it well: ‘Some authors have tried to assess the stiffness of whole organs, such as cirrhotic liver, but it is much more difficult to interpret such findings because elastography shows the relative stiffness of different parts of tissues rather than providing an objective measurement of stiffness.’
Intrinsic cervical heterogeneity further complicates quantitative assessment. As noted by Parra-Saavedra et al., the cervix is extremely heterogeneous; this is readily apparent on B-mode imaging, in which the macrostructural endocervical canal, cystic areas and blood vessels can be seen, not to mention the heterogeneity of the cervical microstructure, which is made up of layers of collagen of varying alignment, all of which may affect measurement results. Parra-Saavedra et al. criticized the work of McFarlin et al. on these grounds; specifically, they noted that the attenuation algorithm is violated because it assumes homogeneous tissue, when the cervix is decidedly not homogeneous. Gross cervical structure also affects elastography measurements. The fact that the cervix is approximately cylindrical with a canal in the center causes complexity in relative deformation (strain) that does not exist for a structure like the breast. Furthermore, because strain is a relative measure and not an intrinsic material property, elastographic images are highly dependent on not only the transducer force and structure of the tissue itself, but also boundary conditions, such as attachment to surroundings. This makes the cervix particularly complicated because it hangs in a potential space (the vagina), suspended by ligaments attached to the uterus in the pelvis, and is extremely susceptible to movement. Molina et al. noted that ‘previous studies have described that the elastographic color of the cervix is not homogeneous but different parts appear to have different degrees of stiffness. Our study has confirmed this apparent lack of homogeneity in the measurable stiffness of the cervix.’ On the other hand, they identify that ‘unlike the comparison of malignant tumor with benign adjacent tissue of the breast, there is no evidence of a true variation in the stiffness of different parts of the cervix’. In other words, for a number of reasons, standardizing the deformation of the cervix is very challenging, making result interpretation difficult.
A final, but significant, issue regarding standardization of elastographic measurements is elucidated by Hernandez-Andrade et al.: ‘the color code differs across ultrasound units and can be modified by the operator’. This point has also been raised by Malgorzata Swiatkowska, who noted that the same cervix, examined with different equipment, produces different results (pers. comm.).
In summary, these arguments demonstrate that interpreting strain images as surrogate quantitative measures of cervix stiffness is not trivial. It requires standardization of the ultrasound machine settings and the contact pressure, and it is hugely complicated by the boundary conditions of the deformation and the anatomy of the structure itself. Converting commercially available strain image data to absolute elastic parameters, such as Young's modulus, is therefore difficult, particularly because no commercially available ultrasound imaging system currently provides a sensor with which to measure contact (tactile) pressure.
Lessons from the past
Standardization issues are not exclusive to the cervix. Because standardization is a critical requirement for a biomarker, it is an issue for all areas of quantitative imaging, many of which have been explored intensely for more than a decade. A recent review of quantitative methods using computed tomography (CT) biomarkers to measure lung tumor volume notes that, after nearly two decades, methods are still limited by measurement error, technical feasibility, image acquisition standardization among different systems and imaging techniques. Nuclear imaging techniques and dynamic function tests for the liver, including those using magnetized blood, gadolineum and radioactivity, have been on the edge of revolutionary for years, but nothing has emerged quite yet, partly because issues of standardization are a significant challenge. These investigators suggest that success is likely to be achieved only through a combination of approaches and collaboration (see discussion below about the Quantitative Imaging Biomarkers Alliance)[30, 32].
Looking into the future
More information is on the horizon; many of the investigators mentioned above are currently collecting more data in pregnant women, and attempts to standardize measurements continue (M. Parra-Saavedra, pers. comm.; B. McFarlin, pers. comm.). An exciting technological development we may see in the near future is an ultrasonic sensor able to quantify the circumferential shear modulus of the cervix (F. Molina and G. Rus, pers. comm.). In addition, elasticity imaging methods based on ‘remote palpation’ are under active investigation, and may be promising for assessing tissue softness/stiffness. These methods are far less dependent on boundary conditions than is (strain imaging) elastography. Systems capable of shear wave elasticity imaging have recently become commercially available and, like (strain imaging) elastography, have shown some success in simple, relatively homogeneous organs such as the liver. Its application to more complicated structures, such as the cervix, is likely to be problematic, because transverse wave behavior is much more complicated in a heterogeneous structure; as with attenuation, accurate measurement depends upon simple assumptions, including tissue homogeneity, an assumption which is violated in the structurally heterogeneous cervix.
That said, the approach holds promise; in fact, our group is investigating transverse wave speeds for quantitative measurement of cervical softness, in addition to an ultrasound method using acoustic backscatter to infer its collagen microstructure. We are also systematically corroborating these measurements with (micron-scale) non-linear optical microscopy images of full (several centimeter-scale) cervical cross-sectional specimens for validation and detailed interpretation. Careful observation of wave behavior seems not only to provide an objective measure of differing softness/stiffness in specific regions of the cervix, but also to provide critical information about cervical microstructure, for example collagen layer boundaries. Frustratingly, however, the complexity of these problems, the underlying biological variability and the need for very careful standardization and validation of findings have forced slow progression towards clinical implementation.
No time like the present
Malgorzata Swiatkowska summarized the present situation perfectly: ‘It is essential now to standardize the measurements or to just decide we do not need quantitative assessment.’ (pers. comm.). Fortunately, because quantitative imaging of the cervix is still in its infancy, we have a unique opportunity to establish a collaborative effort to make progress conscientiously. This is very timely, because the issue of standardization is so critical that there have been recent changes in Food and Drug Administration (FDA) policy regarding the display of quantitative features on ultrasound images. The FDA is also evolving towards requiring more rigorous testing of accuracy and reproducibility, because, with the advent of quantitative imaging, we are seeing more complex measures that are often disease- and/or organ-specific.
Fundamental requirements for effective quantitative imaging include: (a) understanding and minimizing the sources of estimate variance (including system-based sources and biological sources); (b) paying close attention to the underlying assumptions so they are not violated; (c) ensuring result reproducibility across different systems; and (d) demonstrating clinical value. A major international effort organized by the Radiological Society of North America, called the Quantitative Imaging Biomarkers Alliance (QIBA), was established for this purpose in 2007. Its intent is to unite researchers, healthcare professionals and industry to advance quantitative imaging, including the establishment of effective biomarkers. They accomplish this through the development of collaborative imaging protocols, data analysis and display methods. QIBA is organized around imaging modalities, such as MRI, CT, PET and, as of March 2012, ultrasound. The ultrasound committee's first task is to standardize shear wave speed assessment for evaluating liver fibrosis, but the basic principles will apply to any tissue. The QIBA approach requires pursuit of biomarkers that are likely to have a marked impact on public health and address a ‘critical gap’ in the biomarker qualification/validation process. The approach needs to be feasible, practical and, importantly, collaborative. Successful implementation should, of course, result in improved patient care.
In summary, cervical quantitative imaging has great promise. Hornberger notes that ‘[c]ritical to the use of any normative fetal data is an understanding as to exactly how a measurement is performed and to what biometric measure it is indexed’. With careful, methodological approaches such as those advocated by QIBA, we can certainly adapt and develop quantitative technologies for the cervix. Comprehensive assessment of the complex cervix, via a multifaceted approach with complementary quantitative methods that have been standardized and validated, will facilitate our understanding of the central role it plays in the pathways to preterm birth. As Hornberger articulated so well, ‘this approach facilitates diagnosis and contributes significantly to our understanding…which is critical for…the development of timely and effective intervention’. The time is now to develop a collaborative strategy for quantitative imaging of the cervix.
We wish to recognize the above cervical researchers, most of whom have expressed desire to participate in a collaborative effort to standardize cervical quantitative imaging, especially Drs Swiatkowska, Parra-Saavedra, Molina and Rus. The work of H.F. and T.J.H. was supported by NIH Grants R21HD061896 and R21HD063031 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development.