Cell‐free fetal DNA coming in all sizes and shapes

Abstract Cell‐free fetal DNA analysis has an established role in prenatal assessments. It serves as a source of fetal genetic material that is accessible non‐invasively from maternal blood. Through the years, evidence has accumulated to show that cell‐free fetal DNA molecules are derived from placental tissues, are mainly of short DNA fragments and have rapid post‐delivery clearance profiles. But questions regarding how they come to being short molecules from placental cells and in which physical forms do they exist remained largely unanswered until recently. We now know that the distributions of ending sites of cell‐free DNA molecules are non‐random across the genome and bear correlations with the chromatin structures of cells from which they have originated. Such an insight offers ways to deduce the tissue‐of‐origin of these molecules. Besides, the physical nature and sequence characteristics of the ends of each cell‐free DNA molecule provide tell‐tale signs of how the DNA fragmentation processes are orchestrated by nuclease enzymes. These realizations offered opportunities to develop methods for enriching cell‐free fetal DNA to facilitate non‐invasive prenatal diagnostics. Here we aimed to collate what is known about the biological and physical characteristics of cell‐free fetal DNA into one article and explain the implications of these observations.

Cell-free DNA analysis in maternal plasma 1 has resulted in a paradigm shift in the prenatal screening for fetal chromosomal aneuploidies. 2,3 Additional clinical applications of cell-free fetal nucleic acid analysis have emerged such as that for the prenatal assessment of single gene diseases, 4,5 to investigate early or recurrent pregnancy losses 6 and assessment of pre-eclampsia. 7 To push the envelope of this field, researchers have been asking more fundamental questions about the biological and physical nature of such circulating cell-free fetal nucleic acid molecules. With advancements in analytical and informatics tools, much progress has been made on this front.
Interesting biological features of cell-free fetal DNA (cffDNA) have been uncovered which has provided new insights into the development of an even wider range of clinical applications. On this occasion when we mark the 10 th year since the wide adoption of cffDNA analysis for screening of fetal chromosomal aneuploidies, 8-11 we hope to summarize what has been uncovered to date about the T A B L E 1 Physical or biological features of cell-free DNA and the associated clinical or analytical implications

Physical or biological characteristic Clinical or analytical implication
Placental origin of cffDNA � Chromosomal aneuploidies confined to the placenta are detectable in maternal plasma resulting in the clinical implication that NIPT using cell-free fetal (i.e. placental) DNA for such aneuploidies is a screening, rather than diagnostic test nature of this non-invasive source of fetal genetic material. Alongside this discussion, we shall also comment on the analytical or diagnostic implications of some of these biological features (Table 1).

| TRACKING ITS ORIGIN
There is little dispute that the placenta is the key tissue contributor of "fetal" DNA into the maternal circulation. This consensus view is built upon several lines of evidence. The placenta was suspected to be the tissue source because cffDNA was reported to be present in maternal circulation even in a conception where there was no embryo development. 12 Circulating fetal DNA has been detected early in pregnancy at a stage before fetal organ development. 13 Occult malignancies in pregnant women releasing cancerassociated chromosomal abnormalities into the circulation may also confound NIPT results. 28,29 Because the placental methylomic profile is rather distinctive from other tissues, cell-free DNA methylation analysis may allow the tissue-of-origin of the aneuploid DNA to be discerned, i.e. whether from the fetus or other maternal organs. 30 2 | Quantities and abundance cffDNA are detectable from the first trimester of pregnancy. 31,32 Most NIPT are conducted from 9th to 10th week gestation onwards. 3,8 If performed too early in gestation, the test may fail due to insufficient cffDNA in the sample. 33 During early pregnancy, tens to hundreds of genome-equivalents of cffDNA are present in each milliliter of maternal plasma. 31 The amount of fetal DNA in absolute quantity increases as gestation advances. 31 However, fetal DNA molecules circulate within a background of maternal cell-free DNA and exist as a minor species. The fetal DNA population occupies about 10% to 20%, also termed the fetal fraction, of maternal plasma cell-free DNA in the first and second trimesters. 11,34 The fetal fraction (as a percentage of total DNA in maternal plasma) increases less dramatically during the first half of pregnancy when compared with the amount of fetal DNA (in genome-equivalents or copies) per volume of maternal plasma suggests that the amount of maternally derived DNA also increases with gestation. 33 The quantity of cffDNA may vary in certain pregnancyassociated circumstances. Absolute amounts of cffDNA have been reported to be elevated in pregnancies with preeclampsia, preterm labor, fetomaternal hemorrhage, invasive placentation, oligohydramnios and trisomy 21. 35,36 The underlying pathologies responsible for such quantitative changes has not been fully elucidated but has been suspected to be related to increased placental cell death or apoptosis. 35,37,38 Circulating fetal DNA levels may therefore be reflective of placental health. Because massively parallel sequencingbased methods mainly report DNA quantities as a fraction, researchers also studied if aberrant fetal DNA fractions were associated with pregnancy-related or maternal conditions. The most commonly reported association for low fetal fractions was high maternal body mass index. 11 For the fractional value to be reduced, either the amount of cffDNA reduced, the cell-free maternal DNA increased or both. Increased apoptosis in adipose tissue has been reported in obesity and hence has been postulated as a plausible factor contributing to low fetal fractions in maternal obesity. 39 Pregnancies involving embryo transfer after assisted reproduction technologies tended to have lower fetal fractions. 40

| Size matters
Cell-free DNA molecules are released into the blood stream upon cell death. Consequently, circulating DNA molecules are mostly short DNA fragments. When the lengths of cell-free DNA molecules in maternal plasma were measured and plotted in a frequency distribution curve, the predominant DNA size was 166 bp. 50 This length coincides with the length of DNA associated with a mononucleosome ( Figure 1). It is therefore quite revealing that the generation of cellfree DNA is associated with the breakdown of chromatin into nucleosomal units. Interestingly, when the lengths of fetus-specific circulating DNA molecules were measured, the predominant size was about 142 bp. 50 This is the length of DNA wound around histone core proteins in nucleosomes and is some 20 bp shorter than the maternal cell-free DNA molecules. This observation meant that cffDNA has probably undergone more processing or metabolism than the bulk of the circulating maternal DNA molecules.
Knowing the sizes of plasma DNA has several practical implications on NIPT and cffDNA analysis. For example, PCR assays need to be designed with such molecular lengths in mind. 51 Assays targeting amplicons which are too long would detect fewer template DNA molecules. PCR assays with different amplicon lengths would produce different fetal DNA quantification results.
F I G U R E 1 Pictorial glossary. Illustrations to depict some of the terms referred to in the text. Cell-free DNA molecules mostly circulate as short double-stranded fragments with end termini that are blunt or jagged in nature. A blunt end is when both strands of a double-stranded DNA molecule end at the same genomic location. A jagged end is present when each strand of a double-stranded DNA molecule ended at different genomic locations. If the 5 0 end of one strand protrudes more, the end is said to show a 5 0 overhang. If a 3 0 end of one strand protrudes more, the end is said to show a 3 0 overhang. A small proportion of cell-free DNA molecules are single-stranded. The ends of cell-free DNA molecules, whether double-or single-stranded, show characteristic sequences, termed motifs. For example, a 4nucleotide motif is termed a 4-mer end motif. Double-stranded cell-free DNA molecules usually circulate in a form where they are wound around histone proteins in the form of a nucleosome subunit. When the double-helical structure of DNA is wound around histones, it exposes the minor grooves of the 3dimensional structure at the external surface of the nucleosome which are susceptible to nuclease digestion. When many cell-free DNA molecules are aligned to the genome coordinates, it is noted that more molecules cover certain regions than others. This periodic coverage pattern is reflective of where protein-binding, e. g. histones and transcription factors, is present in the cellular DNA and hence are sites protected from nuclease enzymes during the production of cell-free DNA. One could also determine the genomic locations of cell-free DNA ending sites which occur more frequently at certain locations than others. Sites with high ending frequencies are termed preferred ends [Colour figure can be viewed at wileyonlinelibrary.com]

CHIU AND LO
On the other hand, the short size of cffDNA could be exploited to favor the detection of fetal DNA over maternal DNA molecules. 52,53 When a chromosomal aneuploidy is detected in plasma of a pregnant woman, occasionally, the finding may be of maternal origin. For example, monosomy X detected by NIPT is not infrequently a consequence of the mother being a mosaic for Turner syndrome. 54 If the reduction in chromosome X dosage is shown to be predominantly derived from the short DNA population in the maternal plasma sample, there is a higher likelihood the finding is of fetal rather than maternal origin. 52 In addition, measurement of the proportion of short-sized DNA in maternal plasma provides a reasonable estimate of the fetal fraction. 52

| Genomewide coverage and distributions
To use cffDNA as a source of genetic material for prenatal assessment, it is important to know if genetic sequences covering the entire fetal genome are present in maternal plasma. In a 2010 study, by sequencing cell-free DNA in a maternal plasma sample to an extent equivalent to covering the human genome up to 65 times and using polymorphic sequence differences to distinguish fetal DNA molecules from those of the mother, it was shown that fetal DNA molecules were distributed along and covered the whole genome. 50 After NIPT for fetal chromosomal aneuploidy screening became a clinical service in many centers, the volume of cffDNA sequence data available for in-depth bioinformatics analysis increased substantially. When a high amount of cffDNA data was pooled, the profile of the distribution of cffDNA molecules across the genome could be studied at much higher resolutions. Interestingly, while cell-free DNA molecules indeed were contributed by all parts of the genome, there were microheterogeneities in the amount of DNA detectable from region to region. In particular, certain genomic regions revealed periodic peaks and troughs separated by about 147 bp in the cell-free DNA densities, also termed DNA coverage ( Figure 1). 56,57 This characteristic pattern was thought to be reflective of the nucleosomal organization of DNA in cells. The peaks in coverage were considered as regions where DNA was wound around histone proteins and were relatively protected from fragmentation ( Figure 1). The troughs in coverage were therefore DNA regions more exposed to the cell-free DNA fragmentation process (Figure 1).
Analysis revealed the genomic distributions of the peak and trough coverages of maternal plasma DNA differed somewhat for maternal and fetal DNA. 57,58 We hypothesized this might be attributed to the differences in nucleosome packing in the cells or tissues that contributed maternal cell-free DNA versus those that contributed fetal DNA. 57,58 In other words, the profile differences were reflective of the differences in chromatin organization of placental cells (the main contributor of cffDNA) and maternal blood cells (the major contributor of maternal cell-free DNA). Remarkably, the subtle differences in genomic region coverage between the fetal and maternal cell-free DNA, also termed nucleosome positioning, was exploited in some algorithms for determining the fetal fraction. 56

| Preferred ending sites
Remarkably, there were locations in the genome where circulating DNA fragments ended at much higher occurrences than accountable by random chance. These locations have been referred as the preferred end locations or sites ( Figure 1). 57 Certain genomic positions served as preferred ending sites for both circulating maternal and fetal DNA.
Interestingly, there were also genomic positions preferred by maternal DNA while other sites were preferred by fetal DNA. 57,58 Because the ending site locations tended to be related to chromatin structure, these ending site differences between fetal and maternal DNA might be reflective of the differences in the chromatin accessibility of cells that contributed fetal and maternal DNA, respectively.

| Jagged ends
Because DNA is double-stranded, when it fragments it could be blunt- production were investigated using knock-out mouse models and in vitro experiments. 61,64 Plasma of mice with the DNASE1L3 gene deleted showed higher frequencies of DNA fragments of di-and trinucleosomal units in length as well as those that were shorter than 120 bp. 64 Dnase1l3 was likely to be responsible for the internucleosomal fragmentation of DNA. 61,64 It also participated in the further processing of cell-free DNA of mononucleosomal unit in length. The role of DNase1L3 on human plasma cell-free DNA digestion has been shown to be similar to that observed in mice. 61,65 Further studies on the mouse model showed that in the absence of Dnase1L3, Dnase1 took on the predominant role of processing the mononucleosomal DNA. While histone proteins were intact within the mononucleosomal unit of cell-free DNA, Dnase1L3 and Dnase1 mainly cleaved the minor grooves exposed when a piece of double-helical DNA was wound around the core histone proteins. 61 This physical restriction resulted in the characteristic size profile of cell-free DNA whereby molecules shorter than 166 bp were usually shorter than their longer peers by a stepwise downward gradation of every 10 bp because the minor grooves on the double helix were about 10 bp apart ( Figure 1). 50 When the physical structure of nucleosomes was disrupted, for example, by adding heparin to plasma and displacing the histone proteins, the 10-bp periodic pattern of peaks in cell-free DNA fragment sizes disappeared. 61 The size profile instead revealed much more short DNA (<166 bp) and was represented by a smoothened curve without the 10-bp periodic pattern. 61 69 In such surrogate pregnancies, polymorphic variants could be used to distinguish the fetal mitochondrial molecules from those of the gestational carrier. Notably, more mitochondrial DNA from the gestational carrier circulated in circular forms while the fetal ones were mainly in linear forms. 69 A median of 88% of the fetal mitochondrial DNA molecules were in linear form compared with that of 49% among the molecules from the gestational carrier. 69 Knowing this difference in physical characteristic renders it possible to design assays that preferentially analyzes the linear mitochondrial DNA and hence possibly enrich for the fetally derived population to facilitate studies in plasma of pregnancies of biological mothers.

| A kaleidoscope of molecular features
From the discovery of cffDNA in 1997, 1 intensive research efforts were devoted to realizing the clinical potential of using it for noninvasive prenatal testing. In those initial years, insights into the biology of cffDNA were mainly gained from observing the physiological changes during pregnancy (e.g. gestational age progression, post-partum changes) or by comparing pregnancies with and without complications (e.g. preeclampsia, confined placental mosaicism). Technological advancements in molecular analysis platforms, such as massively parallel sequencing and bioinformatics, highthroughput analyses at much higher resolution become feasible.
The analytical resolution has attained an extent to allow per molecule per nucleotide analysis. Consequently, the pace at which we are uncovering the physical and molecular characteristics of cffDNA have accelerated substantially in recent years. This once enigmatic source of genetic material has now bared itself in front of our eyes unveiling its genomic distribution, tissue origin, fragmentation processes, length characteristics, ending site features and topological forms. Because these physical features bore relationship to how the circulating DNA fragments were derived from its cell-oforigin, they provided many options for us to distinguish the fetal from the maternal molecules. We foresee this knowledge would inspire the development of new laboratory procedures to enrich cffDNA, new bioinformatics algorithms to home in on cffDNA signals, and the assembly of more discrminant cell-free DNA profiles to distinguish different pregnancy-associated conditions. These developments may provide enhancements in the analytical and clinical sensitivities of current NIPTs as well as spur the development of newer test applications aiding prenatal management and the monitoring of pregnancy health.