SMN1 and SMN2 copy numbers in cell lines derived from patients with spinal muscular atrophy as measured by array digital PCR

Proximal spinal muscular atrophy (SMA) is an early-onset motor neuron disease characterized by loss of α-motor neurons and associated muscle atrophy. SMA is caused by deletion or other disabling mutation of survival motor neuron 1 (SMN1). In the human genome, a large duplication of the SMN-containing region gives rise to a second copy of this gene (SMN2) that is distinguishable by a single nucleotide change in exon 7. Within the SMA population, there is substantial variation in SMN2 copy number; in general, those individuals with SMA who have a high SMN2 copy number have a milder disease. Because SMN2 functions as a disease modifier, its accurate copy number determination may have clinical relevance. In this study, we describe the development of an assay to assess SMN1 and SMN2 copy numbers in DNA samples using an array-based digital PCR (dPCR) system. This dPCR assay can accurately and reliably measure the number of SMN1 and SMN2 copies in DNA samples. In a cohort of SMA patient-derived cell lines, the assay confirmed a strong inverse correlation between SMN2 copy number and disease severity. Array dPCR is a practical technique to determine, accurately and reliably, SMN1 and SMN2 copy numbers from SMA samples.


Introduction
Spinal muscular atrophy (SMA; OMIM #253300) is an early-onset neurodegenerative disease characterized by the loss of a-motor neurons (MNs) in the anterior horn of the spinal cord (Crawford and Pardo, 1996). This loss of a-MNs is associated with muscle weakness and atrophy. SMA is an autosomal recessive disease and is a leading genetic cause of infant death worldwide with an incidence of 1 in 6000-10,000 births (Pearn, 1978;Cusc o et al., 2002). The carrier frequency for SMA is 1:25-50 in most populations (Ben-Shachar et al., 2011;Su et al., 2011;Sugarman et al., 2012;Lyahyai et al., 2012) though it is lower for some ethnicities (Zald ıvar et al., 2005;Labrum et al., 2007;Hendrickson et al., 2009;Sangar e et al., 2014). SMA results from the loss or mutation of SMN1 (survival motor neuron 1; OMIM #600354) on chromosome 5q13 (Lefebvre et al., 1995). In humans, a large tandem chromosomal duplication has lead to a second SMN2 copy of the gene (OMIM #601627). SMN2 can be distinguished from SMN1 by a single-nucleotide difference (c.850C>T) at the outset of exon 7 that disrupts an exonic splice enhancer Monani et al., 1999). As a result, most of SMN2 mRNAs (about 80-90%) lack exon 7 (SMND7) and produce a protein that is both unstable and less than fully functional (Lorson and Androphy, 2000;Burnett et al., 2009). With just 10-20% of the SMN2 gene product full length and functional, increasing number of SMN2 partially complements loss of SMN1 with diminished severity of the phenotype (Coovert et al., 1997;Lefebvre et al., 1997;McAndrew et al., 1997;Prior et al., 2005;Swoboda et al., 2005;Wirth et al., 2006;Tiziano et al., 2007;Elsheikh et al., 2009). The capacity of SMN2 copy number to modulate phenotype has been extended to transgenic mouse models (Monani et al., 2000;Hsieh-Li et al., 2000;Michaud et al., 2010).
Because SMN2 copy number influences disease severity in SMA, there is prognostic value in accurate measurement of SMN2 copy number from patients being evaluated for SMA. Molecular diagnosis of SMA-that is, loss of SMN1-has historically been made using a polymerase chain reaction (PCR)-based assay followed by digestion of the PCR product with specific restriction endonucleases (Lefebvre et al., 1995;van der Steege et al., 1995). Numerous assays have been developed to quantify SMN2 copy number in DNA samples from SMA patients. These assays include radioactive PCR (Coovert et al., 1997), quantitative-or real-time PCR (qPCR)-- (Feldk€ otter et al., 2002;Anhuf et al., 2003;G omez-Curet et al., 2007), competitive PCR/primer extension (G erard et al., 2004), denaturing high-performance liquid chromatography (Su et al., 2005), multiplex ligation-dependent probe amplification (Huang et al., 2007), quantitative capillary electrophoresis fragment analysis (QCEFA, Kirwin et al., 2013) and short-amplicon melt profiling (Dobrowolski et al., 2012). An important limitation of these established PCR-based copy number assays is the requirement for a parallel-run calibration curve to assign a breakpoint necessary that identifies placement of an ordinal SMN2 value.
Digital PCR (dPCR) offers a means of measuring the abundance of a target molecule quantitatively without the need for a calibration curve. In dPCR, the template DNA is distributed across a large number of partitions by limited dilution (Sykes et al., 1992;Vogelstein and Kinzler, 1999). As a result, some of the partitions will lack the template DNA and, as such, will not amplify the target molecule during PCR. By counting the number of partitions containing the amplified target PCR product (positive partitions) and the number of negative partitions, the absolute abundance of the target molecule can be measured in a sample. There are currently two platforms for dPCR-microfluidics and microdroplet emulsion (Day et al., 2013). Zhong et al. (2011) show in a pilot study that droplet dPCR can be used to measure SMN2 copy number in a small number (n = 4) of SMA samples. We demonstrate here the feasibility of using an array dPCR system containing 20,000 partitions in determining the number of copies of SMN1 and SMN2 in DNA samples and show that there is a strong correlation between SMN2 copy number and SMA disease severity.

Ethics statement
Fibroblast lines generated at the Nemours/Alfred I. duPont Hospital for Children (N/AIDHC) were established following a protocol approved by the N/AIDHC Institutional Review Board, obtained following written informed consent or assent. The cell lines were de-identified so that no protected health information related to these cell lines is known.

Cell lines
Fibroblast and lymphoblastoid cell lines (LCLs) were either established at N/AIDHC using standard procedures (Villegas and McPhaul, 2005) or obtained from Johns Hopkins University (T. O. C.; Baltimore, MD), the University of Utah (K. J. S.; Salt Lake City, UT) or a cell line repository. Some of the fibroblast lines established at N/ AIDHC were obtained from the Molecular Diagnostics Laboratory, while others were established in the Motor Neuron Diseases Research Laboratory from skin samples obtained from the MDA Neuromuscular Clinic or the Nemours Biobank. The following cell lines were obtained from Coriell Cell Repositories (Camden, NJ): GM00232, GM00409, GM00489, GM03813, GM03814, GM03815, GM09677, GM10684, GM22592, GM23255, GM23603, GM23686, GM23687, GM23688 and GM23689. Fibroblast lines UMB-1897, UMB-4648 and UMB-4994 were obtained from the NICHD Brain and Tissue Bank for Developmental Disorders at the University of Maryland (Baltimore, MD). The number of cell lines obtained from non-SMA and SMA patients with varying degrees of disease severity is shown in Table 1.

Genomic DNA isolation
Genomic DNA (gDNA) was isolated from fibroblast and LCL cell pellets, using the Gentra Puregene Cell Kit (QIA-GEN, Germantown, MD). The pellets were disrupted in 500-1500 lL cell lysis solution depending on the pellet size. The cells were lysed overnight at room temperature in a 15-mL conical tube and then divided in 500-lL aliquots. RNase A Solution (2.5 lL) was added to each aliquot and incubated for 5-60 min at 37°C water bath for five minutes. After chilling the digested aliquots on ice, 165 lL Protein Precipitation Solution was added to each aliquot, vortexed vigorously for 20 sec and then centrifuged for 1 min at 16,000 g. The supernatant was then transferred to a clean microcentrifuge tube containing 500 lL 2-propanol. The samples were inverted 50 times to facilitate precipitation of DNA and then centrifuged for 1 min at 16,000 g. After the supernatant was discarded, the DNA pellet was washed with 500 lL 70% ethanol and centrifuged for 1 min at 16,000 g. The DNA pellets were allowed to air dry for 10 min. After drying, they were re-suspended in 25 lL DNA Hydration Solution and incubated for 60 min at 65°C overnight at room temperature on an orbital shaker. The concentration of the purified gDNA was determined by an ND-2000C NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA). The integrity of the gDNA was verified by agarose gel electrophoresis.

SMN sequencing
To determine if each sample contained SMN1 and/or SMN2, exon 7 of SMN was amplified by PCR, using SMN-SEQ7F and SMN-SEQ7R as primers (Vezain et al., 2010). The PCR conditions were as follows: 5 min 96°C followed by 40 cycles of 30 sec 96°C, 30 sec 56°C and 1 min 72°C followed by a final extension step for 7 min at 72°C. The PCR product was purified using the Wizard SV Gel and PCR Cleanup System (Promega, Madison, WI) according to manufacturer's directions. The purified PCR product was sequenced with the ABI 3130xl Genetic Analyzer (Life Technologies) automated sequencer, using the BigDye Terminator v3.1 Cycle Sequencing kit (Life Technologies).

SMN1 and SMN2 copy number assays
The copy numbers of SMN1 and SMN2 were measured in each gDNA sample using the QuantStudio TM 3D Digital PCR System (Life Technologies). Figure 1 shows the workflow of the SMN1/SMN2 copy number assays. The concentration of double-stranded DNA (dsDNA) was measured from each sample using the Qubit TM dsDNA Broad Range Assay kit (Life Technologies). 400 ng dsDNA was digested with 20 U EcoRI (New England Biolabs, Inc., Ipswitch, MA) for 60 min at 37°C. After thermal denaturation (20 min at 65°C), the digest DNA was diluted fourfold with nuclease-free ddH 2 O. RPPH1 and either SMN1 or SMN2 signals were amplified in each PCR reaction. Each PCR reaction contains 30-60 ng EcoRI-digested gDNA, RPPH1 primer/probe mix (19), SMN1/SMN2 primer probe mix (900 nmol/L SMN1-Ex7-116F, 900 nmol/L SMN1-Ex7-261R, 200 nmol/L either SMN1-Ex7-206T or SMN2-Ex7-anti) and QuantStudio TM 3D Digital PCR Master Mix. Each PCR reaction was then loaded into a QuantStudio TM 3D Digital PCR 20K chip according to manufacturer's directions. The chips were then loaded onto the Dual Flat Block GeneAmp TM PCR System 7900 and PCR was performed using the following conditions: 10 min at 96°C followed by 39 cycles of 2 min at 60°C, and 30 sec at 98°C followed by 2 min at 60°C. 6FAM and VIC images from each chip were then taken with the QuantStudio TM 3D Instrument which provided the copies of SMN1 (or SMN2)/lL (6FAM) and of RPPH1/lL (VIC). The raw data were subsequently analyzed using the QuantStudio TM 3D AnalysisSuite Cloud Software. The copy number of SMN1 or SMN2 was calculated with the following equation: SMN1 (SMN2) copy number ¼ raw SMN1 (SMN2) number / (raw RPPH1 number=2Þ:

Statistical analysis
The coefficient of variation (CV) was used to measure the reliability of the QuantStudio TM 3D dPCR assay (G omez-Curet et al., 2007). The CV is defined as the standard deviation (SD) of the copy number divided by the mean copy number for all subjects with the same copy number. Spearman's correlation analysis (r) was used to determine the relationship between SMN2 copy number and disease severity in the SMA samples. All statistical analyses were performed with SPSS v.22.0 (IBM, Armonk, NY).

Results
We measured the number of SMN1 and SMN2 copies in gDNAs isolated from cell lines derived from SMA patients as well as from healthy non-SMA subjects using array dPCR (Fig. 1). SMN1 or SMN2 dPCR was multiplexed with RPPH1 because the copy number of RPPH1 does not vary amongst the human population (Baer et al., 1990). The gDNA templates were digested with EcoRI prior to PCR amplification as there are no EcoRI restriction sites within the SMN1, SMN2 or RPPH1 regions of amplification. The array dPCR assay detected accurately and reproducibly from 0 to 3 copies of SMN1 and 0 to 5 copies of SMN2 in the analyzed samples. The assay conditions were first tested on SMN1 and SMN2 reference standards (n = 7) obtained from the Clinical Laboratory Improvement Amendments (CLIA)certified Molecular Diagnostics Laboratory at N/AIDHC. These standards were generated from genomic DNA extracted from blood specimens, with copy numbers assessed by QCEFA (Kirwin et al., 2013). One of these reference samples (SDC1) could not be accurately accessed for SMN2 copy number using QCEFA. Blinded array dPCR determination of SMN1 and SMN2 copy numbers in the reference samples matched that obtained by QCEFA (Table 2). Of interest, sample SDC1 measurements fell well within the detection capability of array dPCR and this sample carried 4 copies of SMN2.
This validation phase also included a comparison between array dPCR and real-time TaqMan TM qPCR (qPCR). The SMN2 copy numbers for a subset of our SMA samples (n = 30) were determined previously (G omez-Curet et al., 2007). We saw good concordance between the 2 techniques for samples carrying low SMN2 copy number (i.e., ≤2 copies); however, the concordance dropped to 80% (12/15) for samples carrying higher SMN2 copy numbers (Fig. 2). This result highlights the inability of TaqMan TM qPCR to accurately measure copy numbers greated than 3 (G omez- Curet et al., 2007;Prior et al., 2011). To test the upper limit of detection for this new assay, we used gDNA from a set of 4 SMA (GM00232, GM03813, GM09677, and GM10684) and 1 carrier (GM03814) Coriell Cell Repositories cell lines that were shown to contain 2, 3, 3, 2, and 5 SMN2 copies, respectively, using droplet dPCR (Zhong et al., 2011). Our copy number measurements were in complete concordance with the published droplet dPCR results, most notably for GM03814 which carried a high copy number of SMN2.
To determine the reliability of the array dPCR copy number assays, the mean CV was calculated for each SMN1 and SMN2 copy number measurement for both the SMA (Table 3) and non-SMA (Table 4) samples derived from cell lines (G omez-Curet et al., 2007). In both SMA and non-SMA samples, the CVs for each SMN1 and SMN2 copy numbers were below 4% demonstrating that our array dPCR copy number assays were reliable.  Using these assay conditions, we determined the copy numbers of SMN1 and SMN2 for all of the cell lines within our collection. Our collection contained both fibroblasts and Epstein-Barr virus (EBV) immortalized LCLs. One hundred cell lines-63 of which were derived from SMA patients-were used in this study (Table 1). All but one of the 63 SMA DNA samples had a loss of both SMN1 alleles as determined by using Sanger sequencing. The remaining cell line harbored one deletion allele and one missense mutation in SMN1 (c.38C>G; SMN1p.A2G). Most of the SMA samples contained 2 or 3 copies of SMN2 (Fig. 3A). For those SMA patients harboring deletions of SMN1 and whose disease severities were known (n = 59), patients with higher SMN2 copy numbers (i.e., 3 or 4) had milder disease severities than those patients with fewer copies of SMN2 (i.e., 2; Fig. 3B). In fact, there was a strong, negative correlation between SMN2 copy number and SMA disease severity within this group (r = 0.830, P < 0.001).

Discussion
We established new array dPCR SMN1 and SMN2 copy number assays that accurately measured copy numbers in SMA as well as in non-SMA DNA samples isolated from whole blood cells and cell lines derived from fibroblasts and lymphoblasts. The dPCR-derived SMN1 and SMN2 copy numbers matched those found in reference standards used by a diagnostic laboratory and in a limited number of cases using microdroplet dPCR (Zhong et al., 2011). SMN2 copy numbers in SMA DNA samples were concordant with those results measured by TaqMan TM qPCR (G omez-Curet et al., 2007) at low SMN2 copy numbers but the concordance was not as strong at higher (i.e., >3) SMN2 copy numbers. The majority of dPCR / TaqMan TM qPCR mismatches occurred at higher SMN2 copy numbers where the TaqMan TM qPCR assay cannot easily distinguish unit differences (G omez-Curet et al., 2007;Prior et al., 2011). Array dPCR detected unit differences in SMN2 copy number over a wide range  of SMN2 copy numbers similar to droplet dPCR (Zhong et al., 2011). Because of this wide range of detection, dPCR can be very useful in accurately quantifying SMN2 copy number in patients with milder forms of SMA, that is, type III SMA, who generally have higher SMN2 copy numbers.
The reliability of the array dPCR assays was determined by comparing the coefficients of variation (CV) for all samples with the same copy number. Our array dPCR results had a 1.6-3.7% CV for SMN1 and 2.1-3.7% CV for SMN2. In contrast, the TaqMan TM qPCR assay shows a 5.2-9.7% CV for SMN1 and a 0.8-7.6% CV for SMN2 (G omez-Curet et al., 2007). The greater reliability of the array dPCR assays when compared against the TaqMan TM qPCR assays is a result of the random distribution of template DNA molecules within the 20,000 partitions in microfluidic dPCR array (Whale et al., 2014).
Using array dPCR, we have confirmed a very strong inverse correlation between SMN2 copy number and disease severity in our SMA patient samples. Numerous previous studies also document a similar relationship between SMN2 copy number and SMA disease severity (Coovert et al., 1997;Lefebvre et al., 1997;McAndrew et al., 1997;Prior et al., 2005;Swoboda et al., 2005;Wirth et al., 2006;Tiziano et al., 2007;Elsheikh et al., 2009;Crawford et al., 2012). SMN2 copy number is associated with many measures of SMA phenotype severity including gross motor function, forced vital capacity, muscle mass, and denervation Rudnik-Sch€ oneborn et al., 2009;Kaufmann et al., 2011;Crawford et al., 2012;Kaufmann et al., 2012). Many current and future clinical trials for SMA will use these outcomes measures along with changes in SMN expression to gauge efficacy (Nurputra et al., 2013). Because SMN2 copy number is a defining criteria of eligibility to many SMA clinical trials, accurate and reliable measurements will continue to be essential to clinical research.
In some cases within our pool of SMA samples, there were SMA patients with low SMN2 copy numbers exhibiting a milder phenotype. A rare variant in SMN2, SMN2 c.859G>C, may explain this finding as it results in a partial rescue of the truncated, exon 7 excluded, transcript that characterized most of the mRNA generated from SMN2 Vezain et al., 2010). Array dPCR will aid in the identification of cases having mismatches from the expected genotype-phenotype relationship. Identifying such mismatches could lead to the identification of potential complementing mutations in SMN2 like SMN2 c.859G>C.
Array dPCR can be easily used to measure SMN1 and SMN2 copy numbers accurately in DNA samples obtained from SMA patients and healthy, non-SMA controls. Array dPCR can accurately determine copy number within a wider range of SMN2 copies (0 to at least 5 copies) than either qPCR (0-3 copies) or QCEFA (0-4 copies). Unlike qPCR, dPCR does not require a calibration curve to assign a numeric measure of copy number (Day et al., 2013). Because the template DNA molecules are randomly distributed amongst the 20,000 partitions in dPCR, copy number measurements are more precise and reliable when compared against qPCR (Whale et al., 2014). For these reasons, array dPCR has advantages in comparison to conventional diagnostic measurements of SMN1 and SMN2 copy number in SMA patient DNA samples. Future work using a larger cohort of gDNA extracted from blood samples will determine the applicability of array dPCR for SMA diagnostics and as a prognostic tool.

Acknowledgments
We would like to dedicate this publication to the memory of Dr. Wenlan Wang who passed away on 26 May 2011. N. J. C. and K. W. L. received fellowships from the University of Delaware Undergraduate Research Program. We thank Susan Kirwin and Priscilla Moses of the Nemours Molecular Diagnostics Laboratory for providing some of the fibroblast lines from their archive as well as their SMN1/SMN2 internal reference standards, Dr. Diana Corao-Uribe, Janell Raber, and Christine Brandt-Montone of the Nemours Biobank for obtaining some of the skin biopsies used to establish some of the fibroblast lines and Dr. Ron Zielke and John Cottrell of the NICHD Brain and Tissue Bank for Neurodevelopmental Disorders at the University of Maryland, Baltimore, MD for providing some of the fibroblast lines used in this study. Other fibroblast lines used in this study were obtained from the Intellectual and Developmental Disabilities Research Center at Kennedy Krieger Institute, Baltimore, MD which is supported by the NICHD grant 1U54HD079123-01A1.