It is important to have a reliable and stable method to assess the quality of RNA samples generated from precious heterogeneous tissues, especially from small anatomical regions, such as the substantia nigra and hypothalamus. The most widespread measure for estimating the integrity of RNA samples at present is the RNA Integrity Number (RIN) as calculated by the Agilent 2100 Bioanalyzer for electrophoresis (Agilent Technologies UK Ltd, Edinburgh, UK). The RIN ranges from undetectable to ten, with undetectable being completely degraded and 10 being the most intact RNA. The calculation of RIN value is largely based on ribosomal RNA separation although this measure has been shown to be inconsistent (Imbeaud et al. 2005; Schroeder et al. 2006; Sherwood et al. 2011).
We are building a publicly accessible database of regional human brain expression, the UK Human Brain Expression Consortium, to allow the assessment of the genetic variability in gene expression (expression quantitative trait loci, eQTLs) and splicing (splicing quantitative trait loci, sQTL) as well as detailed genome-wide expression analysis (Hardy et al. 2009). To that end, we are collecting a large series of control human brain tissues (originating from ∼130 individuals) in which we are dissecting 13 different CNS areas: prefrontal cortex Brodmann areas 9 and 46, parietal cortex Brodmann areas 3,1, and 2, occipital cortex (OCTX) Brodmann areas 17, temporal cortex Brodmann areas 21,41 and 42, central white matter (WHMT) below Brodmann areas 39 and 40, hippocampus, thalamus, hypothalamus, putamen (PUTM), cerebellum (CRBL), substania nigra, medulla and spinal cord. From each individual brain, we isolated DNA for whole genome genotyping analysis and from each region we isolated RNA for whole transcriptome exon array analysis. This resulted in a total of 1266 RNA samples analysed on Affymetrix Exon arrays and represents by far the largest single CNS expression dataset at present. For this quality control study, we focused on analysing the factors that affected the reliability of the RNA samples.
In this study, we assess: (i) the effects of brain bank, age, gender, cause of death, region, post-mortem delay and brain pH on RIN-based RNA quality, and, (ii) the effects of RNA quality on the performance quality of the array experiment, which was measured by a reliable and widely used parameter, present call (%P). %P is the percentage of probe sets with signal detection above background noise. We examine the effects of RNA quality on the cDNA preparation and cRNA production as part of the quality control of the array experiment, and finally we confirm the reproducibility of array data using QuantiGene (QG), a novel, PCR-independent platform (Canales et al. 2006; Arikawa et al. 2008; Hall et al. 2011).
- Top of page
- Materials and methods
To our knowledge, the UK Human Brain Expression Consortium data set is the largest control brain microarray data set generated to date. It is based on the analysis of tissue samples from 13 different CNS regions originating from 137 individuals and containing 2318 processed samples. The main goal of this project is to build a large reference database for eQTL and sQTL analysis.
This study showed considerable variation in RIN values among RNA samples. Sixty-seven per cent of the variation resides in differences among extractions from the same tissue blocks and most of the remaining variation is unexplained by the available covariate information. We found that pH is the most important post-mortem factor influencing RIN-based RNA integrity, a result consistent with previous studies (Hardy et al. 1985; Mexal et al. 2006; Chevyreva et al. 2008; Monoranu et al. 2009; Durrenberger et al. 2010). Samples with very low pH values (ranging from 5.42 to 5.90) were responsible for the positive correlation seen between pH and %P (and also RIN). However, when these low pH samples were removed from the analysis we no longer observed any significant correlation. This may in part explain contradictory observations regarding the effect of pH on RNA integrity and sample performance on arrays (Hardy et al. 1985; Monoranu et al. 2009; Birdsill et al. 2010) and confirms the findings of a recent, but smaller study on control brain tissue (Tomita et al. 2004; Sherwood et al. 2011).
We found our array-based expression data to be reliably validated by the QuantiGene PCR-independent method, when tested on two high expression genes (MAPT and SCN8A) and one low expression gene (LRRK2). However, the performance quality of the array, as defined by %P, was not profoundly affected by age, gender, region, PMI, RIN and cause of death. This confirms findings from previous studies on much smaller sample sizes (Tomita et al. 2004; Birdsill et al. 2010; Durrenberger et al. 2010). We found that only 2.7% of the variation in %P was explained by RIN. Indeed, 80 RNA samples with undetectable RINs performed well on the arrays with %P values ranging from 45 to 76%. Thus, we found RIN to be a poor predictor of array quality performance even at the low end of the RIN scale. Furthermore, the latter was confirmed since the cDNA and cRNA length synthesis was not affected by the wide range of RIN values (from 2 to 7) in our array experiments. The robust performance of the Affymetrix Exon arrays in the face of degraded RNA may be due to recent changes to the RNA amplification process. In keeping with the manufacturer’s instructions, this was performed using the Ambion® WT Expression kit, which uses both non-polyA and polyA-based mRNA priming for first strand cDNA synthesis. This meant that RNA amplification did not require an intact polyA tail. In addition, increasing the quantity of the starting material of RNA from 500 to 750 ng improved the array performance.
Through the analysis of this observational study, we experienced different limitations. For example, we had expected that cause of death would greatly influence both RIN-based RNA quality and %P-based array quality, but cause of death only explained 1.9% of variation in RIN and we did not find any significant relationship with %P. It may be that cause of death is an imperfect reflection of the true medical and drug treatment history of the individual, and that access to that history, were it available, would reveal other factors of greater relevance. Likewise, in the range of 28–114 h PMI did not affect on either RIN or %P, nor could we see a loss of RIN-based RNA quality over the 1–5 h range. It remains possible that there may be selective loss of RNA within an hour because the half-life of some mRNA species has been reported to be as short as 15 min, while others may be as long as 22 days depending on the tissue type and storage conditions (Ross 1995; Barrachina et al. 2006; Bahar et al. 2007; Beach et al. 2008; Vennemann and Koppelkamm 2010). This issue has been studied in detail for a range of PMIs by Harrison et al. 1995 and confirmed by a more recent study by Tomita et al. (2004). The same conclusion was made that PMI had a limited effect on mRNA (Harrison et al. 1995; Tomita et al. 2004).
Furthermore, there is a possibility that samples with the shortest PMIs (1–2.5 h) within the SHRI-USA sample set may originate from those individuals who suffered longer agonal states prior to death and agonal stress has been shown to affect gene expression differently in different brain regions (Li et al. 2007). Other factors may contribute to this such as intermittent edge effects, especially on small samples during dissecting and tissue handling procedures.
Finally, we note that our brain samples have been derived from only two sources: one was a rapid death brain bank with long post-mortem intervals and the other specialises in obtaining very short post-mortem intervals. Both brain banks had separately optimised their protocols to facilitate gene expression studies and this may limit the generalisability of these conclusions to tissue collected in other ways.
These results are important for several reasons. Firstly, they confirm the practical feasibility of using post-mortem control brain tissue to study the transcriptome of the human brain by array technology. Secondly, they show that microarrays can give reliable results over a wide range of RIN numbers (1–8.5) and pH measurements with drop off in array validity only being observed below brain pH 5.9. Thirdly, they show that the results from Affymetrix exon arrays are reproducible by other technologies, making it possible for database users to use the data generated with confidence.
Furthermore, this study is the first step of an ongoing multi-regional human brain expression project that has been established to build an open-access database of identified genome-wide genetic variability in relation with gene eQTLs and sQTL as well as for detailed expression analysis (Hardy et al. 2009). We hope this will move the field forward in our understanding of the underlying molecular mechanisms of complex neurological and psychiatric diseases, and will support the neuroscience community with a resource which will bring functional insights.