Microarray Identification of RAR Target Genes
Microarray analysis is a very powerful approach to identify target genes (Schena et al., 1995) and already has been used in the Xenopus system (Altmann et al., 2001). We decided to use microarray analysis to identify genes up- or down-regulated in response to modulating RAR signaling. The chip we used for microarray analysis contained 21,120 normalized, sequenced neurula (stage 13–15) cDNAs (Shin et al., 2005). These 21,120 cDNAs represent approximately 8,700 distinct transcripts. Fluorescently labeled probes were generated from poly A+ RNA prepared from TTNPB- or AGN193109-treated embryos, hybridized to the microarrays, washed, and scanned.
Despite its power as a system, some technical difficulties must be addressed for the results of microarray analysis to be reliable. The first problem is that, because retinoids are present in normal embryos, there will be some background of induced mRNAs in the control embryos, reducing the sensitivity and dynamic range of the assay. To overcome this problem, we compared agonist with antagonist-treated embryos and identified how the genes identified varied compared with control embryos during the validation step. This approach improves the dynamic range of the assay compared with agonist/control and antagonist/control probe pairs. Previous experiments suggested that gastrula stage (stage 12) embryos did not show much difference in the expression of genes already known to be direct RAR targets (e.g., Hoxb-1, Hoxd-1) in response to TTNPB treatment compared with neurula (stage 18) embryos where these differences are more pronounced. Therefore, we used RNA from stage 18 embryos for microarray analysis.
Another important issue is the difference in sensitivity, labeling efficiency, and signal/noise ratio between the red dye (cy5) and the green dye (cy3) commonly used for microarray analysis. We overcame this problem by changing from direct incorporation of fluorescent nucleotides to incorporation of aminoallyl dUTP during reverse transcription followed by chemical coupling of the dye molecules following the advice of the Brivanlou laboratory (Altmann et al., 2001). The single labeling followed by chemical dye coupling ensures that the labeling efficiency in each reaction is virtually identical. Further improvement was obtained by substituting Alexa Fluor 546 for Cy3 and Alexa Fluor 647 for Cy5. This change produced very intensely labeled probes that exhibited substantial increases in the red signal and dropped background to levels comparable to the green channel making normalization of channel intensity unnecessary.
Our initial experiments used embryos from several frogs, and the resulting mRNA was pooled. The probes were prepared, labeled, hybridized to the arrays in triplicate, washed, scanned, and analyzed. The fold induction was calculated as an average ratio of median feature pixel intensity at wavelength 635 nm to median feature pixel intensity at wavelength 532 nm. Northern blotting was used to establish the validity of the microarray analysis. To our surprise, 40% of the top 35 up-regulated genes (12 of 29) by microarray were not changed when checked by Northern blot (data not shown). Such a false-positive rate is incompatible with identifying the network of target genes regulated by RARs; therefore, we modified the approach to reduce false positives.
It has been suggested that biological variation is the largest source of error in microarray experiments (Baldi and Hatfield, 2002). We reasoned that embryos from an individual mating would exhibit stochastic fluctuations in gene expression that would be different from those in matings between different males and females. Therefore, embryos were obtained from matings between three different pairs of males and females and subsequently kept strictly separate and treated independently. Probes were prepared from the RNA populations isolated from treated groups and used as matched pairs derived from the same frog in competitive hybridization to the microarray. Each experiment was replicated yielding a total of six data sets. Bad hybridization spots were filtered from each data set before statistical analysis. Thus, the total number of replicates varies by clones depending on hybridization quality, even though all experiments were performed six times. By using this approach and calculating fold-induction values for each agonist/antagonist pair, none of the true positives from the first experiment was lost, whereas all of the false positives disappeared.
To further improve the statistical validity of our results, we used Cyber-T software, which was developed by Pierre Baldi, Tony Long, and colleagues at UCI (http://visitor.ics.uci.edu/genex/cybert/ Baldi and Long, 2001; Long et al., 2001). Cyber-T analysis allows users to chose either the observed variance among replicates within treatments (simple t-test) or a Bayesian estimate of the variance among replicates within treatments based on a prior distribution obtained from a local estimate of the standard deviation (Bayesian statistics). We first tested which method generated the largest number of expressed sequence tags (ESTs) showing statistically significant changes in expression in response to ligand treatment. P values from Bayesian statistics were plotted against log transformed fold changes (magnitude of gene expression, Fig. 1). A sliding window size of 101 and a Bayesian confidence value of 10 were used. The number of statistically significant clones (P < 0.01) identified was 3,759 for the simple t-test compared with 4,811 by Bayesian analysis. The smallest fold change observed in the Bayesian analysis was 1.18-fold, whereas it was 1.08-fold for the simple t-test. Therefore, Bayesian analysis is not expected to increase the number of false positives and this expectation proved to be the case when we tested the response of these genes by Northern and QRT-PCR analysis (see below).
Figure 1. Relationship between fold change and magnitude of the gene expression change as a function of the Bayesian statistical significance for all 18,437 clones. Significance is plotted as the negative log10 of the P value, and the magnitude represents the log base two-fold change in gene expression. The ordinate axes show arbitrary cutoff of 1.75-fold for down-regulated clones and 1.5-fold for up-regulated clones, and the secondary abscissa axis shows P value of 0.01. The points in section a represent clones that are changed less than arbitrary cutoff but are statistically significant. The points in section b represent clones that are changed less than the arbitrary cutoff and are not statistically significant. The points in section c represent clones both statistically significant and altered by more than the arbitrary cutoff. The points in section d represent clones that are changed more than arbitrary cutoff but are not statistically significant.
Download figure to PowerPoint
We chose 1.5-fold up-regulation and P < 0.01 as initial criteria for determining significance. Samples that had less than three repetitions were automatically eliminated during data analysis by Cyber-T. The resulting data set consisted of 18,437 ESTs, which have three to six repetitions. The number of up-regulated clones predicted to be significant by the simple t-test was 505, whereas Bayesian statistics yielded 571. Thus, 66 ESTs shown to exhibit significant changes using Bayesian analysis were not statistically significant using the simple t-test. Among these 66 ESTs are two known to be up-regulated by RA: XCYP26 (Hollemann et al., 1998) and XMeis1 (Mercader et al., 2000). We sampled these 66 ESTs and found that 18 of 19 were validated by QRT-PCR and 3 of 3 by Northern blotting. Overall, these results show that Bayesian statistics detected more true positives than simple t-statistics without increasing the number of false positives. Therefore, Bayesian statistics were used for all data analysis in this manuscript.
To more confidently estimate the minimum fold change required for a gene to be considered significant, we sampled a subset of the putative up-regulated and down-regulated genes and verified that they were regulated in embryos treated with ligands. Northern blot analysis was performed with selected genes, including known RA-responsive genes, anterior and posterior marker genes, and uncharacterized genes (Fig. 2). The expression levels of RA up-regulated genes such as midkine (MK) that was enhanced by RA treatment in mammals (Muramatsu and Muramatsu, 1991), RA degradation enzyme CYP26 (Hollemann et al., 1998), short-chain dehydrogenase/reductase (SDR; Cerignoli et al., 2002), posterior marker genes, Xcad3 (Shiotsugu et al., 2004), XMeis3 (Kudoh et al., 2002), and Forkhead transcription factor HNF-3α (Jacob et al., 1994) were enhanced with TTNPB treatment and reduced with AGN193109 treatment. Meanwhile, RA down-regulated anterior marker genes such as Otx-2 (Pannese et al., 1995) and XA-1 (Hemmati-Brivanlou et al., 1990) behaved as expected in response to ligand treatments. All tested clones, including unknown and hypothetical genes, altered their expression in response to ligand treatment in the same direction in Northern blotting or QRT-PCR as in the microarray results. As Northern blot analysis is not suitable for testing the expression of large numbers of genes, we adopted QRT-PCR as our main gene expression analysis tool and compared the expression of selected genes in TTNPB, AGN193109 and solvent treated embryos. We repeated validation of Xcad-3, XMeis3, Fetuin B, MK, HNF-3α, and two ESTs (XL044f02 and XL041b23) in QRT-PCR and confirmed that QRT-PCR generated similar results to those of Northern blot analysis. A total of 113 up-regulated and 115 down-regulated genes were tested by QRT-PCR and/or Northern analysis. Table 1 shows overall QRT-PCR results.
Figure 2. Northern blot analysis. Equal amounts of total RNA isolated from controls or embryos treated with TTNPB or AGN 193109 were electrophoresed, blotted to nylon membranes, and hybridized with probes derived from each clone. The name of each clone used for probe-templates and its closest match found in public databases are given on the left of each panel. Fold and (PCR) indicate change in microarray analysis and a validation result of quantitative real-time reverse transcriptase-polymerase chain reaction (QRT-PCR). C, T, and 109 indicate control, TTNPB, and AGN193109 treatment, respectively. The asterisks indicate known RA target and/or posterior marker genes, and ISH indicates clones used as probes for whole-mount in situ hybridization. MK, Midkine; VLCS, very-long-chain acyl-CoA synthetase; SDR1, short-chain dehydrogenase/reductase; LCS, long-chain acyl-CoA synthetase.
Download figure to PowerPoint
Table 1. Validation of Putative RAR Target Genesa
| ||Number of genes||Tested (QRT-PCR)||Validation|
|Up-regulated genes fold|| || || |
| >1.9||102||39||100% (39/39)|
| 1.6∼1.9||130||37||95% (35/37)|
| 1.5∼1.6||130||24||83% (20/24)|
| Total||362||100||94% (94/100)|
|Down-regulated genes fold|| || || |
| >2.0||95||67||66% (44/67)|
| 1.75∼2.0||103||23||61% (14/23)|
| > 1.75||198||90||64% (58/90)|
| 1.5∼1.75||404||25||36% (9/25)|
The genes were divided into three groups based on fold induction and a fraction of genes in each group was validated by QRT-PCR (Table 1) and/or Northern blot analysis (100%, 12 of 12 in group 1, 6 of 6 in group 2, and 2 of 2 in group 3). Of these genes validated by Northern blot analysis, 7 of 7 were also validated by QRT-PCR. Overall validation rate was 92%, which was calculated by the following formula (102 × 1 + 130 × 0.97 + 130 × 0.83)/362 × 100 = 92%. Therefore, we performed clustering analysis on the entire group of up-regulated genes.
The down-regulated gene set was much less robust than the up-regulated set. We set a significance threshold of 1.75-fold, P < 0.01 because the dimethyl sulfoxide–negative controls on the microarray appeared at 1.75-fold down-regulation. The down-regulated set was further subdivided into three groups, although there was no dramatic difference in the validation rate between these groups (Table 1). The validation of genes showing more than 3.0-fold change was 3 of 3 (100%) by Northern blot analysis. Considering the much lower rate of validation compared with the up-regulated genes, we limited cluster analysis to the subset of validated genes. However, it should be noted that, although genes below the selected thresholds validate at relatively low frequency, a significant fraction (36%) does validate and may be biologically important.
Expression Pattern of RAR Target Genes
We determined the expression patterns of 43 up-regulated genes encoding hypothetical/unidentified proteins in stage 18 and 22 embryos by whole-mount in situ hybridization. We also characterized the expression of two Xenopus genes encoding thyroid hormone receptor TR-βA and U8snoRNA. A significant portion of the clones tested (37/45) showed staining in neural tissue; nine genes exhibited expression in the brain, and 13 showed neural crest expression. Expression patterns of the genes tested are summarized in Table 5. The eye anlagen was a site of expression for 22 different genes, and somite expression was seen in 16 genes. Five genes showed strong staining in the lateral plate mesoderm, which is the major site of RALDH2 expression during neurula stages. Other notable staining patterns included the blastopore, cement gland, presumptive pituitary anlagen, pronephros, and blood islands. We were rather surprised to see that every clone tested exhibited a distinct expression pattern in either stage 18 or stage 22 embryos.
In cases where it was not possible to discern from whole-mounts which tissues exhibited staining, we examined transverse sections of the embryos (n = 23). Figure 4 shows sample staining of the somites (Fig. 4Q3), neural tube (Fig. 4N3), roof plate of the neural tube (Fig. 4M3), the neural tube and somites (Fig. 4P3), the lateral neural tube (Fig. 4O3), and the notochord (Fig. 4R3). The staining patterns in the whole and sectioned embryos were summarized and used to categorize genes broadly into four groups by expression pattern (Supplemental Table 5). The staining patterns by which we chose to divide the genes were the neural tube (nt), neural crest (nc), brain (br), eye anlagen (eye), notochord (no), somites (sm), lateral plate mesoderm (mes), and others, including pronephros (pr), blood islands (bi), cement gland (cg), and pituitary anlagen (pa). The signature staining for group 1 (n = 12) was present in the neural tube, neural crest, and optic anlagen. Examples of group 1 (XL027f19 and XL010b24) showed predominant neural crest and eye anlagen staining with or without staining in the closing neural tube at stage 18 (Fig. 4A1,A2,B1,B2). Both expression patterns were similar in stage 22 embryos with the migrating neural crest, eye anlagen, and neural tube staining (Fig. 4A3,A4,B3,B4). Although the examples did not show brain expression, two genes (XL021i22 and XL037p18) included staining in the probable brain at stage 22 (Supplemental Figure). XL010g16 expression included staining in the brain and neural tube at stage 18 and 22 (Supplemental Figure). Somite expression was detected in 8 genes in this group. Three genes had staining in pituitary anlagen (data shown in Supplemental Figure), and one was expressed in the presumptive pronephros and the ventral blood islands (Fig. 4C3,C6,C7). Transverse sections of the embryos disclosed that expression of two genes showing neural tube staining was localized to the presumptive roof plate (Fig. 4C5,M3) is shown in Figure 4. XL049b08 expression included staining in presumptive otic vesicle, pronephros, blood islands, and pituitary anlagen in addition to the neural tube and strong eye staining at stage 22 (Fig. 4C). This gene was clustered into this group based on the probable neural crest staining seen at stage 18 (Fig. 4C2), although the staining was unclear at stage 22. Xenopus TR-βA expression was indistinct at stage 18 but stained the neural tube, eye anlagen, neural crest, and somites at stage 22. The function of TR-βA during the neurula stage is currently unknown, because the ligand for this receptor, thyroid hormone, is not produced until much later in development.
Figure 4. Expression patterns of selected retinoic acid receptor (RAR) target genes. Expression pattern of selected upregulated genes at stage 18 and stage 22 were determined by whole-mount in situ hybridization. Embryos were hybridized with probes made from indicated expressed sequence tag (EST) clones. The staining patterns in whole and sectioned embryos were summarized and used to categorize the genes broadly into four groups (group 1, 2, 3, and 4) by expression patterns. The signature staining of group 1 is present in the neural tube, neural crest, and optic anlagen; the staining patterns of group 2 are similar to those of group 1 but absent from the neural crest. Group 3 genes show staining in the neural tube but not in the optic anlagen. Group 4 genes have staining in the lateral plate mesoderm where RALDH2 is predominantly expressed. A, B, and C are in group 1; D, E, and F are in group 2; G, H, and I are in group 3; and J, K, and L are in group 4. A1–L1, A3–L3, M1–R1, and M2–R2: dorsal view; A2–I2 and A4–I4: anterior view; C6 and J2–L2: lateral view; C7: ventral view; C5: J4–L4 and M3–R3: sagittal section of stage 22 embryos. VLCS, very-long-chain acyl-CoA synthetase; Hypothetical, similar to ESTs from other organisms; Unidentified, no significant similarity to database sequences from any organism; Sm, somites; rp, roof plate of the neural tube; no, notochord; nt, the neural tube.
Download figure to PowerPoint
Group 2 (n = 10) staining is similar to that of group 1 but lacking in neural crest staining. For example, two clones encoding hypothetical proteins (XL016p20 and XL043c08) were expressed in overall neural tube and regionalized brains and regional eye anlagen in both stage 18 and stage 22 embryos (Figs. 4D, 5E). Because the expression patterns of XL016p20 and XL043c08 are remarkably similar to each other at stage 18 and 22, they may be under the control of the same regulatory pathway. XL005g23 was expressed in anterior neural tissue as regionalized brain staining in stage 18 embryos and showed strong anterior neural tube staining with probable brain and weak regional eye staining patterns at stage 22 (Fig. 4F). All genes in this group are hypothetical or unidentified proteins (Table 5).
Figure 5. The effects of modulated retinoid signaling on expression pattern of retinoic acid receptor (RAR) target genes. The effects of AGN 193109, TTNPB, or vehicle control on expression pattern of three expressed sequence tags (ESTs), XL016p20 (S1–S6), XL049b08 (T1–T6), and XL043c08 (U1–U6), were determined in stage 18 embryos by whole-mount in situ hybridization. AGN193109 treatment reduced posterior neural expression of all three genes compared with controls (S1 vs. S2, T1 vs. T2, and U1 vs. U2) but increased medial staining of XL016p20 (S1 vs. S2), XL049b08 staining in the eye anlagen (T4 vs. T5), and XL043C08 expression in the forebrain (U1 vs. U2). In contrast, TTNPB treatment increased expression of XL016P20 throughout the embryo (S3, S6 vs. S2, S5) and reduced Xl043C08 staining in the forebrain and presumptive trigeminal ganglion (U3, U6 vs. U2, U5). S1–S3, T1–T3, and U1–U3: dorsal view; S4–S6, T4–T6, and U4–U6: anterior view; TTNPB, TTNPB-treated embryos; Control, vehicle control embryo; AGN193109, AGN193109-treated embryos.
Download figure to PowerPoint
Group 3 genes (n = 13) showed neural tube but no eye staining. The expression patterns in group 3 are more variable than those in groups 1 and 2. For example, XL035a11 had very strong staining in the cement gland, neural tube, and probable hatching gland, whereas another gene (XL033j21) that showed significant staining in the cement gland had neural tube and regional staining in the presumptive brain but no staining in the hatching gland (Figs. 4G, 5H). XL038o17, encoding Xenopus U8snoRNA (GenBank accession no. AF375054), showed unique expression in the neural tube and in a punctate pattern in the epidermis (Fig. 4I). Although this punctate staining appears to be excluded from the epidermis overlying the brain and neural tube, transverse sectioning of embryos revealed neural tube staining in stage 22 embryos (Supplemental Figure). Additional staining patterns seen in Group 3 included the neural crest (n = 1), somites (n = 3), brain (n = 2), and blood islands (n = 1). The fourth group (group 4; n = 5) showed significant staining in the lateral plate mesoderm where RALDH2 is predominantly expressed (Fig. 4J–L). XL010k11 (Fig. 4J) and XL053l12 (Supplemental Figure) showed somite staining in addition to the lateral staining. Three genes could not be classified into any of these four groups; these did not have neural tube staining but showed staining in the somites (XL040i08; Fig. 4Q), somites and presumptive blood islands (XL04e24; Supplemental Figure), and the notochord (XL038g12; Fig. 4R).
The similar staining patterns of several of these unidentified genes gave us some concern that these might be nonoverlapping segments of the same gene. To minimize this possibility, we extensively checked the available sequences of these ESTs against each other to rule out that they had been misclassified by the clustering programs. We also compared them with the NCBI and NIBB databases regularly up until submission of the manuscript to facilitate correct classification. The results suggest that all of these clones represent unique sequences; however, we cannot fully eliminate the possibility that some may ultimately represent different regions of the same transcript. The emerging X. tropicalis genome sequence will aid in resolving this issue.
Lastly, we tested whether treatment with TTNPB vs. AGN193109 elicited regional differences in staining for the putative RAR-responsive genes. Treatment with TTNPB caused posteriorization of neural tissues together with loss of anterior tissues and anterior neural markers. AGN193109 treatment caused loss of posterior neural markers and increased expression of anterior neural markers accompanied by enlarged heads (Koide et al., 2001). We chose XL049b08 from group 1, because its normal expression included pronephros and blood islands in addition to the neural tube and eye anlagen. XL016p20 and XL043c08 were chosen from group 2, because they were expressed in the neural tube and the anterior neural tissues in very similar patterns. As we expected, posterior neural expression of all three genes was reduced or eliminated by AGN193109 treatment compared with controls (cf Fig. 5S1, T1, U1, and S2, T2, U2) in stage 18 embryos. AGN193109 treatment led to increased medial staining of XL016p20 (Fig. 5S1 vs. S2) and expanded expression domains in the forebrain for XL043C08 (Fig. 5U1 vs. U2). In contrast, TTNPB treatment led to apparently increased expression of XL016P20 throughout the embryo (Fig. 5S3, S6 vs. S2, S5). TTNPB treatment led to a complete loss of Xl043C08 staining in the forebrain and presumptive trigeminal ganglion (Fig. 5U3, U6 vs. U2, U5). These results indicate that RAR activation is essential for the posterior neural expression of XL016p20 and XL043c08 but not in anterior neural tissues. XL049b08 expression was enhanced in the eye anlagen by AGN193109 treatment (Fig. 5T4 vs. T5) and was expanded to the anterior neural region by TTNPB with apparent loss of regional boundaries (Fig. 5T3, T6 vs. T2, T5). The posterior expression domain and the ventral expression domain were apparently unaffected in XL049b08 (data not shown). Thus, XL049b08 expression is probably regulated by RAR-mediated repression in the anterior neural tissues but RAR is not likely the dominant controller in the pronephros and blood islands.