SEARCH

SEARCH BY CITATION

Keywords:

  • chicken;
  • database;
  • embryo;
  • gallus;
  • gene expression;
  • in situ hybridization

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. EXPERIMENTAL PROCEDURES
  7. Acknowledgements
  8. REFERENCES

Despite the increasing quality and quantity of genomic sequence that is available to researchers, predicting gene function from sequence information remains a challenge. One method for obtaining rapid insight into potential functional roles of novel genes is through gene expression mapping. We have performed a high throughput whole-mount in situ hybridization (ISH) screen with chick embryos to identify novel, differentially expressed genes. Approximately 1,200 5′ expressed sequence tags (ESTs) were generated from cDNA clones of a Hamburger and Hamilton (HH) stage 4–7 (late gastrula) chick embryo endoderm–mesoderm library. After screening to remove ubiquitously expressed cDNAs and internal clustering and after comparison to GenBank sequences, remaining cDNAs (representing both characterized and uncharacterized genes) were screened for expression in HH stage 3–14 embryos by automated high throughput ISH. Of 786 cDNAs for which ISH was successfully performed, approximately 30% showed ubiquitous expression, 40% were negative, and approximately 30% showed a restricted expression pattern. cDNAs were identified that showed restricted expression in every embryonic region, including the primitive streak, somites, developing cardiovascular system and neural tube/neural crest. A relational database was developed to hold all EST sequences, ISH images, and corresponding BLAST report information, and to enable browsing and querying of data. A user interface is freely accessible at http://geisha.biosci.arizona.edu. Results show that high throughput whole-mount ISH provides an effective approach for identifying novel genes that are differentially expressed in the developing chicken embryo. Developmental Dynamics 229:677–687, 2004. © 2004 Wiley-Liss, Inc.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. EXPERIMENTAL PROCEDURES
  7. Acknowledgements
  8. REFERENCES

With sequencing of the human genome completed, attention is focusing on understanding the function of individual genes, and how groups of interrelated genes function in concert. An important prerequisite is to determine when and where genes are expressed during embryogenesis and in the adult. For developmental biologists, expression and functional studies of differentially expressed genes have traditionally focused on one or a relatively few genes involved in specific developmental processes. While this approach has yielded important new information concerning the function of a limited number of differentially expressed genes, analysis of the human genome indicates that approximately half of all predicted genes encode proteins of currently unknown function (Lander et al., 2001; Venter et al., 2001). Methods for high throughput analysis of gene expression would greatly speed the rate at which differentially expressed genes are identified and characterized.

In recent years, several methods for high throughput analysis of gene expression have been developed. Microarray technology offers a powerful means for screening a large number of sequences for expression within defined tissues and/or organs, and can rapidly identify large numbers of differentially expressed sequences (Schena, 1995; Xiang, 2003). In general, however, microarrays profile expression levels for highly defined groups of cells or tissues, with more limited ability to provide spatial and temporal information. For developmental studies, whole-mount in situ hybridization (ISH), in which transcripts from a single gene are visualized within an entire embryo, provides a level of spatial information that is not possible with other techniques (Wilkinson, 1992; Nieto et al., 1996). By analyzing embryos at different embryonic stages, a global, although relatively qualitative, view of changes in gene expression can be obtained. Whole-mount ISH is relatively labor intensive, however, and so has traditionally been used to examine expression of one or at most a small group of genes. Protocol modifications and the use of robotics have allowed researchers to contemplate much larger screens, and high throughput ISH gene expression screens have now been reported in several organisms including fly, zebrafish, frog, and mouse (Gawantka et al., 1998; Neidhardt et al., 2000; Kudoh et al., 2001; Tomancak et al., 2002). These screens demonstrated the efficacy of using ISH as a means for gene discovery, efforts are under way in zebrafish and fly to map expression of all differentially expressed genes.

Due to the obvious ethical limitations concerning use of human embryos, comprehensive gene expression mapping efforts in amniotes must rely on model organisms. The embryo of the chicken provides several advantages for large scale screening efforts. Research using chick embryos has a long history and has spanned a broad range of fields. The chicken is also an agriculturally important species, providing one of the most rapidly growing sources of meat and egg protein (Rosegrant, 2001). As an amniote, early stages of chick embryogenesis share close similarities with mammals, including human, but because development occurs in ovo chick embryos are easily and inexpensively accessible for observation and experimental manipulation. Furthermore, the high conservation of gene function observed across classes of organisms indicates that discoveries in chicken can be applied to mouse and human studies. Advantageous morphology and availability of large numbers of embryos at low cost make large scale screens in chicken significantly faster and less expensive than in mammalian species such as mouse. The chick embryo, therefore, provides an advantageous species for large scale mapping of gene expression patterns.

This report describes GEISHA (Gallus EST in situ hybridization analysis), a pilot gene discovery project in chick embryos that combines expressed sequence tag (EST) generation and BLAST analysis with whole-mount ISH to identify novel, differentially expressed genes. A MySQL database was developed to house data and to enable display and querying through a freely available user interface (http://geisha.biosci.arizona.edu). An overview is presented of the experimental and analysis pipeline, and representative images are presented of ISH patterns showing differentially expressed genes in the primitive streak, somites, cardiovascular system, and neural tube. Results show that ISH provides an effective means for identifying differentially expressed genes in chick embryos, and that databases such as GEISHA provide a framework for organizing and presenting expressed sequence and gene expression data to the research community.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. EXPERIMENTAL PROCEDURES
  7. Acknowledgements
  8. REFERENCES

Approach

The goal of this study was to assess the utility of linking EST analysis with high throughput whole-mount ISH as a means for identifying differentially expressed genes in the developing chicken embryo. The overall approach was as follows: individual colonies were picked from a directionally cloned cDNA library following a simple hybridization backscreen to identify and remove the most redundant sequences, and a single 5′ sequence read was obtained. Resulting ESTs were subjected to BLASTN comparisons against NCBI's nucleotide (nt) and EST databases (dbest), and BLASTX against NCBI's protein (nr) database. All ESTs were also compared with each other through an internal BLASTN algorithm, and sequences were clustered to reveal redundancy and overlap within the EST set. Expression patterns of nonredundant cDNAs were determined by ISH using hybridization robots. Resulting ISH patterns were documented, and a database and Web-based user interface was created to facilitate presentation and querying of sequence and BLAST and image information. An overall pipeline is shown in Figure 1.

thumbnail image

Figure 1. Flow chart showing conceptual pipeline for the GEISHA project.

Download figure to PowerPoint

cDNA Library Preparation and EST Analysis

PolyA+ RNA isolated from the endoderm and mesoderm of approximately 2,000 gastrula-stage (Hamburger and Hamilton [HH] 4–7) White Leghorn chick embryos was used to generate a mixed random primed and oligo d(T) primed, directionally cloned cDNA library in the Uni-ZAP XR vector. Library analysis revealed a complexity of approximately 106 clones, with an average insert size of 1.8 kb. To facilitate processing large numbers of cDNAs, mass excision was performed and the resulting Bluescript XR phagemid library was used for all subsequent manipulations.

For preliminary analysis of cDNAs, 200 individual colonies were picked at random. After overnight culture and isolation of plasmid DNA, sequence reads were obtained from the 5′ end of cDNAs, and BLASTN and BLASTX searches were performed against the GenBank nt, dbest, and nr databases. cDNAs coding for ribosomal RNA, mitochondrial proteins, metabolic enzymes, and common cytoskeletal elements were pooled and used to backscreen the cDNA library. Approximately 1,200 nonhybridizing clones were picked and for each a single 5′ sequence read was obtained. Vector sequence and low-quality regions were removed from the raw sequence traces by using the PHRED base calling program (Ewing and Green, 1998; Ewing et al., 1998); resulting 5′ ESTs had an average length of 537 nt. BLAST comparisons were then performed against the GenBank databases. For each BLAST comparison, the top 50 hits above a threshold (Expectation [E] values better than 0.05 for BLASTN of nt and BLASTX of nr; E values better than 10-10 for BLASTN of dbest) were saved, along with sequence traces and corresponding quality data, and imported into the database (see below).

To determine the degree of sequence redundancy within the EST set, each EST was compared with all other sequences in the database, and matching clones with an E value better than 0.05 and with a matching region of at least 100 nucleotides were clustered. Twenty-three percent of all ESTs were related to at least one other EST, leaving 729 unique sequences. Because cDNAs were chosen at random, the size of a cluster revealed the relative abundance of that sequence in the cDNA library. Median best hit E values for ESTs were 7e-36, 2e-54, and 1e-112, for BLASTN of nt, BLASTX, and BLASTN of dbest, respectively. The percentage of sequences showing no significant match (BLASTN of nt and BLASTX hits better than E = 0.05, BLASTN of dbest hits better than E = 10−10) were 24% (226), 26% (247), and 26% (248), respectively. Approximately 2,200 ESTs from a second cDNA library generated from HH stage 16–18 cardiac cushions were also added to the database, although these sequences were not included in the hybridization screen.

High Throughput Whole-Mount In Situ Hybridization

After removal of redundant, mitochondrial, and ribosomal sequences, and cDNAs coding for common metabolic enzymes and cytoskeletal components, 864 cDNAs were chosen for high throughput whole-mount ISH. Digoxigenin-labeled antisense RNA probes were prepared from linearized plasmid or PCR generated templates, and ISH and visualization of bound probe was performed manually (Wilkinson, 1992; Nieto et al., 1996) or using two Abimed In Situ Pro hybridization robots to facilitate processing. Hybridization robots enabled high throughput analysis of a large number of probes, though the overall quality of ISH results was sometimes lower than that obtained by manual processing. In general, however, the increased throughput possible using the robots outweighed the somewhat reduced quality of ISH results. At least three embryos between 1 and 3 days of development (HH stages 3–14) were assayed with each probe. Processed embryos were examined and expression patterns initially classified as negative, ubiquitous, or restricted (e.g., with a locally specific expression pattern observed for at least one embryo stage). Digital images of embryos were acquired for all cDNAs showing a restricted expression pattern, and most of these were chosen for more detailed ISH analysis. Of 786 total cDNAs for which ISH was successfully performed (78 cDNAs proved refractile to probe generation), approximately 30% showed ubiquitous expression, 40% were negative, and approximately 30% showed a restricted expression pattern.

A review of ISH images showed that cDNAs were identified with restricted expression patterns in virtually every embryonic region and at all stages examined (through HH stage 14), even though the cDNA library was generated using RNA from endoderm and mesoderm of HH stage 4–7 chick embryos. A summary of expression patterns according to stage and localization is shown in Figure 2. Expression patterns in certain structures were especially prevalent. Of the 786 cDNAs whose expression was examined by ISH, at least 11 (1.5%) showed localized expression in the primitive streak (Fig. 3). BLAST report information (Table 1) shows that these include cDNAs that are completely unrelated to any known sequence (e.g., P14 and Z10), cDNAs showing some relationship to known sequences (e.g., Q38, which has weak homology to a serine protease inhibitor), and cDNAs coding for well-characterized genes. Among the well-characterized cDNAs, some, such as E36 (goosecoid, Izpisua-Belmonte et al., 1993) and L11 (c-hairy1; Jouve et al., 2002), have been localized previously in the streak, while others have not (e.g., G40, a 78-kDa glucose regulated protein).

thumbnail image

Figure 2. Bar graphs illustrating the number of cDNAs expressed in various embryonic structures between Hamburger and Hamilton stages (st) 3–14. Y-axes reflect the number of cDNAs, “n” values at right show the total number of cDNAs identified for each stage that showed expression in any structure.

Download figure to PowerPoint

thumbnail image

Figure 3. Representative cDNAs identified in the GEISHA screen that are expressed in the primitive streak. Letter and number combinations correspond to the GEISHA nomenclature designation for that cDNA. Sequence descriptions and an expression summary are shown in Table 1.

Download figure to PowerPoint

Table 1. BLAST Report Information
cDNA #DescriptionStreakExpression patternNeural
SomiteHrt/Vasc
  • a

    Expression patterns not shown in Figure 6.

D18moderately similar to dead box protein 15 (DDX15)++++  
E36goosecoid++   
G4078-kDa glucose-regulated protein (GRP78)++   
I17N-cadherin++++++++
J31MRLC (myosin regulatory light chain)++   
L11c-hairy1++++  
P14unknown++   
P44N-acetylglucosamine-6-sulfatase++   
Q38weakly similar to Kunitz type I ser. prot. inhibitor++   
W24Ghox-lab homoeobox++   
Z10unknown++++  
A01weakly similar dodecenoyl-CoA Delta-isomerase ++  
A48moderately similar to oxidoreductase ++  
E54unknown++++  
F15unknown ++ ++
F33moderately similar to ribonuclease/angiogenin ++ ++
  inhibitor    
I35weakly similar to immunoglobulin superfamily ++  
  member 4    
X04unknown ++  
Y11retinaldehyde dehydrogenase 2 (Raldh2) ++  
A38very similar to DnaJ (Hsp40) homolog (DNAJA2)  ++++a
C02unknown  ++ 
G15unknown  ++++a
H01Moderately similar to glutamyl-prolyl-tRNA  ++++a
  synthetase    
L16unknown  ++++
M16weakly similar to gag polyprotein  ++ 
N17moderately similar to sorting nexin 7  ++++a
P38DEAD box polypeptide 27  ++ 
P48weakly similar to equarin-S  ++ 
W34MIF2 suppressor (PSMT3)  ++ 
Y48DEAD/H box polypeptide 19 (DDX19)  ++++
Y42unknown  ++++
A07weakly similar to carnitine acetyltransferase   ++
A15unknown   ++
A34tight junction protein 1   ++
A35weakly similar to Vg1 binding protein   ++
D03unknown   ++
D07latrophilin 2   ++
D19similar to F37/esophageal cancer-related gene-coding   ++
  leucine-zipper motif (FEZ1)    
D35weakly similar to reverse transcriptase, pol-like   ++
E15transformer-2 beta   ++
E51ADP-ribosylation factor 1   ++
F14lymphocyte-specific protein 1 (LSP1)   ++
F18weakly similar to ariadne homolog 2   ++
F34unknown   ++
F50weakly similar to h-Shippo 1   ++

cDNAs showing expression in the developing somites (at least 13 cDNAs, Fig. 4) and in the heart and vascular system (at least 12 cDNAs, Fig. 5) include a similar repertoire of “characterized” cDNAs, cDNAs related to genes of known function, and cDNAs coding for uncharacterized proteins. Surprisingly, cDNAs showing restricted expression in the developing neural tube and/or neural crest represented that largest cluster of differentially expressed sequences, even though the source RNA for production of the cDNA library specifically excluded the ectoderm. At least 26 cDNAs (3.3%) showed expression in the neural tube and/or neural crest (Fig. 6). Of these, only 10 sequences showed identity with, or were highly similar to, characterized genes (Table 1). Genes that share expression in multiple structures during development have been termed a synexpression group (Gawantka et al., 1998), and in several cases coexpression was observed. Five genes (of 19 total) were coexpressed in the primitive streak and somites, although of these only I17 (N-cadherin) was also expressed in neural structures and in the heart/vascular system. The largest synexpression group (nine genes) was coexpressed in neural structures and in the developing vasculature.

thumbnail image

Figure 4. Representative cDNAs identified in the GEISHA screen that are expressed in the presomitic mesoderm and/or somites. Sequence descriptions and an expression summary are shown in Table 1.

Download figure to PowerPoint

thumbnail image

Figure 5. Representative cDNAs identified in the GEISHA screen that are expressed in the heart, blood vessels, and/or blood islands. Sequence descriptions and an expression summary are shown in Table 1.

Download figure to PowerPoint

thumbnail image

Figure 6. Approximately 3.3% of cDNAs examined were expressed in the developing neural tube. A representative group of images are shown, grouped roughly according to staining localization. Sequence descriptions and an expression summary are shown in Table 1.

Download figure to PowerPoint

Database and User Interface

To organize sequence information and expression data in a format that would be helpful during data collection and for subsequent gene characterization efforts, a relational database and user interface was created that is freely accessible at http://geisha.biosci.arizona.edu. When information pertaining to a particular cDNA is requested, all information relating to that sequence (previously entered into the MySQL database) is retrieved and merged to create an HTML page. Each cDNA page contains the EST sequence, a link to the original sequence trace, ISH images, BLAST results, cluster information showing relatedness of the EST to other sequences in the GEISHA database, and open reading frame analysis of the 5′ EST in all six potential reading frames (Fig. 7). The top three BLASTN of nt, and BLASTX and BLASTN of dbest hits are shown on each page, linked directly to the corresponding NCBI report. A link is also present to the full BLAST reports. With this design, all information relating to a single cDNA is readily accessible from one page. During construction of the database and user interface, an annotated chicken genome was not available, and so BLAST information provided the best means for characterizing a particular sequence. Availability of assembled genomic sequence will provide for alternative ways to organize and link information.

thumbnail image

Figure 7. Sample GEISHA cDNA page. All information associated with a sequence is displayed on a single page, including BLAST results, ISH images, EST sequences, relatedness to other sequences in the database, and EST open reading frame analysis.

Download figure to PowerPoint

Several options are available through the main user interface page for browsing and querying the database. Users can go directly to a particular cDNA using either GEISHA cDNA identifiers (for example, “G15”) or the common name for a cDNA (for example, “Fgf8”). Alternatively, cDNAs can be browsed through table listings of sequences. For comparative analyses, all images for a particular stage of development also can be viewed on a single page, permitting rapid scanning of thumbnail images for expression patterns of interest.

Query GEISHA allows searching of all BLAST output files for any parameter in the BLAST report, in either a basic or advanced mode. The basic mode allows users to choose the type of BLAST output to query, the cDNA library from which sequences were derived, and any text term within the BLAST reports. Advanced query allows more sophisticated filtering of searches based upon various additional BLAST parameters. Also available under Query GEISHA is the option to simultaneously compare ISH patterns obtained from multiple cDNAs. Querying of images using anatomic terms is not yet available and will require development of a consensus chick embryo anatomic ontology (see Discussion section).

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. EXPERIMENTAL PROCEDURES
  7. Acknowledgements
  8. REFERENCES

It is generally acknowledged that embryonic development is driven largely by differential gene expression and that knowing when and where genes are expressed is an important prerequisite to understanding their function. ISH offers one approach to mapping gene expression patterns that provides a large amount of spatial and temporal information. High throughput ISH screens have been performed in several vertebrate and invertebrate organisms, although none have been reported in chick despite its agricultural importance and extensive history as a model organism for biomedical research. The objective of this study was to assess the utility of using EST analysis and high throughput ISH as a means for identifying differentially expressed genes in the chick embryo and to develop a relational database and user interface for display and query of sequence and image data. A non-normalized, directionally cloned cDNA library was generated, and subjected to a simple hybridization backscreen using the most abundant sequences obtained from a preliminary set of 200 ESTs. The 5′ ESTs were obtained from approximately 1,400 randomly selected cDNAs, and BLAST report information was used to select cDNAs for hybridization analysis. While some screens have performed ISH by using random cDNAs before sequencing, BLAST analysis provided a rapid and efficient selection mechanism. A total of 864 cDNAs were ultimately chosen for high throughput hybridization analysis, using at least three chick embryos between HH stages 3 and 14 (approximately 16–54 hr of incubation) for each probe. The use of ISH robots greatly increased the throughput rate of probe analysis, although in general the quality of hybridization patterns (increased background, occasional reduced signal intensity) was somewhat inferior to results obtained by manually performing the procedure.

Thirty percent of cDNAs analyzed showed restricted expression, and approximately one third of these showed highly restricted expression within one defined structure. These proportions are generally similar to those obtained in an ISH screen in Xenopus (Gawantka et al., 1998). Because only 1,400 ESTs in total were sequenced from a non-normalized library, it is expected that only a minority of these would correspond to rare mRNAs. It is, therefore, not clear why 40% of cDNAs showed no hybridization signal, although to reduce the possibility that technical errors contributed to this percentage, probe concentration and integrity was checked by gel electrophoresis. A high throughput ISH screen in mouse in which cDNAs were pre selected for low abundance reported that 21% of cDNA showed no or low ubiquitous expression (Neidhardt et al., 2000).

cDNAs showing restricted expression in the initial hybridization screen were further analyzed by using embryos at multiple intermediate stages, providing a more detailed view of their expression patterns. Embryos showing restricted expression patterns were scored according to their anatomic location of staining (Fig. 2), and for the most part, assignment of anatomic location was performed without sectioning. Genes were identified that showed localized expression in all major structures. Approximately 1.5% of all cDNAs, and 5% of differentially expressed cDNAs, showed expression in the primitive streak, the presomitic mesoderm, and/or somites or the heart and vasculature. Five genes were coexpressed in the primitive streak and somites, and nine genes showed coexpression in neural structures and the developing vasculature. It will be interesting to determine whether any of these coexpressed genes are regulated by similar signaling pathways in different embryo regions, or alternatively are themselves components of the same regulatory pathway. Because the cDNA library used specifically excluded the ectoderm and neural primordia, it was unexpected that the highest percentage of differentially expressed cDNAs was localized to the neural tube, neural crest, and developing brain (Fig. 6). This finding is consistent with the high percentage of neurally expressed genes identified in other screens (Gawantka et al., 1998; Neidhardt et al., 2000), reflecting the complexity of ongoing cellular processes in the developing central nervous system.

The overall screening strategy used in this study has been used, with some modifications, for large scale screens in other organisms including, frog, zebrafish, and mouse (Gawantka et al., 1998; Neidhardt et al., 2000; Kudoh et al., 2001; Tomancak et al., 2002). The goal for each of these studies was to identify differentially expressed genes, and because considerable evidence indicates that mRNAs showing a restricted expression pattern represent a minority of all expressed sequences, strategies were used to increase the percentage of differentially expressed sequences in the pool to be analyzed. Library normalization can reduce redundancy and increase the relative abundance of rare cDNAs (Soares et al., 1994; Neidhardt et al., 2000). It has also been reported that mRNAs for differentially expressed genes are found preferentially in the low-abundance pool, and hybridization backscreening screening has been used to select for these low-abundance cDNAs (Neidhardt et al., 2000). For the present study, we elected to use a non-normalized library and to perform a simple backscreen by using the most abundant cDNAs in a preliminary analysis of 200 ESTs. This approach eliminated approximately 20% of sequences from further analysis, although considerable redundancy was still observed in remaining cDNAs. For screens in which many thousands of cDNAs will be obtained from a single library, normalization will ultimately be both time- and cost-effective.

The availability of genomics related resources impacts the manner in which ISH screens can be approached. Availability of nonredundant cDNA sets, for example, provide characterized sequences for probe production, reducing the need to obtain sequences form embryonic libraries. Nonredundant cDNA sets have been developed for Drosophila that represent the majority of expressed sequences (Rubin et al., 2000; Stapleton et al., 2002a, b) and are being used to systematically map the expression of all genes (Tomancak et al., 2003). A chicken cDNA set representing approximately one third of all predicted expressed genes is presently under development (Boardman et al., 2002) and will provide a valuable reference resource for ISH screening in chick.

An important objective of this study was to create a database for storing image and associated sequence information, and a user interface to enable efficient browsing and querying of expression information. All data are entered into a MySQL database and accessed for querying and to generate Web pages. For each cDNA, all relevant information is displayed in a single Web page, enabling efficient navigation between data types. The GEISHA database was created before sequencing of the chicken genome; therefore, in its present format, BLAST reports provide the information around which image and sequence information is organized. Once an assembled chicken genome is available, it will be possible to map each expressed sequence to the genome, and information in the GEISHA database will then annotate the genomic sequence. One important feature presently absent from the database is the ability to query for expression patterns by anatomic structure. Anatomic querying requires a controlled vocabulary that describes chicken embryo anatomy at each stage of development. Several chicken embryo anatomy atlases have been published (Bellairs and Osmond, 1998; Schoenwolf and Mathews, 2003), and two universally accepted stages series describe recognizable stages from fertilization through laying (Hamburger and Hamilton, 1951; Eyal-Giladi and Kochav, 1976). Controlled vocabularies have been developed for several other organisms, including mouse (Davidson et al., 2001), which could serve as a framework for generating a consensus chicken anatomic ontology. This remains an important task for the chicken research community.

EXPERIMENTAL PROCEDURES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. EXPERIMENTAL PROCEDURES
  7. Acknowledgements
  8. REFERENCES

cDNA Library Construction and EST Generation

Embryos were isolated from fertile chicken eggs (Gallus domesticus) and staged (Hamburger and Hamilton, 1951) after 16–54 hr in a humid incubator at 37°C. After removal from the egg, embryos were placed in isosmotic (123 mM) NaCl solution. Endoderm and mesoderm was excised under a dissecting microscope with an eyebrow hair or tungsten probe and placed directly into RNA Stat-60 (Tel-test, Friendswood, TX) for total RNA purification, according to manufacturer's instructions. RNA was isolated from approximately 2,000 embryos. PolyA+ RNA was purified on oligo(dT) Sepharose and then mixed oligo(dT) and random primed for cDNA synthesis. Size fractionated cDNA (>0.4 kb) was directionally cloned into the EcoRI and XhoI sites of the Uni-ZAP XR phage vector (Stratagene). To facilitate high-throughput handling of cDNAs, mass excision and transformation into SOLR cells was performed according to standard protocols (Stratagene), and the remaining manipulations were performed with Bluescript phagemids. Plasmid purification from the cDNA clones was performed with multiples of QIAprep 8 8-well strips (Qiagen). The first 200 clones and subsequent clones that were negative in the backscreen (see below) underwent single-pass 5′ sequencing using the T3 primer. Vector sequence and low-quality regions were removed from the raw sequence traces using PHRED (Ewing, 1998a, b) with quality score < 20.

Macroarray Backscreen

After obtaining an initial 200 ESTs and performing BLASTN to identify known genes, digoxigenin riboprobes (Roche Diagnostics) were generated for 15 ribosomal, mitochondrial, and cytoskeletal sequences by PCR of gel purified inserts, according to the manufacturer's protocol. A total of 0.5–1 μl from plasmid preps (0.2 μg/μl DNA) of remaining clones were spotted onto gridded nitrocellulose membranes. After hybridization using standard protocols, colonies that failed to hybridize were selected for further analysis.

BLAST Comparisons

ESTs were compared with the NCBI's GenBank (nt), GenPept (nr), and dbEST databases using BLASTN and BLASTX (nr only) set to an expect value cutoff of 0.05, and a maximum of 50 hits were recorded in HTML format. To aid in annotation, BLAST bit scores (in addition to other data, in some cases) were used to create a short description for each clone. Each EST was also compared using BLASTN to all GESIHA ESTs, and these results were parsed to create clone clusters. Custom software was used to generate cluster diagrams. Approximately 2,200 ESTs from a second cDNA library generated from HH stage 16–18 (Hamburger and Hamilton, 1951) cardiac cushions were also added to the database but were not included in the ISH screen.

Whole-Mount ISH

Riboprobes were prepared from all cDNAs that did not represent obviously ubiquitous genes based on 5′ EST BLAST results. Digestion of 1.5-μg plasmid preparations with NotI was verified by resolving 1 μl of each 15-μl reaction by 1% agarose gel electrophoresis. Riboprobe synthesis was performed in microcentrifuge tubes in a scaled-down reaction containing 2.5 μl of linearized plasmid (0.1 μg/μl), 0.5 μl of Dig RNA labeling mix (Roche), 0.5 μl of T7 RNA polymerase (50 U/μl, Stratagene), 0.25 μl of RNase inhibitor (10 U/μl), 1.5 μl of 5× RNA polymerase buffer (Stratagene), and 4.75 μl of water. After incubation at 37°C for 2 hr, 0.25 μl of RQ1 DNase was added and samples were incubated for an additional 30 min. Riboprobe quality and approximate size was determined by using 1% agarose gel electrophoresis, and probes were stored at −80°C. Whole chicken embryos were isolated as described above, and after fixation in 4% paraformaldehyde in PBS, embryos were dehydrated in a graded methanol series. For hybridization, at least three embryos of varying stages between 3 and 14 were placed into each of 30 large incubation columns of an InSituPro ISH robot (Primm Labs, Cambridge, MA) along with riboprobes in hybridization solution. The robot hybridization protocol followed closely that of Nieto et al. (1996), except that the KTBT wash solution contained only 0.1% Triton X-100. This modification reduced adherence of embryos to one another in the wells. After hybridization and washing, embryos were removed from the robot and placed into 12-well plates for NBT/BCIP staining. Staining was stopped when needed, after 1–72 hr, and background staining was removed with a methanol wash. Embryo staining was classified as negative, ubiquitous, or restricted, and digital images were obtained of all embryos showing a restricted expression pattern. Image processing was performed by using ImageMagick with a script interface to allow simultaneous manipulation of multiple images, and with Adobe Photoshop.

Data Storage and Analysis

A MySql database was used to store data and to facilitate display and querying of cDNA and sequence information, BLAST reports, and images. A data analysis pipeline was created with a series of Perl scripts to permit easy data input, processing, and updating, and all primary data and sequence and image analysis was placed in the relational database. Data presentation uses the GEISHA MySql database along with a custom Web interface.

Acknowledgements

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. EXPERIMENTAL PROCEDURES
  7. Acknowledgements
  8. REFERENCES

We thank Rob Baker for helpful discussions and comment on the manuscript, Nirav Merchant of the University of Arizona Biotechnology Facility for extensive assistance with design and implementation of the initial database and user interface, and Robert Schambach and Carl Cox for implementing the revised database model. We also thank Raymond Runyan for contribution of 2,200 ESTs from a chick embryo cardiac cushion cDNA library. G.W.B. was supported by an NIH postdoctoral institutional fellowship.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. EXPERIMENTAL PROCEDURES
  7. Acknowledgements
  8. REFERENCES
  • Bellairs R, Osmond M. 1998. The atlas of chick development. London: Academic Press. 323 p.
  • Boardman PE, Sanz-Ezquerro J, Overton IM, Burt DW, Bosch E, Fong WT, Tickle C, Brown WRA, Wilson SA, Hubbard SJ. 2002. A comprehensive collection of chicken cDNAs. Curr Biol 12: 19651969.
  • Davidson D, Bard J, Kaufman M, Baldock R. 2001. The mouse atlas database: community resource for mouse development. Trends Genet 17: 4951.
  • Ewing B, Green P. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8: 186194.
  • Ewing B, Hillier L, Wendl MC, Green P. 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8: 175185.
  • Eyal-Giladi H, Kochav S. 1976. From cleavage to primitive streak formation: a complementary normal table and a new look at the first stages of the development of the chick. Dev Biol 49: 321337.
  • Gawantka V, Pollet N, Delius H, Vingron M, Pfister R, Nitsch R, Blumenstock C, Niehrs C. 1998. Gene expression screening in Xenopus identifies molecular pathways, predicts gene function and provides a global view of embryonic patterning. Mech Dev 77: 95141.
  • Hamburger V, Hamilton HL. 1951. A series of normal stages in the development of the chick embryo. J Morphol 88: 4992.
  • Izpisua-Belmonte JC, De Robertis EM, Storey KG, Stern CD. 1993. The homeobox gene goosecoid and the origin of organizer cells in the early chick blastoderm. Cell 74: 645659.
  • Jouve C, Iimura T, Pourquie O. 2002. Onset of the segmentation clock in the chick embryo: evidence for oscillations in the somite precursors in the primitive streak. Development 129: 11071117.
  • Kudoh T, Tsang M, Hukriede NA, Ziongfong C, Dedekian M, Clarke CJ, Kiang A, Schultz S, Epstein JA, Toyama R, Dawid IB. 2001. A gene expression screen in zebrafish embryogenesis. Genome Res 11: 19791987.
  • Lander ES, Linton LM, Birren B, Nusbaum C, et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860921.
  • Neidhardt L, Gasca S, Wertz K, Obermayr F, Worpenberg S, Lehrach H, Herrmann BG. 2000. Large-scale screen for genes controlling mammalian embryogenesis, using high-throughput gene expression analysis in mouse. embryos. Mech Dev 98: 7793.
  • Nieto MA, Patel K, Wilkinson DG. 1996. In situ hybridization analysis of chick embryos in whole mount and tissue sections. In: Methods in cell biology. New York: Academic Press, Inc.
  • Rosegrant MR, Paisner MS, Meijer S, Witcover J. 2001. Global Food Outlook: Trends, Alternatives, and Choices. In. Washington, D.C.: International Food Policy Research Institute.
  • Rubin GM, Hong L, Brokstein P, Evans-Holm M, Frise E, Stapleton M, D.A. H. 2000. A Drosophila complementary DNA resource. Science 270: 467470.
  • Schena M, Shalon D, Davis RW, Brown PO. 1995. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270: 467470.
  • Schoenwolf GC, Mathews WW. 2003. Atlas of descriptive embryology. Upper Saddle River: Pearson Education, Inc.
  • Soares MB, Bonaldo MF, Jelene P, Su L, Lawton L, Efstratiadis A. 1994. Construction and characterization of a normalized cDNA library. Proc Natl Acad Sci U S A 91: 92289232.
  • Stapleton M, Carlson J, Brokstein P, Yu C, Champe M, R. G, Guarin H, Pacleb J, Park S, Wan K, GM R, Celniker SE. 2002a. A Drosophila full length cDNA resource. Genome Biol 3: research 0080.00810080.0088.
  • Stapleton M, Liao G, Brokstein P, Hong L, Carninci P, Shiraki T, Hayashizaki Y, M. C, Pacleb J, Wan K, Yu C, Carlson J, George R, Celniker S, Rubin GM. 2002b. The Drosophila gene collection: identification of putatitive full-length cDNAs for 70% of D. melanogaster genes. Genome Res 12: 12941300.
  • Tomancak P, Beaton A, Weiszmann R, Kwan E, Shu S, Lewis SE, Richards S, Ashburner M, Hartenstein V, Celniker SE, Rubin GM. 2002. Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol 3: 114.
  • Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, et al. 2001. The sequence of the human genome. Science 291: 13041351.
  • Wilkinson DG. 1992. In situ hybridization: a practical approach. Oxford: IRL Press. p 7583
  • Xiang Z, Yang Y, Ma X, Ding W. 2003. Microarray expression profiling: analysis and applications. Curr Opin. Drug Discov. Devel. 6: 384395.