Continuing development of molecular markers is facilitating rapidly increasing resolution of plant genome structure. Using common sets of DNA markers, genetic linkage maps developed for different species have been compared. Such studies have revealed a high degree of linkage conservation both between the genomes of several monocot species ( Moore et al. 1995 ; Tikhonov et al. 1999 ) and between a range of dicots ( Cavell et al. 1998 ; Lagercrantz, 1998). From this work, collinear chromosome regions and orthologous loci have been identified. Even the possibility of synteny with practical applications bridging the monocot–dicot divide has been predicted ( Paterson et al. 1996 ). Comparative studies between Oryza sativa (rice) and Arabidopsis thaliana (Arabidopsis) found five genes whose synteny was altered by a single inversion, but these were interspersed by non-conserved genes ( van Dodeweerd et al. 1999 ). Comparative physical mapping and sequence analysis of genetically identified collinear regions will clarify the extent to which macrosynteny is maintained at the inter- and intragenic levels. Sequencing data comparing orthologous regions in maize, sorghum and Arabidopsis indicate that a much higher degree of diversity exists at the genome microstructural level than predicted from genetic mapping studies ( Tikhonov et al. 1999 ).
Brassica oleracea is a diploid species with many subspecies covering a wide range of commercially important vegetable crop forms such as broccoli, cauliflower, cabbage, kale and brussels sprouts. To date four genetic linkage maps have been reported for B. oleracea, all of which demonstrate that this genome is highly duplicated ( Bohoun et al. 1996 ; Kianian & Quiros, 1992; Landry et al. 1992 ; Slocum et al. 1990 ). Its nuclear DNA content has been estimated to be approximately 600 Mb ( Arumuganathan & Earle, 1991).
Arabidopsis is used extensively in many areas of plant biology research. Its small genome (approximately 130 Mb) and low amounts of repetitive DNA mean that it is well suited for genetic and physical mapping studies. The complete genome sequence should be available by the end of the year 2000, with those of the first two chromosomes having been reported ( Lin et al. 1999 ; Mayer et al. 1999 ). Arabidopsis sequencing data and the accompanying information on the functions of genes identified will be important for an understanding of other species with more complex genomes, particularly related species. Brassica genes have been shown to share a high level of sequence conservation with their Arabidopsis orthologues, typically >85% nucleotide identity in coding regions ( Cavell et al. 1998 ). Brassicas are the most closely related group of crops to Arabidopsis, and are therefore the obvious choice for evaluating comparative genomic approaches to understanding and manipulating biological processes and traits in crops.
Comparative genetic mapping between Brassica species and Arabidopsis ( Cavell et al. 1998 ; Lagercrantz, 1998) has shown that the diploid Brassica genomes are extensively triplicated. The overall pattern of organization suggests that although many rearrangements have occurred, each unit is approximated by, and has extensive collinearity with, the Arabidopsis genome. These results raised the expectation that a general description of the collinearity between Brassicas and Arabidopsis would make it possible to use physically mapped Arabidopsis clones and DNA sequence data to directly assist Brassica genome analysis. The comparative information upon which this is based is, however, derived from the genetic mapping of a relatively small number of genes and anonymous DNA fragments, often widely separated along the chromosome. Establishing the extent to which microsynteny (i.e. gene-by-gene organization) is conserved will provide clues to structure–function relationships of plant genome organization and insights into the mechanisms of plant genome evolution, and will define vital parameters to guide researchers starting to conduct gene-isolation experiments based on comparative mapping approaches.
In order to understand the details of genomic organization and microsyntenic relationships for large, duplicated genomes such as the Brassicas, which are never likely to be fully sequenced, large insert clone libraries are an essential tool. Bacterial artificial chromosome (BAC) vectors can carry large inserts (up to 300 kb) and have lower frequencies of chimeric and rearranged clones (approximately 5%; Bent et al. 1998 ) than other clone types such as yeast artificial chromosomes. This has lead to the current extensive use of BAC vectors for library construction. The Binary-BAC (BIBAC) vector ( Hamiliton, 1997) incorporates two technologies: the BAC ( Shizuya et al. 1992 ), and the binary vector strategy for Agrobacterium-mediated plant transformation ( Hoekema et al. 1983 ). Transforming large fragments of DNA (>100 kb) into plants would make it feasible to introduce genomic fragments encoding quantitative trait loci or gene clusters (as is frequently the case for disease resistance genes), or for use in studies to confirm that a particular clone contains the gene or allele of interest.
In this study a BIBAC library (33 742 clones, average insert size 145 kb) was constructed using genomic DNA from B. oleracea var alboglabra and was probed with 19 genes from a 222 kb sequenced region on Arabidopsis chromosome 4. Homoeologous regions within the Brassica genome were identified and microsynteny was evaluated.