Genome sequence of dwarf birch (Betula nana) and cross-species RAD markers


New sequencing technologies allow development of genome-wide markers for any genus of ecological interest, including plant genera such as Betula (birch) that have previously proved difficult to study due to widespread polyploidy and hybridization. We present a de novo reference genome sequence assembly, from 66× short read coverage, of Betula nana (dwarf birch) – a diploid that is the keystone woody species of subarctic scrub communities but of conservation concern in Britain. We also present 100 bp PstI RAD markers for B. nana and closely related Betula tree species. Assembly of RAD markers in 15 individuals by alignment to the reference B. nana genome yielded 44–86k RAD loci per individual, whereas de novo RAD assembly yielded 64–121k loci per individual. Of the loci assembled by the de novo method, 3k homologous loci were found in all 15 individuals studied, and 35k in 10 or more individuals. Matching of RAD loci to RAD locus catalogues from the B. nana individual used for the reference genome showed similar numbers of matches from both methods of RAD locus assembly but indicated that the de novo RAD assembly method may overassemble some paralogous loci. In 12 individuals hetero-specific to B. nana 37–47k RAD loci matched a catalogue of RAD loci from the B. nana individual used for the reference genome, whereas 44–60k RAD loci aligned to the B. nana reference genome itself. We present a preliminary study of allele sharing among species, demonstrating the utility of the data for introgression studies and for the identification of species-specific alleles.