• Open Access

The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads

Authors

  • Zhiwen Wang,

    1. BGI-Shenzen, Bei Shan Industrial Zone, Yantian District, Shenzhen 518083, China
    Search for more papers by this author
  • Neil Hobson,

    1. Department of Biological Sciences, University of Alberta, Edmonton, Alberta, T6G 2E9, Canada
    Search for more papers by this author
  • Leonardo Galindo,

    1. Department of Biological Sciences, University of Alberta, Edmonton, Alberta, T6G 2E9, Canada
    Search for more papers by this author
  • Shilin Zhu,

    1. BGI-Shenzen, Bei Shan Industrial Zone, Yantian District, Shenzhen 518083, China
    Search for more papers by this author
  • Daihu Shi,

    1. BGI-Shenzen, Bei Shan Industrial Zone, Yantian District, Shenzhen 518083, China
    Search for more papers by this author
  • Joshua McDill,

    1. Department of Biological Sciences, University of Alberta, Edmonton, Alberta, T6G 2E9, Canada
    Search for more papers by this author
  • Linfeng Yang,

    1. BGI-Shenzen, Bei Shan Industrial Zone, Yantian District, Shenzhen 518083, China
    Search for more papers by this author
  • Simon Hawkins,

    1. Université Lille-Nord de France, Lille 1 Unité Mixte de Recherche Institut National de la Recherche Agronomique 1281, Stress Abiotiques et Différenciation des Végétaux, F-59650 Villeneuve d’Ascq Cedex, France
    Search for more papers by this author
  • Godfrey Neutelings,

    1. Université Lille-Nord de France, Lille 1 Unité Mixte de Recherche Institut National de la Recherche Agronomique 1281, Stress Abiotiques et Différenciation des Végétaux, F-59650 Villeneuve d’Ascq Cedex, France
    Search for more papers by this author
  • Raju Datla,

    1. National Research Council of Canada, Plant Biotechnology Institute, Saskatoon, Saskatchewan, S7N 0W9, Canada
    Search for more papers by this author
  • Georgina Lambert,

    1. University of Arizona, School of Plant Sciences and BIO5 Institute, Tucson, AZ 85721, USA
    Search for more papers by this author
  • David W. Galbraith,

    1. University of Arizona, School of Plant Sciences and BIO5 Institute, Tucson, AZ 85721, USA
    Search for more papers by this author
  • Christopher J. Grassa,

    1. Department of Botany, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
    Search for more papers by this author
  • Armando Geraldes,

    1. Department of Botany, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
    Search for more papers by this author
  • Quentin C. Cronk,

    1. Department of Botany, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
    Search for more papers by this author
  • Christopher Cullis,

    1. Case Western Reserve University, Cleveland, OH 44106, USA
    Search for more papers by this author
  • Prasanta K. Dash,

    1. National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, Pusa Campus, New Delhi 110012, India
    Search for more papers by this author
  • Polumetla A. Kumar,

    1. National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, Pusa Campus, New Delhi 110012, India
    Search for more papers by this author
  • Sylvie Cloutier,

    1. Agriculture and Agri-Food Canada, 195 Dafoe Road, Winnipeg, Manitoba, R3T 2M1, Canada
    2. Department of Plant Science, University of Manitoba, Winnipeg, Manitoba, R3T 2N2, Canada
    Search for more papers by this author
  • Andrew G. Sharpe,

    1. National Research Council of Canada, Plant Biotechnology Institute, Saskatoon, Saskatchewan, S7N 0W9, Canada
    Search for more papers by this author
  • Gane K.-S. Wong,

    Corresponding author
    1. BGI-Shenzen, Bei Shan Industrial Zone, Yantian District, Shenzhen 518083, China
    2. Department of Biological Sciences, University of Alberta, Edmonton, Alberta, T6G 2E9, Canada
    3. Department of Medicine, University of Alberta, Edmonton, Alberta, T6G 2E1, Canada
      (e-mails deyholos@ualberta.ca, wangj@genomics.org.cn and gane@ualberta.ca).
    Search for more papers by this author
  • Jun Wang,

    Corresponding author
    1. BGI-Shenzen, Bei Shan Industrial Zone, Yantian District, Shenzhen 518083, China
    2. The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Denmark
    3. Department of Biology, University of Copenhagen, Denmark
      (e-mails deyholos@ualberta.ca, wangj@genomics.org.cn and gane@ualberta.ca).
    Search for more papers by this author
  • Michael K. Deyholos

    Corresponding author
    1. Department of Biological Sciences, University of Alberta, Edmonton, Alberta, T6G 2E9, Canada
      (e-mails deyholos@ualberta.ca, wangj@genomics.org.cn and gane@ualberta.ca).
    Search for more papers by this author

(e-mails deyholos@ualberta.ca, wangj@genomics.org.cn and gane@ualberta.ca).

Summary

Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44–100 bp), produced a set of scaffolds with N50 = 694 kb, including contigs with N50 = 20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43 384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (Ks) observed within duplicate gene pairs was consistent with a recent (5–9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.

Ancillary