Genomic resources for lupins are coming of age

Lupins are underutilised pulse crops subject to increasing interest for human consumption of the high‐protein grain. They are also valued as a source of animal nutrition and make an excellent break crop in agricultural production systems. Like other orphan legumes, the genomic revolution has made it cost‐effective to also apply modern genetic and genomic approaches in lupins. These have predominantly been conducted in the two major domesticated lupin species, namely, narrow‐leafed lupin (NLL; Lupinus angustifolius) and white lupin (Lupinus albus), with transcriptome studies also emerging in other domesticated and undomesticated species. This review provides an overview of the current lupin genomic resources developed including two reference genomes for NLL and white lupin, several transcriptome resources and the development of pan‐genomes for NLL and white lupin, and describes how these offer great potential to increase grain yield and quality for these recently domesticated pulse crops. Furthermore, we highlight the importance of lupins to further our understanding of many aspects of fundamental legume biology. Combined, this will aid breeders and growers to improve lupin crops to help meet the increasing demand for plant protein in more sustainable cropping systems.


| INTRODUCTION
The genomic revolution over the last two to three decades has improved our understanding of many aspects of plant biology as well as accelerating on-going efforts to improve a wide range of crops. It has led to a suite of genomic resources from complete reference genomes for many crops, a growing number of comprehensive pangenomes and valuable gene expression atlases based on transcriptomic data. Complimentary proteomic and metabolomic resources for lupins are also emerging, but are not as extensive as the genomic and transcriptomic resources. These genomic advances have been pivotal in greatly increasing our knowledge of plant biology, initially in the model plants such as Arabidopsis, and major crops such as maize and rice, but increasingly in many of the world's other crops. In parallel, these genomic resources have paved the way for more and more sophisticated plant-breeding tools starting with the generation and deployment of various types of molecular markers such as simple sequence repeats (SSRs) and now for many crops the use of massive single nucleotide polymorphism (SNP)-based chip and/or direct sequencing approaches to analyse specific breeding populations, in combination with high-throughput phenotyping methods.
The focus of this review is on the relatively recent development and application of genomic resources for lupins, which are members of the legume family. Legumes are an important group of plants that belong to the Fabacea (or Leguminosae) family and are distributed throughout the world. Legumes constitute the third largest family of flowering plants based on total number of known species (Christenhusz & Byng, 2016) and are considered the second most important family of crop plants from an economic perspective (Smykal, Coyne, Ambrose, Maxted, & Schaefer, 2014). Thus, the legume family has about 19,000 known species, and these make up about 7% of all flowering plant species. The legume family is varied with a range of plant types, and many legumes have been used throughout the world as either grain or forage crops. This has occurred for a combination of reasons including their unique ability to fix nitrogen thereby reducing the need for fertilizers and enriching the soil for subsequent nonlegume crops as well as providing disease breaks through crop rotations. Grain legumes are also valued for their nutritional benefits for both human and livestock consumption (Duranti & Gius, 1997;Proskina & Pilvere, 2019;Rubio & Molina, 2016;Santa & Rekiel, 2020), where they can provide valuable levels of major nutrients such as protein, dietary fibre and carbohydrates. Recently, interest in grain legumes has grown due to the relatively high levels of protein in their grains and the rapidly growing demand for plant protein in human diets (Lonnie & Johnstone, 2020;Murphy-Bokern et al., 2017;Sa et al., 2020).
Our knowledge of the legume family has benefited greatly through the advent of genomic approaches. Initially, these were focused on a couple of model legumes, Medicago truncatula and Lotus japonicus, as well as soybean, the major grain legume crop. However, as genomic approaches have become more and more accessible, in large part through the development of high-throughput and low cost, next generation sequencing technologies, legume scientists have been able to deploy these technologies to an increasing number of legume crops. Many of these crops had previously been considered as 'orphan' crops in terms of the genomic resources available and progress in crop improvement for these crops lagged behind those with good genomic resources. Consequently, the development of advanced genomic resources has greatly facilitated research and breeding efforts for these crops like chickpea, peanut and pigeon pea (Pandey et al., 2016;Sahruzaini et al., 2020;Varshney et al., 2019), while there is growing interest in using proteomic and metabolomic approaches for legume crop improvement (Ramalingam et al., 2015). There has also been an expansion in the use of comparative genomic approaches that have helped to give valuable insight into many aspects of legume biology such as syntenic relationships among various legume species and a deeper understanding of legume evolution. For example, a big recent push has involved the large scale surveying of sequence diversity in some legume species through low coverage sequencing of a range of accessions including many wild relatives and in some cases to use this information to form pan-genomes that help illustrate the conserved and variable gene sets for the species.
Lupins are important members of the legume family with several interesting properties. Lupins are important ecological 'engineers' and are often among the first plant species to colonise new areas such as damaged sites. They are also able to colonise impoverished and low nutrient soils due to their nitrogen fixing ability and take up phosphorous efficiently from soils. Lupins belong to the genus Lupinus in the genistoids clade of legumes, which diverged early in papillionoid legume evolution (Lavin et al., 2005), and so genomic advances in lupin species provide valuable information on how this somewhat distant branch of the legume phylogenetic tree has evolved. Lupin grain is also receiving considerable interest for its potential to help with major human health issues around diabetes, obesity and cardiovascular diseases (Lee et al., 2006). There are around 800 known lupin species that are widely distributed geographically. Thus, they are found extensively in North and South America, where they can inhabit a wide range of habitats, but are also in the Mediterranean region (Drummond et al., 2012). Particularly in S. America, wild lupin species have been found to have one of the highest levels of diversification of any plant species, and genomic advances are shedding light on how this is achieved . From a crop perspective, only a few lupin species have been domesticated, and today, the most widely cultivated species are Lupinus angustifolius and Lupinus albus, whereas Lupinus luteus and Lupinus mutabilis are niche crops. Although overall lupin production has fluctuated over the last 20 years, over a million tonnes are produced every year with Australia being the largest producer, followed by Poland and the Russian Federation.
This review on lupin genomics focuses primarily on L. angustifolius and L. albus, both of whose genomes have been recently sequenced.
The review also touches on some of the other genomic resources that have/are being developed for lupins ( Figure 1) and how these are and will be used for furthering our understanding of lupin biology and evolution as well as drive step changes in lupin crop improvement to help meet the growing demand for high quality plant protein.

Some of the earliest genomic resources developed in lupin species
were the generation of expressed sequenced tag (EST) libraries (Fischer et al., 2015;Foley et al., 2011;Kroc et al., 2014;Nelson et al., 2006Nelson et al., , 2010Parra-González et al., 2012;Phan et al., 2007;Tian et al., 2009). These EST datasets were predominantly used to convert expressed sequences into gene-based molecular markers. This was achieved through identifying either sequence or length-based polymorphisms between expressed sequence reads of different lupin lines and converting these into polymerase chain reaction (PCR)-based molecular markers (reviewed in detail in Kamphuis et al., 2020). They have been successfully employed to construct the first genetic maps in white lupin (L. albus) (Croxford et al., 2008;Phan et al., 2007) and yellow lupin (L. luteus) (Iqbal et al., 2019;Lichtin et al., 2020) as well as improve the density of genetic maps in narrowleafed lupin (NLL) (Fischer et al., 2015;Nelson et al., 2006Nelson et al., , 2010.
Length-based polymorphic markers derived from these ESTs have also been demonstrated to be transferable across lupin species, where Parra-Gonzalez and colleagues (Parra-González et al., 2012) showed that EST-SSR markers from yellow lupin also generated PCR amplicons in L. hispanicus and L. mutabilis. This allowed them to assess the level of genetic diversity between different lupin species (Parra-González et al., 2012). Similarly, genetic maps constructed for white lupin (Phan et al., 2007) and NLL (Nelson et al., 2006 have EST-derived molecular markers that generated amplicons in both species and even some that amplify across a range of legume species. With a new age of sequencing technology, EST libraries have nowadays been replaced with RNA sequencing (RNAseq) data generated using next generation sequencers. In NLL, the first comprehensive RNA sequencing study of five different tissue types (root, stem, leaf, flower and seed) for three different accessions (Tanjil, Unicrop and P27255) was employed by Kamphuis et al. (2015). The RNA sequencing data for these three lines together with RNAseq data for breeding line '83A:476' generated through the 1,000 plants initiative (Cannon et al., 2015) were utilised to predict in silico gene-based molecular markers evenly distributed across the NLL genome . The four lupin lines are the parents of two recombinant inbred line populations in NLL, a wide cross (83A:476 × P27255) and a narrow cross derived from two cultivars (Tanjil × Unicrop), and a set of 239 insertion/deletion markers and 96 SNP markers were generated of which 179 and 88, respectively, were incorporated in the reference genetic map Kamphuis et al., 2015). A recent study by Książkiewicz and associates (Książkiewicz et al., 2017) utilised RNAseq data to generate a total of 3,597 transcriptome-derived markers to significantly improve the density of the reference genetic map for white lupin.
The generation of dense reference genetic maps in both NLL and white lupin has aided the identification of candidate genes for domestication traits such as vernalisation requirement and alkaloid content Kamphuis et al., 2015;Kroc et al., 2014Kroc et al., , 2019Książkiewicz et al., 2017) as well as genes associated with resistance to key diseases of lupins such as anthracnose caused by Colletotrichum lupini and Phomopsis stem blight caused by Diaporthe toxica (Fischer et al., 2015;Hane et al., 2017;Książkiewicz et al., 2017). Similarly, genetic maps developed in yellow lupin and quantitative trait locus (QTL) analysis for flowering time and anthracnose resistance have identified candidate genes for these traits (Iqbal et al., 2019;Lichtin et al., 2020). More recently, massive analysis of cDNA ends (MACE) was employed for 126 NLL lines surveyed in the field for several agronomic traits to conduct a genome wide association study (GWAS) to identify genes associated with these traits (Plewinski et al., 2020). Strong marker trait associations were identified for the start and end to flowering, maturity, plant height, yield and total grain weight, where the deletion in the major flowering time locus LanFtc1 (Taylor et al., 2019) was strongly associated with both the start and end to flowering as well as maturity (Plewinski et al., 2020). The MACE technique was also used for expression QTL mapping in NLL, constituting the first e-QTL survey in lupins (Plewinski et al., 2019). This study highlighted numerous genes co-regulated at major domesticated loci, including one conferring low alkaloid content.
EST and RNA sequencing libraries both investigate global gene expression in a species, and different treatments can be investigated to determine changes in global gene expression. The latter is referred to as differential gene expression analysis, and these have become popular over the last decade in lupin species with Table 1 summarising RNA-sequencing studies generated to date. These transcriptome studies have also proven extremely useful for the annotation of the two reference genome assemblies for NLL (L. angustifolius) and white lupin (L. albus) Hufnagel, Marques et al., 2020;Xu et al., 2020) as discussed in more detail later in this review. The following sections will focus on some specific RNA sequencing studies for narrow-leafed, white, yellow and some wild lupin species. T A B L E 1 Overview of peer-reviewed published next generation sequencing (NGS)-derived transcriptome datasets (RNA, small RNA or degradome) in the various lupin species and their associated GenBank BioProject and/or short read archive identifiers RNAseq study for this species was focussed on identifying as many differently expressed genes from five different tissue types  that aided the annotation of the reference genome , several other studies focussed on specific areas, such as seed biology, alkaloid regulation, phenology and yield. Two studies, conducted in the reference genotype, Tanjil, have focussed on the lupin grain to identify small RNAs that regulate aspects of seed biology (DeBoer et al., 2019) and identification of the key seed storage proteins (Foley et al., 2015). Conglutins are a major protein class of seed storage proteins, which provide the energy for germinating seedlings. Members of this conglutin family have been demonstrated to have both beneficial effects for example for the ability to reduce glycaemia and enhance satiety, as well as detrimental effects as some members can cause allergic reactions, when used as a human food.
Using RNAseq data, the authors demonstrated that Tanjil contains at least 16 members of the conglutin family, confirming earlier findings , and they used RNAseq data from five different lupin species to identify their homologous genes. Interestingly, the different members of the conglutin families showed distinct expression patterns between lupin species and high plasticity for conglutin gene expression, thus providing opportunities to remove detrimental and increase preferential conglutins to improve the lupin grain as a human food. The first insight into how these conglutins might be differentially regulated came from a small RNA study in 2019, where the authors identified key small RNAs highly abundant during seed development, and the majority of these small RNAs were predicted to target transcription factors and hormone signalling pathway genes (DeBoer et al., 2019).
Several other studies generated or utilised RNAseq datasets of NLL to identify key genes in the production of quinolizidine alkaloids (QAs). QAs are produced by lupins to protect themselves from insect predation. These secondary metabolites also accumulate in the lupin grain and are toxic to animals and humans and must remain below an industry threshold of 0.02% (Frick et al., 2017).
Modern NLL varieties have a low-alkaloid content in the grain but are not alkaloid free. The low alkaloid content in NLL varieties is controlled by a major locus termed iucundus, and Kroc et al. (2019) use transcriptome data guided approach from bitter and sweet accessions to identify candidate genes for this locus. This resulted in the identification of 12 candidate genes that were differentially expressed between bitter and sweet accessions that were associated with QA biosynthesis. Of these, two genes, an APETALA2/ethylene response transcription factor (ERF) and a 4-hydroxytetrahydrodipicolinate synthase (DHDPS) gene were closely linked to iucundus. The DHDPS is involved in lysine biosynthesis, an essential building block to generate QAs, whereas the ERF gene could be the key regulator of the pathway like ERF in tobacco that regulates production of the alkaloid nicotine (Liu et al., 2019).
A similar approach comparing bitter and sweet NLL RNAseq data to identify key genes involved in QA biosynthesis was used in two earlier studies (Frick et al., 2018;Yang et al., 2017). As genes involved in secondary metabolite production are often co-expressed, these studies compared the expression profiles of known QA biosynthesis genes such as a lysine decarboxylase gene (LaLDC), which identified 33 genes with similar expression profiles to the known QA biosynthesis genes in the study by Yang and associates . One included a gene with homology to a copper amine oxidase (LaCAO), and this gene was shown to have strong conservation with other CAO genes and was able to oxidise cadaverine into 1-piperideine. The first two key enzymes in the QA biosynthesis pathway are thus now known with LaLDC, converting lysine into cadaverine (Bunsupa et al., 2012) and LaCAO oxidising cadaverine into 1-piperideine . A separate study by Frick et al. (2018) also identified LaCAO as co-expressed with LaLDC in a bitter accession. This study utilised the transcriptome datasets generated by Kamphuis et al. (2015) to identify additional candidate genes involved in QA production, where three major latex-like proteins (LaMLP1, LaMLP2 and LaMLP4) were co-expressed with LaLDC and LaODC (Frick et al., 2018). Other candidate genes identified that did not show coexpression with known QA biosynthetic genes were a fourth major latex-like protein (LaMLP3) and a berberine bridge-like enzyme (LaBBE-like) gene. The expression of these genes in different tissues as well as various stages of seed development using qRT-PCR demonstrated an unchanged expression level during seed development, supporting our current understanding that in NLL, the majority of the QAs are transported to the grain rather than produced within the seed.
The expression pattern of the functionally characterised QA biosynthesis genes and QA content in the mature grain was subsequently investigated following application of biotic (aphid predation and wounding) and abiotic (drought and high temperature) stress. Both drought and high temperature stress as well as their interaction could affect alkaloid content in the NLL grain, albeit in a genotype specific matter (Frick et al., 2018). In contrast aphid predation using the green peach aphid (Myzus persicae) did not exhibit changes in QA biosynthetic gene expression or grain QA content   . This is similar to the findings from O'Rourke that showed ethylene biosynthesis and auxin metabolism, and sensing genes were induced in phosphate deficient cluster roots, and cytokinin degradation was also important for cluster root growth (O'Rourke et al., 2013). This was confirmed by exogenous application of auxin to white lupin, which promoted cluster root formation, whereas application of exogenous cytokinin inhibited cluster root formation (Meng et al., 2013;O'Rourke et al., 2013). RNA interference (RNAi) construct targeting a cytokinin oxidase, which was highly upregulated in phosphate deficient cluster roots, was also used to demonstrate the importance of cytokinin degradation.
The third RNAseq study (Secco et al., 2014) further highlighted the importance of auxin and specific transcription factors in the initiation and formation of cluster roots. It also identified transcripts associated with the TCA cycle and glycolysis leading to the production of organic acids, but their data suggested that the root tip is not producing and exporting malate and citrate. These two compounds are known to accumulate in mature cluster roots to facilitate orthophosphate mobilisation, and these transcripts encoding malate dehydrogenease were also previously shown to be highly upregulated in juvenile clusters in contrast to mature cluster roots, whereas a phosphoenolpyruvate carboxylase was highly expressed in mature clusters suggesting a burst of malate to mobilise phosphorus . Finally, the study helped with the generation of an improved dense reference genetic map that facilitated the assigning of scaffolds to pseudochromosomes and aided the fine mapping and identification of candidate genes for various domestication traits such as flowering time and seed shattering as well as disease resistance to anthracnose, an important issue for NLL and other lupins .
In 2020, the genome of white lupin (L. albus) cultivar 'Amiga' was sequenced and assembled by two different research groups in France and China (Hufnagel, Marques, et al., 2020;Xu et al., 2020). The French group utilised a combination of PacBio long reads, Illumina short reads and a Bionano optical map to generate an assembly of 451 Mb captured in 89 contigs. RNAseq data from 10 different tissue types and the proteome of M. truncatula were used to annotate the genome, resulting in the identification of 38,258 protein coding genes (Hufnagel, Marques, et al., 2020). This assembly is housed on the white lupin genome portal (https://www.whitelupin.fr/). The Chinese team used a combination of long read PacBio, Illumina short read and Hi-C data to generate a genome assembly of 559 Mb captured in 1,580 scaffolds. Comparisons of these two assemblies showed good chromosomal synteny with some small inversions. This second assembly was annotated using protein coding sequences of A. thaliana and several legume species as well as RNAseq data from root, leaf and stem tissue resulting in 48,719 predicted coding sequences (Xu et al., 2020).
Like the findings for NLL by Hane et al. (2017), white lupin has also undergone a WGT event, and the authors reconstructed the three subgenomes. Through the WGT, the genes involved in phosphorus use efficiency and cluster root formation were shown to have duplicated and diverged (Xu et al., 2020). In addition to the reference genome, 15 modern accessions, landraces and wild accessions were re-sequenced by the French team, and comparison of wild and modern cultivar root formation revealed that modern accessions establish lateral cluster roots a lot earlier than their wild counterparts (Hufnagel, Marques, et al., 2020). Furthermore, proteomics analysis of the white lupin seed protein composition showed different profiles for the seed storage protein beta conglutin between wild and modern white lupin varieties. Lastly, the genome assembly was used to identify candidate genes in the region underlying the QTL locus termed pauper, which controls alkaloid accumulation. Among the 66 candidates in the region of interest are transcription factors and genes encoding acyltransferases and cinnamoyl-CoA reductases (Hufnagel, Marques, et al., 2020). These findings support earlier research, which demonstrated a high association between a marker anchored in the acyltransferase-like (LaAT) gene and a low alkaloid pauper phenotype in white lupin .
While the white lupin assembly by Hufnagel and associates is less fragmented, the assembly by Xu and colleagues is larger in size and has a larger number of predicted coding sequences. This led us to compare the different NLL and white lupin genome assemblies through a Benchmarking Universal Single-Copy Ortholog (BUSCO) analysis (Seppey, Manni, & Zdobnov, 2019). BUSCO analysis allows one to assess the completeness of a genome assembly using the concept of single-copy orthologous genes that should be highly con-  (Table 2). Of these, two BUSCOs (31855at33090 and 40630at33090) are absent in both the NLL and white lupin assemblies. To determine if these are also absent in other legume genomes, we conducted BUSCO analysis on several published legume genomes including alfalfa (Shen, Du, Chen, Lu, & Zhu, 2020), adzuki bean (Yang et al., 2015), barrel medic (Young et al., 2011), chickpea (desi and kabuli; Parween et al., 2015;Varshney et al., 2013), common bean (Schmutz et al., 2014), cowpea (Lonardi et al., 2019), lotus (Sato et al., 2008), mungbean (Kang et al., 2014), pea (Kreplak et al., 2019), peanut (Bertioli et al., 2019), pigeonpea (Varshney et al., 2012), soybean (Schmutz et al., 2010) and subclover (Hirakawa et al., 2016) and  the Australian lupin breeding programme wild accessions, seven accessions from the western Mediterranean were used and one from the eastern Mediterranean (Cowling, 2020). These studies also highlighted that there is much higher linkage disequilibrium in NLL cultivars relative to their wild counterparts. Therefore, breeding in novel beneficial loci can be hampered by unwanted linkage to undesirable loci.
To further understand the genetic diversity of NLL, a pan-genome is also being developed by our group that should capture the genetic diversity for the species. Here, genetically diverse wild accessions as well as domesticated European and Australian varieties have been selected based on the previous diversity studies for NLL . Similarly, in white lupin, the first pan-genome for a lupin species was recently published, where 38 white lupin accessions representing cultivars, landraces and wild accessions were de novo assembled, aligned to the reference and a pan-genome assembled . helped to find a single origin for L. mutabilis and pinpointed the area and timing of domestication . In the case of lupin diversification, an initial landmark study used nuclear DNA sequences from Andean Lupinus species, involving 53 accessions from 36 species to show that these Andean lupin species exhibited very recent and rapid diversification (Hughes & Eastwood, 2006). Using a similar approach, Drummond (2008) used three regions of rapidly evolving chloroplast DNA to study North American lupin species and showed that these were also undergoing rapid diversification. A follow-up study used further DNA analysis to provide additional support for the rapid rates of diversification for Andean lupin species and to support that drivers for this high rate of diversification were a shift from annual to perennial life cycle and adaptation to higher altitude environments (Drummond et al., 2012). Overall, these initial studies showed that these diversification rates were among the highest documented for plants and were still accelerating (Drummond, 2008;Drummond et al., 2012;Hughes & Eastwood, 2006).
The mechanisms underlying these rapid diversifications in some lupin species were explored using transcriptome sequencing for 55 New World lupin species  that included annual and perennial species from both North America and South America. In the rapidly diversifying perennial species, the study found major increases in the frequency of adaptive evolution in both coding sequences and regulatory regions compared with what was found for the annual lupin species that were undergoing low rates of diversification. These results indicated that both processes, changes in protein sequence and changes in gene expression, were underlying the rapid diversification observed with some new world lupins. GBS is a powerful marker-assisted selection tool for crop breeding (He et al., 2014) that involves sequencing of many samples, which can be multiplexed to allow for both marker discovery and genotyping for a crop species. As mentioned briefly in the lupin transcriptome section, GBS approaches have been used in white lupin to identify loci associated with vernalisation responsiveness, anthracnose resistance and seed alkaloid content (Książkiewicz et al., 2017). An alternative high-throughput genotyping approach in DArT sequencing has been utilised in NLL  Mousavi-Derazmahalleh, , and such genotyping approaches can be utilised to conduct GWAS or genomic prediction/ selection.

| OTHER DEVELOPING GENETIC AND GENOMIC RESOURCES
Genomic selection is another advanced breeding method that combines genotyping and phenotyping data from a germplasm sample of interest such as a collection of inbred lines or landraces in the case of crops, into a prediction model. The prediction model is initially based on the genotypic and phenotypic data of a training population, which allow the prediction of individuals from a breeding programme that will perform better under certain situations, for example, drought conditions (reviewed in Bhat et al., 2016;Voss-Fels et al., 2019). Once the model has been validated, it can be applied to much larger germplasm populations without requiring the large amount of corresponding phenotyping and selection of a more classical breeding approach and therefore lends itself particularly well for tackling complex, polygenic traits such as yield.
In the case of lupins and genomic selection, good progress has been made in white lupin (Annicchiarico et al., 2019). Annicchiarico and colleagues have examined the utility of genomic selection to predict grain yield and several other agronomic traits in various white lupin germplasm accessions. They evaluated two major statistical models for genome selection in white lupin, the ridge regression BLUP method (Meuwissen et al., 2001) and the Bayesian Lasso method (Park & Caella, 2008). They found both models had similar predictive ability when applied to the prediction of grain yield and other traits such as winter survival and the onset of flowering, based primarily on the analysis of 83 white lupin landraces from diverse cropping regions.

| CONCLUSION AND FUTURE DIRECTION
This review has highlighted the development of a growing suite of genomic resources for two key lupin crops, NLL and white lupin, and these advances offer great promise to accelerate yield gains for these important crop species while helping to further our understanding of many aspects of lupin and indeed legume biology. Figure 1 summarises the major genomic resources that have or are in the process of being developed for both crops. Other tools that will help in this regard are the development and utilisation of efficient gene editing processes, the widespread implementation of speed breeding approaches for both crops and the development of proteomic and metabolomic resources; for example, a pan-proteome of the lupin grain would be a valuable resource.
Collectively, these resources will help researchers and breeders tackle some of the pressing issues facing these crops. For example, while domestication of NLL has had some success, the genetic base of domesticated accessions has been shown to be quite narrow (Berger et al., 2013) and restricts efforts to improve the crop. However, the development of an NLL pan-genome with over 40 wild accessions will provide valuable insight into the core and variable gene set for the crop and facilitate efforts to increase diversity in the breeding of new cultivars. Efforts are also underway to develop genomic resources for the other two domesticated lupin species, namely, yellow lupin and pearl lupin. These combined genomic and related resources will help researchers and breeders to improve lupin crops to help meet the rapidly growing demand for plant protein and the establishment of more sustainable cropping systems, where lupin crops can be important contributors.