Genetics, genomics and breeding of groundnut (Arachis hypogaea L.)

Abstract Groundnut is an important food and oil crop in the semiarid tropics, contributing to household food consumption and cash income. In Asia and Africa, yields are low attributed to various production constraints. This review paper highlights advances in genetics, genomics and breeding to improve the productivity of groundnut. Genetic studies concerning inheritance, genetic variability and heritability, combining ability and trait correlations have provided a better understanding of the crop's genetics to develop appropriate breeding strategies for target traits. Several improved lines and sources of variability have been identified or developed for various economically important traits through conventional breeding. Significant advances have also been made in groundnut genomics including genome sequencing, marker development and genetic and trait mapping. These advances have led to a better understanding of the groundnut genome, discovery of genes/variants for traits of interest and integration of marker‐assisted breeding for selected traits. The integration of genomic tools into the breeding process accompanied with increased precision of yield trialing and phenotyping will increase the efficiency and enhance the genetic gain for release of improved groundnut varieties.

(AABB, 2n = 4x = 40), which is believed to be the result of hybridization between two wild species, Arachis duranensis (AA-genome, 2n = 2x = 20) named as "A-genome ancestor" and Arachis ipaensis (BB-genome, 2n = 2x = 20) named as "B-genome ancestor" and subsequent chromosome doubling. Based on the patterns of reproductive and vegetative branching and on the pod morphology, the cultivated species is divided into two cultivated subspecies that is A. hypogaea subsp. hypogaea and A. hypogaea subsp. fastigiata.
Groundnut is grown in more than 100 countries covering over 26 million (M) hectares (ha) area in 2014 with a global production of about 44 M metric tons and an average yield of about 1,655 kg/ha (FAOSTAT 2017). Asia (58.3%) and Africa (31.6%) accounted for about 90% of the world's production with China (16.6 M tons), India (6.6 M tons) and Nigeria (3.4 M tons) being the top three largest producing countries (FAOSTAT, 2017). The groundnut seed contains 22% to 30% protein and 35% to 60% oil and is a rich source of dietary fibre, minerals, vitamins and bioactive compounds, hence contributing to household nutrition. It is suitable for making nutrientdense foods for alleviating malnutrition in vulnerable groups such as pregnant and breastfeeding women and children under 2 years, particularly in developing countries (Anim-Somuah, Henson, Humphrey, & Robinson, 2013). The haulms and groundnut cake are important sources of animal feed. In addition, groundnut has the ability to fix atmospheric nitrogen benefitting the succeeding crop. As a cash crop, it is frequently traded locally, regionally and globally, significantly contributing to rural household cash income and national economy. In the west and central Africa (WCA), for example, groundnut accounts for up to 50% or more of rural household cash income in many countries-46% in Mali, 54% in Nigeria, 66% in Niger and 80% in Senegal (GAIN 2010;Ndjeunga et al., 2010). In Asia and Africa, a large number of women and youth are engaged in the cultivation, processing and marketing of groundnut, thereby contributing to their economic participation and empowerment. In Nigeria, for example, almost all the small-scale groundnut oil processing is controlled by women. In Mali, about 85% of groundnut fields are owned by women (Ndjeunga et al., 2010).

Groundnut productivity significantly varies among regions with
Africa having the lowest mean yield of around 965 kg/ha (FAOSTAT 2017). In Asia, the productivity is relatively better with an average yield of 2,370 kg/ha. On the other hand, in the USA and other developed countries, groundnut yields are high with a yield over 3,300 kg/ha. In general, groundnut productivity has significantly increased over the last five decades with a global yield average increasing from 849 kg/ha in 1961 to 1655 kg/ha in 2014, which is attributed to significant advances in genetics, genomics, breeding and crop management. This paper reviews the advances in understanding the genetics of important traits, genome sequences, molecular marker development, QTL analysis, genetic resources, breeding for specific traits and integration of genomic tools into groundnut breeding process to enhance the genetic gain and improve the productivity of the crop.

| GENOMICS
Limited genomic resources existed for groundnut prior to 2005 . However, significant advances have been made in recent years in genome sequencing, development of molecular markers, construction of genetic maps and quantitative trait locus (QTL) analyses. Various marker systems including RFLP (restriction fragment length polymorphism), RAPD (random amplification of polymorphic DNA), AFLP (amplified fragment length polymorphism), DArT (diversity array technology), SSR (simple sequence repeat) and SNPs (single-nucleotide polymorphisms) were developed Varshney, 2016) and have been utilized for genetic diversity analyses, constructing genetic maps, mapping of traits of breeding interest and marker-assisted breeding. The emphasis has been more on SSR and SNP markers for usefulness and practical reasons. SSR markers are codominant, more informative and easy to score in the tetraploid genome, while SNP markers are highly amenable to highthroughput genotyping approaches . Consequently, a large number of expressed sequence tag (EST)-based SSR markers ranging from 26 (Hopkins et al., 1999) to 6455 (Peng, Gallo, Tillman, Rowland, & Wang, 2016) have been reported. Similarly, large numbers of SNP markers have been developed including 8486 candidate SNPs from a screening of sequences of 17 genotypes assembled along with sequences from the reference 'Tifrunner' transcriptome (Alves et al., 2008;GCP 2011), which was used to construct 1536-SNP GoldenGate assay (Nagy et al., 2012).
Another 768-SNP Illumina GoldenGate assay was developed at the University of California-Davis . These assays were found very informative for genotyping diploid species, but limited use for tetraploid species Pandey et al., 2012). Zhou et al. (2014) reported the development of 53,257 SNPs for tetraploid species. Additional SNPs have become available including 62 SNPs (Hong et al., 2015), 263,840 SNPs and indel variants (Chopra et al., 2015), 11,902 SNPs  and 6965 SNPs (Peng et al., 2017). Besides, 96 SNP markers were converted to kompetitive allele-specific PCR (KASPar) SNP markers to develop KASPar assays designated as GKAMs (groundnut KASPar assay markers) for use in LGC's KASP genotyping service (Khera et al., 2013). Similarly, easy-to-use KASP markers linked to root-knot nematode (RKN) resistance loci were developed and validated in a tetraploid context (Leal-Bertioli et al., 2015).
Genetic maps were constructed to understand the groundnut genome structure and organization and to identify QTLs for traits of breeding interest. Different marker systems such as RFLP (Halward, Stalker, & Kochert, 1993), RAPD (Garcia, Stalker, Schroeder, Lyerly, & Kocher, 2005), AFLP (Herselman, Thwaites, Kimmins, & Seal, 2004), SSR (Moretzsohn, Barbosa, Alves-Freitas, Teixeira, & Leal-Bertioli, 2009), SNP  and DArT  were employed to construct the genetic maps, but the majority of maps were based on SSR markers from biparental populations (Table 1). Earlier SSR-based genetic maps had lower marker density (e.g., 135 markers, Varshney et al., 2009), but as more and more SSR markers have become available, the genetic maps were improved with more dense maps developed recently (e.g., 1,469 markers- Shirasawa et al., 2013). SNP and other markers were integrated into some of the genetic maps. Besides, six consensus maps were developed, the first with 175 loci (Hong et al., 2010) and the latest with 3,693 loci (Shirasawa et al., 2013), which are useful for the characterization of the groundnut genome. Specifically, the construction of the consensus map by Shirasawa et al. (2013) from 16 segregating populations of diverse genetic backgrounds has enabled mapping a larger number of loci with greater genome coverage than in any of the genetic maps from the single populations and was useful to determine the relative position of common markers across different mapping populations. While many genetic maps were developed with a focus on mapping maximum number of loci onto a single map (e.g., Foncéka et al., 2009;Hong et al., 2008Hong et al., , 2010Shirasawa et al., 2013;Wang et al., 2012), majority of them were developed with a focus on facilitating QTL analysis (trait mapping) and development of diagnostic markers for marker-assisted breeding. QTL analysis studies to date have reported the identification of more than 1,380 small and major effect QTLs (Table 2) for various traits including agronomic and yield component traits (e.g., Selvaraj et al., 2009), quality traits (e.g., Sarvamangala et al., 2011;Shasidhar et al., 2017), biotic stress resistance (e.g., Khedikar et al., 2010;Kolekar et al., 2016;Pandey, Wang, et al., 2017;Pandey, Khan, et al., 2017;Zhou et al., 2016) and abiotic stress resistance mainly for drought-related traits (e.g., Leal-Bertioli et al., 2016;Varshney et al., 2009). Another significant advance in groundnut genomics has been the release of the draft genome sequences of the 1.1 Gb genome size for A-genome progenitor (A. duranensis, accession V14167) and 1.38 Gb for B-genome progenitor (A. ipaensis, accession K30076) . In addition, the draft genome sequence of another A-genome progenitor accession (A. duranensis, accession PI475845) was generated with 1.07 Gb genome size which provided greater insights into the genome architecture and genes related to important traits such as geocarpy, oil biosynthesis and allergens . In the case of cultivated tetraploid genotype, a high-quality genome assembly of 'Tifrunner', an important US variety with good market and growth characteristics and resistance to several diseases, was released in December 2017 (https://peanutba se.org/peanut_genome). The draft genome sequences have enabled large-scale genomewide discovery of 515,223 indels  and SSRs including 105,003 SSRs in the A-genome , 135,529 SSRs in the A-genome (Zhao et al., 2017), 199,957 SSRs in the B-genome (Zhao et al., 2017), 84,383 in the A-genome  and 120,056 in the B-genome  sequencing of 41 groundnut accessions and wild diploid ancestors against the genomes of two groundnut progenitors, that is A. duranensis and A. ipaensis (Pandey, Agarwal, et al., 2017), which was used to identify signatures of selection and tetrasomic recombination in groundnut . For understanding the genetic architecture of domestication-related traits in groundnut, specificlocus amplified fragment sequencing (SLAF-seq) method was employed for large-scale identification of 17,338 high-quality SNPs in the whole groundnut genome, and 1,429 candidate genes for eleven agronomic traits were found using genomewide association studies in 158 peanut accessions .

| Focus traits and breeding methods
Priority traits in groundnut breeding include high pod yield, early maturity, high shelling percentage, high oil, resistance to biotic and abiotic stresses, fresh seed dormancy, confectionery, high oleic acid and dual-purpose types. In the USA and other developed countries, under high input production system, the breeding focus has been maximizing yield, but in recent years, improving quality and flavour, resistance to drought and diseases have become important priorities. In Asia and Africa, the focus has been increasing pod yield with enhanced resistance to biotic and abiotic constraints and high oil content. Conventional breeding approaches such as introduction, selection, mutation and hybridization (pedigree, backcross and single-seed descent, etc.) have been used to develop improved varieties. In the USA, although it was used extensively in the late 1950s to early 1970s, mutation breeding is little used in the present day (Holbrook & Stalker, 2003

| Genetic resources
Genetic resources are important sources of variability for traits of breeding interest and serve as reservoirs of many useful genes for the present and future groundnut improvement programmes. Several groundnut accessions are conserved globally in national and international gene banks including ICRISAT, the USA, Brazil, India and China (Ntare, Waliyar, Mayeux, & Bissala, 2006;Pandey et al., 2012).
Majority of these accessions have been characterized for various morphoagronomic and biochemical traits using groundnut descriptors (IBPGR andICRISAT 1992, Jiang &Duan, 2006;Pittman, 1995) where large variation for qualitative and quantitative traits, seed quality traits and resistance to biotic and abiotic stresses was observed (Barkley, Upadhyaya, Liao, & Holbrook, 2016). Diversity studies using molecular markers revealed generally low diversity within the cultivated types (e.g., Halward, Stalker, Larue, & Kochert, 1991;He & Prakash, 1997;Herselman, 2003;Hopkins et al., 1999;Moretzsohn et al., 2004), but moderate-to-high polymorphisms were also reported (e.g., Cuc et al., 2008 specific lines (Gowda, Upadhyaya, Sharma, Varshney, & Dwivedi, 2013). It is also costly to screen large collections for specific traits of breeding interest (Holbrook & Stalker, 2003). A subset that represents the genetic diversity facilitates easier access to the genetic resources and enhances their use in crop improvement programmes was required. Hence, core and minicore collections were established in China (Jiang et al., 2008) and USA (Holbrook, Anderson, & Pittman, 1993;Holbrook & Dong, 2005) Dwivedi, Sharma, et al., 2014) and also offer important variability for agronomic traits including yield (Upadhyaya, Dwivedi, Sharma, et al., 2014). Hence, several lines have been developed through interspecific hybridization to increase the variability for important traits, and some improved varieties were released.

| Drought
With more than 70% of groundnut area being in the semiarid tropics , drought is a major production constraint.
T A B L E 3 Some sources of variability identified/developed for traits of breeding interest in groundnut
Empirical approach or trait-based approach or a combination of both is used for phenotyping for drought resistance . The empirical approach involves selection based on pod and grain yield under imposed drought stress conditions. The trait-based approach involves phenotyping for traits such as HI, total amount of water transpired (T), TE and water use efficiency (WUE). Positive correlations were reported between TE and pod yield under waterstressed environments (Devi et al., 2011;Sanogo, 2016 Note. a These are derivatives of a cross involving ICGV-SM 83708 (CG7) and ICGV-SM 90704 (Monyo & Varshney, 2016). b The varieties were released based on their performance for one or more of important traits including high yield, drought resistance, foliar disease resistance, rosette resistance, etc.  Sanogo, 2016). Janila, Manohar, Rathore, and Nigam (2015) observed low heritability for SCMR and SLA. On the other hand, high correlations of both SCMR and SLA with pod yield and other economic traits such as 100-seed weight were reported (Janila et al., 2015;Songsri et al., 2009;Upadhyaya, 2005;Upadhyaya et al., 2011). High heritability and a lower G × E interaction for the surrogate traits were also reported (Songsri et al., 2009;Upadhyaya et al., 2011). Varshney et al. (2009) reported moderate-to-high heritability for drought-related traits with alleles having moderate additive effects identified. Additive and both additive and nonadditive effects were also reported (Lal, Hariprasanna, Rathnakumar, Gor, & Chikani, 2006;Nigam et al., 2001). A combined use of the empirical and trait-based selection approaches has been suggested under drought stress conditions (Devi et al., 2011;Janila et al., 2015;Nigam et al., 2005) as it would be advantageous in selecting genotypes which are more efficient water utilizers or partitioners of photosynthates into economic yield.

| Leaf spots
ELS and LLS are caused by Cercospora arachidicola Hori and Cercosporidium personata (Berk & Curt.). Deighton, respectively, are the most common and serious diseases of groundnut, which can cause pod yield losses of over 50% (Mayeux & Ntare, 2001;McDonald, Subrahmanyam, Gibbons, & Smith, 1985). Field and laboratory screening methods involve sowing genotypes in replicated plots with rows of a highly susceptible cultivar arranged systematically throughout the trial with good disease development ensured through the provision of inoculum . A 9point disease scale is used for measuring reactions separately for the two leaf spots. Earlier germplasm screenings resulted in the identification of promising lines for resistance sources (Subrahmanyam, Moss, McDonald, Subba Rao, & Rao, 1985), and since then, many additional lines have become available as good sources of resistance (GCP 2011;Izge, Mohammed, & Goni, 2007Kanyika et al., 2015;Monyo & Varshney, 2016).

| Rosette
Groundnut rosette disease (GRD) caused by the groundnut rosette virus (GRV), groundnut rosette assistor virus (GRAV) and satellite RNA  is a devastating disease. A method for simultaneous detection of the three causal agents has been published (Anitha, Monyo, & Okori, 2014).
Resistance among these cultivars was effective against both chlorotic and green rosette forms of the disease and was governed by two independent recessive genes (Nigam & Bock, 1990;Olorunju, Kuhn, Demski, Misari, & Ansa, 1992). Breeding through utilizing the cultivars resulted in the development of long-duration Virginia cultivars and early and medium maturing Spanish types (GCP, 2011;Mayeux et al., 2003;Monyo & Varshney, 2016, Ntare et al., 2002.  (Mayeux et al., 2003). More recently, seven accessions with consistent very low aflatoxin accumulation were identified (Waliyar et al., 2016). However, G × E interaction remains a major issue in screening for aflatoxin resistance , and generally, little progress has been made in using conventional breeding for enhancing host-plant resistance to aflatoxin contamination (Waliyar et al., 2016). Even if some elite lines were recommended for cultivation in India

| Quality
Oil and oleic acid content and confectionery traits are among the important quality traits. Various physical sensory, chemical and nutritional factors determine the quality of groundnut for which substantial genetic variability exists (Dwivedi & Nigam, 2005). Nearinfrared reflectance spectroscopy (NIRS), a robust and nondestructive method, is gaining popularity for the estimation of oil, protein, carbohydrate and fatty acid contents . It is also cost-effective compared with wet chemistry. At ICRISAT, a large number of accessions screened had 34%-55% oil content (Dwivedi & Nigam, 2005). Several advanced lines for high oil content have also been recently developed Janila unpublished

| Marker-assisted breeding
Genomic tools enhance crop breeding process by increasing the efficiency and speed of precision breeding to develop improved varieties. Diagnostic molecular markers linked with traits of breeding interest (or major effect QTLs) were identified for root-knot nematode (Choi et al., 1999;Chu, Holbrook, Timper, & Ozias-Akins, 2007;Church, Simpson, Burow, Paterson, & Starr, 2000;Garcia, Stalker, Schroeder, & Kochert, 1996;Simpson, 2001), rust (Khedikar et al., 2010;Mondal, Badigannavar, & D'Souza, 2012), rust and LLS (Kolekar et al., 2016;Sujay et al., 2012), nutritional quality traits (Chen, Wang, Barkley, & Pittman, 2010;Chu, Holbrook, & Ozias-Akins, 2009;Sarvamangala et al., 2011;Wilson et al., 2017), TSWV (Tseng, Tillman, Peng, & Wang, 2016) and growth habit . Some of these linked markers have been validated and deployed for marker-assisted selection (MAS) and marker-assisted backcrossing (MABC). In the USA, MAS has been used for pyramiding nematode resistance and high oleic trait (Chu et al., 2011). At ICRISAT, MABC was employed to transfer a major rust resistance QTL from GPBD 4 to three popular varieties (ICGV 91114, JL 24 and TAG 24) resulting in the development of rust resistance lines with 56%-96% increase of pod yield . Some of these lines were also found to be resistant to LLS with 39%-79% of higher mean pod yield . Besides, MAS and MABC were used to enhance the oil quality traits in three groundnut varieties (ICGV 06110, ICGV 06142, and ICGV 06420) by transferring FAD2 mutant alleles from SunOleic 95R. A large number of lines with increased oleic acid in the range of 62%-83% were identified (Janila, Pandey, Shasidhar, et al., 2016), which are currently being evaluated for yield (Janila, pers. Comm.). At Dharwad University of Agricultural Sciences in India, MABC was used to improve JL 24 with GPBD 4 as donor parent (Yeri & Bhat, 2016). Similarly, MABC was employed to improve TMV 2 for LLS and rust using GPBD 4 where two backcross lines showed enhanced resistance to LLS and rust along with 71.0% and 62.7% increase of pod yield over TMV 2 (Kolekar et al., 2017). In the case of other important quantitative traits such as drought tolerance and yield components, QTL analyses using biparental populations revealed few major rather several small-effect QTLs. Genomewide association studies for 50 agronomic traits using 300 genotypes from the "reference set" identified a total of 524 highly significant MTAs for 36 traits (Pandey, Upadhyaya, et al., 2014) indicating complex genetic control. Breeding approaches such as marker-assisted recurrent selection and genomic selection are the preferred approaches for introgression of a larger number but smalleffect QTLs. But such approaches have not been widely used in groundnut.

| CONCLUSION AN D FUTUR E PERSPE CTIVES
Significant progress has been made in groundnut genetics, genomics and breeding, thus contributing to the increased productivity and production of groundnut globally although the rate of increase varies among regions. It is worth mentioning that the progress has been achieved through strong partnership and collaborations between scientists from national research systems, international research institutes, universities, and private research organizations and service providers. Globally, large numbers of groundnut lines were identified or developed as sources of variability for important traits and many improved varieties were released for target environments by breeding programmes. The last decade has witnessed the rapid development of genomic tools helping to better understand the groundnut genome. MAS and MABC have proved useful for selected traits.
Emerging trait mapping approaches are expected to help the search for linked markers for other traits and develop diagnostic markers for breeding applications. The availability of the diploid and tetraploid genome sequences will provide more opportunities to identify the useful genetic variation for breeding at a genome scale, discover the genes of breeding interest and identify additional molecular markers amenable for high-throughput genotyping. High-throughput genotyping technologies are advancing fast with genotyping costs getting cheaper. It will not be far for such technologies to be routinely utilized by many breeding programmes, if not all, for screening segregating populations, purity testing, genetic mapping, targeted resequencing of specific genomic regions and other studies. In summary, groundnut improvement tools are available to exploit and build on past achievements for new discoveries to enhance and accelerate the genetic gain of breeding programmes such that processes for the development and release of improved varieties are speedy, technically efficient and cost-effective.

ACKNOWLEDG MENTS
The authors are thankful to Bill & Melinda Gates Foundation (BMGF) for the financial support under TL III project: Opportunity/Contract ID OPP1114827. The BMGF financial support has significantly contributed to groundnut research and development globally and in Asia and Africa particularly, through tropical legume (TL) projects.