Plant genomics in Africa: present and prospects

Plants are the world's most consumed goods and they possess high economic and health values. In most countries in Africa, the supply and quality of food will rise to meet the growing population's increasing demand. Genomics and other tools of biotechnology offer the opportunity to improve on subsistence crops and medicinal herbs in the continent. Significant advancement has been made in plant genomics which has enhanced our knowledge of the molecular processes in both plant quality and yield. The sequencing of complex genomes of African plant species, facilitated by the continuously evolving Next Generation sequencing technologies and advanced bioinformatics approaches, have provided new opportunities for crop improvement. This review summarizes the achievements of genome sequencing projects of endemic African plants in the last two decades. We also present perspectives and challenges for future plant genomic studies that will accelerate important plant breeding programs for African communities. These challenges include lack of basic facilities, lack of skills for genomics studies design, sequencing, and bioinformatics analysis. However, it is imperative to state that African countries have become key players in the plant genome revolution and genome derived-biotechnology. Therefore, African governments should invest in public plant genomics research and applications, establish bioinformatics platforms and training, and stimulate University and Industry partnership to fully deploy plant genomics, particularly to agriculture and medicine.


INTRODUCTION
Currently, the world is facing several challenges, including public health and food security for an ever-increasing population of approximately 8 billion people worldwide, with over 600 million living in Africa (Conway, 2012;Philp, 2018). In Africa, agriculture is one of the most prominent economic sectors, and about 70% of the continent's land is used for agriculture-related activities (Nhemachena et al., 2018). Agriculture can contribute significantly to achieving the continent's developmental goals, which include eradication of poverty and hunger, enhancing trade and investment, sustainable management of resources and environment, employment, and human security and prosperity. Although the continent possesses about 60% of the world's arable land, Africa accounts for just 4% of global agricultural output (Nhemachena et al., 2018). Researchers blame the low productivity on erratic weather conditions, drought, poor seed quality, and obsolete farming practices.
Regarding public health, the incidence and prevalence of non-communicable and infectious diseases have increased. Most of these diseases lack approved and effective treatments, while others are still being treated by decades-old protocols with many limitations. It is interesting to note that Africa hosts a large variety of plants, with about 60% of them being indigenous, and most have been reported to be used for the treatment of diverse diseases (Tuasha et al., 2018). The use of medicinal plants as therapeutics is a common practice. Ethnobotanical surveys have reported some plant species employed in the treatment of many diseases such as malaria, typhoid, tuberculosis, rheumatism, and asthma (Lawal et al., 2020).
Harnessing the full therapeutic potential of plants may contribute to bridging the gap between disease outcomes and therapy management, thereby improving health in the African population. Genome projects on medicinal plants could advance their use as a source of natural drugs coupled with the development and production of pharmaceutical agents. The search for bioactive compounds from medicinal plants is not a new research effort in Africa. However, most of the previous research works are either national-or regional-based. Furthermore, bioinformatics technology has not been applied extensively in any of the previous efforts. Therefore, most of the approaches have only been wet lab-based, yet investigating wet lab validations of obtained promising active compounds, the experimental methods applied have often led to high attrition rates in the final active compounds that are validated.
The field of genomics has opened up a whole new plethora of opportunities and avenues, which can help achieve enhanced productivity and preserve biodiversity under the present-day challenges. The recent advances in genomics offer great potential in crop improvement and herb biodiversity programs. To this end, various crop improvement programs have been initiated (Aglawe et al., 2018;Butt et al., 2020;Moose and Mumm, 2008;Parry et al., 2009). Over the past few decades, genomics-based approaches have been extensively used to understand and dissect the genetic basis of crop improvement. The H3Africa Health Genomics Consortium undertakes efforts to integrate genomics knowledge and expertise, which have been acquired over the last decade, into African conventional crop improvement and biodiversity protection programs (H3Africa consortium et al., 2014). In all African countries, the supply and quality of food must rise to meet the demand. Genomics-driven biotechnology will generate prosperity for producers and consumers by lowering costs and increasing plant yield and quality. Furthermore, famine and malnutrition in Africa may be alleviated by applying genomics or other biotechnology tools to improving subsistence crops and investigating the potential of medicinal plants (for more detail, refer to http://africanorphancrops.org).
Modern genomics and genetic engineering technologies have emerged as powerful tools that can provide roadmaps to sustainable agriculture. These have been successfully utilized and have led to significant achievements like the development of high-yielding varieties, new hybrids, higher-quality products, and transgenic crops. Genetic engineering techniques to develop genetically modified (GM) crops have garnered much attention worldwide. Interestingly, most countries are unwilling to use GM BOX 1 . Summary of the main points 1 Plant genomics and its applications 2 The progress in sequencing and analysis of African plant genomes 3 Application of bioinformatics tools to study plant genomes in Africa BOX 2 . Open question 1 What is the current state of the art of crop and medicinal plant genomics in Africa? seeds, but cross-bred seeds can provide a more workable solution. Therefore, it is necessary to pay attention to the other non-controversial, non-invasive, promising applications of genomic tools in African agriculture, plant biodiversity, and pharmaceutical and herbal industries.
Furthermore, the major bottlenecks affecting the productivity of the so-called orphan (neglected or underutilized) crops in Africa include little or no selection of improved genetic traits and extreme environmental conditions. Nevertheless, some orphan crops have recently been brought to the attention of the global scientific community, where advanced research and development projects have been initiated (Chiurugwi et al., 2019;Epping and Laibach, 2020;Hendre et al., 2019;Jamnadass et al., 2020;Kamei et al., 2016;Kumar and Bhalothia, 2020;Tadele, 2018;Tadele, 2019;Tadele and Bartels, 2019). These initiatives, introducing a range of genetic and genomic methods, addressed significant constraints affecting orphan crop production and/or nutritional efficiency. There is an urgent need for concerted efforts to advance research and development of both major and orphan crops using plant genomics to achieve food security and ultimately improve the livelihood of the population.
Plants possess mitochondrial, chloroplast, and nuclear genomes. The nuclear genome, which is the largest and most complex, contains repetitive sequences and retrovirus-like retrotransposons. Plant genomics studies have elucidated the genetic composition, structure, organization, function, diversity, and interactions embedded in plant genome sequences, and high-throughput technologies, tools, and methodologies have been developed to study the plant genome at the molecular, chromosomal, biochemical, and phenotypic levels. Jackson et al. (2006) pointed out how to choose the plant species to sequence and the sequencing technologies to be used. Genome assembly of sequenced plants aims to generate the complete ordered genomic sequence of a plant. Many bioinformatics algorithms for de novo assembly of sequencing reads are available. However, genome assembly is challenging in the plant genome due to repeats in the genome sequence and/or polyploidy (Pirita et al., 2019). In general, it is highly recommended to use plant-specific genome assemblers that consider the complexity of the plant genome (Michael and VanBuren, 2019;Schatz et al., 2012). Understanding of the plant genetic structure and its genetic organization enhances our comprehension of plant system biology beyond the genomic level (Campos-de Quiroz, 2002). The bioinformatics approaches to investigate the plant genetic structure and its genetic organization include annotating functional and non-functional elements, studying genome regions' epigenetic status, identifying repetitive sequences, and studying the dynamic nature of the genome.
The term pan-genome was introduced in 2005 by Tettelin et al. to refer to the number of genes shared by taxonomically related species (Tettelin et al., 2005). The pan-genome concept includes both function-based and structure-based analyses (Tranchant-Dubreuil et al., 2019). The function-based pan-genome definition considers extending the shared genes into gene families. However, the structure-based definition of a pan-genome considers only the non-redundant set of sequences (about 100 base pairs) among the collection of individual genomes (Tranchant-Dubreuil et al., 2019). The study of the plant pan-genome could result in the creation of a pan-reference genome, which could be used to improve all sequencebased bioinformatics analyses, such as the accuracy of the mapping process (Bayer et al., 2020). The bioinformaticians' roles in the pan-genome study involve genome-wide association study (GWAS) data analysis, transcriptomics, proteomics, and constructing and mining the genomic networks. Nowadays, the pan-genome has been constructed for several plant species, including Asian rice (Oryza sativa) and African rice (Oryza glaberrima) (Tranchant-Dubreuil et al., 2019). Nonetheless, the study of plant pan-genomes still faces some challenges, such as storing the huge genomes of most plants.
Since the first plant genome sequence was released in 2000 (Arabidopsis Genome Initiative, 2000), plant functional genomics have entered the high-throughput stage. Understanding of the genomic functions helps in establishing links between phenotypes and their underlying genotypes. Key contributions of bioinformaticians to the study of plant genomic functions include providing and maintaining databases (Ong et al., 2016). The bioinformatics databases for plant genomics functions are snowballing rapidly. Examples of public databases for querying plant genomes include KEGG (Tokimatsu et al., 2011), PLAZA (Van Bel et al., 2018, Phytozome (Goodstein et al., 2012), GreenPhylDB (Rouard et al., 2010), and PlantsDB (Nussbaumer et al., 2013). None of these plant genomics databases is hosted in Africa.
Arabidopsis thaliana is commonly utilized as a model for researching multiple facets of plant biology. Its small genome (125 Mb) was chosen as the subject of the first plant genome sequencing project in 2001, which was a boost for plant genomics research. Following the sequencing of the Arabidopsis genome, hundreds of plant genomes have been sequenced, and presently a deluge of sequence data is available. The sequences of model plants and economically useful plants help to understand the plants' genome architectures and identify biotechnologically important genes.
Worldwide, plant genomic studies have had a revolutionary impact on agriculture and herbal medicine. Major success stories of plant genomic studies include those facilitated by the National Plant Genome Initiative established in 1998 by the US Government, which sought to comprehend the structures and roles of genes in crops that are essential to agriculture, environmental conservation, health, and energy. Significant work done includes the development of a new efficient method for sorghum (Sorghum bicolor) biofortification (Ashokkumar et al., 2020;Chandra et al., 2020;Cruet-Burgos et al., 2020;Siwela et al., 2020) and the development of innovative technology to enable highthroughput image capture and automated visual analysis of traits within mutant and naturally varying plant populations (Krishnamurthy et al., 2019;Li et al., 2020).
Furthermore, the plant genomics market, which was USD 7.3 billion in 2019 and is projected to grow to USD 11.7 billion by 2025 (https://icrowdnewswire.com/; https:// www.marketsandmarkets.com/), continues to increase. This growth is likely to be driven by the research attention drawn to the application of genomics in plant breeding and conservation of genetic resources worldwide. Until recently, most plant genomics research had been secluded to the industrialized world, although tropical Africa alone houses about 50 times more native plant species than countries in the temperate zone (Slik et al., 2015). Plant genomics research is currently gaining ground in Africa, and the continent is poised to become a hub of plant genomic research. In this review, we present the current status of plant genomics in Africa and explore all the opportunities plant genomics might offer to benefit optimally from the rich flora in the continent.

PROGRESS IN AFRICAN PLANT GENOME SEQUENCING AND ANALYSIS
Plant genomics in Africa is developing, and most plant genomes in Africa were sequenced in the last decade, with a clear acceleration in the recent 5 years thanks to the African Orphan Crops Consortium (http://africanorphancrops. org). For example, Chang et al. (2018) reported the draft genome sequences of five orphan crops, which are local plants that are either underused or neglected but are important to agriculture and agroforestry. They listed some of the potential applications of the released genomes, such as the improvement of crop breeds (Chang et al., 2018). To date, 60 species have undergone initial sequencing, and many plant genomes, including eight genome assemblies from the African Orphan Crops Consortium, have been published and six others are nearing release, with 20 currently underway (Hendre et al., 2019). Below we summarize the achievements of 'African' plant genome sequencing projects, which we define as 'plants that grow in Africa and that are sequenced by African-based groups or by groups involving African-based partners', by the end of 2020 (Table 1). Results and statistics of these plant genome sequencing, assembly, annotation, and transcriptome analyses are summarized in Table 2.
This century might see largely completed sequences for several branches of practically all angiosperm clades, including main crops growing in Africa. These sequences can offer a strong structure for connecting genome-level events to all facets of morphological and physiological variation in crops worldwide. Tables 1 and 2 illustrate interesting features about some of the most common plants in Africa. For instance, the sizes of the African plant genomes sequenced so far vary from 278 Mb to approximately 1.5 Gb and are said to be influenced by climatic changes (Vesel y et al., 2020). The olive tree has by far the largest genome (about 1.5 Gb) and the largest number of genes (50 684 genes), which mainly encode proteins; only 1982 RNA genes have been identified. Making comparisons, O. glaberrima, Artocarpus altilis, and Solanum aethiopicum have genome sizes of 316, 833, and 1020 Mb, respectively, yet they encode nearly the same number of conventional genes (33 164, 33 986, and 34 906, respectively). On the other hand, the genome of Sclerocarya birrea (331 Mb) is slightly larger than that of Moringa oleifera (278 Mb). Interestingly, they have about the same number of genes (18 937 and 18 451, respectively), which is almost half of the number of genes of O. glaberrima, with a similar genome size (316 Mb). Moringa oleifera has the lowest number of genes (18 451) and the highest numbers of rRNA (8406) and tRNA (1241) genes.
The described genomes of these African plant species will serve as a valuable complementary resource for nonmodel food crops, especially legumes. They will be of use to both agroforestry and evolutionary science. Improvement of these plants using genomics-assisted techniques and methods could help deliver food security to millions of people. The obtained genome data could help define and characterize agronomically significant genes and characterize their modes of operation, allowing genomics-based, evolutionary studies and breeding strategies to evolve quicker, in order to establish more oriented and predictable crop improvement programs.
One of the first 'African' plant genomes to be sequenced was Olea Europaea, by the International Olive Genome Consortium, which includes two African-based teams, from Morocco and South Africa. The O. Europaea genome offers significant insights into oil biosynthesis evolution and constitutes a valuable tool for the genomics of oil crops (Unver et al., 2017). Whole-genome sequencing and gene expression studies provide insight into the nature of oil biosynthesis and assist future research aimed at further increasing olive oil production.
Faidherbia albida is known for its traditional use to treat conditions such as fever, diarrhea, hives, vomiting, cough, rheumatism, or even hemorrhage (Tijani et al., 2008). The study of its genome promises to provide more insights into its anti-inflammatory properties (Chang et al., 2018).
The genome of Vigna subterranea, or Bambara groundnut, has also been sequenced (Chang et al., 2018). The plant originates from West Africa and is cultivated in sub-Saharan regions, particularly in Nigeria (Borget, 1992 Linnemann, 1992). This plant is drought-tolerant and has significant nutritional value (Chang et al., 2018;Gbaguidi et al., 2018). The study of this plant's genome will help to establish efficient breeding strategies to improve its decreasing yield. Lablab purpureus, or Dolichos bean, is a member of the Fabaceae family, is one of the most ancient (>3500 years) domesticated legume species (Maass et al., 2005) native to Africa (Maass et al., 2010), and is cultivated throughout the tropics for food (Ojo et al., 2013). It is often grown as forage for livestock and used as a medicinal plant (Al-Snafi, 2017). Lablab purpureus has a good nitrogen-fixing ability, and it is highly adaptable to diverse environmental conditions (Robotham and Chapman, 2017). Based on reports regarding the limited yields of this crop, it is critical to identify and characterize the agronomically important genes; hence, the L. purpureus genome is studied (Chang et al., 2018). On the other hand, marula (S. birrea) is a popular wild tree found in many African countries. Its leaves, stem, bark, roots, and fruits are used for food and traditional medicine (Mariod and Abdelwahab, 2012). The S. birrea fruit is not only renowned for its abundance in ascorbic acid (Mariod and Abdelwahab, 2012), it also contains a hard-brown seed enclosing a soft white kernel rich in oil and protein (Mariod and Abdelwahab, 2012). The draft genome of this species will serve as a valuable complementary tool for non-model food crops to elucidate the medical and socio-economical importance of marula seeds, fruits, and oil and to discover genes involved in the biosynthesis of its metabolites (Chang et al., 2018). Horseradish tree (M. oleifera), a member of the Moringaceae family, is a drought-tolerant plant distributed throughout tropical and subtropical countries including those in Africa (Rivas et al., 2013). This plant covers the major agroecological regions in Nigeria (Busani et al., 2011) and has important nutritional and medicinal values (Busani et al., 2011). The seeds are used to extract oil, which can also be a good source of biodiesel (Ciolkosz, 2015;Rashid et al., 2008). The draft genome of this species (Chang et al., 2018) can be exploited to enhance biodiesel production. The use of biodiesel as fuel increases energy security, improves air quality in the environment, and provides safety benefits (Rashid et al., 2008).
Two 'African' members of the Moraceae family, Artocarpus heterophyllus (Lam.) and A. altilis, have seen their genomes being recently sequenced (Kumar and Bhalothia, 2020). Artocarpus heterophyllus, also known as jackfruit (Prakash et al., 2009), is native to the Western Ghats of India and Malaysia, but it is also found in Central and Eastern Africa (Prakash et al., 2009). Flakes of ripe A. heterophyllus fruits are high in nutritive value (Elevitch and Manner, 2006). The plant A. altilis, also known as breadfruit, has been a reliable staple food in the Pacific region for more than 3000 years, with hundreds of named  cultivars (Ragone, 2018). Breadfruit is grown and used in many tropical regions, including Africa, with most trees originating from just a few Polynesian cultivars that were introduced in the late 1700s (Ragone, 2018). Artocarpus altilis trees generate a variety of nutritious fruits and are easy to cultivate. Breadfruit is rich in vitamin C and carbohydrates, and may be used to support cardiovascular health, aid digestion, and treat dandruff (Ragone, 2018).
The study of the genomes of these two Moraceae species will help to elucidate the mechanisms underlying the high energy density in their fruits and the genes involved in sugar and starch metabolism. The African eggplant (S. aethiopicum), an indigenous non-tuberous crop that mainly grows in tropical regions in Africa (Sunseri et al., 2010), is a member of the Solanaceae family. By dint of its morphological hypervariability (Adeniji et al., 2012;Plazas et al., 2014), the S. aethiopicum is classified into four groups: Gilo, Shum, Kumba, and Aculeatum. The creation of a reference genome sequence will help identify disease resistance genes and discover the expanded gene family responsible for bacterial spot resistance in eggplant (Song et al., 2019). Two members of the Poaceae family have undergone genomic studies. One is O. glaberrima, the first ever 'African' plant genome to be sequenced (Wang et al., 2014). It is an African rice species that was independently domesticated from the wild progenitor Oryza barthii approximately 3000 years ago (Sweeney and McCouch., 2007), 6000-7000 years after the domestication of Asian rice (O. sativa) (Port eres, 1976). Oryza glaberrima (African rice) is an essential food crop in several sub-Saharan countries. The study of the genome of African rice and the population genomics of rice helps to discover the evolution of such species in Africa and how to maintain their sustainability. The second 'African' Poaceae family member whose genome has been sequenced is fonio millet (Digitaria exilis), an African orphan cereal crop that has great potential for dryland agriculture. Abrouk et al. (2020) established high-quality genomic resources to improve fonio by molecular breeding. This study has provided new insights into the genetic diversity, population structure, and domestication of fonio millet (Abrouk et al., 2020).
White guinea yam (Dioscorea rotundata), a member of the Dioscoreaceae family, is a tuber-bearing crop. It is a very popular species in West and Central Africa, which are considered as the main regions for yam production worldwide (Tamiru et al., 2017). Dioscorea rotundata represents a major source of food and income in these regions as well as an integral part of the socio-cultural life of the natives. Genomic analysis of orphan crops such as yam promotes efforts to improve food security and the sustainability of tropical agriculture (Tamiru et al., 2017). Finally, the genome of the argan tree (Argania spinosa), a member of the African Sapotaceae family which serves essential socio-economic and ecological roles in various regions, such as the arid zone in the Southwestern region of the North African country Morocco, has also been sequenced (Khayi et al., 2018). This tree is used as food, as animal feed, and for cosmetic and pharmaceutical purposes. This African endemic genetic asset has experienced serious threats from climate change, environmental factors, and human overexploitation in recent decades. Study of the genetics and genomics of the argan tree will help overcome these challenges (Ghazal et al., 2021). The elucidation of the genomic basis of the production of important metabolites by A. spinosa, including fatty acids, tocopherols, squalene, sterols, and phenolic compounds, strongly stimulated molecular breeding studies in the argan oil industry. Genomic studies of the argan tree could facilitate the discovery of the biosynthesis pathways of oil and other important metabolites, which will help ameliorate the quality and quantity of these metabolites.

Genome-wide single-nucleotide polymorphism analysis of the Ethiopian Capsicum
Ethiopian Capsicum, a genus in the Solanaceae family, is commonly known as chili pepper. Capsicum species are major crop plants and are almost globally distributed (Panda et al., 2004). Chili pepper fruits are not only used as spices and vegetables but also for medicinal purposes due to their richness in vitamins A and C (van Zonneveld et al., 2015). Chili pepper fruits are also used as natural coloring agents, cosmetics, and active ingredients in host defense repellents. The genus includes 27 species, of which five are known to be domesticated (Ince et al., 2010). The five cultivated Capsicum species, namely C. annuum L., C. chinense Jacq., C. frutescens L., C. baccatum L., and C. pubescens Ruiz & Pav., are among the most economically important vegetables worldwide (Gonz alez-P erez et al., 2014). The characterization of genetic diversity in different Capsicum species growing in different regions has shown that the ecological distribution significantly influences plant genetic diversity (Lee et al., 2016). Therefore, analysis of genetic variation in Ethiopian Capsicum species involving many cultivars will help breeders utilize the germplasm collection to improve existing commercial cultivars (Solomon et al., 2019).
To capture genetic diversity, a collection of 142 Capsicum genotypes was established from different geographic areas of Ethiopia. The high-resolution melting curves and the morphological characteristics demonstrated that the collection included one Capsicum baccatum, nine Capsicum frutescens, and 132 Capsicum annuum accessions. Measurement of plant growth parameters revealed variation in germplasm parameters such as plant height, stem thickness, internode length, the number of side branches, fruit width, and fruit length. Broad-sense heritability for fruit weight was maximum, followed by leaf length and width. Genotyping by sequencing (GBS) (Solomon et al., 2019) was performed to identify single-nucleotide polymorphisms (SNPs) in the panel of 142 Capsicum germplasms, and 2 831 791 genome-wide SNP markers were identified. Among these, 53 284 high-quality SNPs were selected and used to estimate the level of genetic diversity, the population structure, and phylogenetic relationships. From model-based ancestry analysis, the phylogenetic tree, and principal coordinate analysis (PCoA), two distinct genetic populations were identified: one composed of 132 C. Annuum accessions and one consisting of nine C. frutescens accessions. GWAS (Solomon et al., 2019) analysis identified 509 SNP markers that were significantly correlated with fruit, stem, and leaf traits.

3000-Year-old Egyptian emmer wheat (Triticum turgidum subsp. dicoccum)
Tetraploid emmer wheat (Triticum turgidum subsp. dicoccum) (NCBI tax ID: 49225) is a progenitor of the world's most widely grown crop, hexaploid bread wheat (Triticum aestivum). Emmer wheat was one of the first cereals domesticated in the old world as it was cultivated around 9700 BC in the Levant (Arranz-Otaegui et al., 2016) and subsequently in Northern Africa. Radiocarbon dating showed that the entire genome sequence of the emmer wheat specimen from the Egyptian Museum could be traced back to the New Kingdom between 1130 and 1000 BC. The aforementioned wheat genome seems to be unusual, carrying haplotypes that are absent in modern emmer (Fuller and Lucas, 2014). Therefore, it is important to study the wheat genome to comprehend the history and diversity of ancient cereals, as ancient DNA sequences can reveal dispersal and domestication histories. Since 1921, emmer wheat samples have been excavated from the archaeological site of the Hememiah North Spur in Egypt (Brunton and Caton-Tompson, 1928). Sequencing was performed using an Illumina NextSeq 500 sequencer, resulting in 861 million reads with a 75 bp length (Scott et al., 2019). Next, sequences were aligned to the Zavitan v.2 reference genome (Avni et al., 2017) for emmer wheat using BWA aligner (Li and Durbin, 2009). SNPs and genotypes were called using the Genome Analysis Tool kit GATK described by McKenna et al. (2010).
Based on the 99 078 SNPs, several methods were used to map genome-wide similarity between the Egyptian wheat and modern ones. This confirmed that Egyptian wheat is genetically closest to the domesticated ones, specifically to the domesticated Indian Ocean subgroup. Not only the identity but also the state of these SNPs confirmed that Egyptian wheat is most concordant (86.4-87.4%) with the Indian Ocean subgroup compared with the other domesticated ones (mean, 81.7%; SD, 1.8%) or wild ones (mean, 79.1%; SD, 1.6%). Phylogenetic analysis showed that Egyptian wheat branches are closest to (but not classified in) the Indian Ocean clade. Moreover, the admixture (Patterson et al., 2012) source-population inference showed that Egyptian wheat shares most ancestry with the Indian Ocean subgroup (Scott et al., 2019).
Egyptian wheat has several specific haplotypes that were described using 50 SNP sliding windows (McKenna et al., 2010). Such unusual haplotypes may reflect missing alleles or incomplete sampling of modern emmer (in the Indian Ocean subgroup, none of the sequenced modern emmers is present, and none is from Africa). Emmer cultivation in the Nile Valley has all but vanished after the Roman period; therefore, it is likely that much of the genetic richness of ancient Egyptian emmer has been lost (Scott et al., 2019).

Plant genetics resources and future trends in plant genomics in Africa
The strength of Africa lies in its natural resources, including genetic resources (Kent et al., 2003;Vicente and Schlebusch, 2020). The latter is the foundation for advancement in agriculture, environment, and the medical field, especially in the areas of drug discovery and disease treatment. Plant genomic studies have buttressed the fact that many plants and food crops originate from different parts of Africa, including West Africa's Niger River basin, North of the Niger River, a part of the Western Sahara Desert that today includes Northern Mali and Mauritania, and East, Central, and Southern Africa. Examples of such crops are African rice (O. glaberrima), African yam (D. rotundata), pearl millet (Cenchrus americanus), and wild millet (Pennisetum glaucum monodii) (Pennisi, 2019). An early cradle of agriculture existed around West Africa's Niger River basin, where the first plant genomics studies started appearing. Many traditional food crops on the continent began with a cereal called pearl millet, which was the African version of rice (Pennisi, 2019).
African cereals have been part of the Cereal Genome Sequencing Efforts on Crop Improvement (Laura, 2017). Resources are available for rice, sorghum, millet, and maize (Zea mays). The use of emerging high-end genomic technologies can be expanded from crop plants to traditional medicinal plants. The use of high-end genomic technologies expedites medicinal plant breeding and transforms them into living factories of medicinal compounds (Da-Cheng and Pei-Gen, 2015). The utility of molecular phylogeny and phylogenomics in predicting chemodiversity and bioprospecting is vital to the discovery and development of natural product-based drugs (Peterson et al., 2019).

The role of plant genomics in African agriculture
Agriculture is currently facing the 'perfect storm' of climate change, increasing fertilizer prices, and increasing food demand from a growing and wealthier human population (Abberton et al., 2016). These indicators may lead to a global food shortage unless crop production is increased in terms of productivity and resilience. Agriculture intensification has centered on increasing productivity under improved conditions with important agricultural inputs (Abberton et al., 2016). In addition, intensive cultivation of a limited number of crops has significantly limited the number of plant species on which humans rely. That alone needs a new agricultural paradigm, which decreases dependence on high inputs, increases crop diversity, and provides stability and resilience to the environment (Abberton et al., 2016). Genomics brings great opportunities to boost crop yield, quality, and output stability via advanced breeding techniques. Genomics also offers the potential to improve major crop tolerance to climate change, enhance productivity, and select minor crops to diversify Africa's food supply (Abberton et al., 2016). It is necessary to have a deep understanding of genomicassisted breeding for most essential staples feeding the planet and how to use and adapt these genomic tools to improve the production of large and small crops with desirable features that enhance adaptation to or mitigate the effects of climate change (Abberton et al., 2016).
Comprehension of genetic variation in a population at the level of the DNA sequence enables the identification of agronomically important genes and associated molecular genetic markers while offering a way of selecting certain genes in breeding programs (Abberton et al., 2016). The key driver of the current genomic revolution is the development of next-generation sequencing (NGS) technology (Abberton et al., 2016). This technology revolutionizes the production of crops as quickly as it has revolutionized medicine. This allows the sequencing of different crop genomes and promotes the correlation of genomic diversity with agronomic characteristics, thereby providing the basis for genomic-assisted breeding. Sequencing technology continues to develop and all the major crops will likely have benefited from sequence-based genomic advancement within a few years (Abberton et al., 2016).
Genome editing technology has great potential to revolutionize agricultural productivity on the African continent, particularly in sub-Saharan Africa, as Komen et al. (2020) pointed out that CRISPR-Cas9 is already being used in Africa to improve primary staple foods including wheat, cassava (Manihot esculenta), and banana (Musa spp.). Indeed, the research results are looking propitious (Komen et al., 2020). In addition, international agricultural science centers are now implementing genome editing in their research and development programs in partnership with national science organizations within Africa (Komen et al., 2020). Research in a variety of African countries is currently in progress. For example, CRISPR-Cas9 technology is used by the Kenya Agriculture and Livestock Research Organization and two other international organizations to develop maize germplasm further until it is immune to maize lethal necrosis (Boddupalli et al., 2020).

The role of plant genomics in herbal medicine in Africa
Herbal medicines have been widely used for more than 5000 years in traditional medicine and ethnomedicine worldwide (Tu, 2011). The World Health Organization has identified 21 000 medicinal crops, out of which 2500 are located in Africa (Modak et al., 2007). Of the African and Indian population, 90% and 70% depends on traditional medicine, respectively. In China, traditional medicine accounts for around 40% of all healthcare delivered (WHO, 2005).
With the fast advances in high-throughput sequencing technologies, a new herbal research discipline, called 'herbal genomics' (herb-genomics), has been established, which combines herbal and genomic research and uses this information together with transcriptomic, proteomic, and metabolomic data to predict the secondary metabolic pathways in herbs. Therefore, it provides a general picture of the genetic background of traditional herbs, enabling researchers to investigate the mechanisms to secure the sourcing of medicinal plants and their active compounds in the future. Furthermore, herb-genomics helps prevent and treat human diseases from an omics perspective (Chakraborty, 2018;Hu et al., 2019). Molecular techniques (such as the random amplified polymorphic DNA method, inter-simple sequence repeat-PCR, amplified fragmentlength polymorphism analysis, microsatellite markers, SNPs, expressed sequence tags, microchip-based genomic profiling, and DNA barcoding) have been well proven to be powerful tools for authentication of medicinal plants based on phylogenic variation signatures on chloroplast and nuclear DNA (Gantait et al., 2014). In recent years, over 7200 plant chloroplast genomic sequences have been published, and over 20% of these plants are believed to have medicinal properties (Hu et al., 2019).
A study conducted by Su et al. (2008) reported the development of strain-specific sequence-characterized amplified region markers for strain detection and authentication of Ganoderma lucidum (Su et al., 2008). Balasubramani et al. (2011) have developed DNA markers from genomic DNA by amplifying and sequencing whole internal transcribed spacer regions (ITS1, 5.8S rRNA, and ITS2). In addition, the use of universal primers has been proven to be effective in authenticating Berberis species. These primers are functional as a molecular pharmacognostic, which is used in quality assessment of raw drugs (Balasubramani et al., 2011;Su et al., 2008). Herb-genomics research has obtained great results, even though it is still in its early developmental stage (Hu et al., 2019). Advanced bioinformatics tools and omics approaches will allow herbgenomics to open a new field for fundamental and applied research into natural drugs (Hu et al., 2019). Moreover, genome sequencing projects may be initiated for medicinal plants. To this end, there is a major need to develop a sequence database with all the bioinformatics tools for all plants and make them available for public use by different nations (Gantait et al., 2014).
To date, just a little part of the plant metabolism has been explored to produce new medicines and other products. Accordingly, there is interest in more herbal genomics studies and the developments in other omics knowledge that may enable the rapid discovery of previously identified metabolic pathways and enzymes (Chakraborty, 2018). Unfortunately, till date, very little has been done in herbal genomics in Africa. There is still a huge gap to fill and plenty of benefits to gain in launching plant medicinal genomics for Africa soon.

The role of plant genomics in novel drug discovery in Africa
Novel drug discovery is often proceeded by screening of bioactive molecules in natural resources such as plants and fungi (Albarano et al., 2020), where metabolites and gene products are screened for their therapeutic activities. However, the advent of genomic technologies has led to significant progress in novel drug discovery. The advance of drug discovery based on plant genomics is expanding into two directions, which are (i) wet-lab experiments and (ii) in silico analyses. Wet lab-driven drug discovery includes plant genome sequencing and chemogenomic analysis; in silico driven drug discovery includes virtual screening via molecular simulation experiments, identification of novel pathways for new drug discovery, and functional genomics analysis (Chakraborty, 2018). Several studies have been performed in several African countries using in silico and modeling tools and methods for plantbased drug discovery.

CONCLUSION AND THE WAY FORWARD
The African continent is currently facing various challenges, including global health and food security for an ever-increasing population. Furthermore, approximately 20% to 30% of the earth's surface is arable land, causing persistent reductions in productivity. As a result, millions may be facing protein deficiency, with three-quarters of the world population getting their daily protein dose from plants. Plant genomics may be able to provide the scientific advances the African societies need for crop improvement. NGS of agricultural crops will also allow for the development of more productive and sustainable practices so that Africa can meet the challenges of feeding and healing a growing population in a changing environment. The current generation of DNA sequencing technologies is making genome sequencing a reality without requiring generous funding sources. This momentous leap brings exciting opportunities to the botanical and plant communities, also in Africa. Whole-genome sequences may contain thousands of nuclear markers for phylogenetic and population-level analyses, allowing genome-wide investigations of fundamental evolutionary and ecological concerns. Additionally, the creation of a pan-genome, recording the genomic richness of ecotypes, regional isolates, and associated organisms (Golicz et al., 2016), would make it possible for comparative methods and association analyses to classify the genetic components of such characteristics and adaptations. The possibilities run the gamut from systematics, ecology, and evolution to molecular genetics.
On the other hand, the incidence and prevalence of noncommunicable and infectious diseases have risen, affecting public health. Besides, the existence of new pathogenic species exacerbates the incidence and prevalence of infectious diseases. Most of these diseases lack appropriate therapies, whereas others are still being handled through decades-old procedures with several limitations. However, the production of novel drugs for certain diseases remains a major challenge. Notably, the pharmaceutical industry has entered an 'innovation crisis' and has been producing at a constant pace for nearly 60 years. Natural medicinal plant extracts have provided the basis for several Western medicines and herbal medicines, a cornerstone of healthcare in the developing world. Many ancient civilizations, including African, Arab, Persian, Indian, and Chinese civilizations, have popularized herbal remedies in developing countries. Chinese and Indian research institutes have initiated multiple genome projects on medicinal plants of Chinese or Indian origin to further explore their economic benefits. A primary propeller has been the effective synthesis of semi-synthetic artemisinin and its entry into commercial production for malaria treatment (Tiwari et al., 2019). This insinuates that harnessing the plant's maximum therapeutic potential will help fill the gap between disease outcomes and therapy management, hence enhancing the health outcomes of the global population. Searching for bioactive compounds from medicinal plants is not a new African research initiative. Even so, in previous studies, two key components were lacking in our view: (i) Most of the studies have only been national or regional (e.g., Bouyahya et al., 2020;Kamau et al., 2016;Okot et al., 2020;Oladeji et al., 2020;Patrick et al., 2015;Spiegler., 2020;Twilley et al., 2020). To achieve the same effect and coverage as the traditional Chinese medicine effort, a continental coverage is needed. (ii) Bioinformatics or computational biology was not extensively applied in any past works. Hence, most of the strategies were based on the The global plant community is getting more convinced than ever that this is the perfect time to be involved in plant genomics research, especially in Africa, for crop improvement and optimization of medicinal plant use. The sequencing of African plant genomes presents a conceptual framework for the establishment of a regional and integrative omics strategy for comprehensive crop and medicinal plant characterization. Plant genomics research will concentrate on predictive tools and functional analysis to annotate the genomes and decipher metabolic pathways through the study of metabolomes and analysis of the biosynthesis of key metabolites as well as the plant microbiome. To this end, collaboration through a pan-African plant genome (PAPG) consortium would be one of the most suitable and efficient ways to organize the broad interests of the plant genomics community for the precious African endemic and orphan crops and plants.
The lack of large-scale plant genomics studies in Africa is the result of many deep-seated issues. Some of these issues include the lack of awareness of the pertinence of genomics in improving breeding techniques and agriculture, the lack of African scientists with genomic science experience, the lack of research infrastructure, the minimal availability of computational expertise and tools, the lack of sufficient funding for research activities from African governments, and the involvement of many African researchers in partnership research at not more than the stage of sample collection. Overcoming these limitations will, in part, depend on African scientists themselves acquiring the expertise and facilities necessary to lead high-quality plant genomics research aimed at solving plant and agriculture problems relevant to the African population and to become internationally competitive in plant genomic science and its applications.
Various initiatives have been established at the regional and continental scales, such as H3Africa and the African Orphan Crops Consortium to address these issues. The NIH-funded H3Africa initiative for a sustainable African bioinformatics network (h3abionet.org) has been instrumental in building bioinformatics facilities and skills in many African countries. H3Africa stands for Human Heredity and Health in Africa (h3africa.org). H3ABioNet projects have demonstrated the value of bioinformatics in the sense of health issues. Moreover, the has NIH sponsored many bioinformatics training programs in Africa. However, these measures have not been applied systematically to the plant-related biological field. To overcome this issue, we propose establishing a Pan African Medicinal Plant Bioinformatics Network (AMPBioNet) to support efforts to expand the reach of H3ABioNet and spread experience with plants across the continent. A primary emphasis on medicinal plants would be an ideal missing link between the human/health and plant/agriculture fields with the ambition of extending the existing tools and learned methods and strategies to other plants such as major cultivars, endemic species, orphan crops, and tropical trees. The Pan African Medicinal Plant Bioinformatics Network (AMPBi-oNet) will do so by (i) pinpointing already sequenced plants, and if not yet sequenced, sequencing targeted plants based on their pharmacological properties for generating assembled and annotated plant genomes; (ii) developing both analytical and statistical models for screening such genomes to elucidate their pharmacological properties and completely characterize them; and (iii) build an interdisciplinary and multidisciplinary network of African researchers designing and implementing models using biocomputing tools and facilitating translation into novel plant-based health interventions.
Accordingly, the Pan-African medicinal plant bioinformatics network AMPBioNet will examine traditional African medicines for novel compounds to treat most common or re-emerging diseases in Africa, such as malaria, leishmaniasis, trypanosomiasis, tuberculosis, and colon and prostate cancers, through network pharmacology and genomics approaches. As mentioned above, although many genomes of African plants have been sequenced, most funding is external. Thus, the research is outsourced and training is mostly given by foreign experts. This often raises sustainability issues and brain retention concerns. From this viewpoint, H3Africa for human and health is a very convenient model for large-scale actions to follow in plant sciences and agriculture. The already launched genomics revolution for human health in Africa through the pan-African H3Africa project could support agriculture and speed up the discovery of new plant genomics methods provided the lessons learned from H3Africa are well implemented from a similar plant, physiology, pathology, and progeny (P4Africa) perspective.
The availability of affordable NGS technologies has boosted research on plant genomics worldwide. Africa is on the right track to benefit optimally from this situation. In recent years we have seen the emergence of a dozen sequenced African plant genomes thanks to the contributions of African scientists. The NGS technologies enabled sequencing, assembly, and annotation of key African plant genomes, thereby generating an unprecedented volume of information for local African researchers and breeders. Using this deluge of plant genomics data within the continent places an unprecedented challenge upon the few local bioinformaticians to deal with the volume, speed, and variety of the data generated by local academia and to improve storage, sharing, and analysis capacities. Plant bioinformatics is expected to bring new insights into the genetics of the crops, associated with phenotyping, enabling the optimal exploration of plant genetic resources for a range of significant end-user characteristics to promote more uses to improve food production, nutritional safety, and plant biodiversity. The efforts of the African scientific community and their international colleagues will not in themselves be sufficient. National governments and regional political and economic organizations must support sustained funding of all research fields, including the development of plant genomics and associated research infrastructure.
With respect to data mining and sharing, the H3Africa consortium can be inspiring as it has developed an approach that attempts to balance (i) the protection of the ability of African scientists to be the first to analyze and publish their research findings given their limited resources and capacities to deal with data as quickly as scientists in developed countries and (ii) the benefit from global accessibility to H3Africa data and biospecimens. To meet these not fully compatible goals, the H3Africa consortium has decided that data will initially be made accessible to consortium members through H3ABioNet before they are sent to the European Genome Phenome Database, through which they will be publicly released (through an 'Independent Data and Biospecimen Access Committee'). As is typical in genomics, there will be a short lag (1 year) between submission and publication of the data. This is marginally longer than the norm (6 to 9 months) to provide resource-challenged African investigators a bit more time to analyze and submit their manuscripts for peer review.
Similar considerations were given to the development of the policy for the release of biospecimens collected in Africa. Biospecimens will indeed be stored within an African biorepository (with copies elsewhere on the continent) and circulated globally for further study. However, data and biospecimen sharing do raise the often contentious issues of ownership and commercialization rights. The prospective P4Africa consortium will address this issue while embracing an ethos that promotes research for the global common good. Resources generated by P4Africa will be expected to be useful in future plant genomic research in Africa and globally.
In conclusion, famine, malnutrition, and the health burden in the African continent may be alleviated by applying genomics or other biotechnology tools to improve subsistence crops and plants of nutritional and medicinal value. The role of the public sector could be great, but government policies on these issues are still unclear. Through the African Union and other affiliated institutions, African nations should invest in public plant genomics research with suitable data deposition and sharing policies. It is imperative that publicly funded plant genomics databases be equally accessible to everyone in the continent. Funds should be increased for improving crop germplasm in the continent. Indeed, agricultural plant genomics studies should be publicly funded as plants' DNA sequences are necessary for the continued progress to understand crop and plant ecology and biodiversity. Furthermore, plant genomics research is naturally organized in a team-based manner. This would constitute another tool to enhance South-South partnership along with the traditional South-North schema. Finally, ethical and social implications of plant genomics studies, genome editing, and breeding should be considered, and patent laws affecting plant genomics must be clarified to evaluate and improve the perception and acceptance by the African population, in order to ensure fair benefit sharing and biodiversity conservation. Policymakers have to be informed and well introduced to this new but quickly accelerating field. Undoubtedly, the future of plant genomics in Africa looks bright for both consumers and producers of food and herbal medicines. Still, without proper measures and targeted actions, the poorest nations of the continent might miss the plant genomics revolution, which will also allow for the development of more productive and sustainable practices. Together, we can meet the challenges of increasing demands for feeding and healing of a fast-growing population in a changing environment. If the dearth of plant genomics research involving Africans persists, the potential nutrition, health, and economic benefits emanating from genomic science may once again elude the entire continent.