Microsatellite characterization and marker development for the fungus Penicillium digitatum, causal agent of green mold of citrus

Abstract Penicillium digitatum is one of the most important postharvest pathogens of citrus on a global scale causing significant annual losses due to fruit rot. However, little is known about the diversity of P. digitatum populations. The genome of P. digitatum has been sequenced, providing an opportunity to determine the microsatellite distribution within P. digitatum to develop markers that could be valuable tools for studying the population biology of this pathogen. In the analyses, a total of 3,134 microsatellite loci were detected; 66.73%, 23.23%, 8.23%, 1.24%, 0.16%, and 0.77% were detected as mono‐, di‐, tri‐, tetra‐, penta‐, and hexanucleotide repeats, respectively. As consistent with other ascomycete fungi, the genome size of P. digitatum does not seem to correlate with the density of microsatellite loci. However, significantly longer motifs of mono‐, di‐, and tetranucleotide repeats were identified in P. digitatum compared to 10 other published ascomycete species with repeats of over 800, 300, and 900 motifs found, respectively. One isolate from southern California and five additional isolates from other countries (“global isolates”) were used to initially screen microsatellite markers developed in this study. Twelve additional isolates, referred to as the “local isolates,” were also collected from citrus at the University of California Riverside agricultural experiment station and were subsequently used to screen the primers that sequenced well and were polymorphic based on the global isolates. Thirty‐six primers were screened, and nine trinucleotide loci and one hexanucleotide locus were chosen as robust markers. These loci yielded two to seven alleles and will be useful to study population genetic structure of P. digitatum populations.

2 of 10 | VARADY et Al. population genetic structures of many types of organisms (Putman & Carbone, 2014). Within the Eukaryotes, studies investigating microsatellite variability within the Fungal Kingdom have lagged compared to both the Plant and Animal Kingdoms in general. Two of the factors that contribute to this are as follows: (a) The mycological community of researchers is small compared to groups studying other organisms such as mammals or plants and (b) distinct microsatellite loci in fungi appear to be much less abundant compared to other organisms studied, and the standard techniques to isolate microsatellites, such as selection using hybridization of specific biotin-labeled microsatellite motif probes, are difficult to achieve for many fungal species studied thus far (Dutech et al., 2007).
However, full genome sequencing of many different fungal species has enabled mycologists to investigate not only microsatellite distribution within specific fungal genomes but also allowed them to compare this information to other sequenced fungi (Karaoglu, Lee, & Meyer, 2005;Lim, Notley-McRobb, Lim, & Carter, 2004;Simpson, Wilken, Coetzee, Wingfield, & Wingfield, 2013). Specific sequenced fungal genomes have allowed access to a plethora of potential microsatellites to develop markers to address various questions regarding the population biology of specific fungal species and/or species complexes (Simpson et al., 2013). Moreover, next-generation sequencing technologies have also become relatively inexpensive, which has allowed researchers the ability to develop microsatellite markers from incomplete genomic data for population-level studies for nonmodel organisms in general including fungi (Abdelkrim, Robertson, Stanton, & Gemmell, 2009;Cai, Leadbetter, Muehlbauer, Molnar, & Hillman, 2013;Yu, Won, Jun, Lim, & Kwak, 2011).
Penicillium digitatum is one of the most important postharvest pathogens of citrus on a global scale, which can be responsible for up to 90% of total crop loss after packing, storage, transportation, and marketing (Eckert, Sievert, & Ratnayake, 1994). Penicillium digitatum is a haploid fungus, which belongs to the Phylum Ascomycota, and is only known to reproduce asexually, but is found essentially everywhere citrus is produced. Despite the economic importance of this fungal pathogen, interesting ecology, and global distribution, few studies have been published on the population biology of P. digitatum. Most studies on P. digitatum diversity have focused on various aspects of fungicide resistance within "agricultural" populations and/or citrus packing plants from various citrus growing regions (e.g., Sánchez-Torres & Tuset, 2011).
The genome of P. digitatum has been sequenced, thus allowing for the mining of this genome for microsatellite markers (Marcet-Houben et al., 2012). Therefore, the objectives of this study were to (a) determine the distribution of microsatellite loci in the published genome of P. digitatum and (b) to design primers from specific loci to develop markers that would have utility for future population genetic studies of this important pathogen. To accomplish this, a global representation of isolates of P. digitatum was used to screen various loci to determine whether variation could be found by sequencing each locus. Variable loci were then further screened from a local California collection of isolates to gain insight into the utility of the markers for both fine-scale and global perspectives.
html. The genome of P. digitatum used in this study was published by Marcet-Houben et al. (2012) from an isolate (PHI126) recovered from orange in Valencia, Spain. The final assembly of the genome resulted in a genome size of approximately 26 Mb with an average GC content of 48.9%. The program Msatcommander 1.0 (Faircloth, 2008) was used to characterize the entire mono-, di-, tri-, tetra-, penta-, and hexanucleotide microsatellites within the genome of P. digitatum isolate PHI126. The parameters set were to restrict mononucleotide repeats to 12 bp and above, while the rest were restricted to five repeats and above. The output files from Msatcommander 1.0 were sorted in Microsoft Excel 2016, and the results were compared to other published genomes of ascomycete fungi (Karaoglu et al., 2005;Lim et al., 2004;Simpson et al., 2013). For this comparison, the original data from the published manuscripts were used so the comparisons are relative because slightly different parameters and different software were used to calculate microsatellite density between the different studies. The length distribution of mononucleotide repeats was also compared using SigmaPlot software (Systat Software, San Jose, CA).

| Microsatellite search and primer design
Primer pairs for di-, tri-, tetra-, penta-, and hexanucleotide loci (n = 43) were designed using the online version of Primer3 (Koressaar & Remm, 2007;Untergasser et al., 2012) with the default settings. The microsatellite motifs were identified by searching the genome for various repeats and choosing repeats that were at least greater than 9. Only perfect microsatellite loci within the target genome (isolate PHI126) were used to design primers. The flanking regions were also scanned for potential repetitive elements directly outside of the perfect repeats. When a locus was found acceptable, approximately 100-200 bp on either side of the repeat was included to find robust loci with annealing temperatures of at least 60°C.
All primers were purchased from Integrated DNA Technologies, Coralville, Iowa. Our approach was to bias toward trinucleotide repeats and above given the difficulty to differentiate dinucleotide repeats using fluorescently labeled primers and capillary sequencing methodologies.

| Fungal isolates
To screen the initial primers, five P. digitatum isolates, which were originally isolated from South Africa, Uruguay, Argentina, Chile, and Cyprus, were acquired in cetyltrimethylammonium bromide (CTAB) extraction buffer from Dr. Mareli Kellerman, at Stellenbosch University. One North American isolate, collected at the Agricultural Operations (AgOps) facility at the University of | 3 of 10 VARADY et Al.
California Riverside, was also included in the initial primer screen; the six isolates are referred to as the "global isolates" within this manuscript. Twelve additional isolates, referred to as the "local isolates," were also collected at AgOps that were subsequently used to screen the primers that sequenced well and were polymorphic based on the global isolates. The local isolates were collected from fallen diseased citrus fruit collected approximately 3-1,400 meters apart from one another. To isolate the fungi, spores were swabbed directly from colonized fruit in the field and placed in sterile H 2 O and dilution plated onto potato dextrose agar (PDA; Becton, Dickinson and Company Franklin Lake, NJ). The cultures were incubated at 25°C for 1-2 weeks. The leading edge of a single colony was then transferred to a new agar plate, incubated for an additional 1-2 weeks, and then, the spores and mycelium were scraped off the agar into CTAB extraction buffer (Gardes & Bruns, 1993).

| DNA extraction and microsatellite amplification
DNA extraction was performed on all P. digitatum isolates using a slightly modified chloroform/CTAB DNA extraction method of Gardes and Bruns (1993). The genomic DNA was electrophoresed in 0.8% gels and visualized as described below. The DNA was di- PCR products were purified using a solution of 1% exonuclease I (10 U/μl), 10% phosphatase (1 U/μl; Affymetrix, Santa Clara, CA), and 89% sterile H 2 O. To purify the PCR products, 1.5 μl of this cocktail was added to 6 μl of PCR product and incubated in a Bio-Rad MyCycler with the conditions of 1 cycle of 37°C (15 min) and 80°C (15 min). The purified PCR products were sequenced using Sanger sequencing at the Institute of Integrative Genome Biology Genomics Core facility at the University of California Riverside. During the first screening, only a single primer was used and when loci were identified that had clean reads for all tester isolates under our conditions, the other primers were then used to acquire the opposite reads. Contigs were edited using Sequencher software (Gene Codes Corporation, Ann Arbor, MI, USA). The sequences were aligned using ClustalX (Thompson, Gibson, Plewniak, Jeanmougin, & Higgins, 1997) and edited (allele counting) using MacClade (Maddison & Maddison, 2005).

| Data analysis
The program POPGENE was used to calculate allele frequencies and various population genetic summary statistics to compare the global and local isolates (Yeh, Yang, Boyle, Ye, & Mao, 1997). Randomization procedures in FSTAT were also used to test for population differentiation between the local and global samples by comparing the allele frequencies using Weir and Cockerham's population differentiation statistic θ (Goudet, 2002). The estimated θ values were tested under the null hypothesis of no differentiation among the global and local isolates by comparing the observed values of θ to values estimated for data sets in which alleles were resampled without replacement 10,000 times (Goudet, 2002). To determine multilocus genotypes (MLG), the data were sorted in Microsoft Excel 2016.

| Microsatellite characterization
A total of 3,134 microsatellite loci were found in the genome of P. digitatum isolate PHI126 based on the parameters used in Msatcommander 1.0 (Table 1). A total of 2,080, 728, 258, 39, 5, 24 mono-, di-, tri-, tetra-, penta-, and hexanucleotide repeats were identified, respectively (Table 1). Mono-, di-, and trinucleotides were more common than tetra-, penta-, and hexanucleotides; these former motif types also had a much greater range of repeat lengths and were found more frequently (Table 1). As with the other ascomycete fungi compared to P. digitatum, mononucleotides were the most abundant repeats with a general trend of becoming less abundant as the repeat length gets larger ( Table 2). The longest microsatellites found were monomorphic repeats with lengths up to 795 and 893 bp for A/T and C/G repeats, respectively, and there were many long (>200) mononucleotide repeats within the P. digitatum genome ( Figure 1). When comparing the lengths of the longest repeats of microsatellites published for ascomycete fungi, P. digitatum had significantly longer mono-, di-, and trinucleotide repeats (Table 3).

| Microsatellite variability
Forty-one out of 43 primers tested were successfully amplified from all of the initial global tester isolates based on the PCR parameters used for this study. Two loci, 3-6 and 3-23, amplified but not from all tester isolates despite multiple attempts. From the initial sequencing results, eight loci were not variable and 25 loci produced sequencing results that could not be easily scored from all tester isolates despite multiple attempts to produce "clean" reads (Appendix 1). A total of 10 loci were further characterized among the loci that yielded quality sequencing reads from both the global and local isolates (Table 4).
Nine trinucleotide loci and one hexanucleotide microsatellite locus were characterized that yielded two to three and two to seven alleles within the global and local isolates, respectively (Table 4).
Nine out of the 10 loci were perfect repeats, whereas locus 3-3 was found to be a compound repeat in the isolates tested while this locus was a perfect repeat within the sampled genome. Additionally, this repeat in the reference Spain isolate was also only 18 repeats but was significantly longer within the isolates that were sequenced in this study. The opposite, with respect to length differences, were found in locus 3-12 where the repeat length in the Spain isolate was 128 repeats but were considerably shorter in the isolates sequenced.
Comparisons of the flanking sequence regions of all sequenced microsatellite loci with that of the Spanish isolate confirmed that the characterized sequenced microsatellite loci were all homologous.
Comparing the global to local isolates, there were also considerable differences with respect to allele frequencies (Table 5) and other summary statistics (Table 4). Obvious differences in allele frequencies between the global and local "populations" can readily be seen in Table 5 as well as many private alleles which resulted in significant population differentiation based on the Fst analysis (p < 0.001). Larger number of alleles and higher estimates of diversity were also found within the local isolates compared to the global isolates (Table 4).
Out of the 18 isolates genotyped based on direct sequencing, only four clonal genotypes were found which were from the global sample of isolates, Chile, Uruguay, Argentina, and Cyprus. The California and South African isolates used in the initial screening accounted for the variable loci observed prior to screening the local isolates. All of the other isolates represented unique MLG, and no single locus was monomorphic, even in this limited sampling of isolates used to develop these markers. of fungi, yet all MLG found within the local population were unique, even isolates collected ~3 m from one another. These markers can now be utilized in future studies to investigate diversity, population structure, the potential for recombination/sexual reproduction, and other ecological and evolutionary processes that shape P. digitatum populations.

| D ISCUSS I ON
Within the genus Penicillium, the only two studies to our knowledge to develop microsatellite markers have been for P. marneffei (Lasker & Ran, 2004) and P. roqueforti (Ropars et al., 2014).
Penicillium marneffei is a human pathogen, and Lasker and Ran (2004) developed these markers to assist in epidemiological studies for this pathogen. However, P. marneffei is actually taxonomically distinct from the genus Penicillium and is more closely related to the genus Talaromyces (LoBuglio & Taylor, 1995). Penicillium roqueforti is the famous fungal species used to produce the marbled effect and taste of blue cheese. In the latter study, Ropars et al. (2014) found significant microsatellite variation from over 100 isolates of P. roqueforti and actually induced viable sexual structures of this pathogen in vitro. This was unexpected because these species, like many other Penicillium species, are thought to be strictly asexual.
In this study, we also found significant variation, which is consistent with the potential for this species to sexually reproduce which has not been demonstrated. Most isolates were unique MLG, which is especially interesting from the local population that was collected from one general location. Similar results were found by Lee (2002) who only found two clonal isolates out of a 100 sampled primarily homologue which suggests that this species may need to outcross if sexual reproduction is possible. Therefore, based on our results of significant variation even at a fine sampling scale, there is a potential that P. digitatum may reproduce sexually but the sexual phase has yet to be discovered. This is an intriguing hypothesis but will take further studies to investigate this possibility.

Microsatellite repeats exist in other fungal species and generally
share the same pattern in which shorter repeats are more common and variable than larger repeats (Karaoglu et al., 2005;Lim et al., 2004;Simpson et al., 2013), which was also found in this study. In comparison with 10 other ascomycete fungi, P. digitatum had significantly longer microsatellite motifs for mono-, di-, and trinucleotide repeats (Lim et al., 2004). Moreover, as consistent with other ascomycete fungi, the genome size of P. digitatum does not seem to correlate with the density of microsatellite loci. What was unusual in this study compared to other published studies on microsatellites were the extremely large repeats of three loci in the published P. digitatum genome. It appears that microsatellites with large numbers of repeats, or long microsatellites, are rare in fungal genomes TA B L E 3 Comparison of the longest microsatellite repeats between various ascomycete fungi compared to Penicillium digitatum isolate PHI126 Note. H: ei's gene diversity (Nei, 1972); I: Shannon information index (Lewontin, 1972); N A : observed number of alleles. compared to the human genome (Dutech et al., 2007). However, the biological importance regarding these large repeats is essentially unknown at this time.
For this study, the global isolates were artificially pooled to represent a population so that population differentiation could be compared to the local population. Based on allele frequencies alone (Table 5), it was clear that the two "populations" were differentiated which was confirmed statistically (p < 0.001). Many private alleles were also found between both "populations" and when shared alleles were found, the frequency for most of them was very different. Further studies sampling "local" populations throughout citrus growing areas will help to elucidate the population structure of this pathogen and may provide insight into the importance of the mechanism(s) of reproduction and spore movement.
Supplemental data for primers that amplified but could not be sequenced reliably are also provided (Appendix 1). Most of these loci produced clean PCR products; however, it was not possible to get a clean sequencing reads from all six tester isolates, but variation within loci was observed. The microsatellite loci that were unreadable were sequenced at least two times and each time similar unclear reads were obtained. The difficulty of performing Sanger sequencing on microsatellites could be due to the instability of long stretches of nucleotide repeats which has been well documented (Wierdl, Dominska, & Petes, 1997). We took a conservative approach and sequenced until we found 10 loci that could be scored unambiguously. However, these additional loci may also be useful in the future using size selection via fluorescently labeled primer methods as we plan to do for future work to study this important pathogen of citrus.

CO N FLI C T O F I NTE R E S T
Authors declare no conflict of interest.

AUTH O R S CO NTR I B UTI O N
E.S.V. and G.W. D. conceived and designed this study, collected and analyzed data, and drafted the article. S. B. and G. V. helped in collecting and analyzing data and critically revised the article.

E TH I C S S TATEM ENT
Not required.
TA B L E 5 Allele frequencies between the local and global "populations" of Penicillium digitatum isolates used in this study Locus: 3-2 Locus: 3-3 Locus: 3-5 Locus: 3-8 Locus: 3-9 APPENDIX 1 Primers for microsatellite loci that amplified but clean sequencing reads were not 100% successful for all six tester isolates.