Construction of Host Plant Insect‐Resistance Mutant Library by High‐Throughput CRISPR/Cas9 System and Identification of A Broad‐Spectrum Insect Resistance Gene

Abstract Insects pose significant challenges in cotton‐producing regions. Here, they describe a high‐throughput CRISPR/Cas9‐mediated large‐scale mutagenesis library targeting endogenous insect‐resistance‐related genes in cotton. This library targeted 502 previously identified genes using 968 sgRNAs, generated ≈2000 T0 plants and achieved 97.29% genome editing with efficient heredity, reaching upto 84.78%. Several potential resistance‐related mutants (10% of 200 lines) their identified that may contribute to cotton‐insect molecular interaction. Among these, they selected 139 and 144 lines showing decreased resistance to pest infestation and targeting major latex‐like protein 423 (GhMLP423) for in‐depth study. Overexpression of GhMLP423 enhanced insect resistance by activating the plant systemic acquired resistance (SAR) of salicylic acid (SA) and pathogenesis‐related (PR) genes. This activation is induced by an elevation of cytosolic calcium [Ca2+]cyt flux eliciting reactive oxygen species (ROS), which their demoted in GhMLP423 knockout (CR) plants. Protein‐protein interaction assays revealed that GhMLP423 interacted with a human epidermal growth factor receptor substrate15 (EPS15) protein at the cell membrane. Together, they regulated the systemically propagating waves of Ca2+ and ROS, which in turn induced SAR. Collectively, this large‐scale mutagenesis library provides an efficient strategy for functional genomics research of polyploid plant species and serves as a solid platform for genetic engineering of insect resistance.

Table S3.List of all sgRNA target site sequences used for the construction of pooled sgRNAs library.
The sequence of 968 sgRNA target sites

Figure S1 .
Figure S1.GO function enrichment analysis of 502 target genes.Pathways with higher confidence are shown in red with the adjust p-value presented.The dot size represents the number differences of GO enrichment.

Figure S2 .
Figure S2.Genetic transformation and plant regeneration of cotton, greenhouse and field view of cotton mutants.(a) Seven-day ex-plants grown in the dark and used for transformation.(b-f) Callus induction and differentiation.(g) Plant regeneration.(h-i) Acclimatization of regenerated plants grown in growth room, greenhouse then field; respectively.Scale bars =1.1 cm.

Figure S3 .
Figure S3.Sanger sequencing validated the high throughput sequencing data

Figure S4 :
Figure S4: Detailed editing profiles of T0 plants in the mutant library.(a) Distribution of the length of deletions.(b) Different base numbers of substitutions.(c) Statistics of short fragment insertions with different lengths.

Figure S5 .
Figure S5.Inheritance of gene editing from T0 to T1 generation.(a) Frequency of editing types in 276 T1 plants.One line represents a T1 plant and the same colored line comes from the same T0 plant.(b, c) Genome editing profile of line 24 and line 552 in T0 and three T1 plants, and compared to the reference sequences, TTTCCTTTTTGTTCATTTCCTCATTGAGGTTGGAGCGCAT and ACCTCCTCCACAGCTACCATCATCGATATATGGCGATTCT, respectively.The position of the Protospacer Adjacent Motif (PAM) is marked by a red line, the insertion of nucleotides is marked by a red border,

Figure S6 .
Figure S6.Determination of aphid population density of plants screened with 5 genome-edited lines with distinct phenotypes.

Figure S7 .
Figure S7.The editing profiles of line no 139 and line no 144 and the expression level of GhMLP423 gene in the generated mutants.

Figure S8 .
Figure S8.Evolution and selection of the GhMLP423 locus.(a) Principal component analysis (PCA) of Ghlandraces, GhImpUSO and GhImpCHN populations.PC1 on X-axis and PC2 on Y-axis.(b) Nucleotide diversity (π) of Ghlandraces, GhImpUSO and GhImpCHN populations, with an X-axis of 20 kb upstream and downstream of the gene GhMLP423 and Y-axis represents 10-4 of nucleotide diversity.(c) The gene GhMLP423 structure and single-nucleotide polymorphisms (SNPs) variants contained in the Ghlandraces population showed the highest proportion of five haplotypes, and the blank boxes were the same nucleotides as the reference genome (Hap1).The pie charts show the proportion of the five haplotypes.(d) The expression of the gene GhMLP423, containing transcriptome data from 10 Ghlandraces samples and 40 GhImpCHN samples.* represents significant difference <= 0.05.FPKM, Fragments Per Kilobase of transcript per Million mapped reads.(e) Three haplotypes of SNP variants in the promoter of GhMLP423 gene (>5%), the blank box is the same as the reference genome (Hap1).The pie chart shows the proportions of the three haplotypes and the other haplotypes.

Figure S12 .
Figure S12.Expression level analysis of PR genes or GhEPS15 or GhMLP423 in different plants materials.

Table S1 .
List of 502 target gene IDs used for the target editing by CRISPR

Table S2 .
Sequencing results of 100 clones randomly selected to test gene coverage in a small pooled vectors library.