Pathogenic variants in the Longitudinal Early‐onset Alzheimer's Disease Study cohort

One goal of the Longitudinal Early‐onset Alzheimer's Disease Study (LEADS) is to investigate the genetic etiology of early onset (40–64 years) cognitive impairment. Toward this goal, LEADS participants are screened for known pathogenic variants.


INTRODUCTION
Although early-onset Alzheimer's disease (EOAD) has been estimated to be highly heritable (>90%), only ≈5%-10% of individuals with EOAD carry a known autosomal dominant pathogenic variant in the APP, PSEN1, or PSEN2 genes. 1 Similarly, although about 30% of frontotemporal dementia (FTD) incidence is attributed to pathogenic variants in the GRN and MAPT genes, and expansion of a hexanucleotide repeat in the C9ORF72 gene, with a small amount accounted for by rare pathogenic variants in several additional genes, a large portion of genetic etiology for this disease has not yet been identified. 2e Longitudinal Early-onset Alzheimer's Disease Study (LEADS) targets enrollment of individuals with early onset (age 40-64 years) cognitive impairment who lack a strong family history of EOAD (study excludes individuals with >1 immediate relative with EOAD) and who do not have a known genetic etiology such as a pathogenic PSEN1 variant. 3LEADS is designed to fill a gap in the research of EOAD, by recruiting individuals who do not qualify for studies of Mendelian EOAD such as the Dominantly Inherited Alzheimer Network (DIAN); LEADS data are being utilized to investigate longitudinal cognitive impairments, fluid and neuroimaging biomarkers, and genetic causes of EOAD.Enrolled patients are screened for brain amyloid positivity (EOAD) or negativity (EOnonAD) using positron emission tomography (PET) neuroimaging.Cognitive impairment at any age can be caused by a host of etiologies, the most common being Alzheimer's disease (AD).Thus, although the majority of individuals screened for LEADS are amyloid positive, it is not surprising that some of the cognitively impaired LEADS participants are amyloid negative.These participants are also followed in LEADS, and their clinical profile is identical to suspected non-Alzheimer's disease pathophysiology (SNAP) in older individuals. 4An exploratory aim of LEADS is to investigate the genetic etiology of EOAD and EOnonAD, with the goal of identifying novel genetic variants causal or contributing to risk for EOAD and EOnonAD.
LEADS includes a genetic testing pipeline, wherein all participants are screened for previously reported pathogenic variants in APP, PSEN1, PSEN2, GRN, or MAPT, and pathogenic repeat expansions in C9ORF72.
The objective of this report, including patients enrolled during the first half of LEADS, is to investigate the frequency of these identified pathogenic variants, as well as the potential contribution of other rare functional variants in APP, PSEN1, PSEN2, GRN, and MAPT to disease.
The goal of this analysis is to confirm that variants in the screened genes APP, PSEN1, PSEN2, GRN, MAPT, and C9ORF72 are not contributing to the genetic etiology of most LEADS EOAD/EOnonAD patients, who are selected based on a lack of extensive family history of disease.

Participants
This study included 299 individuals with early-onset cognitive impairment enrolled in LEADS; cognitively normal controls were not submitted for sequencing and were not included in the analysis.Affected individuals had biospecimens including deoxyribonucleic acid (DNA) collected at baseline and were assessed with a neurocognitive battery as well as neuroimaging including PET amyloid and tau imaging.
Collected data included demographics such as age at enrollment and age at symptom onset for cognitive impairment, sex, race, ethnicity, and family history of AD in parents and siblings, as well as results of neurocognitive examinations including the Mini-Mental State Exam (MMSE). 5Study protocols have been described extensively in Apostolova et al. 3 More information on LEADS leadership, resources, and data sharing policies are available on the LEADS website (https://leadsstudy.medicine.iu.edu/).

Ethics statement
Written informed consent was obtained from all participants or their authorized representatives prior to study inclusion.A central institutional review board (IRB) at Indiana University approved this study, which was conducted according to the ethical standards of the Helsinki Declaration of 1975.

Genetic data processing
LEADS WES data were processed following GATK Best Practices using Sentieon Genomics software (Sentieon, Inc., San Jose, CA). 7Briefly, paired-end FastQ files were aligned to Genome Reference Consortium Human Build 38 (hg38) using the recommended pipeline for Sentieon's proprietary BWA-MEM function fused with a process to account for the many alternate contigs included in hg38.BWA-MEM typically assigns reads that map to more than one location a mapping quality score of zero, leading to highly divergent genome regions being excluded from downstream analyses.The alternate-contig-aware process adjusted read tags and de-coupled paired-end mates to prevent MAPQ dead zones, allowing the variant caller to include reads with mates mapping to different contigs. 7itially, three Picard functions were implemented: RevertSam to produce unmapped BAM files, AddOrReplaceGroups to assign all reads in individual files to a single new read-group, and MergeBa-mAlignment to merge all aligned and unaligned reads per sample.
Files were then sorted with a Sentieon Util sort function.Duplicates Processed BAM files were used to generate gVCFs with Sentieon Haplotyper.Next, gVCFs were joint-called using Sentieon GVCFtyper.

Variant review
Data were reviewed for all affected participants.For genes APP, PSEN1, or PSEN2, genetic variants were considered pathogenic if they were included in the list of variants qualifying individuals for the Dominantly Inherited Alzheimer Network Trials Unit (DIAN-TU). 12,13For genes GRN and MAPT, variants were manually reviewed and identified as pathogenic based on previous reports in ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/), the Human Gene Mutation Database (HGMD, https://www.hgmd.cf.ac.uk/), and the Leiden Open Variation Database (LOVD, https://www.lovd.nl/);5][16][17][18][19] Patients were not informed about variants not meeting the criteria for pathogenicity.
Calls for each CNV for each subject from both CANOES and CoNIFER results were aligned and compared for overlap to identify high-confidence calls.Identified overlapping CNVs were reviewed to investigate if any occurred within genes APP, PSEN1, PSEN2, GRN, or MAPT.

Pathogenic variant confirmation
For participants with an identified, previously-reported pathogenic variant in one of the six screened genes who signed an informed consent to have genetic test results returned, a separate 6 mL tube of blood was transferred to the Indiana University Genetics Testing Laboratories, or to GeneDx (GeneDx, LLC, Gaithersburg, MD), for DNA extraction and PCR-based genotype confirmation in a clinical laboratory improvement amendments (CLIA)-certified laboratory.were calculated in PLINK. 22,23Population frequencies for analyzed SNPs were obtained from GnomAD. 24st hoc testing was also performed to investigate the contribution of individual variants to enrichment analysis results via logistic association testing in PLINK, covarying for age, sex, and APOE ε4 carrier status;

Controls for statistical analyses
testing was limited to SNPs with at least one minor allele in cases and controls.

Pathogenic variants identified
Of the 299 LEADS EOAD and EOnonAD with sequencing data, a total of eight pathogenic variant or repeat carriers were identified (carrier frequency of 2.68%), including three EOAD heterozygous for PSEN1 variants, two EOnonAD heterozygous for GRN variants, two EOnonAD with heterozygous C9ORF72 pathogenic expansion repeats, and one EOnonAD heterozygous for an MAPT variant (Table 1).The two heterozygous C9ORF72 repeat expansion carriers both had full (beyond assay quantifiable detection limit) repeat expansions.The rate of previously reported pathogenic variants is 1.35% in EOAD (3/223), and 6.58% in EOnonAD (5/76) (Figure 1).

Gene burden results
Assessment of participant demographics for the LEADS cases (excluding previously reported pathogenic variant carriers) and PPMI controls identified significant diagnostic group differences for sex, enrollment age, and APOE ε4 carrier status (Table 3).4); however, post hoc association analysis of SNPs in PLINK showed that this result was driven by rs140501902, which was more common in controls than in cases (p = 0.04; Table 5), rather than enrichment of rare variants in cases compared to controls.There were no genes showing significant enrichment in only EOAD or only EOnonAD compared to controls, although in EOAD, there was a trend for enrichment of variants in PSEN2 (p = 0.059), again driven by rs140501902 minor allele enrichment in controls.This SNP has also been reported as Benign/likely-benign in ClinVar.

DISCUSSION
Screening indicates that the frequency of previously reported pathogenic variants in APP, PSEN1, PSEN2, GRN, MAPT, or C9ORF72 is low for both EOAD and EOnonAD LEADS participants, although variants are more frequent in EOnonAD than EOAD.Results from the gene burden analysis of rare functional variants in these genes also supports this conclusion, showing that unidentified rare variants in these genes are also not responsible for a significant portion of EOAD or EOnonAD cases.This highlights the importance of future studies to investigate other genetic factors and genes that may play roles in genetic risk or etiology of early-onset cognitive impairment in the LEADS study.Preliminary results from genetic screening of LEADS participants also indicate that study exclusion criteria for individuals with extensive AD family history have been successful in avoiding enrichment of autosomal dominantly inherited pathogenic variants for AD and FTD.
We observed that both EOAD and EOnonAD diagnostic groups include more APOE ε4 heterozygotes compared to controls, and participants with EOAD had more APOE ε4 homozygotes compared to controls.[27] Although autosomal dominant pathogenic variants in APP, PSEN1, and PSEN2 are estimated to account for ≈10%-15% of EOAD, the observed frequency of these variants is lower in LEADS, showing that, as expected, variants in these genes do not account for most disease risk in this cohort. 28Up to 70% of EOAD following a Mendelian TA B L E 5 SNP summary from gene burden analysis of all cases.

Limitations
Although given the rarity of EOAD in the general population, the sample size of the LEADS cohort is impressive, it is still small in terms of a genetics study.LEADS sample size and diagnostic heterogeneity currently limits the ability to perform discovery-based genetic analyses; however, the use of gene burden testing allows us to leverage summary-level data to investigate the contribution of rare functional variants in screened genes to case status.Given the small sample size, we did not remove individuals with diverse races or ethnicities.However, we did perform gene burden testing in only White non-Hispanic individuals to check sensitivity; results were not significantly different from results including all participants.An important limitation to note is that we did not have WES data for LEADS controls for this analysis.It is possible that merging data for LEADS cases and PPMI controls may introduce batch effects to the gene burden analysis; it will be important to future work to conduct gene burden testing including LEADS control sequencing data once available, to verify and expand these results.In addition, this report focused on the genetic screening pipeline, which includes six genes accounting for the majority of known pathogenic variants in AD and FTD; however, it is possible that rare variants in other neurodegenerative disease-related genes could account for some portion of the genetic etiology of the LEADS cohort.
Future work as enrollment continues will expand to encompass additional genes.Finally, current CNV results are based on WES data; it is possible that whole genome sequencing will identify additional CNVs not detectable with the data currently available.

Future directions
It will be important to expand these analyses to the entire LEADS cohort once enrolled, to validate preliminary findings regarding pathogenic variant frequency in APP, PSEN1, PSEN2, GRN, MAPT, and C9ORF72 in LEADS cases, as well as to expand analyses to include additional neurodegenerative disease-related genes.Future plans also include performing WES on all participants, which will enable assessment of the contribution of non-coding variants in genes of interest, as well as a more complete assessment of CNVs.It will also be important for future studies to leverage planned enrollment of more heterogeneous individuals to investigate the contribution of genetic ancestry and genetic background in diverse geographic and racial/ethnic cohorts to disease risk and progression.

CONCLUSIONS
These initial findings highlight the LEADS cohort as an excellent source of early-onset cognitive impairment cases for future analyses of novel genetic etiology for EOAD and EOnonAD and support the important complementary role of LEADS compared to studies such as DIAN in AD research and future clinical trials.

RESEARCH IN CONTEXT 1 . 2 . 3 .
Systematic review: Literature relating to the genetics of early-onset Alzheimer's disease (EOAD) and frontotemporal dementia was reviewed, referencing traditional sources such as PubMed and the collective expertise of the Longitudinal Early-onset Alzheimer's Disease Study (LEADS) consortium.Studies have investigated the contribution of pathogenic variants in APP, PSEN1, and PSEN2 to EOAD and variants in GRN, MAPT, and C9ORF72 to frontotemporal dementia; these findings are cited.However, literature is limited on the efficacy and impact of the selection of non-carriers of pathogenic variants based on a family history of disease.Interpretation: In the longitudinal EOAD study (LEADS, N = 299), pathogenic variants in APP, PSEN1, PSEN2, GRN, MAPT, and C9ORF72 genes were detected in eight (2.7%) of affected individuals, highlighting the utility of LEADS for discovery-based research of novel variants.Future directions: Future work will include replication of these results in future LEADS participants and investigation of rare variants in other neurodegenerative disease-related genes.were removed with Sentieon functions LocusCollector and Dedup.The resulting BAM files were realigned, and scores were recalibrated using Sentieon Realigner and QualCal.Picard SortSam sorted recalibrated files.Then, NM, MD, and UQ tags were calculated by Picard SetNm-MdAndUqTags, and the 0 × 1 paired flag was removed from reads with a piped gawk/samtools command.Files were finally indexed using the Sentieon Util index.

Table 2
Abbreviations: AD, Alzheimer's disease; EOAD, early-onset Alzheimer's disease; EOnonAD, early-onset non-Alzheimer's disease (cognitively impaired, amyloid negative); FTD, frontotemporal dementia.TA B L E 2 LEADS pathogenic variant carrier demographics.Abbreviations: EOAD, early-onset Alzheimer's disease; EOnonAD, early-onset non-Alzheimer's disease (cognitively impaired, amyloid negative); MMSE, Mini-Mental State Exam; SD, standard deviation.*Some participants missing data: for EOAD, 1 missing race, 9 missing symptom onset age, 3 missing MMSE, and 10 missing family history; for EOnonAD, 2 missing race, 4 missing symptom onset age, and 4 missing family history.Percentages for variables with missing data were calculated based on non-missing group size.**% of participants with a first-degree relative with Alzheimer's disease.F I G U R E 1 Pie chart of EOAD and EOnonAD pathogenic variant carriers.Percentages of LEADS EOAD participants carrying a PSEN1 previously reported pathogenic variant and non-carriers (left) and LEADS EOnonAD participants carrying a GRN, C9ORF72, or MAPT previously reported pathogenic variant and non-carriers (right).Screening did not identify any APP or PSEN2 pathogenic variant carriers.AD, although for EOAD and EOnonAD non-carriers, <40% of each group had a first-degree relative with AD.The majority of variant carriers were male, and there was a greater percentage of APOE ε4 allele carriers in the EOAD variant carriers than in EOnonAD carriers (Figure 2).All cases had a similar mean age at symptom onset; EOAD pathogenic variant carriers (mean age 55.67) and non-carriers (mean age 55.31) as well as EOnonAD carriers (mean age 56.20) and noncarriers (mean age 54.24) had an average onset in their mid-fifties.There were no identified CNVs overlapping genes APP, PSEN1, PSEN2, GRN, or MAPT.
LEADS and PPMI participant demographics for gene burden analysis.Gene burden analysis results.
Pie chart of APOE ε4 allele carriers and non-carriers.Participant counts and percentages of LEADS EOAD and EOnonAD pathogenic variant carriers and non-carriers, as well as PPMI controls, carrying one or two APOE ε4 alleles (blue) compared to individuals with no APOE ε4 alleles (orange).TA B L E 3APP, PSEN2, GRN, and C9ORF72, covarying for age, sex, and APOE ε4 carrier status within all cases, EOAD, or EOnonAD cases compared to controls are reported in Table4.PSEN1 gene burden testing was not performed because there was only one variant meeting inclusion criteria, with <3 minor alleles in the data set.Rare functional variants in PSEN2 showed significant variant enrichment (p = 0.0121, Table