Genetics of coronary artery disease in the post‐GWAS era

During the past decade, genome‐wide association studies (GWAS) have transformed our understanding of many heritable traits. Three recent large‐scale GWAS meta‐analyses now further markedly expand the knowledge on coronary artery disease (CAD) genetics in doubling the number of loci with genome‐wide significant signals. Here, we review the unprecedented discoveries of CAD GWAS on low‐frequency variants, underrepresented populations, sex differences and integrated polygenic risk. We present the milestones of CAD GWAS and post‐GWAS studies from 2007 to 2021, and the trend in identification of variants with smaller odds ratio by year due to the increasing sample size. We compile the 321 CAD loci discovered thus far and classify candidate genes as well as distinct functional pathways on the road to indepth biological investigation and identification of novel treatment targets. We draw attention to systems genetics in integrating these loci into gene regulatory networks within and across tissues. We review the traits, biomarkers and diseases scrutinized by Mendelian randomization studies for CAD. Finally, we discuss the potentials and concerns of polygenic scores in predicting CAD risk in patient care as well as future directions of GWAS and post‐GWAS studies in the field of precision medicine.


Introduction
Despite improved lifestyle and the successful targeting of related risk factors, such as dyslipidemia and hypertension, coronary artery disease (CAD) remains to be the most prevalent cardiovascular disorder [1]. New therapeutic strategies are urgently needed to battle the disease. Since 2007, genome-wide association studies (GWAS) have uncovered significant associations between 321 chromosomal loci and CAD [2,3] (Figs 1a and 2). Scrutinizing the entire genome, CAD GWASs not only rediscovered current drug targets and genes known to affect risk of the disease, but also identified many novel genes with hitherto unrecognized importance for atherosclerosis, suggesting new disease-causing mechanisms.
In the post-GWAS era, the rich findings of GWAS studies are increasingly explored for translational purposes. The aims of these studies are to (a) elucidate the disease-associated mechanisms underlying CAD loci; (b) prioritize causal genes and potential novel drug targets; and (c) harness CAD genetic variations for risk stratification, disease prevention and personalized medicine.  [2] Ever-refined CAD GWAS Since 2007, GWAS have generated a wealthy knowledge on the genetics of CAD [4][5][6][7][8][9]. Improved sequencing power, larger samples sizes and specified phenotypes continue to drive novel discoveries of CAD loci and genes (Fig. 1a) [10][11][12]. Recent endeavours of three large CAD consortia and collaborations across the globe now pushed the number of loci displaying genome-wide significant association with CAD to 321, along with increasingly accurate prioritization of the causal genes [3,13,14] (Tcheandjieu et al., unpublished data -Research Square 2021 and Aragam et al., unpublished data -medRxiv 2021). Larger sample sizes profoundly refined the sensitivity of these analyses and led to discovery of many new loci harbouring common risk alleles. However, it also becomes clear that most of the novel risk alleles confer relatively small odds ratios (Fig. 1b). Based on these findings, numerous post-GWAS studies and applications are sprouting as CAD GWAS provide ever-refined risk estimates for common and rare variants in a sex-, age-and population-specific way.

CAD GWAS of diverse ethnic-and sex-stratified groups
To date, CAD GWAS were predominantly conducted within cohorts of European ancestry, hampering the clinical translation on a global scale. Thus, population studies with diverse ethnic groups are entering the spotlight. Koyama et al. (BioBank Japan, BBJ CAD ) genotyped or imputed almost 20 million variants on populationand CAD-specific haplotypes of 25,892 cases and 142,336 controls, and discovered eight new CAD loci in Japanese subjects [3]. Three of these new loci and many rare variants were found to be specific for this ethnic group. Moreover, transancestry analyses together with CARDIoGRAM-plusC4D and UK biobank (UKBB) datasets dramatically increased the number of identified loci to 175, amongst which 35 were novel.
In another study, Tcheandjieu et al. incorporated Whites, Blacks and Hispanics from the Million Veteran Program (MVP) with existing studies including CARDIoGRAMplusC4D, UKBB and BBJ and performed a multiethnic GWAS of CAD including a quarter of a million cases (Tcheandjieu et al., unpublished data -Research Square 2021). The authors identified 107 novel and the first eight CAD loci amongst Blacks and Hispanics and observed that the two major risk haplotypes of 9p21 locusthe strongest locus in Western Europeans [4]-are completely absent in people of African origin. They also found almost equivalent heritability of CAD across ancestries. Indeed, trans-ancestry analyses studies show advantages in (a) uncovering new variations, and (b) testing relevance and robustness of polygenic risk score (PRS) [3]. Regarding the latter, it nevertheless appears that the precision of PRS is the best, when the discovery of variant effect sizes and the testing of the score are carried out within the same population [15].
Gender-specific differences in the phenotypic appearances of CAD and its risk factors are known for many decades. The extent by which these differences-and their implications for risk prediction-are driven by genetic underpinnings remained unclear. Lately, Aragam et al. conducted sex-stratified GWAS in 16 studies comprising 22,997 female and 54,083 male CAD cases in Million Hearts GWAS (1MH) (Aragam et al., unpublished data-medRxiv 2021). They found 10 associations that reached p ≤ 5.0 × 10 −8 and evidence for sex heterogeneity (p ≤ 0.01). Nine of the 10 had stronger effects in the male participants, which might reflect the larger sample size for males. Conversely, the MYOZ2 (rs7696877) locus had a significantly larger effect in females. It will be interesting to see GWAS with larger sample sizes particularly on female CAD cases for a better understanding of the genetic architecture of CAD and a more accurate prediction of outcomes based on the inter-action amongst genetics, sex effects and environment. Furthermore, the authors identified 30 novel CAD loci with genome-wide significance and 656 suggestive loci of false discovery rate (FDR) <1%, representing the largest number of novel loci discovered in an individual CAD GWAS.
To date, 321 loci have been associated with CAD in a genome-wide significant fashion (Fig. 2). The sample size and diversity of participants are still increasing in multiple biobanks, such as CARDIo-GRAMplusC4D, UKBB, JBB, 1MH, MVP and All of Us Research Program [16]. Thus, this number is likely to increase. The majority of future loci reaching genome-wide significance will arise from current suggestive loci with a low FDR (see Section CAD loci thresholding by FDR) and will have a relatively low odds ratio (Fig. 1b).

CAD loci thresholding by FDR
Given the linkage disequilibrium (LD) pattern of the genome, Bonferroni correction-based thresholding, defining genome-wide significance for association at p ≤ 5 × 10 −8 , might generate too many false negatives and thus overlook a number of loci with important biological implications. An alternative method, the FDR was adopted for GWAS thresholding in order to derive more candidates with suggestive evidence of association. FDR controls the expected proportion of false positives amongst the rejected null hypotheses and therefore is less conservative compared to Bonferroni correction. Nelson et al. employed this method in 2017 and identified, in addition to the 13 genome-wide significant loci, 304 independent variants at 243 loci associated with CAD at FDR <5%, which explained 21.2% of CAD heritability [12]. In the recent 1MH, a large number of these variants became significant at the p ≤ 5 × 10 −8 threshold or were found amongst 656 loci, which now have an FDR of less than 1% (Aragam et al., unpublished data-medRxiv 2021).

Low-versus high-frequency variants
By design, GWAS has the largest power for detecting effects of common single-nucleotide polymorphisms (SNP). In parallel, the increasing sequencing coverage of human genome also enabled identification of many low-frequency variants, which allowed studying whether the signals at risk loci originate from common or rare variants. Interestingly, these analyses clearly revealed that the overall genetic susceptibility to CAD conferred by GWAS loci is largely based by common variants with small effect sizes [10]. These common variants are mostly located in presumed regulatory regions of the genome, that is, they usually modulate gene expression and thus the complex network that connects the activity of many genes.
Other studies focused on the coding sequence of genes, that is, by exome array and whole-exome sequencing. Stitziel et al. identified a few rare loss-of-function mutations in ANGPTL4, LPL and SVEP1 in association with CAD [11], offering new drug targets. Whilst ANGPTL4 and LPL are already under intense clinical evaluation for the treatment of hypertriglyceridemia [17,18], SVEP1, a poorly studied gene, was functionally validated to have an atheroprotective role in mice [19]. Whole-exome sequencing has been used to identify rare variants for CAD/myocardial infraction (MI) in a large patient cohort with 9793 participants [20]. Only rare LDLR and APOA5 alleles showed significant association with MI at exome-wide significance (p ≤ 8 × 10 −7 ), suggesting that larger sample size might be required for more rare variant identification. At the end of 2020, UKBB released exome sequencing data of 200,000 individuals. Together with similar datasets from other biobanks, more novel rare variants for CAD are expected to be discovered, which will further empower therapeutic discoveries.

GWAS in a petri dish
It is important to understand which cell types mediate the effects of GWAS loci. GWAS linking genotypes to disease-relevant phenotypes in a specific cell type, represents a new approach to pinpoint causal genes and elucidate the riskrelated mechanisms. Aherrahrou et al. quantified 12 atherosclerosis-relevant phenotypes related to proliferation, migration and calcification in vascular smooth muscle cells (VSMCs) isolated from 151 multiethnic heart transplant donors, who were well genotyped. Approximately 6.3 million genetic variants were genotyped or imputed for GWAS [21]. Four novel loci (p ≤ 5 × 10 −8 ) were associated with at least one of the VSMC phenotypes, and additional 79 nominal significant loci overlapped with known CAD GWAS loci. The authors validated one of the overlapping loci, 1q41, and provided further proof of the causal role of MIA3 at the locus. Despite the success, a larger number of donors would be needed to identify more CAD loci related to VSMC functions with increased statistical power. Such study might be performed in many other CADrelevant cell types to facilitate post-GWAS functional validation and translational applications.

Identification of causal variants and plausible genes in post-GWAS era
Post-GWAS studies elucidating risk-related biological mechanisms come with several challenges. First, association signals at a given GWAS locus do not specify which variant(s) and gene(s) are causal. Due to co-inheritance of variants in LD blocks, the statistically most significant (lead, tag or sentinel) SNP is not necessarily the causal one. Consequently, empirical validations are required to determine functional variants and the genes they affect. The latter is likewise difficult, as over 90% of risk-associated variants are located in the noncoding regions of the genome. Thus, it becomes necessary to untangle the regulatory elements the risk alleles act on as well as their downstream effector gene(s). Often risk variants are found at predicted transcriptional regulatory regions of nearby genes (cis-regulatory elements, CREs) and many of such genes are known causal genes for CAD. This suggests that many GWAS loci contribute to the manifestation of a phenotype by altering regulation of one or more target genes in cis and less often in trans. Nevertheless, it may be difficult to pinpoint the putative CREs and their regulated genes in silico. A combination of genomic datasets and experimental validation is thus needed to facilitate identification of causal variants and genes.

Causal variants at CAD loci
Fine-mapping studies aim to identify causal variants for a locus amongst SNPs in LD. The Encyclopedia of DNA Elements (ENCODE) provides a genome-wide catalogue of regulatory regions [22], including expression quantitative loci (eQTL), which are enriched in regions of GWAS hits [23]. Recently, it became possible to integrate ENCODE data with GWAS associations to functionally finemap known loci to a few likely causal SNPs per locus. The largest effort so far in this respect is the 1MH GWAS, which used chromatin state datasets from the NIH Roadmap Epigenomics Consortium. These investigators identified 1421 potentially causal variants at 115 GWAS loci, and in 14 regions just a single highly probable variant explaining the association signal, such as rs2107595 near HDAC9/TWIST1 and rs9349379 near PHACTR1/EDN1.

Causal genes at CAD loci
Locus-and similarity-based approaches have been developed to objectively and systematically prioritize genes that are responsible for the elevated risk of CAD, that is, the causal culprits. Predictive features of the candidates include (a) the gene in proximity to the sentinel variant, (b) genes containing monogenic mutations for CAD or its risk traits, (c) genes harbouring coding variants previously associated with CAD, (d) genes having protein-altering variants in LD with the sentinel, (e) genes with promoters in chromatin contacts of the CAD variants, (f) genes with eQTLs in CAD-relevant tissues [24,25], (g) genes being targets for cardiovascular drugs [26,27], (h) genes encoding proteins with causal relevance to CAD in Mendelian randomization (MR) studies, and (i) genes associated with relevant phenotypes in experimental studies.
As previous studies based prediction of the causal gene mostly on proximity to the lead SNP, it was reassuring to see that at most loci this was still the case after all other factors were considered in the prediction model. Aragam et al. explored a new similarity-based method, the polygenic priority score (PoPS), to inform gene prioritization, in which instead of a locus of interest, the full genome-wide association data were used (Aragam et al., unpublished data-medRxiv 2021). The analysis resulted in 196 prioritized genes at 196 loci. Despite exclusion of locus-specific information, PoPS still prioritized many genes in proximity to the sentinel variant. It also identified candidate causal genes related to well-established lipid genes, and many nonlipid genes affecting CAD pathogenesis by a number of diverse mechanisms [28,29] (Fig. 2). Figure 2 lists all genes, prioritized by proximity to genome-wide significant GWAS loci and PoPS, in their most relevant functional context.

Functional validation for protein-coding genes
In silico prioritization provides a starting point for validation of candidate genes. Human cell lines, mouse models and human induced pluripotent stem cells (hiPSC) are commonly used for such studies. Not surprisingly, given the many genes and tissues involved, a large spectrum of mechanisms comes into play.
For example, CAD-related function of SORT1 was confirmed in hepatocyte cell lines, MIA3 and GUCY1A3 in VSMC, and EDN1 in human endothelium cells (ECs) [21,[30][31][32][33]. Atherosclero-sis mice models (mainly Apoe −/− and Ldlr −/− ) facilitated functional studies for many CAD GWAS genes. In this regard, the atherogenic role of ADAMTS7 was found in Adamts7 −/− /Ldlr −/− and Adamts7 −/− /Apoe −/− mice, which displayed reduction in aortic lesion formation [34]. One of the underlying mechanisms could be that ADAMTS-7 inhibits re-endothelialization of injured arteries and promotes vascular remodelling through cleavage of thrombospondin-1 [35]. Conversely, ApoE −/− /Svep1 +/− mice showed enhanced leukocyte recruitment from blood into plaque and more atherosclerosis, suggesting an atheroprotective role of SVEP1 [19]. In addition, hiPSCs allow the differentiation of subtle effects of common variants on gene expression or function. These cells can be obtained by reprogramming of human somatic cells, are highly proliferative and could be differentiated into major cell types of human body. Taking advantage of the CRISPR-based genome engineering, genetically matched hiPSCs can be readily generated via various approaches [36]. The combination of hiPSC and CRISPR technologies enables titrating the effect of the nonrisk and risk alleles of a common variant in an isogenic setting. A study with successful application of these techniques in hiPSC-derived endothelial cells explored rs9349379, a variant at PHACTR1 locus, which showed that in fact EDN1, a gene about 600 kb away from this SNP, was transcriptionally regulated [33].
The majority of these studies are designed for a specific locus or gene, lacking the scale to elucidate over 300 CAD risk loci. The CRISPR system has also been modified for large-scale and genomewide functional screenings, which annotate gene functions at scale and could accelerate the field of functional genomics of CAD. For instance, CRISPR sgRNAs targeting the entire CAD candidate genes could be designed and produced as an arrayed panel or a pool library, and researchers could use them to target cell types of interests and assay CAD-relevant phenotypes in batch.

Functional validation for RNA-coding genes
Approximately 2500 miRNAs and over 50,000 lncRNAs have been annotated in the human genome, practically double the number of proteincoding genes, indicating the important role of this part of the genome [37]. Many GWAS hits are mapped near or within noncoding (nc)-RNA genes such as miRNAs, lncRNAs and antisense RNA.
Extremely few nc-RNAs have been studied so far. Several studies unveiled miRNA-dependent regulation of CAD risk variants [38], which often involves in the effect of 3' UTR variants on the miRNA binding site. A well-investigated ncRNA resides at the 9p21 CAD locus, a 'gene desert' without proteincoding gene [39], the first discovered and still so far strongest risk allele in the human genome (Aragam et al., unpublished data-medRxiv 2021). The key region of the 9p21 locus overlaps with exon 13-19 of the lncRNA ANRIL, known as CDKN2B-AS1, which is transcribed on the reverse strand of the INK4b-ARF-INK4a gene cluster. Accumulating evidence suggests that ANRIL is expressed in EC, VSMC, mononuclear phagocytes and atherosclerosis plaques [40,41]. ANRIL occurs as a linear or a circular transcript, which appears to contribute differentially to atherosclerosis.

GWAS candidate genes -From bench to bedside
Several candidate causal genes at CAD loci are clinical drug targets such as 3-hydroxy-3methylglutaryl-coenzyme A reductase (HMGCR) (statins), PCSK9 (respective antibodies or inhibitors) and APOB (Mipomersen), or gene targets under preclinical evaluation (e.g., LPA, APOC3, ANGPTL4, ROCK1 and ROCK2), suggesting many CAD GWAS genes might be druggable [42,43]. In fact, associations of rare variants with CAD risk, GWAS hits and related systems genetics are increasingly used for preclinical target prioritization and drug design [20,26,44,45]. For example, loss-of-function variants in ANGPTL4 (angiopoietin-like 4), a locally released LPL inhibitor, are associated with hypolipidemia and atheroprotective effects. Many ANGPTL4 inhibitors are now in clinical trials and have shown benefits of ameliorating atherogenic dyslipidemia. In addition to genes related to lipid metabolism, efforts also have been made to explore potential targets related to GWAS hits affecting inflammation and dynamics in the artery wall [46]. Tragante et al. applied publicly available GWAS results on CAD to identify drug-gene interactions and prioritized candidates amongst existing drugs (drug repurposing) [26]. They also considered the potential side effects related to candidate drug targets and therefore included pleiotropic traits of CAD loci. A druggability score was calculated based on accessibility of CAD genes and predicted side effects. They prioritized LMOD1, HIP1 and PPP2R3A proteins as novel drug targets, and CHRNB4, ACSS2 and GUCY1A3 as the targets for repurposing drugs such as guanylyl cyclase activators [47,48].
In recent years, human genetics was increasingly used to infer causal effects of presumed drug targets or biomarkers. MR studies use genetic variants as instruments for testing causality of presumed risk factors leading to a disease [49]. If genetic variants have no other immediate effect than modulation of a risk factor (e.g., LDL cholesterol) and also demonstrate association with downstream traits (e.g., CAD), the latter is likely caused by the risk factor (LDL-C) indicating a causal relationship of the risk factor with CAD [50][51][52][53][54]. Thereby MR studies go beyond traditional epidemiological studies in inferring causality, as they are rather independent of confounding. MR studies take advantages of the random assortment of risk-conferring alleles from parent to offspring with the conception implying that the genetics are not affected by disease status (reverse causality) [55].
MR lends further support for many classic causal factors for CAD, such as LDL-C, triglycerides, LP(a), blood pressure, obesity, T2D, smoking and alcohol. Indeed, several causal risk markers were confirmed by MR, including APOC3, IL-1β, IL-6, height, insulin resistance, nonfasting glucose, telomere length, Niemann-Pick C1-Like 1 (NPC1L1) and HMGCR (Fig. 3, Table S1). However, MR also challenged many of them with lack of association, such as HDL-C, C-reactive protein, uric acid, fibrinogen, phospholipase A-2, lipoprotein-associated phospholipase A-2, and folic acid (Fig. 3, Table S1). Furthermore, MR can evaluate effects of drugs in a personalized context. For instance, Ference et al. applied the principle of MR to assess the effect of LDL-C lowering on the risk of CAD mediated by SNPs in NPC1L1 (the target of ezetimibe), HMGCR (the target of statins) or both (combined therapy with ezetimibe and a statin) [56]. The authors found that NPC1L1 and HMGCR genetic variations have nearly the same effect on CAD risk. Moreover, when NPC1L1 and HMGCR risk alleles are combined, they appear to have independent and linearly additive effects on LDL-C levels and loglinearly additive effects on CAD risk. The results agreed with the reported results of IMPROVE-IT trial, suggesting that lowering LDL-C by inhibiting NPC1L1 with ezetimibe has similar effect on CAD risk as suppressing HMGCR with a statin, and that combined NPC1L1 and HMGCR targeting has independent and additive effects on both LDL-C level and the related risk of CAD. In addition, Table S1 MR studies might also shorten the long journey of FDA approval of a drug. MR studies can determine whether drug targets are causative and their modulation related to adverse effects. MR studies are inexpensive, not time consuming and relatively easy. Indeed, MR studies might be more sensitive in comparison to randomized controlled trials and therefore are recommended for drug targets before proceeding to clinical trials.

Epistasis of CAD GWAS
Evidently, a mutation can have different effects in different individuals and this is mainly due to genetic context dependency of a mutation [57]. This genetic dependency is known as epistasis or genetic interaction [58]. In fact, only a small fraction of CAD heritability could be explained by common variations identified to date. Interactions between genes for cardiovascular regulation may account for part of the missing heritability [59], given that the biological mechanisms mediated by genetic effects usually involve multiple genes. Uncovering gene-gene interactions may yield novel insight into biological mechanisms underlying CAD. Sporadic gene-gene interaction amongst CAD GWAS loci have been investigated through candidate gene approaches based on known biol-ogy. Ma et al. tested pairs of GWAS SNPs filtered by biological knowledge, including protein-protein interactions (PPIs), and pathway annotations. Using published cohorts from ARIC, the Framingham Heart Study and the Multi-Ethnic Study of Atherosclerosis, they identified an interaction between HMGCR and LIPC loci affecting HDL-C levels. By similar approach, many reports showed the gene interactions of renin-angiotensin system (RAS) contributing to CAD risk and modifying the disease process, such as the genetic interaction between ACE and AGTR1, ACE and AGT, and ACE and AT1R [60]. However, systematical epistasis studies have not been successfully pursued. Other epistatic mechanisms apply locally-affecting gene expression by allele-specific differential regulation of gene promoters. The key challenges for epistasis investigation were (a) the lack of large sample sizes with individual-level data, and (b) the missing analytical approaches to distinguish true interactions amongst the large number of possibility tests. Lately, evidence has emerged that hypothesis-free searches could identify epistasis and pairwise search for epistasis is rapidly becoming relatively effortless. Sophisticated computational methodologies have enabled the fast, interpretable and potentially routine epistasis analysis at individual-GWAS level. Progress in detecting epistasis of CAD is likely to continue with enlarged sample sizes, increased SNP density and rigorous reporting.

System genetics in networking CAD loci with high-order molecular interaction
The complex etiology of CAD involves not only genetics but also environmental and lifestyle risk factors, age-related changes (e.g., clonal haematopoiesis of indeterminate potential, CHIP), sex differences and gut-microbiome interactions, which all jointly affect the molecular landscape of vascular and metabolic tissues. The commonly used 'reductionist' approach that evaluates one element at a time is certainly not sufficient to elucidate causality. Instead, systems genetics apply population-based approaches to address this complexity. It employs high-throughput multiomics and measures many molecular differences amongst individuals in populations, which could be integrated into higher order interactions of many tissues or cell types to elucidate mechanisms at different phases of complex diseases combining the subtle effects of genetic and environmental influences in molecular networks [61][62][63][64]. The efforts will eventually facilitate the understanding of gene function in the context of pathways and networks, which help researchers to formulate hypotheses for experimental examination [65].
In CAD research, Björkegren et al. have focused on genetics of gene expression studies across multiple tissues, the central mission of the Stockholm Atherosclerosis Gene Expression (STAGE) and Stockholm Tartu Atherosclerosis Reverse Network Engineering Task (STARNET) studies [25,66]. Parallel sampling seven CAD-relevant tissues from each patient undergoing open-thorax surgery made these studies unprecedented resources for systems genetic studies of CAD. RNA expression datasets identified eQTLs in tissue-specific manner and inferred multiple causal regulatory gene networks acting both within and across tissues all affecting CAD risk [63]. Interestingly, these networks were interconnected and formed a supernetwork across all seven tissues, and perturbation in one of the causal regulatory gene networks influenced function of others in the super-network.

CAD PRS versus classic clinical risk stratification
A risk-based stratification strategy is widely used to guide clinician-patient decision-making for prevention and treatment of CAD. According to this strategy, the efforts are matched to the risk estimation of the individual. For instance, pooled cohort equations (PCEs) are commonly used for initial risk assessment, which integrate traditional risk factors to provide race-and sex-specific risk estimation of atherosclerotic cardiovascular diseases. However, clinical risk assessment for CAD is known to be imprecise and the search for better strategies continues to be a focus of cardiovascular research [67]. In the advent of large-scale gene sequencing and genotyping chip techniques with ever cheaper price at faster speed, the ability of PRS to improve risk assessment of CAD seems inevitable.
Commonly tested CAD PRS aggregates individual's genetic variations weighted by GWAS associations into simplified scores that can be used for risk prediction [68]. CAD PRS has evolved from the numeric sum of risk alleles of a handful CAD risk alleles to a score calculated from millions of variants, weighted by their strength of CAD association [69,70]. Khera et al. developed and validated genome-wide polygenic scores by leveraging more than 6 million variants for CAD (GPS CAD ) [70]. The approach identified 8.0% of the population with a more than three-fold increase of CAD risk. Thus, many more people carry profoundly elevated risk for CAD because of such polygenic exposure rather than a rare monogenic mutation [70,71]. The data also suggest that preventive interventions such as statins and lifestyle change recommendations could be focused to high-risk individuals indicated by GPS [70,72].
Two further studies likewise incorporated more than 6 million SNPs for CAD risk prediction in 7237 and 352,660 middle-aged participants without history of CVD at baseline, but found no significant improvement in discriminative accuracy, calibration or net reclassification by the GPS CAD [73,74]. Thus, despite the remarkable success of PRS in stratifying CAD risk in some parts of the population, whether it surpasses the application of traditional risk scoring methods is still an ongoing debate.

Predictive value of CAD PRS in individuals with extreme genetic risks
Whilst the overall predictive value of current PRS for large parts of the general population might be low, PRS offers a unique angle to assess future risk-and thus the potential benefits of early statin therapy-for select groups in primary prevention. Mega et al. leveraged a PRS combining 27 variants and observed that individuals with the highest PRS derived larger relative and absolute risk reduction with statin therapy [75]. Notably, an approximate three-fold decrease in CAD events was observed amongst individuals at a high genetic risk in these primary prevention statin studies.
In secondary prevention, efforts have been made to evaluate PRS as an instrument to identify CAD patients benefiting from PSCK9 antibodies. Marston [76]. Patients with neither high genetic risk nor clinical risk factors had fewer coronary and vascular events and did not benefit from evolocumab treatment for 2.3 years. Patients with low genetic risk but multiple clinical risk factors had intermediate risk and experienced some evolocumab-related risk reduction. Interestingly, regardless of clinical risk, patients with a high genetic risk exhibited a high event rate and evidently profound benefit from evolocumab. Similar findings were made with a PRS comprising over 6 million variants in 11,953 patients from ODYSSEY OUTCOMES trial (Evaluation of Cardiovascular Outcomes After an Acute Coronary Syndrome During Treatment With Alirocumab) [77]. Indeed, both studies showed that in the placebo groups, a high CAD PRS was strongly associated with CVD events, even after adjusting for traditional risk factors. In the treatment groups, high genetic risk was found to derive the greatest absolute and relative risk reduction by PCSK9 inhibitors. These studies strongly support that (a) a CAD PRS predicts recurrent events even independent of traditional risk factors, (b) patients with high CAD PRS are most likely to benefit from PCSK9 inhibition, and (c) clinical application of a CAD PRS in this condition could be employed to titrate intensity of LDL-lowering therapy. Thus, prospective studies investigating the clinical utility and implementation of CAD PRS are warranted [68].

Caveats of current CAD PRS
Several outstanding questions should be answered before broad application of PRS in clinics. Particularly, more information is needed on sex differences, age of first event, time to recurrent events, mixed endpoints of the disease (e.g., CAD, stroke, peripheral arterial disease) and populations outside European ancestry, to name a few. Most importantly, none of the current studies examined the value of PRS earlier in the life course. Moreover, neither most effective forms of PRS (only genomewide significant SNPs vs. millions of variants) nor their stratification (e.g., risk groups vs. a continuous variable) is settled. Nevertheless, genomewide SNP typing is available with low price ($25 to $50) and may systematically provide information for predicting genetic risk of many common complex diseases in parallel, and integrating other risk factors for clinical applications. Finally, guidelines and clinical support systems should be established to implement such genetic-risk stratification [68].

Summary and future directions
CAD GWAS and post-GWAS studies have led to substantial progress in understanding the genetic architecture of this complex disease. Beyond discovery of candidate genes, these studies informed drug developments and improved stratification of disease risk as well as preventive measures. Indeed, GWAS may build a foundation for precision medicine of CAD. From our perspective, post-GWAS applications will be further empowered by integration of additional OMICs data, as well as individual and environmental influences to constitute a system view on this complex disease, which could eventually lead to unveiling of the missing heritability. Obviously, it will require deep learning algorithms and artificial intelligence to integrate and interpret these comprehensive data.
investigation, methodology, project administration, resources, supervision, writing the original draft and review, and editing.