Genetics, its role in preventing the pandemic of coronary artery disease

Abstract Epidemiologists have claimed for decades that about 50% of predisposition for coronary artery disease (CAD) is genetic. Advances in technology made possible the discovery of hundreds of genetic risk variants predisposing to CAD. Multiple clinical trials have shown that cardiac events can be prevented by drugs to lower plasma low‐density lipoprotein cholesterol (LDL‐C). A major barrier to primary prevention is the lack of markers to identify those individuals at risk prior to the development of symptoms of the disease. Conventional risk factors are age‐dependent, occurring mostly in the sixth or seventh decade, which is less than desirable for early primary prevention. A polygenic risk score, derived from the number of genetic risk variants predisposing to CAD inherited by an individual, has been evaluated in over 1 million individuals. The risk for CAD is stratified into high, intermediate, and low. Polygenic risk scores derived from retrospective genotyping of several clinical trials evaluating the effect of statin therapy or PCSK9 inhibitors show the genetic risk is reduced 40%–50% by decreasing plasma LDL‐C. Prospective randomized placebo‐controlled clinical trials document a 40%–50% reduction in cardiac events in individuals at high genetic risk associated with favorable lifestyle changes and increased physical activity. The polygenic risk score is not age‐dependent and remains the same throughout life. Thus, the GRS is superior to conventional risk factors in identifying asymptomatic individuals at risk for CAD early in life for primary prevention. These results indicate clinical embracement of the GRS in primary prevention would be a paradigm shift in the treatment of the number one killer, CAD.


| INTRODUCTION
Advances in technology starting with the sequencing of the human genome in 2001 1 have enabled the discovery of multiple DNA variants that predispose to multiple diseases. Most common diseases have a significant genetic predisposition which could not be explored until the recent advances made possible by HapMap. 2 The availability of Single Nucleotide Polymorphisms (SNP's) to provide DNA markers spanning the genome led to the widespread application of Genome-Wide Association Studies. The present review will summarize these genetic discoveries and how they can be utilized for the early prediction of disease risk. This study will focus on coronary artery disease (CAD). Coronary artery disease has been known for some time to be preventable and clinical trials assessing drugs that lower plasma low-density lipoprotein cholesterol (LDL-C) have consistently shown a significant reduction in cardiac events. Secondary prevention of CAD is very successful. Primary prevention of CAD promises to be more rewarding. A major barrier to primary prevention has been the lack of markers to identify those individuals at risk prior to the development of symptoms of the disease. The inadequacy of using conventional risk factors for primary prevention of CAD will be discussed. The role of predicting genetic risk for CAD will be discussed and how it enables primary prevention to be implemented early in life in males and females. Results of randomized, placebo-controlled clinical trials will be summarized that show genetic risk for CAD can be significantly decreased by drugs that lower the plasma LDL-C and by favorable changes in lifestyle.

Hereditary and heart disease
Epidemiologists have claimed for years that predisposition for CAD is about 50% due to hereditary causes and the remainder being acquired or due to unfavorable lifestyle. 3 The powerful influence of hereditary factors is displayed in the Utah study in which 14% of the population have a family history of heart disease and in this cohort, 72% of all premature myocardial infarctions and 48% of all coronary events occur. 4 The human genome One common mechanism is for non-coding RNA to bind to the 3 0 end of the mRNA, which determines its stability and ultimately translation into protein. However, non-coding RNA, based on the number of nucleotides, are divided into three categories (micro, intermediate, and long) and have many functions as illustrated in Figure 1.
Many human DNA sequences have their origin in simple life forms that originated about 4 billion years ago. It is not surprising that as one moves up the chain into rodents and mammals, their DNA shows considerable overlap with the human DNA sequence. In the interval of 7 million years during which humans and chimpanzee's evolved along separate pathways, their DNA sequences differed by approximately 4%. 5 The difference in genomes among the Homo sapiens is less than 1%. 6 The sequences that contribute to this difference are primarily large structural variants whose function in large part remains to be determined. 2 The sequences that contribute most to the unique features of each individual human are primarily single nucleotide polymorphisms. 16 The number of SNP's per genome is fairly constant at about 5 million. 6 Furthermore, these SNPs are fairly evenly distributed throughout the genome. Evolution and environmental adaptation are made possible by the new mutations that evolve per generation. These mutations are induced by copy errors 7 made in the replication of DNA. The DNA molecule is renewed every few days and its fidelity is maintained by complementary base pairing. This mechanism is very precise and makes about one error per billion bases generated, nevertheless, copy errors occur and such errors occurring in germline cells can be transmitted to the next generation. It is expected that most of these errors would be in the form of single nucleotides since the DNA is synthesized one nucleotide at a time. Of the errors, 96% are single nucleotide polymorphisms, another 2% are duplets or triplets, and the remaining 2% might be several nucleotides. 7 Thus, it is not surprising that over 80% of the unique features of each individual such as the color of one's skin, including predisposition to disease, are due to these SNP's. 8

Genetic risk variants for polygenic diseases
Rapid advances have been made starting in the 80s with the discovery of genes responsible for single-gene disorders, often referred to as Mendelian disorders. These disorders, such as familial cardiomyopathies are rare, occurring in less than 1% of the population, and are highly penetrant with a single gene predominating in the expression of the phenotype. 10 It required only a few hundred DNA markers to genotype pedigrees of two or three generations. Utilizing genetic linkage analysis it was possible to localize the chromosomal locus of the responsible gene and through cloning and sequencing of the region, identify the precise gene and its mutation.
In contrast, polygenic disorders are due to multiple genes, each of which contributes only minimally to the phenotype. These disorders are very common and the phenotype is significantly influenced by environmental and lifestyle factors. It was recognized very early that genetic linkage analysis would not be the appropriate approach to pursue polygenic disorders. Several advances occurred, making it possible to pursue the genetic architecture of polygenic diseases, such as CAD.
The initial discovery was the sequencing of the human genome 11 followed by annotation of millions of single nucleotide polymorphisms by HapMap. 2,12 These SNPs provided DNA markers that were distributed throughout the genome. It was now possible to take an unbiased approach utilizing the Case-Controlled Association Study. Genotyping was performed with SNPs as markers distributed throughout the genome, referred to as Genome-Wide Association Studies (GWAS). 13 Rapid genotyping techniques made it possible to genotype large populations of cases and controls with millions of SNPs. The millions of markers utilized required statistical correction to the conventional p-value of 0.05. There was general agreement that a Bonferroni corrected p-value of 10 À8 would be adopted, referred to as Genome-Wide significant risk. 14 In addition, those markers reaching a p-value of 10 À8 were required to be replicated in an independent population.

Discovery of genetic risk variants for CAD
The application of GWAS led to the simultaneous discovery of 9p21 as the first genetic risk variant for CAD by two independent groups. 15,16 Features of the 9p21 risk variant confirmed our initial hypothesis that there would be many risk variants contributing to CAD and they would be occurring commonly in the population with each having a minimal risk for CAD. Both groups showed the 9p21 risk variant was associated with only a 25% increased relative risk per copy and occurred in 75% of the world's population. Of considerable interest was the observation that the 9p21 risk variant mediated its risk for CAD independent of all known conventional risk factors. The discovery of the 9p21 risk variant utilized a sample size of 23 000 cases and controls and the Icelandic group utilized a sample size of over 17 000 cases and controls. The 9p21 risk variant was subsequently confirmed by the Welcome Trust Group with a sample size of over 14 000 cases and controls. 17 While these initial studies were primarily performed in individuals of European descent, subsequent studies confirmed the 9p21 risk variant predisposed to CAD across many other ethnic groups. 18 The features associated with the 9p21 as a risk variant for CAD confirmed the necessity of having even larger sample sizes in the pursuit of genetic variants predisposing to CAD. This led to the largest international cardiovascular collaboration, referred to as Coronary Artery Disease Genome-wide Replication And Meta-Analysis (CARDIoGRAM) 19 and subsequently as CARDIoGRAMPlusC4D which played a major leadership role in the pursuit of DNA variants predisposing to CAD. In the decade since the discovery of the 9p21 risk variant, the architecture of genetic predisposition for CAD has been unfolding rapidly as documented in a recent review. 18 The international effort along with many individual groups has discovered 173 genetic risk variants 20,21 that satisfy genome-wide significance and have been replicated in an independent population.
Genetic risk variants as targets for novel drug development Genetic risk for CAD can be summarized in a single number The individual predisposition for CAD is proportional to the number of inherited genetic risk variants by that individual. The total burden of risk is reflected by the accumulative number of risk variants inherited. The number of copies of a genetic risk variant for CAD inherited by an individual will vary from none (no copy in either parent) to one copy in one parent (parent heterozygous) to two parents (parents homozygous). The total CAD risk score is determined by the summation of the number of copies of the risk variant, multiplied by their derived odds ratio.
Polygenic risk score is superior to conventional risk factors in risk stratification for primary prevention of CAD Clinical events due to coronary artery disease have consistently been shown to be preventable. Secondary prevention, such as drugs that lower plasma cholesterol, has been consistently associated with a 30%-40% reduction in relative risk. 22 A similar approach to primary prevention has also been shown to be very effective. 23 Coronary artery disease, due to coronary atherosclerosis, develops early and slowly progresses. It usually does not reach a clinical threshold until the fifth or sixth decade, with the incidence of myocardial infarction peaking in males at age 58 and in females at age 68. A limiting factor to primary prevention is identifying those at risk for CAD before the development of symptoms.
Conventional risk factors such as hypertension or diabetes are age-dependent and usually not present until the sixth or seventh decade of life. The National Lipid Association has recently 24 summarized the potential role of coronary artery calcium scoring which is also age-dependent. In contrast to other conventional risk factors, plasma LDL-C increases early in life and is associated with doubling the risk for CAD for each additional decade of exposure 25 Ference et al have shown that primary prevention is more effective when initiated early as opposed to later in life. 26 The genetic risk for CAD is randomly determined at conception thus it is not age-dependent and can be determined at any time after birth. An example would be a 40-year-old male or female with only one risk factor such as plasma LDL-C of 160 mg/dl which according to current guidelines of the AHA, the ACC, and the ESC would not qualify for any form of preventative therapy. If one is entertaining primary prevention, this would be close to the ideal candidate for primary prevention. One solution is to treat everyone with increased plasma LDL-C, however, since the average plasma LDL-C of a 40-year-old Westerner is almost twice the recommended level, this would require treating just about everyone. It is also an epidemiological observation 27 that only about 50% of these individuals, even if they live a normal lifespan, will experience a cardiac event. It is most desirable that we select from among the 50% that is at risk for a coronary event. The use of the polygenic risk score as a complement to the conventional risk factors to identify those individuals at risk will be discussed.
Retrospective evaluation of the polygenic risk score in clinical trials assessing the effect of statins and PCSK9 inhibitors on cardiac events The initial evaluation of the polygenic risk score to predict those at risk of CAD was performed in 2012 utilizing only 12 genetic risk variants. 28 The results showed only a slight improvement over that of The results of these studies are summarized in Table 1.

Limitations to the Current Polygenic Risk Score
The polygenic risk score for CAD is based on genetic risk variants discovered by GWAS performed on populations that are 77% of European descent. 38 It is reasonable to expect that different ethnic groups may have developed variants unique to these groups. The  . Genetic risk stratification identified the group at greatest risk, which was also the group that benefited most from statin therapy. The Fourier and ODYSSEY Clinical Trials evaluated the effect of PCSK9 inhibitors vs. placebo in patients with CHD. Genetic risk stratification identified the group at greatest risk, which was also the group that benefited the most from PCSK9 inhibitors. The study be Kera et al. evaluated the effect of a favorable diet versus an unfavorable diet. Genetic Risk Stratification identified the group at risk, which was also the group that benefited most from a favorable diet. NNT, the number needed to treat. RRR, the relative risk reduction. Sample size, the total number of patients studied.
Each of the populations was genotyped with an array containing 1.7 million and the other 6.6 million genetic risk variants. The polygenic risk score was strongly associated with risk for CAD in all 3 cohorts.  Figure 2.
The other conventional factors in the third and fourth decade are seldom present and cannot be used to initiate primary prevention.
Given plasma cholesterol is elevated in most Americans, 27 if not in most Westerners in the third and fourth decade of life, one might argue, why not treat everyone? The therapy to decrease plasma cholesterol is accessible and inexpensive. The side effects are minimal but the cost would still be significant to treat everyone with therapies such as statin and even more so if one uses PCSK9 inhibitors. The third factor to be considered is that epidemiologists 27 have long recognized that only about 50% of the population will experience a cardiac event even if they live the normal mean lifespan. In an era where medicine is supposed to be more precise and cost-effective, one would prefer initiating primary prevention in those at risk for the disease.
F I G U R E 2 Genetic versus traditional risk stratification for coronary artery disease (CAD). Early primary prevention is limited based on traditional risk factors as shown in this figure. Traditional risk factors such as age, hypertension, or diabetes are infrequent until the 50's or 60's. Cholesterol (red) is an exception which increases early in life and the risk for CAD doubles every 10 years. In contrast, the genetic risk score for CAD (blue) is independent of age and remains the same throughout life. The genetic risk obtainable at any time after birth provides a major advantage enabling one to predict risk for CAD early in life. This could be a paradigm shift for the implementation of primary prevention The polygenic risk score is superior to conventional risk factors and clinical trials have shown genetic risk to be markedly reduced by statin therapy, PCSK9 inhibitors, and lifestyle changes. The age dependency of conventional risk factors does not apply to the polygenic risk score, since one's genetic risk is determined and randomly assigned at conception and does not change in one's lifetime. Thus, the GRS can be determined early in life and serve as the basis to initiate primary prevention of CAD as illustrated in Figure 3. While the polygenic risk score has not yet been incorporated into the cardiology clinical guidelines, the recent revision of the clinical guidelines offers some flexibility by proposing the use of enhancers to stratify for risk. Genetic Risk Stratification as proposed in Figure 3 could be incorporated as an enhancer similar to the recent embracement of the calcium score. In a recent study 44  F I G U R E 3 Genetic risk screening for coronary artery disease (CAD). The sample for DNA analysis can be obtained either from saliva or blood. The DNA will be genotyped for the genetic risk variants predisposing to CAD. The polygenic risk score is calculated as a single number. Based on the GRS, the patients are stratified into three separate groups; high, intermediate, and low risk