American College of Rheumatology Basic Research Conference: Genetics and genomics in rheumatic disease



Several major steps toward unraveling the complexities of the genetic basis of autoimmune and rheumatic diseases have been made recently. These advances were discussed at 2000 Basic Research Conference: Genetics and Genomics in Rheumatic Disease, held as a preface to the American College of Rheumatology 2000 Annual Meeting in Philadelphia, PA, October 28th–29th, 2000. This summary highlights the major areas reviewed. Topics included the genetic basis of autoimmune disease, strategies and tools used in gene discovery, and the potential impact these discoveries will have on patient care.

Gene identification and biology in Mendelian disease

Progress with several different syndromes illustrates the clinical benefits of genomic advances. How the discovery of a single gene affects the clinical management of a disease is exemplified by the hereditary periodic fever syndromes. Familial Mediterranean fever (FMF) is the most common and best understood of the hereditary fever syndromes. Recessively inherited with incomplete penetrance, it is a disease characterized by episodic fever along with some combination of severe abdominal pain, pleurisy, arthritis, and a characteristic rash. Episodes typically last up to 3 days. Most patients are asymptomatic between attacks, and symptoms appear to be due to the accumulation of neutrophils in affected sites. FMF responds well to prophylactic colchicine.

Historically, FMF was diagnosed by clinical presentation. In 1997, the gene responsible for this disease was identified and cloned, allowing subsequent researchers to develop molecular diagnosis and better prophylaxis. This syndrome is an example of a simple genetic disorder and demonstrates how knowledge of the genetic basis of a disease can improve therapy. It fosters the hope that a similarly detailed molecular understanding of the “complex” genetic diseases (e.g., osteoporosis, rheumatoid arthritis, systemic lupus erythematosus [SLE], osteoarthritis) would provide the opportunity for important progress in conquering them.

The genetic linkage for the FMF gene was first localized to the short arm of chromosome 16p. Later the responsible gene, encoding a protein called pyrin, was shown to have expression relatively restricted to granulocytes and activated monocytes, perhaps partially explaining the accumulation of neutrophils at affected sites. Current hypotheses suggest that the wild-type gene acts as an upregulator of an antiinflammatory molecule (or molecules), or as a downregulator of a proinflammatory molecule (or molecules). Many different missense mutations of this gene have been identified that are associated with variations in phenotypic expression of the syndrome among different populations, demonstrating the impact of different mutations within a gene on the resulting phenotype.

Autoimmune lymphoproliferative syndrome (ALPS) provides another example of a syndrome whose pathophysiology was elucidated following the discovery of the responsible genes and description of the function of their products. It has a clear genetic component but is not a pure, single-gene disorder. ALPS is associated with abnormal lymphocyte apoptosis, and is characterized by variable expression and greater phenotypic prevalence in males. People with ALPS usually present in childhood with unexplained lymphadenopathy, but no fever or other constitutional findings. Biopsy samples demonstrate characteristic changes, but are not collected as frequently because the elevated proportion of CD4 and CD8 double positive αβ T cells in the peripheral blood is virtually diagnostic of ALPS.

Defects in the TNFRSF6 gene have been detected in 75% of patients with ALPS. This gene encodes Fas, which triggers apoptosis. Family studies indicate that the clinical syndrome of ALPS is inherited in an autosomal dominant manner. All family members bearing fas mutations exhibit altered lymphocyte apoptosis, although some of these individuals do not have the clinical syndrome of ALPS.

Investigators have recently determined that the location of the mutation within the gene affects whether or not clinical effects will be present. Mutations that affect the intracellular domain of the protein (also referred to as the “death domain”) have the highest clinical penetrance, whereas mutations in the extracellular domain are less likely to cause clinical features of ALPS. The result of failed apoptosis is that autoreactive lymphocytes are not properly eliminated and lead to autoimmune disease. ALPS patients without mutations in the fas gene have been found to have mutations in the Fas ligand and in genes, such as caspase-10, that encode proteins involved in the downstream cascade of Fas.

Genetic strategies also have been critical for the clinical separation and identification, as well as the elucidation, of the tumor necrosis factor receptor–associated periodic syndrome (TRAPS). A person with TRAPS presents with attacks of fever and severe localized inflammation lasting up to several weeks at a time. Fever may be accompanied by abdominal pain, pleurisy, arthritis, a migratory erythematous skin rash, myalgia, or conjunctivitis. This syndrome is caused by dominantly inherited mutations in TNFRSF1A (formerly referred to as TNFR1), the gene encoding the 55-kda TNF receptor. All known mutations in this gene affect the first two cysteine-rich extracellular subdomains of the receptor, and several mutations are substitutions directly disrupting conserved disulfide bonds. One likely mechanism of inflammation in TRAPS probably involves the impaired cleavage of the TNFRSF1A ectodomain upon cellular activation, with diminished shedding of the potentially antagonistic soluble receptor. Preliminary experience with recombinant p75 TNFR:Fc fusion protein in the treatment of TRAPS has been favorable. Indeed, this is an example of a disease that has gone from identification, through genetic explanation, to an effective therapy for at least some affected patients in less than 2 years.

The molecular pathogenesis of autoimmune polyendocrinopathy–candidiasis–ectodermal dystrophy (APECED) has revealed additional clues to the molecular basis of autoimmunity. APECED is characterized by variable combinations of autoimmune endocrinopathies as well as many other nonendocrine symptoms, including candidiasis, dental enamel hypoplasia, and keratopathy. This syndrome is thought to be the result of the combination of a single gene mutation along with several modifier genes that affect its expression. The gene for APECED was isolated using positional cloning and was named “autoimmune regulator” (AIRE). The AIRE gene maps to 21q22.3 and consists of 14 exons. It is expressed in immune-related organs such as the thymus, lymph nodes, and fetal liver, implying that the AIRE gene plays a pivotal role in immune function. AIRE is predicted to encode a transcription factor, because the protein resides mainly in the nucleus and has been shown in vitro to be a powerful transactivator.

Gene identification technologies

The availability and appropriate use of various tools for genetic and genomic analysis was another major topic explored during the conference. Historically, gene sequence data were obtained from the mRNA that encode known proteins. As we near completion of the Human Genome Project, the opportunity and challenge is that, although DNA sequences are known for segments of chromosomes, the products of the genes and their functions largely remain a mystery. Determining the functional characteristics of these genes requires many complex tools. Among the new tools are cDNA expression profiling, bioinformatics, and single-nucleotide polymorphisms (SNPs).

cDNA expression profiling is often used for determining the genotype of tumors to guide oncologic research. In such an approach, a common reference standard is used against which all tumor samples are compared. Statistical analyses are then performed to find the samples that cluster together based on their gene expression profiles. Various clusters may be used to map relationships between different tumors. This approach identifies areas of “expression space” and allows researchers to look for genes that are associated with that space. However, it is important to distinguish associations that are due to actual biologic phenomena from those that appear as a result of random variation. A weighted list of genes for distinguishing between different tumors can be developed to help identify the true positive results. Clearly, the enormous quantity of data generated from these approaches will benefit from efficient strategies to reduce them to their essential lessons for application.

Bioinformatics applications involving the computerized storage, access, and analysis of data on gene structure and function are critical to present progress. Although bioinformatics cannot explain the biology of a gene, the application of these tools can provide many potentially important insights and can help determine which mutations are associated with occurrence of disease. This is a very complex task, however, as hundreds of thousands and, sometimes, millions of data points must be evaluated and analyzed. Furthermore, observed associations must be explained by biologic phenomena, and not simply by chance variation or by the result of some other artifact.

Several online tools are available to assist the association of genetic structure with function, including the databases and search engines available through the National Center for Biotechnology Information (NCBI). For example, Online Mendelian Inheritance in Man (OMIM) allows researchers to identify known genes for syndromes. LocusLink allows users to link to sequence and structure information about genes and the proteins they encode. UniGene contains the sequences of thousands of mRNAs, and Basic Local Alignment Search Tool (BLAST) allows users to search existing databases for nucleotide or amino acid sequences that match known sequences. This tool can identify similar sequences in other life forms, such as bacteria and yeast, which may facilitate the determination of gene function because such organisms are easier than mammals to manipulate in experimental systems.

SNPs are one group from among the possible different types of sequence variation. Sometimes SNPs produce functional variants, meaning that the mutation is found in an exon, which changes the amino acid incorporated into the resulting protein. SNPs can also be used as polymorphisms to help locate the origin of genetic effects within the genome. This form of analysis assumes that there was a single ancestral chromosome at some point in time, and that SNPs are the result of evolution. Thus, it is possible to estimate the genetic distance by measuring the linkage disequilibrium between SNPs or between a SNP and a candidate disease marker.

When investigating or using SNPs, one should consider the important possibility that the code available at NCBI contains errors. Researchers who are studying autoimmune disorders should consider the SNPs that are in T cell receptor genes and in major histocompatibility complex proteins. An unusually high number and proportion of SNPs in these regions lead to amino acid substitutions. These SNPs are well distributed throughout the T cell receptor molecule, although they are particularly associated with the variable domain of antigen recognition (i.e., the V genes).

Homology and animal models

Homology, the similarity in genetic sequences among different species that arises because of a common ancestor, provides the philosophical basis for suspecting that functional studies of various genes in model organisms will reveal human biology. The use of model organisms allows for experiments that involve knockouts or mutagenesis, and are particularly useful for studying rare Mendelian disorders. Two types of genes may be found between species and can be studied in model organisms: orthologous genes and parologous genes. An orthologous gene has a similar sequence and the same function in different species, whereas parologous genes are close genetic relatives with similar sequences, but that subserve different functions in their respective species.

Because much chromosomal organization has been conserved over mammalian evolution, many genes may have syntenic relationships (i.e., they are found in a very similar order in the genomes of different species). For example, long stretches of chromosomal segments are known to overlap between mice and humans; approximately 200 putative homology segments have been identified from more than 2,000 orthologues (see the human–mouse homology map on NCBI's web site, which provides virtual gene maps of multiple mammalian species). Once a sequence is identified, the totality of knowledge in any species concerning its function, structure, relationship to other genes, and participation in particular pathways, can be used to better understand the particular context under which it is being studied. The facile access to much of this information will greatly accelerate progress and will provide many otherwise unexpected but important insights.

Once a responsible gene has been identified in an animal model, that gene becomes a candidate for the analogous phenotype in another species. Under these circumstances, synteny can be applied to great advantage to find genetic linkage or association.

Highlights of the meeting illustrated application of this approach by the announcement of the discovery of 2 sets of candidate genes potentially responsible for the expression of the lupus phenotype in 2 related murine models.

Brian Kotzin, MD, described the work from his team, which conducted a series of experiments with NZB and NZW crossed mice. (The NZB × NZW cross generates a mouse strain that serves as a model of severe human lupus glomerulonephritis.) Congenic NZB mice were generated that produce autoantibodies and develop glomerulonephritis. These mice were found to have increased IgG anti-chromatin antibodies and an increase in IgG antihistone antibodies. An Affymetrix microarray analysis was used to analyze the DNA of these mice and to identify a gene having a 100-fold difference in expression in the mice with glomerulonephritis.

Ward Wakeland, PhD, leads a group that has been isolating the genetic effects for lupus in the NZM murine model (a hybrid strain of the NZB and NZW mouse strains studied by the Kotzin group). They found multiple genetic effects that contribute to the initiation and progression of autoimmunity in mice with systemic lupus erythematosus (SLE). Sle1 is a genetic locus that is essential for fatal SLE development; Sle2, Sle3, Sle4, yaa, or lpr in combination with Sle1 result in death from lupus. These genes are syntenic in human and mouse DNA, and are considered attractive targets for therapeutic modalities. Sle1 has been found to mediate the spontaneous loss of T cell tolerance to chromatin antigens in T cells and B cells and to underly antinuclear antibody production, but not necessarily pathology.

Four epistatic modifiers have been identified by linkage analysis: Sles1, Sles2, Sles3, and Sles4. Sles1, in particular, can suppress the activity of Sle1, but apparently does not suppress Sle2 or Sle3. In addition to demonstrating the complexities of the genetic interactions leading to the development of autoimmunity, such discoveries shed light on possible mechanisms that identify pathways involved in autoimmune disease.

Wakeland and colleagues have shown that one of the Sle1 genetic effects maps to a region on chromosome 1 containing a cluster of genes related to CD2. The CD2 subset of the immunoglobulin superfamily of cell surface receptors\MCD2, CD48, CD58, CD84, 2B4, Ly-9, and signaling lymphocytic activation molecule (SLAM)—impacts cellular activation and inhibition. These molecules, especially CD2, 2B4, and SLAM, are expressed on various leukocyte subsets, including T and B lymphocytes, monocytes, and natural killer cells. The receptor-ligand pairing of these cell-surface molecules is complex and not well characterized; however, the genes for these proteins show increased variation in structure in mice with SLE. In addition, CD48, SLAM, and Ly-9 are expressed in B cells, and potentially function in the development of autoimmunity. Thus, mutations in one or more of these genes may lead to dysregulation of the immune system, resulting in end-organ toxicity. In addition, there are probably many unidentified modifier genes that have interactive effects.

Genome scans

In humans, genome scans are often used in an attempt to associate genetic loci and specific genes with changes in incidence of disease or presence of particular phenotypes. The results from genome scans of people with rheumatoid arthritis (RA) and SLE have been the most interesting.

RA is a polygenic disease with an unknown degree of underlying etiologic heterogeneity. Although the general population rate for RA is only 0.24–1%, the rate increases to 2–4% in dizygotic twins and up to a 15% concordance rate in monozygotic twins. Many different factors affect the development of RA, including the influence of chance and time on the development of disease, and there are variable outcomes for individuals who do develop the disease.

The North American Rheumatoid Arthritis Consortium (NARAC) is in the process of collecting DNA from 1,000 families nationwide in which at least 2 siblings have RA that was first diagnosed between the ages of 18 and 60 years. As of October 2000, 875 families had been entered into the program, 515 of whom were fully entered and validated, and 257 of whom had completed genome-wide screens for allele sharing.

The interim analysis of NARAC has found 6 regions that appear to be associated with RA (P < 0.005); one of which is located in the HLA region (P < 0.00005). These regions of possible linkage are similar to genomic regions linked to other autoimmune diseases, including SLE, inflammatory bowel disease, and multiple sclerosis; this is consistent with the possibility that all these disorders might share common pathways. Once these regions are verified as more family data are collected, those regions confirmed will be analyzed for polymorphisms that are associated with the development of RA. Because such an undertaking will require a huge collaborative effort, the data generated by NARAC are being openly provided without cost to stimulate progress, and will serve as a national resource for researchers studying the genetics of RA.

Research in the genetics of SLE has advanced from the genome scan stage toward trying to identify the actual genes. In 1997, Tsao and colleagues reported the first evidence of linkage in lupus. Since then, four other groups have performed genome scans, looking at a total of more than 400 pedigrees and 2,000 individuals. The results of these genome scans have been compared with each other in an attempt to identify regions that were identified in more than one scan, and to assess the additive effects from the different studies to help eliminate false positives and false negatives. This analysis revealed 5 regions that were significant and 2 that were suggestive for linkage from at least 2 independent data sets, demonstrating substantial concordance among studies. However, it is important to note that additional relevent loci are likely to be present. Future analyses will match specific American College of Rheumatology criteria for SLE with specific regions, to help redefine the phenotype with a genetic basis.

Finding the true positive effects

Finding different regions of significant linkage in different genome scans presents a prickly problem. Which result is true? Is there a linkage in the genomic region identified or was the putative linkage a false-positive result? This is a conundrum facing virtually all genetic studies of complex disease phenotypes. No one knows how to best deal with this problem, and there is much discussion among investigators. Its importance cannot be overemphasized, mainly because the reproduction of findings is the foundation of the scientific method.

One approach to address this issue in SLE has been a cooperative effort between otherwise competing investigators. A Specialized Center of Research for SLE, sponsored by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS), has been established to perform joint analysis, to resolve multiple regions, and to provide additional resources for searching for genes associated with SLE. A cooperative joint analysis increases statistical power and allows investigators to explore differences in results much more completely than is possible in the comparison of published reports by competing and uncooperative scientific groups. The potential for reconciling differences and for explaining their origins is greatly improved. These genetic studies are, after all, very complicated undertakings. Hopefully, understanding the origins of differences will permit progress and validation of the important findings.

Among the issues confronted are the different genetic markers evaluated in the different genome scans. These are, in addition to differences in collection and ascertainment methods, ethnic differences between populations studied, and analysis options. Fortunately, newer analytic methods are providing methods to integrate results from different genetic marker sets, thereby providing the prospect that this will not be a serious impediment to progress.

When analyzing pedigrees, age of onset can be an important variable used to partition the phenotype into subsets with different genetic explanations. Also, penetrance as a function of age is important. A patient who is very young might not currently express a disease phenotype, but could develop it later in life. Thus, incorporating age of onset and age of contact into pedigrees provides a more complete picture of the phenotype. To assign a relative risk ratio to any gene, the incidence within a pedigree must be compared with the cumulative probability that family members would not develop the disease.

Novel methods are required to incorporate such variables as age into genome screening. Some strategies that have been employed include considering age as a quantitative trait, using survival analytic methods, and partitioning variants into components of variance. For example, when modeling time-to-onset with Martingale residuals is performed in RA pedigrees, there is evidence for chromosome 17 linkage, a finding that otherwise could have been missed if age were not considered.

In humans, the prevalence of SLE is 15–50 per 100,000 individuals. The prevalence is higher in African Americans, and 95% of affected individuals are women. The heritability of SLE is approximately 60%. These statistics suggest that different genes are potentially associated with different characteristics of the disease. Such characteristics can be evaluated for their genetic contribution by an analysis of principal components.

Principal components analysis has more power to detect linkage in the presence of locus heterogeneity when the source of heterogeneity is known. The purpose of principal component analysis is to try to combine the data contributing to the phenotype in a way that is more powerful. Instead of using principal components as the dependent variables, they are used as covariates. Such an analysis examines the data for significant signals associated with specific principal components of a disease. For example, in SLE, linkage to chromosome 7 involves variability in the presentation of malar rash. These additional phenotypic data are often extremely valuable in detecting linkage, but often are overlooked. Methods exist and continue to be developed that allow additional phenotypic data to be included in linkage analysis— great advantage that may potentially identify various diagnostic criteria associated with different loci that act independently of each other.


Perhaps the greatest goal of genetic analysis in autoimmune disease is the identification of pathophysiologic pathways that provide targets for pharmacologic intervention, resulting in improved therapies. Although 70–90% of human genes are currently represented by sequences in human databases, the end result of the Human Genome Project, the functionality of the vast majority remains undescribed. It is estimated that currently available prescription medications target a total of 600 different genes, yet the theoretical number of gene targets is estimated to be 10,000. Thus, the description of gene functionality is likely to be associated with increased numbers of available pharmaceutical agents.

The utility of pharmaceutical agents is also affected by SNPs in these gene targets and by the resulting effects of SNPs on protein function. Thus, pharmaceuticals of the future may be individually tailored depending on phenotype and genotype. For example, certain SNPs may correlate with the response to certain medications, or the likelihood of various adverse events.

Genetic screening to estimate the likelihood of response to a pharmaceutical agent is another potentially important application of genome research. Such tests would accelerate drug development because investigators could exclude patients who are unlikely to respond to the agent, even before they are enrolled in clinical trials. Genotyping of tumors and infecting organisms can help guide treatment selection.

An excellent example of the effect of SNPs on the clinical utility of various pharmacologic agents is illustrated by the CYP450 hepatic enzyme system, one of the most important mechanisms for metabolism of medications. The CYP2C9 isozyme has 2 major variants, R144C and I359L, although a total of 5 variants have been identified. (R144C, for example, means that the arginine at position 144 in the amino acid sequence has been changed to cysteine. This is an example of an SNP in the coding region.) In addition, several other polymorphisms (also SNPs) exist within the gene for the protein, but do not affect the resultant amino acid sequence. Individuals with 2C9 variants exhibit reduced metabolism and clearance of medications that are primarily metabolized by this isozyme. One day, knowledge of the CYP2C9 genotype should guide selection of pharmacologic agents for individual patients.


The data presented during this conference demonstrated that genomic approaches may reveal unanticipated relationships between genotype and phenotype. This increased understanding of the molecular basis of disease is likely to result in the development of antiinflammatory therapies with improved efficacy and reduced toxicity for the treatment of RA, SLE, and other autoimmune diseases.


The support of the American College of Rheumatology is appreciated. This article was prepared with the assistance of Connie Herndon and a written summary was prepared by Judith Crespi-Lofton.


Speakers. The list of speakers at The American College of Rheumatology 2000 Basic Research Conference: Genetics and Genomics in Rheumatic Disease, held in Philadelphia, PA, October 28–29.

Conference Cochairs

John B Harley, MD, PhD (University of Oklahoma and Oklahoma Medical Research Foundation, Oklahoma City, OK)

Daniel L. Kastner, MD, PhD (Arthritis & Rheumatism Branch, NIAMS, National Institutes of Health, Bethesda, MD)

Jeffrey Trent, PhD (National Institutes of Health, National Human Genome Research Institute, Bethesda, MD)

Invited Speakers

Chris Amos, PhD (Department of Endocrinology, M.D. Anderson Cancer Center HMB, University of Texas, Houston, TX)

Christopher Austin, MD (Merck Laboratories, West Point, PA)

Grant Gallagher, MD (Department of Surgery, Glasgow Royal Infirmary, Glasgow, Scotland)

Peter K. Gregersen, MD (Department of Biology and Human Genetics, North Shore University Hospital, Manhasset, NY)

Marie M. Griffiths, PhD (University of Utah School of Medicine, Salt Lake City, UT)

Bina Joe, PhD (National Institutes of Health, Bethesda, MD)

Kimberly D. Klonowski, MD (Temple University School of Medicine, Philadelphia, PA)

Brian L. Kotzin, MD (Division of Clinical Immunology, University of Colorado Health Sciences Center, Denver, CO)

David Landsman, PhD (National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD)

Ali Manir, MD (Molecular Medicine Unit and Research Rheumatology Unit, University of Leeds, Leeds, UK)

Kathy Moser, PhD (Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH)

Ulf Muellar-Ladner, MD (Department of Internal Medicine, University of Regensburg, Regensburg, Germany)

Deborah A. Nickerson, PhD (Department of Molecular Biotechnology, University of Washington, Seattle, WA)

Yoshinori Nonomura, MD (Department of Bioregulatory Medicine and Rheumatology, Tokyo Medical and Dental University, Tokyo, Japan)

Jane Olson, PhD (Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH)

Leena Peltonen, MD, PhD (Department of Human Genetics, University of California Los Angeles, Los Angeles, CA)

Jennifer M. Puck, MD (Genetics and Molecular Biology Branch, National Human Genome Research Institute, National Institute of Health, Bethesda, MD)

Allan Rettie, PhD (Department of Medicinal Chemistry, University of Washington, Seattle, WA)

Stephen Rich, PhD (Section on Epidemiology, Wake Forest University School of Medicine, Winston-Salem, NC)

Jerrold Schwaber, PhD (Thomas Jefferson University, Philadelphia, PA)

Michael F. Seldin, MD, PhD (Rowe Program in Genetics, University of California School of Medicine, Davis, CA)

Nan Shen, MD (Department of Rheumatology, Ren Ji Hospital, Shanghai, China)

Roger Sturrock, MD (Centre for Rheumatic Diseases, University Department of Medicine, Glasgow Royal Infirmary, Glasgow, Scotland)

Betty Tsao, MD (University of California Los Angeles, Rehabilitation Center, Los Angeles, CA)

Ward Wakeland, PhD (Department of Immunology, University of Texas Southwestern Medical School, Dallas, TX)

William C. Whitworth, MPH (Centers for Disease Control and Prevention, Atlanta, GA)

Xiadodong Zhou, MD (Department of Rheumatology, University of Texas Houston Medical School, Houston, TX)


Internet resources. Information pertinent to this article can be found at the following Web sites: American College of Rheumatology (; lupus genetic studies at Oklahoma Medical Research Foundation (, University of California at Los Angeles (, and University of Minnesota (; National Center for Biotechnology Information (NCBI; and its affiliated pages BLAST (, LocusLink (, Online Mendelian Inheritance in Man (OMIM;, and UniGene (; National Institute of Allergy and Infectious Diseases (NIAID;; National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS;; and North American Rheumatoid Arthritis Consortium (