The molecular definition of red cell antigens


  • 6B-S47-01

Geoff Daniels, Bristol Institute for Transfusion Sciences, NHSBT, North Bristol Park, Bristol BS34 7QH, UK

For half a century after their discovery, blood groups almost entirely represented inherited patterns of reactions in haemagglutination tests and there was no recognition of their molecular definition. In the 1950s, the carbohydrate structures defining the ABO antigens started to be resolved and the molecular phase of blood group science had begun. Gel electrophoresis technology in the 1970s demonstrated that the MNS antigens were defined by proteins and, in 1986, cloning of the gene for glycophorin A, which expresses the MN antigens, heralded the start of the current molecular genetic era of blood group definition. Isolation of the ABO and Rh genes soon followed in 1990 and 1992.

The ISBT currently recognizes 308 blood group specificities, with 270 of these belonging to one of 30 blood group systems [1,2], and several more will be added at the ISBT Congress in Berlin, 2010. Each system represents either a single gene or a cluster of two or three closely linked homologous genes. The genes for all of these systems have been cloned and the molecular bases for all the polymorphisms within the 30 systems are known with the exceptions of Xg and MER2 [2,3]. Twenty-six of the systems appear to comprise a single gene, but Rh, Xg and Chido/Rodgers have two genes each and MNS has three.

The molecular definition of blood groups comprises a variety of factors involving the structure of the red cell surface antigens and the sequence of the genes encoding them, either directly or indirectly.

Red cell membrane molecules expressing blood groups

Antigens of the ABO, P, Lewis, H, I and Globoside systems are carbohydrate structures on glycoproteins and glycoli-pids, their corresponding genes encoding glycosyltransferases, which catalyse the synthesis of oligosaccharide chains. The antigens of the remaining 24 systems are on proteins or glycoproteins and depend primarily on the amino acid sequence, though in some cases glycosylation may be required for complete expression of the antigen.

Proteins and glycoproteins that express blood groups can be classified into four types, based on their integration into the membrane. Type 1 and Type 2 pass through the membrane once. Type 1 proteins (e.g. glycophorins, Lutheran, Knops) have an external N-terminal domain and a cytoplasmic C-terminal domain, whereas in the Kell glycoprotein, the only known red cell surface Type 2 glycoprotein, the C-terminus is external and the N-terminus internal. Type 3 proteins are polytopic; that is, they cross the membrane several times. Usually, both termini are cytoplasmic (e.g. Rh, Kidd, Diego), but the Duffy glycoprotein has an odd number of membrane-spanning domains and an extracellular N-terminal domain. Type 5 proteins have no membrane-spanning domain, but are anchored to the membrane by a glycosylphosphatidylinositol (GPI) anchor, which is attached to the C-terminus of the protein through carbohydrate.

Most red cell surface proteins are glycosylated, the only exceptions being the Rh and Xk proteins. This glycosylation may be (i) N-glycosylation, large, branched sugars attached to asparagine residues of the amino acid backbone and often expressing ABO, H and I antigens, or (ii) O-glycosylation, smaller glycans (usually tetrasaccharides) attached to serine or threonine residues.

Blood group genes

There are basically two types of blood group genes: (i) those encoding the amino acid sequence of the polypeptide backbone of proteins and glycoproteins; (ii) those encoding glycosyltransferases that catalyse the biosynthesis of carbohydrate chains.

Polymorphisms and mutations within the blood group genes

Some of the variety of mechanisms that account for blood group polymorphisms and variants are listed below.

  • • Single nucleotide polymorphisms (SNPs). Most blood group polymorphisms result from one or more SNPs encoding amino acid substitutions in either a glycosyltransferase or the extracellular domain of a membrane protein. In the Duffy system, a SNP in an erythroid transcription factor binding site in the promoter region of the FY gene is responsible for the erythroid Duffy-null phenotype, common in Africans. Some exonic SNPs are called ‘silent’, because they encode no amino acid change, and yet they can affect the antigenic expression by altering RNA splicing.
  • • Gene deletion. Deletion of RHD between two regions of sequence identity (Rh boxes) is the main cause of the D-negative phenotype in the Rh system. Homozygosity for deletion of the whole coding region of GYPB accounts for the S– s– U– phenotype in people of African origin.
  • • Intergenic recombination between closely linked homologous genes giving rise to hybrid proteins occurs in the MNS and Rh systems and creates numerous variants, some of which are polymorphic. These include RHD–CE–Ds in the Rh system, which produces no D and is polymorphic in Africans, and the GYP(B–A–B) gene in the MNS system responsible for the GP.Mur phenotype, which is polymorphic in East Asia.
  • • Inactivating mutations. Some mutations result in expression of no or little protein product. The most commonly encountered examples are: frameshift mutations, resulting from nucleotide deletions or insertions that alter the reading frame of the gene and usually introduce a premature stop codon; splice-site mutations, in which changes in the invariable nucleotides of the intronic splice sites alter RNA splicing, often with the loss of the one or more exons from the mRNA; deletions of whole genes or parts of genes; missense mutations encoding amino acid changes that are incompatible with enzyme activity or have a dramatic affect on the expression of the protein within the membrane.

Transcription factors

Transcription factors are responsible for the regulation of gene expression within specific tissues. Mutations with the genes encoding the erythroid transcription factors EKLF and GATA1 are responsible for aberrant blood group expression within the Lutheran, Indian (CD44), Knops, MER2 and P1 systems [4,5].

Interaction between molecules within the red cell membrane

Blood group proteins do not exist in the red cell membrane in isolation. The Xk and Kell proteins, for example, are linked by a disulphide bond despite being genetically independent. Mutations within the XK gene have profound effects on expression of the Kell antigens. The Rh-associated glycoprotein (RhAG) is associated with the Rh protein in the membrane and mutations in RHAG effect expression of Rh antigens and may lead to the Rhnull phenotype [6].

The membrane protein complex of Rh proteins and RhAG is part of the band 3/Rh ankyrin membrane protein macrocomplex, which also includes band 3 (Diego antigen), glycophorins A and B (MNS antigens), LW and CD47 [7]. This is attached to spectrin of the red cytoskeleton through ankyrin and protein 4·2. The complex may also involve interaction with CD44, the Indian antigen [8]. Evidence from studies on mice suggests that the Rh proteins are also part of another protein complex, together with band 3, glycophorin C, Duffy, Kell and Xk proteins, and linked to the actin–spectrin junction of the cytoskeleton through p55, protein 4·1R and adducin [9]. Inherited changes to any one of these proteins will often affect antigens expressed by one or more of the other proteins within the same complex.

Competition between transferases

The definition of carbohydrate antigens is extremely complex. In addition to being determined by sequence changes within specific genes encoding glycosyltransferases, these enzymes often compete with each other for substrates affecting the nature and quantities of antigenic structures expressed and so changing blood group phenotype.

Selection of erythroid cells in acquired blood group variants

Antigens of the Cromer, Dombrock, Yt and JMH systems are on Type 5 glycoproteins, which are attached to the membrane through a GPI anchor, in which a lipid, linked to the protein through a short carbohydrate chain, is inserted into the outer leaflet of the membrane within the lipid rafts. Paroxysmal nocturnal haemoglobinuria (PNH), a disease characterized by intravascular haemolysis, venous thrombosis and blood cytopenias, is caused by somatic mutations in PIGA, an X-linked gene that encodes a subunit of an enzyme essential for the biosynthesis of the GPI anchor. Over 100 different mutations, most of which are small insertions or deletion, have been identified and occur within haemopoietic stem cells [10]. Somatic cells contain only one active X-chromosome, and consequently only one active PIGA gene. For the disease to manifest, the affected clone must expand at the expense of normal cells. It appears that normal, healthy people have low numbers of PIGA mutated cells. In PNH, an autoimmune process develops in which NK cells of the patient target GPI-linked proteins on normal blood cells and selectively destroys the normal blood cell population, leaving the mutated clone, which lack GPI-linked proteins, to proliferate [11,12]. The affected red cells in patients with PNH (PNHIII cells) are deficient in all GPI-linked proteins and lack all antigens of the Cromer, Dombrock, Yt and JMH systems.

Another acquired blood group phenotype is Tn polyagglutination, in which a proportion of red cells are agglutinated by most human sera. Tn results from the absence of active T-synthase, a galactosyltransferase that is involved in the synthesis of O-glycans. T-synthase catalyses the conversion of Tn (GalNAcα–Ser/Thr) to T (Galβ1,3GalNAcα–Ser/Thr). Exposure of α-linked N-acetylgalactosamine residues when Tn remains unconverted appears to be responsible for the polyagglutination. Like PNH, Tn results from clonal expansion of haemopoietic cells containing somatic inactivating mutations in an X-linked gene: C1GALT1C1 [13,14]. This gene encodes a molecular chaperone, which is essential for T-synthase activity. If a similar process occurs in Tn donors to that in PNH, then NK cells targeting the normal O-glycans of blood cells would be expected in donors having large populations of Tn-positive blood cells.




The serological definition of blood groups is vital in blood transfusion and in solid organ transplantation and has made these two practices possible. The molecular genetic definition of blood group polymorphism means that it is now possible to predict serologically defined phenotypes from DNA. This is extremely valuable when suitable red cell samples are not available, such as for fetal testing or for testing red cells of recently transfused patients or patients whose red cells are coated with immunoglobulin. In the near future, molecular genetic testing for prediction of blood group phenotypes may replace much serological testing simply because it is more easily automated and often more accurate. It will not, however, replace serological ABO typing, in the foreseeable future.

The molecular definition of blood groups at the nucleic acid, protein and carbohydrate levels has provided a vast amount of information on the red cell membrane, its interaction with the red cell cytoskeleton, and the functions of its various components.