Nomenclature for factors of the HLA system, 2010

The WHO Nomenclature Committee for Factors of the HLA System met following the 14th International HLA and Immunogenetics Workshop in Melbourne, Australia in December 2005 and Buzios, Brazil during the 15th International HLA and Immunogenetics Workshop in September 2008. This report documents the additions and revisions to the nomenclature of HLA specificities following the principles established in previous reports (1–18).


Naming of HLA genes and alleles
A number of HLA gene fragments have been reported and named. These are HLA-T previously known as HLA-16 (19), HLA-U previously known as HLA-21 (19), HLA-V previously known as HLA-75 (19), HLA-W previously known as HLA-80 (19), HLA-P previously known as HLA-90 (19) and HLA-Y previously known as HLA-BEL/COQ/DEL (20,21). A full list of all recognised HLA genes is given in Table 1. a. Conditions for acceptance of new allele sequences As emphasised in previous reports, there are required conditions for acceptance of new sequences for official names.
1. Where a sequence is obtained from cDNA, or where PCR products are subcloned prior to sequencing, several clones should have been sequenced. 2. Sequencing should always be performed in both directions. 3. If direct sequencing of PCR amplified material is performed, products from at least two separate PCR reactions must have been sequenced. 4. In individuals who are heterozygous for a locus, and where one of the alleles is novel, the novel allele must be sequenced in isolation from the second allele. Thus an allele sequence that is derived using a sequencebased typing (SBT) methodology, where both alleles of a heterozygous individual are sequenced together, is insufficient evidence for assignment of an official designation. 5. Sequence derived solely from the primers used to amplify an allele must not be included in the submitted sequence. 6. Where possible, a novel sequence should be confirmed by typing of genomic DNA using a method such as PCR-SSOP or PCR-SSP. Where a new sequence contains either a novel mutation or a previously unseen combination of nucleotides (sequence motif), this must be confirmed by a DNA typing technique. This may require the use of newly designed probes or primers to cover the new mutation; these reagents should also be described. 7. An accession number in a databank should have been obtained. Sequences may be submitted to the databases online at the following addresses: EMBL: www.ebi.ac.uk/Submissions/index.html GenBank: www.ncbi.nlm.nih.gov/Genbank/ submit.html DDBJ: www.ddbj.nig.ac.jp/sub-e.html 8. Full-length sequences are preferable though not essential; the minimum requirements are complete exons 2 and 3 for an HLA class I sequence and complete exon 2 for an HLA class II sequence. 9. Where a novel sequence differs only within an intron or other non-coding part of the gene, a full-length sequence must be obtained, which covers all coding and non-coding regions. In the absence of a full-length genomic sequence from the most closely related allele that is identical in its exon sequence, it may be required that this also be sequenced and submitted before a name can be assigned to the novel sequence. 10. Where possible, a paper in which the new sequence is described should be submitted for publication. Copies of draft publications can be submitted to the database by email or FAX. 11. Sequences derived solely from tumour material will not be considered for nomenclature. 12. The complete HLA type for the HLA-A, -B and -DRB1 genes should be submitted for the material in which a novel allele has been defined. In addition the sample should have been characterised for the second allele at the locus of interest in a heterozygous individual. 13. DNA or other material, preferably cell lines, should, wherever possible, be made available in a publicly accessible repository or alternatively, at least in the originating laboratory. The WHO Nomenclature Committee will maintain documentation on this material. 14. Submission of a sequence to the WHO Nomenclature Committee should be performed using the online submission tool available at www.ebi.ac.uk/imgt/hla/subs/ submit.html. Researchers are expected to complete a questionnaire relating to the sequence and provide a comparison of their new sequence with known related alleles. If the sequence cannot be submitted using the online web tools, researchers should contact hla@alleles.org directly for details of alternative submission methods.
Although at present it is only a recommendation that fulllength sequences of the coding region of novel alleles be submitted it was widely felt that in the future this should become a requirement for submission. Such requirement would remove many of the currently encountered ambiguities in the assignment of names to alleles for which partial sequences have been submitted and should not be burdensome as sequencing techniques have improved substantially since the submission conditions were first devised. In cases where novel mutations or polymorphisms are detected in non-coding regions of the gene, it will be a requirement that full-length sequences be submitted of both the novel allele and its most closely related allele. It should be noted with some caution that cells from which only partial sequences have been obtained may later be shown to have different or novel alleles when further sequencing is performed. This is of particular importance in cases where partial sequences of what appears to be the same allele have been obtained from several different cells. In such cases, all cells studied have been listed in this report.
Current practice is that official designations will be promptly assigned to newly described alleles in periods between Nomenclature Committee meetings, provided that the submitted data and its accompanying description meet the criteria outlined above. A list of the newly reported alleles is published each month in nomenclature updates in the journals Tissue Antigens, Human Immunology and the International Journal of Immunogenetics. The listing of references to new sequences does not imply priority of publication. The use of numbers or names for alleles, genes or specificities which pre-empt assignment of official designations by the Nomenclature Committee is strongly discouraged.
The list of those genes in the HLA region considered by the WHO Nomenclature Committee is given in Table 1.

b. New Allele Sequences
A total of 2558 HLA alleles have been named since the last report (18). The newly named alleles are shown in bold typeface in Tables 2 to 11 22 HLA-DPB1, one HLA-DMB and four HLA-DOA alleles were named, making a total of 1198 class II alleles with official names. Eleven MICA alleles were named bringing their total to 68 and 12 MICB alleles bringing their total to 30 alleles, see Table 12. The total number of alleles at each locus assigned with official names as of 31 st December 2009 is given in Table 13. A full list of all allele names that have been deleted is given in Table 14.
In February 2005 the allele A*30:14L was named. The allele has a mutation in codon 164 encoding a cysteine residue contributing to a structurally critical disulphide bond in the α2 domain of the HLA molecule. Expression studies performed on cells with this allele showed its protein to have a much-reduced expression compared to normal, and the allele name was thus given the suffix 'L' to indicate this low expression. Since then several other alleles have been reported that have also lost one of the two cysteine residues (position 101 and 164) that form the α2 domain disulphide bond. It has not, however, been possible to ascertain the expression status of these alleles, due to a lack of viable material. The Nomenclature Committee considered the naming of these alleles during the 14 th HLA and Immunogenetics Workshop. As a result of these discussions, it was decided to introduce an additional suffix, Q, to indicate a 'Questionable' expression level. The first seven alleles to receive this suffix have been named and are included in this report, A*23:19Q, A*32:11Q, B*13:08Q, B*35:65Q, B*39:38Q, C*02:25Q and C*03:22Q. It is anticipated that when further examples of these alleles are described, their expression status will be determined and the suffix changed accordingly.
As the database of HLA allele sequences has expanded, it has become increasingly difficult to maintain consistent linkage between allele names assigned on the basis of nucleotide sequences and the serological profiles of the encoded proteins. These difficulties are in part technological and in part due to the inherent biological properties of the HLA system. In the first category there is the increasing emphasis on DNA technology and consequent lack of a serological description for many newly discovered HLA alleles. In the second category is the finding that a newly defined antigen does not comfortably fit within any known serological grouping. This is especially true of the HLA-DRB1*03, *11, *13, *14 and *08 family of alleles, for which the description of new alleles has revealed a continuum of allelic diversity rather than five discrete sub-families. It should be stressed that, although a goal is to indicate the serological grouping into which an allele will fall, this is not always possible. Most importantly the allele name should be seen as no more than a unique designation.

Serological specificities associated with alleles
Where this information is known, lists of the serological specificities or antigens associated with the alleles, is given in Tables 2-7. In most cases these data are based on the serological typing obtained for the cells that were sequenced for the individual alleles and from information submitted to the Committee. In many cases no serological information is available and the entry in the table has been left blank. This is also true for cases when the serological pattern associated with an expressed allele does not correspond to a single defined specificity. A comprehensive dictionary of antigen and allele equivalents is published periodically by the World Marrow Donor Association (WMDA) Quality Assurance Working Group on HLA Serology to DNA Equivalents (22)(23)(24)(25)(26). Where additional or superior serological data are available from the dictionary, this has been included in Tables 2-7 and the source of this information indicated. A full list of all officially named serological specificities is given in Table 15. The specificity B82 was assigned following the 14 th International HLA and Immunogenetics Workshop in 2005 having been clearly identified as a novel antigen in a number of UCLA cell exchanges.

Introduction of colon delimited HLA allele names
The convention of using a four-digit code to distinguish HLA alleles that differ in the proteins they encode was introduced in the 1987 Nomenclature Report (8). Since that time additional digits have been added, and currently an allele name may be composed of four, six or eight digits dependent on its sequence.
The first two digits describe the allele family, which often corresponds to the serological antigen carried by the allotype. The third and fourth digits are assigned in the order in which the sequences have been determined. Alleles whose numbers differ in the first four digits must differ by one or more nucleotide substitutions that change the amino-acid sequence of the encoded protein. Alleles that differ only by synonymous nucleotide substitutions within the coding sequence are distinguished by the use of the fifth and sixth digits. Alleles that only differ by sequence polymorphisms in introns or in the 5 and 3 untranslated regions that flank the exons and introns are distinguished by the use of the seventh and eight digits.
In 2002 we faced the issue of the A*02 and B*15 allele families having more than 100 alleles (17). At that time the decision taken was to name further alleles in these families in the rollover allele families A*92 and B*95 respectively. For HLA-DPB1 alleles, it was decided to assign new alleles within the existing system, hence once DPB1*9901 had been assigned, the next allele would be assigned DPB1*0102, followed by DPB1*0203, DPB1*0302 etc.
When these conventions were adopted it was anticipated that the nomenclature system would accommodate all the HLA alleles likely to be sequenced. Unfortunately this is not the case, as the number of alleles for certain genes is fast approaching the maximum possible with the current naming convention.
With the ever increasing number of HLA alleles described it has been decided to introduce colons (:) into the allele names to act as delimiters of the separate fields. To facilitate the transition from the old to the new nomenclature, a single leading zero must be added to all fields containing the values 1 to 9 but beyond that no leading zeros are allowed. This will help to lessen any confusion in the conversion to the new style of nomenclature.
Hence For allele families that have more than 100 alleles such as the A*02 and B*15 groups it will be possible to encode these in a single series. Thus the A*92 and B* 95  The names A*02:100 and B*15:100 will not be assigned. In cases of other allele families where the number of alleles reaches 100 these will be numbered sequentially, for example A*24:99 will be followed by A*24:100.
The DPB1 allele names that have been previously assigned names within the existing system have also be renamed, for example: The 'w' will be removed from the HLA-C allele names, but will be retained in the HLA-C antigen names, to avoid confusion with the factors of the complement system and epitopes on the HLA-C molecule often termed C1 and C2 that act as ligands for the Killer-cell Immunoglobulin-like Receptors.

Cw*0103
becomes C*01:03 Cw*020201 becomes C*02:02:01 Cw*07020101 becomes C*07:02:01:01 etc Details of the new format allele names are given in column 1 of Tables 2-12, with the previous name listed in column 2. These changes to the HLA Nomenclature will be officially introduced in April 2010. A full listing of old and new HLA allele names will be made available through the IMGT/HLA Database (www.ebi.ac.uk/imgt/hla) (27,28) and be implemented with the April 2010 release of the database.

Reporting of ambiguous HLA allele typing
The level of resolution achieved by many of the HLA typing technologies employed today does not always allow for a single HLA allele to be unambiguously assigned. Often it is only possible to resolve the presence of a number of closely related alleles. This is referred to as an ambiguous 'string' of alleles. In addition, typing strategies are frequently aimed at resolving alleles that encode differences within the peptide binding domains, but fail to exclude those that differ elsewhere. For some purposes it is helpful to provide codes that aid the reporting of certain ambiguous alleles 'strings'. The decision was taken to introduce codes to allow for the easy reporting of: a. HLA alleles that encode for identical peptide binding domains HLA alleles having nucleotide sequences that encode the same protein sequence for the peptide binding domains (exon 2 and 3 for HLA class I and exon 2 only for HLA class II alleles) will be designated by an upper case 'P' which follows the allele designation of the lowest numbered allele in the group.
For example the string of allele names below share the same α1 and α2 domain protein sequence encoded by exons 2 and 3.
A  140 This string can be reduced to A*02:01P b. HLA alleles that share identical nucleotide sequences for the exons encoding the peptide binding domains HLA alleles that have identical nucleotide sequences for the exons encoding the peptide binding domains (exon 2 and 3 for HLA class I and exon 2 only for HLA class II alleles) will be designated by an upper case 'G' which follows the allele designation of the lowest numbered allele in the group.
For example the string shown below consists of alleles that have identical nucleotide sequences in exons 2 and 3.
A  140 This string can be reduced to A*02:01:01G These reporting codes will be implemented in April 2010 and will be made available through the IMGT/HLA Database (www.ebi.ac.uk/imgt/hla) (27,28) and will be implemented with the April 2010 release of the database.

Gene and protein nomenclature
Discussions took place on the use of nomenclature for defining HLA allele sequences at the gene and protein level. The committee recommended the use of standard genetic nomenclature where gene symbols are in uppercase and italicised and protein symbols are the same as the gene symbols but are not italicised. Using this approach it is possible to discriminate between an allele of the HLA-A gene, for example A*03:01 and the expressed protein product of the same gene A*03:01.
Additionally it was recommended that when reporting an ambiguous string of HLA alleles, a forward slash (/) should be used as the separator to indicate 'or'. When reporting genotypes it was recommended to use a comma (,) to indicate 'and'. Hence an HLA type may be reported as:

The IMGT/HLA Sequence Database
The IMGT/HLA Sequence Database continues to act as the official repository for HLA sequences named by the WHO Nomenclature Committee for Factors of the HLA System (27,28). The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools, and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. New releases of the database are made every three months, in January, April, July and October, with the latest version (release 2.28.0 January 2010) containing 4447 HLA alleles. The database may be accessed via the worldwide web at www.ebi.ac.uk/imgt/hla.
U77344 Table 7 Continued HLA allele a This reference is to a confirmatory sequence. c HLA specificity provided from the HLA dictionary (22)(23)(24)(25)(26).        This reference is to a confirmatory sequence.