Get access

Tools for analysing ambiguous HLA data


  • José M. Nunes

    Corresponding author
    1. LGB, Laboratoire de Génétique et Biométrie, Université de Genève, Geneva, Switzerland
    2. ICBAS, Instituto de Ciências Biomédicas Abel Salazar, Universidade do Porto, Porto, Portugal
    Search for more papers by this author

José Manuel Nunes
Laboratory of Genetics and Biometry
University of Geneva
12, rue Gustave-Revilliod
CH-1227 Carouge
Tel: 4122 3797745
Fax: 4122 3793194


Analyses of large projects involving human leukocyte antigen data often face the difficulty of having data sets gathered using distinct techniques and resolution levels. Furthermore, it is not infrequent that missing and ambiguous data arise at one or several loci. This article describes a set of computer programs that can be used to work efficiently with these kinds of data. The tasks of concern include format conversions, data recoding and replacement, and Expectation–Maximization (gene–counting) based frequency estimation for sets with ambiguous cases either under the assumption of Hardy–Weinberg equilibrium frequencies or when some deviation exists (measured by a one degree of freedom inbreeding coefficient). This set of utilities is built on the top of a data format formally defined. The formal definition of the format allows to express all kinds of observable ambiguities, to define a simpler form for writing and manipulating data set files and to make substantial modifications of the actual symbols chosen to describe the data.