Rapid, highly accurate and cost‐effective open‐source simultaneous complete HLA typing and phasing of class I and II alleles using nanopore sequencing

Accurate rapid genotyping of the genes within the HLA region presents many difficulties because of the complexity of this region. Here we present the results of our proof of concept nanopore‐based long read polymerase chain reaction (PCR) solution for HLA genotyping. For 15 HLA anthropology‐based samples and 13 NHS Blood and Transplant derived samples 40 ng of genomic DNA underwent long‐range PCR for class I and II HLA alleles. Pooled PCR products were sequenced on the Oxford Nanopore MinIoON R9.4.1 flow cell. Sequenced reads had HLA genotype assigned with HLA‐LA. Called genotypes were compared with reference derived from a combination of short‐read next‐generation sequencing, Sanger sequence and/or single‐site polymorphism (SSP) typing. For concordance, accuracy was 100%, 98.4%, 97.5% and 95.1% for the first, second, third and fourth fields, respectively, to four field accuracy where it was available, otherwise three field in 28 samples for class I calls and 17 samples for class II calls. Phasing of maternal and paternal alleles, as well as phasing based identification of runs of homozygosity, was shown successfully. Time for assay run was 8 hours and the reconstruction of HLA typing data was 15 minutes. Assay cost was £55 ($80USD)/sample. We have developed a rapid and cost‐effective long‐range PCR and nanopore sequencing‐based assay that can genotype the genes within HLA region to up to four field accuracy, identify runs of homozygosity in HLA, reconstruct maternal and paternal haplotypes and can be scaled from multi‐sample runs to a single sample.

immunosuppressive agents 2 and the identification of the HLA/major histocompatibility complex 3 as the determinant of recognition of a transplanted organ as 'foreign'.
During the process of organ transplantation, HLA matching is performed in order to determine suitability for transplant. The HLA genetics system uses an international classification standard based on observed allelic variation and a common system of representation on genes that make up the HLA region contiguously within chromosome 6 (HLA-A,B,C, DQA1, DPB1, DRB1/3/4/5 and so on). The HLA region is highly polymorphic, [4][5][6] with considerable variation observed across the entire region, as well as difficulties in resolving complex changes because of the structure of this region, including multiple paralogs of the gene (ie, HLA-DRB1/3/4/5). Primer design around these regions using short-read technology is challenging as variation makes it difficult to design primers that span anything other than very short regions, targeting specific alleles, whereas long reads have the advantage of spanning the entire gene of interest.
The nomenclature of the HLA region is necessarily complex, in order to allow a standardised reporting system between laboratories. 7,8 This nomenclature is known as the WHO Nomenclature for factors of the HLA system, which starts with the name of the locus (ie, HLA-A) followed by up to four fields indicating different levels of variation in the DNA sequence and the resulting protein. The first field defines a group of alleles that corresponds to the serologically defined specificity of HLA. The second field equates to nonsynonymous base pair changes that lead to a change in the protein sequence and the third field shows synonymous base pair changes that do not cause protein changes. The fourth field represents changes in the noncoding (ie, intronic) regions. The sequence is sometimes appended with a letter code (C, N, L, Q or S) which represent expression variants (N = null). All currently known HLA alleles are recorded in the IPD-IMGT/HLA database. 9 Kidney, pancreas, heart and liver transplantation rely on at least a one field match, 10 whereas the ideal with allogenic stem cell transplantation would be a two field match 11 and currently the predominant technique used for this is either Sanger sequencing that provides second field resolution 12 and sequence-specific polymerase chain reaction (SS-PCR) 13 for first field resolution, which uses groups of primers to span specific loci in the HLA regions. Although relatively quick (2 hours) this technique is limited by poor resolution to the first or second field only and requires the use of a dedicated real time PCR instrument. Recent advances in short-read next-generation sequencing (NGS) have allowed resolution of the region in greater detail (four field resolution) by enrichment chemistry and Illumina sequencing. 14 Short reads preclude effective analysis of the haplotype and phasing of the HLA region, causing problems with accurate classifications of part of the HLA region, including regions with runs of homozygosity. 15 Use of this technology remains expensive, with a large capital outlay required for the sequencing instrument as well as the use of proprietary software. Short-read technology is comparatively slow compared with SS-PCR as the library preparation and NGS steps takes greater than 24 hours, meaning that accurate four field deceased donor typing is an impossibility.
The development of long-read technology has allowed a solution to these problems. Zero-mode waveguide 16 sequencing (ie, Pacific BioSciences, PacBio) allows accurate, mid-length (typically 10 kb) reads but at high-cost and no advantage of runtimes over conventional approaches, mainly because of the need to batch samples to make sequencing cost-effective. Nanopore sequencing 17 (ie, Oxford Nanopore, ONT) systems allow very long read lengths (>1megabase) with rapid runtimes but at the cost of accuracy compared with short-read sequencing, although nanopore technology is continuously improving with increases in accuracy and the current ONT technology has higher raw read accuracy than the competitor PacBio technology. Long-read sequencing of the HLA region has considerable advantages as although it is more error-prone than short read, the haploblock structure is maintained as with other genomic regions allowing accurate resolution of HLA alleles using haplotype inference 18 and techniques such as population reference graphing. 19 Development of an assay that provides 'whole gene' sequencing of the HLA region, along with highresolution reconstruction of the alleles (known and novel) within it, phasing into maternal and paternal haplotypes and identification of regions of homozygosity, all within a cost-effective, rapid and portable test has the potential to change the field of HLA diagnostics making this type of testing available to all.
A unique technical feature of nanopore sequencing is its scalability: from rapid, one sample, single-gene sequencing through a single flow cell to high volume, whole-genome sequencing. The method is remarkably cost-effective even for a single sample, which means not having to resort to sequencing in large batches. Thus, for full gene HLA sequencing this could mean a fast turnaround for individual patients or recipient/donor pairs, including in a near-patient setting, to multiplex testing of large cohorts, and anything in between. The singlemolecule sequencing reads full-length genes in real time so it includes any DNA variations (in-phase) that, for instance, correspond to expression level or other phenotypes. 20 Although nanopore sequencing is not constrained by the technically demanding nature of other NGS systems this in no way bypasses the requirement for HLA expertise in the authorisation and clinical and scientific use of the information generated by these sequencers. The extreme complexity of HLA requires a corresponding level of understanding to verify, apply and exploit this information. But the agility of our method also means that, where necessary, this can be achieved by the movement of data rather than the movement of DNA or blood samples. This could occur by devolving HLA typing equipment and consumables to the local laboratory level (rather than a central reference) and data uploaded to central reference laboratories for computational analysis and specialist scientific input for resolving allelic ambiguities.
Here we present proof of concept of an assay using long-range PCR to generate full gene length amplicons of the three class I and five class II HLA genes followed by sequencing using a nanopore system and graph-based reconstruction in order to meet these aims and resolve the HLA region to four field accuracy.

| Patient samples
Anonymised patient samples from organ donors were received from NHS Blood and Transplant (NHSBT) under ethical approval (05/Q1605/66). Samples consisted of whole blood taken for routine HLA typing.
A further set of samples (the Frederick Hutchinson HLA anthropology panel) was also chosen that represents 15 samples from different regions of the world allowing us to understand the applicability of the assay to non-CEPH samples and resolve unusual alleles. Unfortunately, some samples from the Fred Hutchinson database had either only two or three field reference data was available so the data from the data from Creary et al 21

| DNA extraction
DNA extraction was performed using the Qiagen DNEasy kit using the standard manufacturers' protocol. DNA was quantified on the Qubit broad range v3 DNA assay (for quantity) and Agilent Tapestation and Nanodrop (for DNA quality). DNA from the Frederick Hutchinson Centre was supplied pre-extracted but was quantified prior to use using the same methodology.

| HLA Reference typing
Donor DNA was typed initially PCR-single-site polymorphism (SSP) (LinkSēq, supplied by One Lambda) and/or sequence specific oligonucleotide (Lifecodes, supplied by Imucor) as part of standard patient care. For Illuminabased NGS typing, pre-amplification fluorometic DNA quantitation was performed using the Qubit Broad Range kits (Thermo Fisher, UK). Prior to amplification genomic DNA was diluted to a concentration of 25 ng/μL. HLA loci were amplified using the AllType (One Lambda, California) 11 locus kit, amplifying HLA-A, -B, -C, DRB1, -DRB345, -DQA1, -DQB1, -DPA1 and -DPB1 in a multiplex PCR. Post-amplification, products were purified using AMPure XP (Agencourt) beads and fluorometric quantitation was repeated using the Qubit (Invitrogen) High Sensitivity kit (dsDNA HS assay).
Amplicons were normalised, then enzymatically fragmented. Barcode ligation was followed by size selection (AMPure XP beads), resulting in products of optimal size (300 bp-1000 bp). A secondary amplification was performed prior to subsequent purification (AMPure XP beads), quantification (Qubit dsDNA HS assay) and final equimolar pooling. The pooled library was denatured with NaOH (20%) and loaded onto an Illumina micro flow cell onto the MiSeq platform (Illumina, California). HLA types were analysed using the Type Stream Visual version 1.2 (One Lambda, California) software.

| HLA class I
Primers sequences are shown in Tables 1 and 2. Amplicons for class I HLA targets (whole gene including exon, intron and untranslated region (UTR)s of HLA A, B, C, E, F and G) were generated in a multiplex reaction using the following conditions: 25 μL PCR reactions were performed using 60 ng DNA, 100 μM primer mix, 1× GoTaq Long (Promega, UK). HLA-E to G were not used in downstream analysis as no reference sample data existed for these genes. The cycling conditions were as follows: 95 C for 2 minutes followed by 30 cycles of 94 C for 30 seconds and 65 for 4 minutes, with a final extension of 10 minutes at 72 C.

| HLA class II
Primer sequences are shown in Table 2. Amplicons for class II (whole gene including exon, intron and UTRs of DRB1, DQB1, DQA1, DPA1 and DPB1) were generated with primers mixes as shown in Table 2 using the following conditions: 25 μL PCR reactions were performed using 60 ng DNA, 20 μM primer mix, 1× GoTaq Long (Promega, UK). The cycling conditions were as follows: 95 C for 2 minutes followed by 30 cycles of 94 C for 30 seconds and 65 for 5/7/9/10 minutes, with a final extension of 10 at 72 C. Amplicons were then quantified by Qubit (Thermo Fisher Scientific, UK) according to the manufactures' instructions and pooled in equimolar amounts for sequencing. All primer sequences were taken from the existing literature or redesigned from this in order to ensure complete coverage of all known HLA alleles by spanning the entirety of the HLA coding regions.
Custom primer design was also carried out for risk alleles in APOL1 that predispose to focal segmental glomerulosclerosis (FSGS) in African patients. The risk alleles were rs73885319 (GRCh38 Chr22:36265860) and rs60910145 (GRCh38 Chr22: 36265988). The PCR primers for this region were spiked into the HLA region as proof of concept.

| Library preparation and sequencing
Barcoded libraries were generated using the native barcoding (EXP-NBD104, EXP-NBD114) and sequencing by ligations kits (SQK-LSK109) from ONT. Briefly 1.3 μg of amplicon pools were end-repaired and a tailed using NEBNext Ultra II module E7546 (3.5 μL End Repair Buffer, 2 μL FFPE repair mix, 3.5 μL Ultra II end-prep reaction buffer and 3 μL of Ultra II endprep enzyme mix to 1.3 μg DNA in a total of reaction volume of 60 μL). This was incubated at 20 C for 5 minutes followed by 65 C for 5 minutes. Clean up was performed using AMPure XP beads (Beckman Coulter) in a 1× ratio. Quantification was performed using fluorimetry (Qubit) and 500 ng taken through to barcode ligation.
Native barcodes were ligated to 500 ng end-repaired/ tailed DNA using NEB blunt/TA ligase M0367 (2.5 μL Native barcode, 25 μL Blunt/TA Ligase Master mIx to 500 ng DNA in a total volume of 50 μL). Following a 10 minutes incubation at room temperature the barcode ligated DNA was cleaned using AMPure XP beads (Beckman Coulter) in a 1× ratio. DNA quantification was performed using fluorimetry (Qubit) and a pool of all samples created with an overall concentration of 700 ng. To reduce the volume a further clean-up was performed using 2.5× AMPure beads and eluting into 65 μL.
Adaptors were ligated by adding 20 μL barcode adaptor mix (ONT) 20 μL quick ligation buffer and 10 μL T4 ligase (NEB Module E6056). Following a 10 minutes incubation at room temperature the adaptor ligated DNA was cleaned using AMPure beads in a 0.4× ratio and washed using long fragment buffer (ONT) before eluting in 15 μL of elution buffer (ONT).

| Bioinformatics analysis
All data analysis was carried out on an Ubuntu 18.04LTS server (with 16 cores and 256 GB memory) and the University of Birmingham BEAR highperformance computing (BEAR-HPC) facility. The jobs  submitted to the BEAR-HPC facility used 32 cores and 256 GB of system memory with a wall time of 30 minutes per sample. Raw data underwent run management with MinKnow v19.05.0 and basecalling using the Guppy 3.1.5 + 781ed57 basecaller using standard parameters. Quality control plots were generated with NanoPlot 1.26.3. 23 Basecalled FASTQ files were demultiplexed using Guppy barcoder 3.1.5 + 781ed57 (parameters: -t 32 --trim_barcodes --require_barcodes_both_ ends -q 0 --compress_fastq). Canu 2.0.0 was 24 then used to correct the raw FASTQ files before alignment (parameters: canu correct -nanopore-raw genome size = 3.1 g). Binned reads were aligned to the Illumina Platinum GRCh38 reference genome using MiniMap v2.12 (parameter: -ax map-ont, setting a default mismatch penalty of four), 25 sorted and indexed using Samtools 1.3.1 using htslib 1.31. 26,27 The aligned BAM file was then input into the HLA-LA* v1.2 pipeline. 28 Output at four field resolution (via the R1_bestguess.txt output) was taken as consensus output to compare to reference Illumina/Sanger/ SSP calls. For FSGS risk alleles, the aligned BAM files were filtered for the region of interest (GRCh38 Chr22: 36265800-36 266 100) and then variant calling was performed using FreeBayes v1.0.0 29 outputting all sites in genome variant call file mode.
Haplotype phasing of the HLA amplicon data was carried out using WhatsHap v.0.18. 30 Initially variant calls for the amplicon data was produced using Freebayes (parameters: -C 2-0 -O -q 20 -z 0.10 -E 0 -X -u -p 2 -F 0.6), then using WhatsHap to produce a phased variant call file (parameters: -o phases.vcf input. BAM). A phased haplotype gene transfer format and a haplotagged BAM file were then produced (using the Whatshap stats and Whatshap haplotag commands, respectively) for visualisation. For identification of homozygosity, visual inspection of the variant calls in integrated genome viewer (IGV) was carried out.
Concordance between reference and nanopore sequenced HLA alleles were defined at each field level as to whether there was an exact match. If there was, this was marked as correct. The numbers of correct alleles were divided by the total number of reference fields present across all the samples (Supporting Information) If there was no third or fourth field, the total number of fields was reduced by the number of samples missing the third/fourth field. A number of samples from the Cleary and Turner papers gave more than one string of alleles as potential references. As the nanopore system by default gave a 'most likely' string of alleles initially this was chosen as one for comparison and only these references are given.
All data is available on request.

| Data delivery
For the NHSBT sample typing, in total 2.7 GBases of sequencing data was produced, with a median read length of 3377 bases, a read length N50 of 3606 bases and a median read quality of 9.4. For the anthropology panel sample typing a total of 3.8 GBases of sequencing data was produced, with a median read length of 3170 bases a read length N50 of 3513 bases and a median read quality of 9.9. Runtime was standardised at 8 hours for both panels. For the single Flongle sequenced sample, 43 266 reads with a median read length of 1080 bases were produced with a total output of 110 megabases of sequence.

| Workflow
The timings for the total workflow are shown in Figure S1. The multiplex long-range PCR reaction took 150 minutes, followed by a modified LSK-109 protocol taking 30 minutes, followed by 120 minutes on the nanopore system and 30 minutes of assembly of the HLA calls. The yield of the flow cells over the project determined the runtime. Typically, a run of 2 hours for a single sample on the Flongle (40 mb yield) and 50 minutes for 12 multiplexed samples on the MinION (396 mb yield) allowed sufficient data for 500× coverage (- Figure S2). We therefore set the runtime at 2 hours.

| Class I and class II HLA call accuracy
In a preliminary analysis, it was found that at least 500× coverage of each amplicon was required for accurate HLA calling, therefore in samples with low coverage, these were rerun. For the first set of NHSBT samples, 11 samples underwent analysis for class I alleles (Table 3). All samples were correct for the first field, NHSBT sample 1 had a reference blood transfusion service (BTS) HLA-C allele of seven, for the MiSeq call it was C*07:02:01:03 (although the C*07:123 was given as the second option in the BTS typing) and for the nanopore, it was C*07:123.
For the second set of NHSBT samples, a more challenging set of two samples were chosen. Concordance for class I and class II calls was 100% with 0% error.
For the anthropology panel, 15 samples underwent analysis for class I and class II alleles (Table 4). All samples were an exact match apart from sample IHW09376.

T A B L E 3
List of results for samples within NHSBT experiment

| FSGS/APOL1 allele calling
In order to understand the use of the nanopore system for single nucleotide polymorphism (SNP) variants that may predispose to clinically relevant diseases, the G1 and G2 risk alleles for FSGS were spiked into the mix. The G1 alleles (rs73885319, Chr22:36265860, NC_000022.10:g.36661906A>G and rs60910145, Chr22:36265988, NC_000022.10:g.36662034T>G) were called in all the NHSBT samples. Of the 12 samples, all had the A reference allele. The G2 allele is a 6 bp (rs71785313, Chr22: 36266000, NC_000022.10:g. 36662046_36662051delTTATAA) deletion in APOL1. Of the 12 samples, the indel was not seen. Of note, several small common SNPs within 200 bp of the region of the SNPs of the APOL1 gene were observed, for example, rs1403581130.

| R9.4.1 vs R10 pores
As part of an early access programme, the project was given to the new R10 nanopore to run HLA typing samples on (Figure 1). The R10 was called using the identical pipeline to the R9 data and displayed significantly higher single-base accuracy. In Figure 2 Figure 3). Median number of mismatches (NM, where fewer mismatches is better) as reported by MiniMap2 was 51 for the R10 pore vs 551 for the R9.4.1 pore (Mann-Whitney P < .0001, Figure 3).

| Single sample calling on the Flongle device
In order to understand whether the output of a miniaturised nanopore device--the Flongle flow cell -a single sample (NHSBT sample 27) was run on a R9.4.1 Flongle. Data output was 0.9 Gb and 100% accuracy was seen at four field level for both class I and class II fields for this sample.

| HLA phasing and identification of homozygosity in HLA-DRB1
Identification of maternal and paternal contributions to HLA alleles is vital to identify runs of homozygosity, which may affect organ matching, as well as being difficult to detect using short-read technologies. In order to show the ability of nanopore long-read sequencing to phase HLA as well as identifying runs of homozygosity, a single sample (anthropology panel sample 1, IHW09377) was chosen for analysis. After variant calling with FreeBayes, haplogroups were generated with WhatsHap. For this sample, two haplogroups were derived for each sample, presumably the maternal and paternal contribution to the inherited HLA of the proband. This could be clearly seen in IGV for HLA-DRB1 (Figure 1) by generating a haplogroup tagged BAM files. In this figure, the separate contributions from maternal and paternal alleles can be seen in the differently coloured reads (green for haplogroup 1, blue for haplogroup 2). Each haploblock spanned the entire amplicon, reinforcing the codominant inheritance of the HLA system. Visual inspection of sample IHW09377 in the anthropology panel showed that HLA-DRB1 was homozygous (Figure 4).

| Speed and cost-effectiveness
The nanopore-based assay showed considerable speedbased advantages over conventional typing. DNA extraction took 1 hour, library preparation 3 hours and sequencing 4 hours to 20 hours depending on the volume of sequence data required. Bioinformatics analysis took 1 hour on a 16 core Intel Xeon server with 256 GB of system memory running Ubuntu LTS 18.04, meaning that in total the assay could be run within 8 hours which is a considerable time saving over NGS and SSP methods. In terms of cost-effectiveness, assuming the pooling of 15 samples on a MinION flow cells, we calculated that the blood DNA extraction costs £25, the long ranger PCR £3, barcoding and library preparation £10 and flow cell costs £27.36, in total giving a cost of £38 without DNA extraction. Typical commercial HLA typing costs range from £300 to £800. For users who do not wish to invest in dedicated hardware, a Ubuntu VM with 64 cores and 256 GB of RAM costs £2.87 (as of February 2020) an hour on the Microsoft Azure cloud (https://azure.microsoft. com/en-gb/pricing/calculator/), meaning a sample compute cost of £2.87/sample (as time is billed per hour).

| CONCLUSIONS
In this study, we have shown that full-length HLA typing using long-range PCR and sequencing on a nanopore sequencing system is highly accurate, cheaper than the nearest alternative and feasible for deployment into the field using a 'laboratory in a suitcase' approach. This approach uses the portability of nanopore sequencing, coupled to a laptop computer and portable PCR equipment to allow HLA typing in resource-poor conditions. Current methodologies for typing of HLA rely on highly specific, but not broad assays such as SSP assays 31 that can sequence individual alleles but not provide indepth reconstruction of the entire region of interest. This means that for rarer alleles although SSP provides accuracy this is at the cost of a single assay that can be used for all patients. Long amplicons, provided by long-range F I G U R E 3 Violin and Whisker plots of log 10 of: Left A, The alignment score (higher is better) for a representative sample comparing R9.4.1 pore (blue) and R10 pore (red). Right B, The number of mismatches (lower is better) for a representative sample comparing R9.4.1 pore (blue) and R10 pore (red) PCR have been previously performed using short-read sequencing, 32 our method coupled with the long-read capability of the nanopore system provides a unique ability to accurately understand the HLA region.
Our use of long-range PCR 33 has advantages in that the entire gene can be encompassed in one PCR reaction, allowing reconstruction of haplotypes 34 and accurate resolution of complex parts of the HLA region. It also requires limited sample input (typically 50 ng of genomic DNA). The disadvantage is the time needed for the extension phase of the PCR reaction, with the longest PCR amplicon (>10 kb) requiring over 10 minutes per cycle which means that a typical long-range PCR reaction for HLA typing takes up to 3 hours. This methodology has the advantage that is can be performed in relatively resource-poor environments enabling its use in lower and middle-income countries (LMIC). When conceived, our original experimental plan for this assay was one that could be used in LMIC as an alternative to expensive and slow out of country HLA typing. An alternative to overcome the disadvantage of extensive long-range PCR amplification may be the use of loop-mediated isothermal amplification (LAMP-PCR), 35 which has the advantage of rapidity but would require extensive primer redesign.
One potential solution to this problem would be the use Cas9 enrichment to pull out HLA regions from samples of genomic DNA, 36,37 which are then subject to a ligation reaction and sequenced using the nanopore device. This has multiple advantages, the first of which is the ability to retrieve in an unbiased fashion the HLA region of interest. In addition, methylation of HLA regions could be natively called off the retrieved DNA because of the inherent ability of nanopore systems to detect methylated bases as signal change within the pore. 38 This would offer the ability to understand the relationship between HLA expression 39 and methylation. A disadvantage of the Cas9 method is the relatively large amounts of genomic DNA required for input (>3 μg), although blood samples will typically provide this, DNA extraction has to be performed carefully in order to maximise retrieved DNA fragment length. Spin column-based extraction can typically obtain >20 kb fragments, although for reads similar to the full-length of HLA (2 mb), more exotic methods of DNA extraction such as the Sambrook and Russell method 40 or pulsed gel field F I G U R E 4 IGV plot showing that HLA-DRB1 is homozygous, represented by the VCF allele call plot (panel below ideogram) is composed of mostly homozygous (red) SNPs and occasional heterozygous (blue) SNPs. IGV, integrated genome viewer electrophoresis must be used. These methods would make it considerably less accessible to LMIC and would restrict this type of application for research environments only.
Another potential benefit to the nanopore system would be parallel sequencing of an expression assay of HLA in order to understand the effect of HLA expression on transplant outcome. It is highly likely that accurate 3 to 4 field resolution of HLA alleles along with methylation and expression data within the same assay would transform our understanding of the importance of this region in HLA typing. There is evidence that HLA typing of the class II system is of considerable importance in haemopoetic stem cell transplantation, 41 and the HLA expression has a bearing on outcomes in this type of transplantation. 42 The algorithm we have used for reconstruction of the HLA region (HLA-LA) has significant advantages as it uses a population reference graph of HLA alleles 28 to accurately reconstruct the HLA region to high accuracy. One issue with population reference graph reconstruction is that it is both computationally and memory intensive, especially with long-read nanopore data. We are working with the authors to modify the algorithm to work on nanopore-based data such that reconstruction is feasible on computers in the field. Another option is the use of a cloud-based infrastructure where nanopore sequencing data is uploaded from the field and HLA types called in real time. This has the advantage of centralised control of the algorithm and quality assurance, but the disadvantage of requiring a method of transfer of nanopore sequencing runs (typically 5-6 GB) which may be difficult in LMIC.
In conclusion, we present our methodology for using nanopore sequencing for four field resolution of all class I and class II alleles using nanopore technology. It is cost-effective, rapid and has many practical advantages over short-read sequencing and we suggest it may represent the most suitable future methodology for HLA typing.