Name: imgt2aa

Owner: NMDP/Be The Match Bioinformatics Research

Description: extract aligned amino acid sequences from IMGT/HLA

Created: 2017-06-21 04:12:15.0

Updated: 2017-06-21 04:28:21.0

Pushed: 2017-06-22 18:31:26.0

Homepage: null

Size: 8

Language: Perl

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits



extract aligned amino acid sequences from IMGT/HLA


Need to first pull the hla.xml file

rl ftp://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/xml/hla.xml.zip -o hla.xml.zip
zip hla.xml.zip


rl imgt2aa.pl >DPB1.db

The output file is tab-delimited with:

How it works

The hla.xml file contains nucleotide sequences and cDNA coordinates for HLA alleles. The nucleotides for Exon2 are extracted and parsed. Then the cDNA coordinates are used to offset the AA sequence such that the first position in the string (possibly “*“) corresponds to AA 1 in the mature protein.


Martin Maiers mmaiers@nmdp.org

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.