nmdp-bioinformatics/imgt2aa

Name: imgt2aa

Owner: NMDP/Be The Match Bioinformatics Research

Description: extract aligned amino acid sequences from IMGT/HLA

Created: 2017-06-21 04:12:15.0

Updated: 2017-06-21 04:28:21.0

Pushed: 2017-06-22 18:31:26.0

Homepage: null

Size: 8

Language: Perl

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

imgt2aa

extract aligned amino acid sequences from IMGT/HLA

Prerequisites

Need to first pull the hla.xml file

rl ftp://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/xml/hla.xml.zip -o hla.xml.zip
zip hla.xml.zip

Run

rl imgt2aa.pl >DPB1.db

The output file is tab-delimited with:

How it works

The hla.xml file contains nucleotide sequences and cDNA coordinates for HLA alleles. The nucleotides for Exon2 are extracted and parsed. Then the cDNA coordinates are used to offset the AA sequence such that the first position in the string (possibly “*“) corresponds to AA 1 in the mature protein.

TODO

Martin Maiers mmaiers@nmdp.org


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.