marcottelab/hmm_proteome_annotation

Name: hmm_proteome_annotation

Owner: The Marcotte Lab

Description: null

Created: 2016-04-15 00:45:35.0

Updated: 2016-04-20 04:21:20.0

Pushed: 2017-01-12 18:48:05.0

Homepage: null

Size: 108798

Language: Shell

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

This is a process for sorting a whole proteome into Hmm profiles using hmmscan. outputs:

  1. A flat file of the best hit protein - HMM profile matches with annotations (tophit)
  2. A flat file of all protein - HMM profile matches with annotations (annotated)
  3. A flat file of all protein - HMM matches (all)
  4. A flat file of proteins with no HMM hit (nohit)
  5. The input fasta annotated with the EggNOG hmm group annotations. This is output to the proteomes/[species] folder

Instructions 1.Place the annotation and hmm files for a phylogenetic level in hmms/

ex. euNOG_hmm.tar.gz euNOG.annotations.tsv.gz

HMM profiles come from http://eggnogdb.embl.de/#/app/downloads

2.From the main directory run: bash masterscripts/startPress.sh [level]

ex. bash masterscripts/startPress.sh euNOG

This step takes about 10 TACC minutes

  1. Make a directory for the species that you want to run in proteomes/ ex. mkdir proteomes/arath

  2. Place the species' fasta in its folder in proteomes/ ex. proteomes/arath/uniprot-proteome%3AUP000006548.fasta

  3. After the hmms are pressed, from the main directory run: bash masterscripts/startHmmscan.sh [species] [proteome] [level]

ex. bash master_scripts/startHmmscan.sh arath proteomes/arath/uniprot-proteome%3AUP000006548.fasta euNOG

This step takes up to 20 TACC hours depending on proteome/hmm profile count

tophit + nonhits are combined to create look ups for the othology mass spec analysis


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.