Name: ProphET
Owner: Hurwitz Lab
Description: null
Forked from: jaumlrc/ProphET
Created: 2017-10-21 16:25:36.0
Updated: 2017-10-21 16:25:38.0
Pushed: 2017-08-22 14:04:25.0
Homepage: null
Size: 25809
Language: Perl
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
João L. Reis-Cunha1,2, Daniella C. Bartholomeu2, Ashlee M. Earl1, Bruce W. Birren1, Gustavo C. Cerqueira1
1 Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States
2 Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Brazil
gustavo@broadinstitute.org
Broad users don't need to install any of the of programs and libraries listed below. If you are Broadie please follow the instructions on README_BROAD_USERS.md before installing and running ProphET.
EMBOSS suite
BEDTools suite
BLAST
Perl module Bio::Perl
Perl module SVG
Perl module GD
Perl moduel GD::SVG
Perl module Bio::Graphics
Perl module LWP::Simple
Perl module XML::Simple
To either install ProphET or to update ProphET bacteriophage database please execute the following command from ProphET's home directory:
INSTALL.pl
This will search for required libraries, set the paths of required programs and download from Genbank (NCBI) all genomes associated to 16 families of bacteriophages (listed in config.dir/Prophages_names_sem_Claviviridae_Guttaviridae-TxID ).
Some warnings will be issued during the setup of ProphET DB. See some examples below:
ing: bad /anticodon value '(pos:complement(13054..13056),aa:Met,seq:cat)'
ing: NC_022920: Bad value '(pos:complement(13054..13056),aa:Met,seq:cat)' for tag '/anticodon'
Those warnings refer to unexpected format for coordinates of tRNA features and they won't affect the execution.
If the script fails and reports missing Perl modules/libraries, please folow the instrucions on file README_INSTALLING_PERL_MODULES.md on how to install those.
From ProphET's home directory execute the following command:
ProphET_standalone.pl --fasta test.fasta --gff_in test.gff --outdir test
The execution should take ~ 5 minutes.
Three putative prophages should be reported and its coordinates indicated in the file test/phages_coords:
AT:
ffold> <#prophage> <genomic.start.coord> <genomic.end.coord>
ENT:
05362.1 1 327710 378140
05362.1 2 1292553 1330556
The nucleotide sequence of each prophage can be found in:
/NC_005362.1.phage_1.fas
/NC_005362.1.phage_2.fas
The program also renders a simple diagram depicting all coding genes in the bacterial genome, coding genes with significant matches to phage genes and the location of predicted prophages:
/NC_005362.1.svg
Check if the GFF file that will be provided to ProphET has the format specified by The Sequence Ontology Consortium
Check if all sequences IDs in the FASTA file (header of each sequence) matches perfectly the source field in the GFF file (first column of the GFF) and vice-versa.
ProphET_standalone.pl --fasta_in <file> --gff_in <file> --outdir
<string> [--grid] [--gff_trna <file> ] [--help]
ons:
--fasta_in - Bacterial genome Fasta file
--gff_in - Bacterial GFF file
--gff_trna - Optional parameter, in case the tRNAs are reported in a
separate GFF please provide it here <(Optional)>
--outdir - output directory
--grid - Use UGER for BLAST jobs (Currently only works in the Broad
Institute UGER grid system) (Optional)
--help - print this and some additional info. about FASTA and GFF input
format (Optional)