Name: taxodb_ncbi
Owner: C3BI-pasteur-fr
Description: NCBI Taxonomy Database formater
Created: 2015-10-12 14:27:29.0
Updated: 2016-05-26 15:29:09.0
Pushed: 2016-11-23 09:10:58.0
Homepage: null
Size: 34
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
taxodb_ncbi.py
is a simple python script used to format the NCBI taxonomy Database.
It requires bsddb3 python library and Berkeley DB library to work.
Install Berkeley DB
Mac OSX
install berkeley-db4
Ubuntu/Debian
apt-get install libdb-dev
CentOS
yum install libdb-devel
Install bsddb3
install bsddb3
taxodb_ncbi.py
on setup.py install
taxodb_ncbi.py
only requires 'Taxonomy nodes' (nodes.dmp
) and 'Taxonomy names' (names.dmp
) files to work.
These files are provided within NCBI Taxonomy database.
Download required files from NCBI taxonomy database:
wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz
tar zxf taxdump.tar.gz
thon taxodb_ncbi.py -h
e: taxodb_ncbi.py [-h] -n file -d file [-k File] [-t File] [-b File]
[-f string]
ram uses to format the NCBI Taxonomy Database
onal arguments:
, --help show this help message and exit
ons:
file, --names file
names.dmp from NCBI taxonomy databank (default: None)
file, --nodes file
nodes.dmp from NCBI taxonomy databank (default: None)
File, --flatdb File
Output file: flat databank format. (default: None)
File, --tab File Output file: tabulated format. Organism with
classification. (default: None)
File, --bdb File Output file: Berleley db format (default: None)
string, --format string
By default, reports only full taxonomy ie taxonomies
that have 'species', 'subspecies' or 'no rank' at the
final position. Otherwise, reports all taxonomies even
if they are partial (default: full)
Create Berkeley DB database and associated databank and tabulates files:
on taxodb_ncbi.py -n names.dmp -d nodes.dmp -k taxodb_ncbi.out -t taxodb_ncbi.out -b taxodb_ncbi.bdb