Name: taxizedb
Owner: rOpenSci
Description: Tools for Working with Taxonomic SQL Databases
Created: 2016-03-15 16:40:20.0
Updated: 2017-12-16 23:38:19.0
Pushed: 2018-01-05 19:53:45.0
Size: 125
Language: R
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
taxizedb
- Tools for Working with Taxonomic Databases on your Machine
taxize is a heavily used taxonomic toolbelt package in R - However, it makes web requests for nearly all methods. That is fine for most cases, but when the user has many, many names it is much more efficient to do requests to a local SQL database.
Not all taxonomic databases are publicly available, or possible to mash into a SQLized version. Taxonomic DB's supported thus far:
Get in touch in the issues with any ideas on new data sources.
This package for each data sources performs the following tasks:
db_download_*
db_load_*
dplyr
SQL backend - src_*
cran version
all.packages("taxizedb")
dev version
ools::install_github("ropensci/taxizedb")
ary("taxizedb")
ary("dplyr")
Remember to start your PostgreSQL database for ITIS and ThePlantList and your MySQL database for COL
ITIS
db_download_itis()
oad_itis(x)
The Plant List (TPL)
db_download_tpl()
oad_tpl(x)
Catalogue of Life (COL)
db_download_col()
oad_col(x)
ITIS
<- src_itis(user = "<user name>", password = "<password>")
TPL
<- src_tpl()
COL
<- src_col()
collect(src, "select * from hierarchy limit 5")
A tibble: 5 x 5
hierarchy_string tsn parent_tsn level childrencount
<chr> <int> <int> <int> <int>
202422 202422 0 0 154282
202422-846491 846491 202422 1 2666
202422-846491-660046 660046 846491 2 2654
202422-846491-660046-846497 846497 660046 3 7
202422-846491-660046-846497-846508 846508 846497 4 6
pipe the src to sql_collect
%>% sql_collect("select * from hierarchy limit 5")
A tibble: 5 x 5
hierarchy_string tsn parent_tsn level childrencount
<chr> <int> <int> <int> <int>
202422 202422 0 0 154282
202422-846491 846491 202422 1 2666
202422-846491-660046 660046 846491 2 2654
202422-846491-660046-846497 846497 660046 3 7
202422-846491-660046-846497-846508 846508 846497 4 6
get a tbl
s <- src %>% tbl("hierarchy")
Source: table<hierarchy> [?? x 5]
Database: postgres 9.6.0 [sacmac@localhost:5432/ITIS]
hierarchy_string tsn parent_tsn level childrencount
<chr> <int> <int> <int> <int>
1 202422 202422 0 0 154282
2 202422-846491 846491 202422 1 2666
3 202422-846491-660046 660046 846491 2 2654
4 202422-846491-660046-846497 846497 660046 3 7
5 202422-846491-660046-846497-846508 846508 846497 4 6
6 202422-846491-660046-846497-846508-846553 846553 846508 5 5
7 202422-846491-660046-846497-846508-846553-954935 954935 846553 6 3
8 202422-846491-660046-846497-846508-846553-954935-5549 5549 954935 7 2
9 202422-846491-660046-846497-846508-846553-954935-5549-5550 5550 5549 8 0
0 202422-846491-660046-846497-846508-846553-954936 954936 846553 6 0
... with more rows
select certain fields
s %>% select(tsn, level)
Source: lazy query [?? x 2]
Database: postgres 9.6.0 [sacmac@localhost:5432/ITIS]
tsn level
<int> <int>
1 202422 0
2 846491 1
3 660046 2
4 846497 3
5 846508 4
6 846553 5
7 954935 6
8 5549 7
9 5550 8
0 954936 6
... with more rows
taxizedb
in R doing citation(package = 'taxizedb')