ropensci/taxizedb

Name: taxizedb

Owner: rOpenSci

Description: Tools for Working with Taxonomic SQL Databases

Created: 2016-03-15 16:40:20.0

Updated: 2017-12-16 23:38:19.0

Pushed: 2018-01-05 19:53:45.0

Homepage:

Size: 125

Language: R

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

taxizedb

Build Status codecov rstudio mirror downloads cran version DOI

taxizedb - Tools for Working with Taxonomic Databases on your Machine

taxize is a heavily used taxonomic toolbelt package in R - However, it makes web requests for nearly all methods. That is fine for most cases, but when the user has many, many names it is much more efficient to do requests to a local SQL database.

Not all taxonomic databases are publicly available, or possible to mash into a SQLized version. Taxonomic DB's supported thus far:

Get in touch in the issues with any ideas on new data sources.

This package for each data sources performs the following tasks:

install

cran version

all.packages("taxizedb")

dev version

ools::install_github("ropensci/taxizedb")

ary("taxizedb")
ary("dplyr")
start your SQL DBs

Remember to start your PostgreSQL database for ITIS and ThePlantList and your MySQL database for COL

Download and load DBs

ITIS

 db_download_itis()
oad_itis(x)

The Plant List (TPL)

 db_download_tpl()
oad_tpl(x)

Catalogue of Life (COL)

 db_download_col()
oad_col(x)
connect to the DBs

ITIS

<- src_itis(user = "<user name>", password = "<password>")

TPL

<- src_tpl()

COL

<- src_col()
query with SQL syntax
collect(src, "select * from hierarchy limit 5")
 A tibble: 5 x 5
                   hierarchy_string    tsn parent_tsn level childrencount
                              <chr>  <int>      <int> <int>         <int>
                             202422 202422          0     0        154282
                      202422-846491 846491     202422     1          2666
               202422-846491-660046 660046     846491     2          2654
        202422-846491-660046-846497 846497     660046     3             7
 202422-846491-660046-846497-846508 846508     846497     4             6

 pipe the src to sql_collect
%>% sql_collect("select * from hierarchy limit 5")
 A tibble: 5 x 5
                   hierarchy_string    tsn parent_tsn level childrencount
                              <chr>  <int>      <int> <int>         <int>
                             202422 202422          0     0        154282
                      202422-846491 846491     202422     1          2666
               202422-846491-660046 660046     846491     2          2654
        202422-846491-660046-846497 846497     660046     3             7
 202422-846491-660046-846497-846508 846508     846497     4             6
use dplyr verbs

get a tbl

s <- src %>% tbl("hierarchy")
 Source:   table<hierarchy> [?? x 5]
 Database: postgres 9.6.0 [sacmac@localhost:5432/ITIS]
                                            hierarchy_string    tsn parent_tsn level childrencount
                                                       <chr>  <int>      <int> <int>         <int>
1                                                     202422 202422          0     0        154282
2                                              202422-846491 846491     202422     1          2666
3                                       202422-846491-660046 660046     846491     2          2654
4                                202422-846491-660046-846497 846497     660046     3             7
5                         202422-846491-660046-846497-846508 846508     846497     4             6
6                  202422-846491-660046-846497-846508-846553 846553     846508     5             5
7           202422-846491-660046-846497-846508-846553-954935 954935     846553     6             3
8      202422-846491-660046-846497-846508-846553-954935-5549   5549     954935     7             2
9 202422-846491-660046-846497-846508-846553-954935-5549-5550   5550       5549     8             0
0           202422-846491-660046-846497-846508-846553-954936 954936     846553     6             0
 ... with more rows

select certain fields

s %>% select(tsn, level)
 Source:   lazy query [?? x 2]
 Database: postgres 9.6.0 [sacmac@localhost:5432/ITIS]
     tsn level
   <int> <int>
1 202422     0
2 846491     1
3 660046     2
4 846497     3
5 846508     4
6 846553     5
7 954935     6
8   5549     7
9   5550     8
0 954936     6
 ... with more rows
Contributors
Meta

ropensci


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.