Name: taxize
Owner: rOpenSci
Description: A taxonomic toolbelt for R (https://ropensci.github.io/taxize/)
Created: 2011-05-19 15:05:33.0
Updated: 2017-12-22 11:49:09.0
Pushed: 2018-01-05 19:54:54.0
Homepage: https://ropensci.org/tutorials/taxize.html
Size: 26880
Language: R
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
taxize
allows users to search over many taxonomic data sources for species names (scientific and common) and download up and downstream taxonomic hierarchical information - among other things.
The taxize
tutorial is can be found at https://ropensci.org/tutorials/taxize.html
The functions in the package that hit a specific API have a prefix and suffix separated by an underscore. They follow the format of service_whatitdoes
. For example, gnr_resolve
uses the Global Names Resolver API to resolve species names. General functions in the package that don't hit a specific API don't have two words separated by an underscore, e.g., classification
.
You need API keys for Encyclopedia of Life (EOL), Tropicos, IUCN, and NatureServe.
Note that a few data sources require SOAP web services, which are difficult to support in R across all operating systems. These include: Pan-European Species directories Infrastructure and Mycobank. Data sources that use SOAP web services have been moved to taxizesoap
at https://github.com/ropensci/taxizesoap.
taxize
Souce | Function prefix | API Docs | API key |
---|---|---|---|
Encylopedia of Life | eol |
link | link |
Taxonomic Name Resolution Service | tnrs |
"api.phylotastic.org/tnrs" | none |
Integrated Taxonomic Information Service | itis |
link | none |
Global Names Resolver | gnr |
link | none |
Global Names Index | gni |
link | none |
IUCN Red List | iucn |
link | link |
Tropicos | tp |
link | link |
Theplantlist dot org | tpl |
** | none |
Catalogue of Life | col |
link | none |
National Center for Biotechnology Information | ncbi |
none | none |
CANADENSYS Vascan name search API | vascan |
link | none |
International Plant Names Index (IPNI) | ipni |
link | none |
Barcode of Life Data Systems (BOLD) | bold |
link | none |
National Biodiversity Network (UK) | nbn |
link | none |
Index Fungorum | fg |
link | none |
EU BON | eubon |
link | none |
Index of Names (ION) | ion |
link | none |
Open Tree of Life (TOL) | tol |
link | none |
World Register of Marine Species (WoRMS) | worms |
link | none |
NatureServe | natserv |
link | link |
Wikipedia | wiki |
link | none |
**: There are none! We suggest using TPL
and TPLck
functions in the taxonstand package. We provide two functions to get bullk data: tpl_families
and tpl_get
.
***: There are none! The function scrapes the web directly.
See the newdatasource tag in the issue tracker
For more examples see the tutorial
all.packages("taxize")
Windows users install Rtools first.
all.packages("devtools")
ools::install_github("ropensci/taxize")
ary('taxize')
Alot of taxize
revolves around taxonomic identifiers. Because, as you know, names can be a mess (misspelled, synonyms, etc.), it's better to get an identifier that a particular data sources knows about, then we can move forth acquiring more fun taxonomic data.
<- get_uid(c("Chironomus riparius", "Chaetopteryx"))
Classifications - think of a species, then all the taxonomic ranks up from that species, like genus, family, order, class, kingdom.
<- classification(uids)
ly(out, head)
`315576`
name rank id
cellular organisms no rank 131567
Eukaryota superkingdom 2759
Opisthokonta no rank 33154
Metazoa kingdom 33208
Eumetazoa no rank 6072
Bilateria no rank 33213
`492549`
name rank id
cellular organisms no rank 131567
Eukaryota superkingdom 2759
Opisthokonta no rank 33154
Metazoa kingdom 33208
Eumetazoa no rank 6072
Bilateria no rank 33213
Get immediate children of Salmo. In this case, Salmo is a genus, so this gives species within the genus.
dren("Salmo", db = 'ncbi')
Salmo
childtaxa_id childtaxa_name childtaxa_rank
1509524 Salmo marmoratus x Salmo trutta species
1484545 Salmo cf. cenerinus BOLD:AAB3872 species
1483130 Salmo zrmanjaensis species
1483129 Salmo visovacensis species
1483128 Salmo rhodanensis species
1483127 Salmo pellegrini species
1483126 Salmo opimus species
1483125 Salmo macedonicus species
1483124 Salmo lourosensis species
0 1483123 Salmo labecula species
1 1483122 Salmo farioides species
2 1483121 Salmo chilo species
3 1483120 Salmo cettii species
4 1483119 Salmo cenerinus species
5 1483118 Salmo aphelios species
6 1483117 Salmo akairos species
7 1201173 Salmo peristericus species
8 1035833 Salmo ischchan species
9 700588 Salmo labrax species
0 237411 Salmo obtusirostris species
1 235141 Salmo platycephalus species
2 234793 Salmo letnica species
3 62065 Salmo ohridanus species
4 33518 Salmo marmoratus species
5 33516 Salmo fibreni species
6 33515 Salmo carpio species
7 8032 Salmo trutta species
8 8030 Salmo salar species
ttr(,"class")
1] "children"
ttr(,"db")
1] "ncbi"
Get all species in the genus Apis
stream(as.tsn(154395), db = 'itis', downto = 'species', verbose = FALSE)
`154395`
tsn parentname parenttsn taxonname rankid rankname
154396 Apis 154395 Apis mellifera 220 species
763550 Apis 154395 Apis andreniformis 220 species
763551 Apis 154395 Apis cerana 220 species
763552 Apis 154395 Apis dorsata 220 species
763553 Apis 154395 Apis florea 220 species
763554 Apis 154395 Apis koschevnikovi 220 species
763555 Apis 154395 Apis nigrocincta 220 species
ttr(,"class")
1] "downstream"
ttr(,"db")
1] "itis"
Get all genera up from the species Pinus contorta (this includes the genus of the species, and its co-genera within the same family).
ream("Pinus contorta", db = 'itis', upto = 'Genus', verbose=FALSE)
tsn target
183327 Pinus contorta
183332 Pinus contorta ssp. bolanderi
822698 Pinus contorta ssp. contorta
183329 Pinus contorta ssp. latifolia
183330 Pinus contorta ssp. murrayana
529672 Pinus contorta var. bolanderi
183328 Pinus contorta var. contorta
529673 Pinus contorta var. latifolia
529674 Pinus contorta var. murrayana
commonNames
scrub pine,shore pine,tamarack pine,lodgepole pine
Bolander's beach pine
NA
black pine,Rocky Mountain lodgepole pine
tamarack pine,Sierra lodgepole pine
Bolander beach pine
coast pine,lodgepole pine,beach pine,shore pine
tall lodgepole pine,lodgepole pine,Rocky Mountain lodgepole pine
Murray's lodgepole pine,Sierra lodgepole pine,tamarack pine
nameUsage
accepted
not accepted
not accepted
not accepted
not accepted
accepted
accepted
accepted
accepted
inus contorta
NA
ttr(,"class")
1] "upstream"
ttr(,"db")
1] "itis"
nyms("Acer drummondii", db="itis")
tsn target commonNames nameUsage
183671 Acer drummondii NA not accepted
183672 Rufacer drummondii NA not accepted
`Acer drummondii`
1] NA
ttr(,"class")
1] "synonyms"
ttr(,"db")
1] "itis"
ids(names="Salvelinus fontinalis", db = c('itis', 'ncbi'), verbose=FALSE)
itis
alvelinus fontinalis
"162003"
ttr(,"match")
1] "found"
ttr(,"multiple_matches")
1] FALSE
ttr(,"pattern_match")
1] FALSE
ttr(,"uri")
1] "http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=162003"
ttr(,"class")
1] "tsn"
ncbi
alvelinus fontinalis
"8038"
ttr(,"class")
1] "uid"
ttr(,"match")
1] "found"
ttr(,"multiple_matches")
1] FALSE
ttr(,"pattern_match")
1] FALSE
ttr(,"uri")
1] "https://www.ncbi.nlm.nih.gov/taxonomy/8038"
ttr(,"class")
1] "ids"
You can limit to certain rows when getting ids in any get_*()
functions
ids(names="Poa annua", db = "gbif", rows=1)
gbif
oa annua
2704179"
ttr(,"class")
1] "gbifid"
ttr(,"match")
1] "found"
ttr(,"multiple_matches")
1] TRUE
ttr(,"pattern_match")
1] FALSE
ttr(,"uri")
1] "http://www.gbif.org/species/2704179"
ttr(,"class")
1] "ids"
Furthermore, you can just back all ids if that's your jam with the get_*_()
functions (all get_*()
functions with additional _
underscore at end of function name)
ids_(c("Chironomus riparius", "Pinus contorta"), db = 'nbn', rows=1:3)
nbn
nbn$`Chironomus riparius`
guid scientificName rank taxonomicStatus
NBNSYS0000027573 Chironomus riparius species accepted
NHMSYS0000864966 Damaeus (Damaeus) riparius species accepted
NHMSYS0021059238 Rhizoclonium riparium species accepted
nbn$`Pinus contorta`
guid scientificName rank taxonomicStatus
NBNSYS0000004786 Pinus contorta species accepted
NHMSYS0000494858 Pinus contorta var. murrayana variety accepted
NHMSYS0000494848 Pinus contorta var. contorta variety accepted
ttr(,"class")
1] "ids"
comm('Helianthus annuus', db = 'itis')
tsn target
36616 Helianthus annuus
525928 Helianthus annuus ssp. jaegeri
525929 Helianthus annuus ssp. lenticularis
525930 Helianthus annuus ssp. texanus
536095 Helianthus annuus var. lenticularis
536096 Helianthus annuus var. macrocarpus
536097 Helianthus annuus var. texanus
commonNames nameUsage
annual sunflower,sunflower,wild sunflower,common sunflower accepted
NA not accepted
NA not accepted
NA not accepted
NA not accepted
NA not accepted
NA not accepted
`Helianthus annuus`
1] NA
2sci("black bear", db = "itis")
`black bear`
1] "Chiropotes satanas" "Ursus americanus luteolus"
3] "Ursus americanus" "Ursus americanus"
5] "Ursus americanus americanus" "Ursus thibetanus"
7] "Ursus thibetanus"
<- c("Sus scrofa", "Homo sapiens", "Nycticebus coucang")
st_common(spp, db = "ncbi")
name rank id
1 Boreoeutheria below-class 1437010
numeric
to uid
id(315567)
1] "315567"
ttr(,"class")
1] "uid"
ttr(,"match")
1] "found"
ttr(,"multiple_matches")
1] FALSE
ttr(,"pattern_match")
1] FALSE
ttr(,"uri")
1] "https://www.ncbi.nlm.nih.gov/taxonomy/315567"
list
to uid
id(list("315567", "3339", "9696"))
1] "315567" "3339" "9696"
ttr(,"class")
1] "uid"
ttr(,"match")
1] "found" "found" "found"
ttr(,"multiple_matches")
1] FALSE FALSE FALSE
ttr(,"pattern_match")
1] FALSE FALSE FALSE
ttr(,"uri")
1] "https://www.ncbi.nlm.nih.gov/taxonomy/315567"
2] "https://www.ncbi.nlm.nih.gov/taxonomy/3339"
3] "https://www.ncbi.nlm.nih.gov/taxonomy/9696"
<- as.uid(c(315567, 3339, 9696))
<- data.frame(out))
ids class match multiple_matches pattern_match
315567 uid found FALSE FALSE
3339 uid found FALSE FALSE
9696 uid found FALSE FALSE
uri
https://www.ncbi.nlm.nih.gov/taxonomy/315567
https://www.ncbi.nlm.nih.gov/taxonomy/3339
https://www.ncbi.nlm.nih.gov/taxonomy/9696
See our CONTRIBUTING document.
Alphebetical
Alphebetical
ahhurlbert - Alectoria - andzandz11 - antagomir - arendsee - ashenkin - ashiklom - bomeara - bw4sz - cboettig - cdeterman - ChrKoenig - chuckrp - clarson2191 - claudenozeres - cmzambranat - daattali - DanielGMead - davharris - davidvilanova - diogoprov - dlebauer - dlenz1 - dschlaep - EDiLD - emhart - fdschneider - fgabriel1891 - fmichonneau - gedankenstuecke - GISKid - glaroc - gustavobio - ibartomeus - jangorecki - jarioksa - jebyrnes - johnbaums - jonmcalder - JoStaerk - jsgosnell - kamapu - karthik - KevCaz - kgturner - kmeverson - Koalha - ljvillanueva - Markus2015 - mcsiple - MikkoVihtakari - millerjef - miriamgrace - mpnelsen - MUSEZOOLVERT - nate-d-olson - nmatzke - npch - philippi - pmarchand1 - RodgerG - rossmounce - sariya - scelmendorf - sckott - SimonGoring - snsheth - snubian - tdjames1 - tmkurobe - tpaulson1 - tpoisot - vijaybarve - wcornwell - wpetry - zachary-foster
Check out our milestones to see what we plan to get done for each version.
taxize
in R doing citation(package = 'taxize')