ropensci/paleobioDB

Name: paleobioDB

Owner: rOpenSci

Description: R interface to the Paleobiology Database

Created: 2014-01-24 17:09:05.0

Updated: 2017-10-06 15:54:54.0

Pushed: 2018-01-13 17:58:00.0

Homepage:

Size: 18355

Language: R

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Build Status

rstudio mirror downloads cran version

paleobioDB

About

paleobioDB is a package for downloading, visualizing and processing data from Paleobiology Database.

Quick start

Install

Install paleobioDB from CRAN

all.packages("paleobioDB")
ary(paleobioDB)

Install paleobioDB developing version from github

all.packages("devtools")
ary(devtools)
all_github("ropensci/paleobioDB")
ary(paleobioDB)

General overview

paleobioDB version 0.5 has 19 functions to wrap each endpoint of the PaleobioDB API, plus 8 functions to visualize and process the fossil data. The API documentation for the Paleobiology Database can be found here.

Download fossil occurrences from the PaleobioDB

pbdb_occurrences

e.g., to download all the fossil data that belongs to the family Canidae, set base_name = “Canidae”.

nidae<-  pbdb_occurrences (limit="all",
                         base_name="canidae", vocab="pbdb",
                         interval="Quaternary",             
                         show=c("coords", "phylo", "ident"))
(canidae)
offee
ccurrence_no record_type collection_no            taxon_name taxon_rank taxon_no
50070  occurrence         13293              Cuon sp.      genus    41204
86572  occurrence         18320         Canis cf. sp.      genus    41198
86573  occurrence         18320            Vulpes sp.      genus    41248
86574  occurrence         18320        Borophagus sp.      genus    41196
92926  occurrence         19617        Canis edwardii    species    44838
92927  occurrence         19617 Canis armbrusteri cf.    species    44827
atched_rank     early_interval    late_interval early_age late_age reference_no
 Middle Pleistocene Late Pleistocene     0.81   0.0117         4412
   Late Hemphillian          Blancan    10.300   1.8000         6086
   Late Hemphillian          Blancan    10.300   1.8000         6086
   Late Hemphillian          Blancan    10.300   1.8000         6086
            Blancan     Irvingtonian     4.900   0.3000         2673
            Blancan     Irvingtonian     4.900   0.3000         2673
ng      lat  family family_no     order order_no    class class_no   phylum
11.56667 22.76667 Canidae     41189 Carnivora    36905 Mammalia    36651 Chordata
85.79195 40.45444 Canidae     41189 Carnivora    36905 Mammalia    36651 Chordata
85.79195 40.45444 Canidae     41189 Carnivora    36905 Mammalia    36651 Chordata
85.79195 40.45444 Canidae     41189 Carnivora    36905 Mammalia    36651 Chordata
112.40000 35.70000 Canidae     41189 Carnivora    36905 Mammalia    36651 Chordata
112.40000 35.70000 Canidae     41189 Carnivora    36905 Mammalia    36651 Chordata
hylum_no genus_name species_name genus_reso reid_no species_reso matched_name
3815       Cuon          sp.       <NA>      NA         <NA>         <NA>
3815      Canis          sp.        cf.      NA         <NA>         <NA>
3815     Vulpes          sp.       <NA>      NA         <NA>         <NA>
3815 Borophagus          sp.       <NA>      NA         <NA>         <NA>
3815      Canis     edwardii       <NA>    8376         <NA>         <NA>
3815      Canis  armbrusteri       <NA>    8377          cf.         <NA>
ubgenus_name subgenus_reso
NA>          <NA>
NA>          <NA>
NA>          <NA>
NA>          <NA>
NA>          <NA>
NA>          <NA>

CAUTION WITH THE RAW DATA

Beware of synonyms and errors, they could twist your estimations about species richness, evolutionary and extinction rates, etc. paleobioDB users should be critical about the raw data downloaded from the database and filter the data before analyzing it.

For instance, when using “base_name” for downloading the information with the function pbdb_occurrences, check out the synonyms and errors that could appear in “taxon_name”, “genus_name”, etc. In our example, in canidae$genus_name there are errors: “Canidae” and “Caninae” appeared as genus names. If not eliminated, they will increase the richness of genera.

Map the fossil records

pbdb_map

Returns a map with the species occurrences.

db_map(canidae)

plot of chunk map

pbdb_map_occur Returns a map and a raster object with the sampling effort (number of fossil records per cell).

db_map_occur (canidae, res= 5)
lass       : RasterLayer 
imensions  : 34, 74, 2516  (nrow, ncol, ncell)
esolution  : 5, 5  (x, y)
xtent      : -179.9572, 190.0428, -86.42609, 83.57391  (xmin, ## xmax, ymin, ymax)
oord. ref. : NA 
ata source : in memory
ames       : layer 
alues      : 1, 40  (min, max)

plot of chunk map

pbdb_map_richness Returns a map and a raster object with the number of different species, genera, family, etc. per cell. The user can change the resolution of the cells.

db_map_richness (canidae, res= 5, rank="species")
lass       : RasterLayer 
imensions  : 34, 74, 2516  (nrow, ncol, ncell)
esolution  : 5, 5  (x, y)
xtent      : -179.9572, 190.0428, -86.42609, 83.57391  (xmin, xmax, ymin, ymax)
oord. ref. : NA 
ata source : in memory
ames       : layer 
alues      : 1, 12  (min, max)

plot of chunk map

Explore your fossil data

pbdb_temporal_range

Returns a dataframe and a plot with the time span of the species, genera, families, etc. in your query.

db_temp_range (canidae, rank="species")
                       max    min
anis brevirostris        5.3330 0.0000
anis mesomelas           5.3330 0.
lopex praeglacialis      5.3330 0.0117
yctereutes megamastoides 5.3330 0.0117
ulpes atlantica          5.3330 0.0117
anis latrans             4.9000 0.0000

plot temprange

pbdb_richness

Returns a dataframe and a plot with the number of species (or genera, families, etc.) across time. You should set the temporal extent and the temporal resolution for the steps.

db_richness (canidae, rank="species", temporal_extent=c(0,10), res=1)
offee
abels2 richness
=1       23
-2       56
-3       53
-4       19
-5       18
-6        5
-7        0
-8        0
-9        0
-10       0
10        0

plot richness

pbdb_orig_ext

Returns a dataframe and a plot with the number of new appearances and last appearances of species, genera, families, etc. in your query across the time. You should set the temporal extent and the resolution of the steps.

olutionary rates= orig_ext=1
db_orig_ext (canidae, rank="species", orig_ext=1, temporal_extent=c(0,10), res=1)
            new ext
-2 to 0-1    0  28
-3 to 1-2   34   6
-4 to 2-3    1   0
-5 to 3-4   13   0
-6 to 4-5    5   0
-7 to 5-6    0   0
-8 to 6-7    0   0
-9 to 7-8    0   0
-10 to 8-9   0   0

plot of chunk map

tinction rates= orig_ext=2
_orig_ext(canidae, rank="species", orig_ext=2, temporal_extent=c(0,10), res=1)
           new ext
-2 to 0-1    0  28
-3 to 1-2   34   6
-4 to 2-3    1   0
-5 to 3-4   13   0
-6 to 4-5    5   0
-7 to 5-6    0   0
-8 to 6-7    0   0
-9 to 7-8    0   0
-10 to 8-9   0   0

plot of chunk map

pbdb_subtaxa

Returns a plot and a dataframe with the number of species, genera, families, etc. in your dataset.

db_subtaxa (canidae, do.plot=TRUE)         
pecies genera families orders classes phyla
5     24        1      1       1     1

plot subtaxa

pbdb_temporal_resolution

Returns a plot and a dataframe with a main summary of the temporal resolution of the fossil records

db_temporal_resolution (canidae)
offee
summary
in. 1st Qu.  Median    Mean 3rd Qu.    Max. 
.0117  0.1143  1.5000  1.5360  2.5760 23.0200 

temporal_resolution
1]  0.7693  8.5000  8.5000  8.5000  4.6000
6]  4.6000  4.6000  3.1000  3.1000  3.1000
11]  3.1000  4.6000  3.1000  3.1000  3.1000
16]  3.1000  3.1000  3.1000  3.1000  3.1000
21]  3.1000  3.1000  3.1000  3.1000  3.1000
26]  3.1000  3.1000  3.1000  3.1000  3.1000

plot tempres

Meta

Please report any issues or bugs.

License: GPL-2

To cite package paleobioDB in publications use:

ite package `paleobioDB` in publications use:

 Varela, Javier Gonzalez-Hernandez and Luciano Fabris Sgarbi (2016). paleobioDB: an R-package for downloading, visualizing and processing data from the Paleobiology Database. R package version 0.5. https://github.com/ropensci/paleobioDB

bTeX entry for LaTeX users is

anual{,
title = {paleobioDB: an R-package for downloading, visualizing and processing data from the Paleobiology Database},
author = {{Sara Varela} and {Javier Gonzalez-Hernandez} and {Luciano Fabris Sgarbi}},
year = {2014},
note = {R package version 0.5},
base = {https://github.com/ropensci/paleobioDB},


This package is part of the rOpenSci project.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.