Name: rdpla
Owner: rOpenSci
Description: DPLA R client
Created: 2014-10-28 19:43:39.0
Updated: 2018-01-02 20:19:46.0
Pushed: 2018-01-15 17:56:56.0
Size: 1178
Language: R
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
rdpla
: R client for Digital Public Library of America
Digital Public Library of America brings together metadata from libraries, archives, and museums in the US, and makes it freely available via their web portal as well as an API. DPLA's portal and API don't provide the items themselves from contributing institutions, but they provide links to make it easy to find things. The kinds of things DPLA holds metadata for include images of works held in museums, photographs from various photographic collections, texts, sounds, and moving images.
DPLA has a great API with good documentation - a rare thing in this world. Further documentation on their API can be found on their search fields and examples of queries. Metadata schema information here.
DPLA data data can be used for a variety of use cases in various academic and non-academic fields. Here are some examples (vignettes to come soon showing examples):
DPLA API has two main services (quoting from their API docs):
rdpla
also has an interface (dpla_bulk
) to download bulk and compressed JSON data.
Note that you can only run examples/vignette/tests if you have an API key. See
?dpla_get_key
to get an API key.
There are two vignettes. After installation check them out. If installing from
GitHub, do devtools::install_github("ropensci/rdpla", build_vignettes = TRUE)
rdpla
rdpla
use case: vizualize churches across DPLA holdingsStable version from CRAN
all.packages("rdpla")
Dev version from GitHub:
all.packages("devtools")
ools::install_github("ropensci/rdpla")
ary('rdpla')
You need an API key to use the DPLA API. Use dpla_get_key()
to request a key,
which will then be emailed to you. Pass in the key in the key
parameter in
functions in this package or you can store the key in your .Renviron
as
DPLA_API_KEY
or in your .Rprofile
file under the name dpla_api_key
.
Note: limiting fields returned for readme brevity.
Basic search
_items(q="fruit", page_size=5, fields=c("provider","creator"))
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
40007 0 5
data
A tibble: 5 x 2
provider creator
<chr> <chr>
Mountain West Digital Library no content
Mountain West Digital Library no content
Mountain West Digital Library no content
Mountain West Digital Library no content
The New York Public Library Anderson, Alexander (1775-1870)
facets
ist()
Limit fields returned
_items(q="fruit", page_size = 10, fields=c("publisher","format"))
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
40007 0 10
data
A tibble: 10 x 2
format
<chr>
1 no content
2 no content
3 no content
4 no content
5 no content
6 no content
7 Gum bichromate on vinyl
8 1 b 10 x 12.5 cm.
9 Woodblock print;Ink and color on paper
0 no content
... with 1 more variables: publisher <chr>
facets
ist()
Limit records returned
_items(q="fruit", page_size=2, fields=c("provider","title"))
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
40007 0 2
data
A tibble: 2 x 2
title provider
<chr> <chr>
Fruit Mountain West Digital Library
Fruit Mountain West Digital Library
facets
ist()
Search by date
_items(q="science", date_before=1900, page_size=10, fields=c("id","date"))
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
57622 0 10
data
A tibble: 10 x 2
id date
<chr> <chr>
1 9cfe90e850b13bc1854f3e40223529c8 1881-1882
2 9d008b592ad35eaa1e4dbff8aa976318 1884
3 268fb8978bbab523ec1ad48ee72e7464 1892
4 7f25fff59b55bd99df3a864e514c3d1d 1893
5 0457c88ca237cec73ce2876f91d56572 1893
6 19bdb84f833b28cb36207d02c38cfc69 1883
7 e93faad718b9d63c2c8dd8725edadb93 1891
8 9f79e6f53dfd2f31a17d756a90f22e0b 1883
9 e3f11047a57f18f8a21baf5d6ff3c4dd 1886
0 e8f0ed10dbdcd0ffd6f504e1892515da 1885
facets
ist()
Search on specific fields
_items(description="obituaries", page_size=2, fields="description")
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
50777 0 2
data
A tibble: 2 x 1
description
<chr>
Obituaries of members
Pages from the complied obituaries
facets
ist()
_items(subject="yodeling", page_size=2, fields="subject")
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
54 0 2
data
A tibble: 2 x 1
subject
<chr>
Yodel & yodeling;Humorous songs;Musicals;Sheet music
Yodel & yodeling;Humorous songs;Musicals;Sheet music
facets
ist()
_items(provider="HathiTrust", page_size=2, fields="provider")
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
2647621 0 2
data
A tibble: 2 x 1
provider
<chr>
HathiTrust
HathiTrust
facets
ist()
Spatial search, across all spatial fields
_items(sp='Boston', page_size=2, fields=c("id","provider"))
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
97974 0 2
data
A tibble: 2 x 2
id provider
<chr> <chr>
337556aaa3096bd77e462d898b70c9d7 Smithsonian Institution
41aa36a38d69f5247529505a55528b5d Smithsonian Institution
facets
ist()
Spatial search, by states
_items(sp_state='Massachusetts OR Hawaii', page_size=2, fields=c("id","provider"))
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
235411 0 2
data
A tibble: 2 x 2
id
<chr>
3d3fba16636ab5211a10ff0b0bf44ae6
0c0b0cc05188d33b63fc6adc14774250
... with 1 more variables: provider <chr>
facets
ist()
Faceted search
_items(facets=c("sourceResource.spatial.state","sourceResource.spatial.country"),
page_size=0, facet_size=5)
meta
A tibble: 1 x 3
found start returned
<int> <int> <int>
17104849 0 0
data
A tibble: 0 x 0
facets
facets$sourceResource.spatial.state
facets$sourceResource.spatial.state$meta
A tibble: 1 x 4
type total missing other
<chr> <int> <int> <int>
terms 6249159 11599925 3632477
facets$sourceResource.spatial.state$data
A tibble: 5 x 2
term count
<chr> <int>
Texas 882954
California 636851
Georgia 472738
New York 397295
Massachusetts 226844
facets$sourceResource.spatial.country
facets$sourceResource.spatial.country$meta
A tibble: 1 x 4
type total missing other
<chr> <int> <int> <int>
terms 7786409 10212531 1818325
facets$sourceResource.spatial.country$data
A tibble: 5 x 2
term count
<chr> <int>
United States 5327273
Russia 172146
United Kingdom 169379
Mexico 167957
France 131329
Search for collections with the words university of texas
_collections(q="university of texas", page_size=2)
meta
A tibble: 1 x 2
found returned
<int> <int>
20 2
data
A tibble: 2 x 14
`_rev` ingestDate
<chr> <chr>
14-bccf34a900456b064086f20da68b0f89 2017-08-08T02:55:37.637978Z
13-e91ba552cf695a88c3f285266a272ca8 2017-08-08T02:55:47.403457Z
... with 12 more variables: `@context` <chr>, id <chr>, title <chr>,
`_id` <chr>, description <chr>, `@type` <chr>, ingestType <chr>,
`@id` <chr>, ingestionSequence <int>, score <dbl>,
validation_message <lgl>, valid_after_enrich <lgl>
You can also search in the title
and description
fields
_collections(description="east")
meta
A tibble: 1 x 2
found returned
<int> <int>
3 10
data
A tibble: 3 x 14
`_rev` ingestDate
<chr> <chr>
8-6b723068e71b40c6d9b64b0c14f80e20 2017-05-23T02:22:47.507183Z
3-388428340432e8ff676cd8d10f9d02b0 2017-07-31T17:06:05.782685Z
3-0318d8a1af2907653ac3a11fb9a5bd5b 2017-07-31T17:05:59.746631Z
... with 12 more variables: `@context` <chr>, id <chr>, title <chr>,
`_id` <chr>, description <chr>, `@type` <chr>, ingestType <chr>,
`@id` <chr>, ingestionSequence <int>, score <dbl>,
validation_message <lgl>, valid_after_enrich <lgl>
Visualize metadata from the DPLA - histogram of number of records per state (includes states outside the US)
<- dpla_items(facets="sourceResource.spatial.state", page_size=0, facet_size=25)
ary("ggplot2")
ary("scales")
ot(out$facets$sourceResource.spatial.state$data, aes(reorder(term, count), count)) +
om_bar(stat="identity") +
ord_flip() +
eme_grey(base_size = 16) +
ale_y_continuous(labels = comma) +
bs(x="State", y="Records")
rdpla
in R doing citation(package = 'rdpla')