prestodb/RPresto

Name: RPresto

Owner: Presto

Description: DBI-based adapter for Presto for the statistical programming language R.

Created: 2015-03-18 22:11:28.0

Updated: 2018-04-16 11:18:00.0

Pushed: 2018-04-16 11:18:04.0

Homepage: null

Size: 244

Language: R

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

RPresto

RPresto is a DBI-based adapter for the open source distributed SQL query engine Presto for running interactive analytic queries.

Installation

RPresto is both on CRAN and github. For the CRAN version, you can use

all.packages('RPresto')

You can install the github development version via

ools::install_github('prestodb/RPresto')
Examples

The standard DBI approach works with RPresto:

ary('DBI')

<- dbConnect(
resto::Presto(),
st='http://localhost',
rt=7777,
er=Sys.getenv('USER'),
hema='<schema>',
talog='<catalog>',
urce='<source>'


<- dbSendQuery(con, 'SELECT 1')
Fetch without arguments only returns the current chunk, so we need to
op until the query completes.
e (!dbHasCompleted(res)) {
chunk <- dbFetch(res)
print(chunk)


<- dbSendQuery(con, 'SELECT CAST(NULL AS VARCHAR)')
e to the unpredictability of chunk sizes with presto, we do not support
stom number of rows
stthat::expect_error(dbFetch(res, 5))

 get all rows using dbFetch, pass in a -1 argument
t(dbFetch(res, -1))

 alternative is to use dbGetQuery directly

ource` for iris.sql()
ce(system.file('tests', 'testthat', 'utilities.R', package='RPresto'))

 <- dbGetQuery(con, paste("SELECT * FROM", iris.sql()))

sconnect(con)

We also include dplyr integration.

ary(dplyr)

- src_presto(
st='http://localhost',
rt=7777,
er=Sys.getenv('USER'),
hema='<schema>',
talog='<catalog>',
urce='<source>'


suming you have a table like iris in the database
 <- tbl(db, 'iris')

 %>%
oup_by(species) %>%
mmarise(mean_sepal_length = mean(as(sepal_length, 0.0))) %>%
range(species) %>%
llect()
How RPresto works

Presto exposes its interface via a REST based API1. We utilize the httr package to make the API calls and use jsonlite to reshape the data into a data.frame. Note that as of now, only read operations are supported.

RPresto has been tested on Presto 0.100.

License

RPresto is BSD-licensed. We also provide an additional patent grant.

[1] See https://github.com/prestodb/presto/wiki/HTTP-Protocol for a description of the API.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.