airoldilab/sgd

Name: sgd

Owner: Airoldi Lab

Description: An R package for large scale estimation with stochastic gradient descent

Created: 2014-12-05 23:10:59.0

Updated: 2017-10-20 04:19:09.0

Pushed: 2017-11-12 04:23:47.0

Homepage:

Size: 2126

Language: C++

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

sgd

sgd is an R package for large scale estimation. It features many stochastic gradient methods, built-in models, visualization tools, automated hyperparameter tuning, model checking, interval estimation, and convergence diagnostics.

Features

At the core of the package is the function

formula, data, model, model.control, sgd.control)

It estimates parameters for a given data set and model using stochastic gradient descent. The optional arguments model.control and sgd.control specify attributes about the model and stochastic gradient method. Taking advantage of the bigmemory package, sgd also operates on data sets which are too large to fit in RAM as well as streaming data.

Example of large-scale linear regression:

ary(sgd)

mensions
 1e5  # number of data points
 1e2  # number of features

nerate data.
 matrix(rnorm(N*d), ncol=d)
a <- rep(5, d+1)
<- rnorm(N)
 cbind(1, X) %*% theta + eps
<- data.frame(y=y, x=X)

theta <- sgd(y ~ ., data=dat, model="lm")

Any loss function may be specified. For convenience the following are built-in:

The following stochastic gradient methods exist:

Check out the vignette in vignettes/ or examples in demo/. In R, the equivalent commands are vignette(package="sgd") and demo(package="sgd").

Installation

To install the latest version from CRAN:

all.packages("sgd")

To install the latest development version from Github:

stall.packages("devtools")
ools::install_github("airoldilab/sgd")
Authors

sgd is written by Dustin Tran and Panos Toulis, and is under active development. Please feel free to contribute by submitting any issues or requests?or by solving any current issues!

We thank all other members of the Airoldi Lab (led by Prof. Edo Airoldi) for their feedback and contributions.

Citation
icle{tran2015stochastic,
thor = {Tran, Dustin and Toulis, Panos and Airoldi, Edoardo M},
tle = {Stochastic gradient descent methods for estimation with large data sets},
urnal = {arXiv preprint arXiv:1509.06459},
ar = {2015}


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.