h2oai/glmnet_python

Name: glmnet_python

Owner: H2O.ai

Description: null

Forked from: bbalasub1/glmnet_python

Created: 2017-03-22 01:33:29.0

Updated: 2017-03-31 10:01:51.0

Pushed: 2017-04-13 01:47:00.0

Homepage: null

Size: 2777

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Glmnet for python

Introduction

This is a python version of the popular glmnet library (beta release). Glmnet fits the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, poisson regression and the cox model.

The underlying fortran codes are the same as the R version, and uses a cyclical path-wise coordinate descent algorithm as described in the papers linked below.

Currently, glmnet library methods for gaussian, multi-variate gaussian, binomial, multinomial, poisson and cox models are implemented for both normal and sparse matrices.

Additionally, cross-validation is also implemented for gaussian, multivariate gaussian, binomial, multinomial and poisson models. CV for cox models is yet to be implemented.

CV can be done in both serial and parallel manner. Parallellization is done using multiprocessing and joblib libraries.

During installation, the fortran code is compiled in the local machine using gfortran, and is called by the python code.

ting started:

The best starting point to use this library is to start with the Jupyter notebooks in the test directory (glmnet_examples.ipynb). Detailed explanations of function calls and parameter values along with plenty of examples are provided there to get you started.

Installation

Unzip the package into a suitable location.

Recompile the GLMnet.so shared library (located in ./lib) using:

gfortran GLMnet.f -fPIC -fdefault-real-8 -shared -o GLMnet.so

Currently, the checked-in version of GLMnet.so is compiled for the following config:

Linux: Linux version 2.6.32-573.26.1.el6.x86_64 (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) OS: CentOS 6.7 (Final) Hardware: 8-core Intel® Core™ i7-2630QM gfortran: version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC)

Authors:

Algorithm was designed by Jerome Friedman, Trevor Hastie and Rob Tibshirani. Fortran code was written by Jerome Friedman. R wrapper (from which the MATLAB wrapper was adapted) was written by Trevor Hastie.

The original MATLAB wrapper was written by Hui Jiang (14 Jul 2009), and was updated and is maintained by Junyang Qian (30 Aug 2013).

This python wrapper (which was adapted from the MATLAB and R wrappers) was written by B. J. Balakumar, bbalasub@stanford.edu (5 Sep 2016).

Department of Statistics, Stanford University, Stanford, California, USA.

REFERENCES:


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.