OHDSI/BigKnn

Name: BigKnn

Owner: Observational Health Data Sciences and Informatics

Description: An R package implementing a large scale k-nearest neighbor classifier using the Lucene search engine

Created: 2016-02-04 13:56:34.0

Updated: 2017-03-14 17:22:56.0

Pushed: 2017-05-19 17:28:55.0

Homepage: null

Size: 4022

Language: R

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

BigKnn

Introduction

An R package implementing a large scale k-nearest neighbor (KNN) classifier using the Lucene search engine.

Features

Examples

riates <- data.frame(rowIds = c(1,1,1,2,2,3),
                     covariateIds = c(10,11,12,10,11,12),
                     covariateValues = c(1,1,1,1,1,1))

omes <- data.frame(rowIds = c(1,2,3),
                   y = c(1,0,0))

xFolder <- "s:/temp/lucene"

dKnn(outcomes = ff::as.ffdf(outcomes),
     covariates = ff::as.ffdf(covariates),
     indexFolder = indexFolder)

iction <- predictKnn(covariates = ff::as.ffdf(covariates),
                     indexFolder = indexFolder,
                     k = 10,
                     weighted = TRUE)

Technology

BigKnn is an R package using the Java based Lucene search engine. The data for the KNN is stored in a folder on the local file system.

System Requirements

Requires R. Also requires Java 1.7 or higher (Oracle Java is recommended) .

Dependencies

Please note that this package requires Java to be installed. If you don't have Java already intalled on your computed (on most computers it already is installed), go to java.com to get the latest version.

BigKnn also depends on the OHDSI Cyclops and OhdsiRTools packages.

Getting Started

Use the following commands in R to install the BigKnn package:

all.packages("drat")
::addRepo("OHDSI")
all.packages("BigKnn")

Getting Involved

License

BigKnn is licensed under Apache License 2.0. Lucene fall under its own Apache License 2.0.

Development

BigKnn is being developed in R Studio and Eclipse

Development status

Build Status codecov.io

Under development. Use at your own risk.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.