h2oai/data-science-examples

Name: data-science-examples

Owner: H2O.ai

Description: A collection of data science examples implemented across a variety of languages and libraries.

Created: 2015-12-14 03:46:44.0

Updated: 2018-04-23 13:46:29.0

Pushed: 2016-01-14 19:54:13.0

Homepage: null

Size: 432

Language: CSS

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

data-science-examples

View this site in GitHub Pages: http://h2oai.github.io/data-science-examples/



1. Goals

Goal: To provide a side-by-side framework for adding code examples in many different environments

Goal: It should be easy to add an example

Goal: Encourage lots of different people to add an example

Goal: It should be easy to add a new kind of example

Goal: Examples should be testable

Goal: Provide a library of runnable and easy-to-access answers to common questions

Goal: It should be possible to cut-and-paste a “stable” link for a given example

Goal: Provide support for tags

Non-goals



2. Adding a new example

Conventions
The generation process

The gen.py tool creates the result examples.html file. (Look at the trivial Makefile.)

Tools required to run the generator

I installed the markdown tool on my Macbook Pro with the following command:

install markdown-to-html -g
Commands to run

On Macbook Pro:


add examples.html
commit
push
Top-level directory layout

README.md
This file.

Makefile
Very simple helper for running the generation process.

./gen.py
Tool to generate examples.html.

examples
The example code. New files generally want to go somewhere in here.

examples.html
Generated from files in the examples directory.

data
Data used by examples.

index.html
What gh-pages points to.

packages
Some helper packages used by the examples (ex package for R).

static
Static resources (jquery, bootstrap, highlight.js).

Adding a new case for an existing example

Usually this is as easy as just dropping in one more file with the right name that gen.py knows to look for. You need to add that file in the one specific already-existing example directory. No metadata files need to be updated.

Unless you want to add a totally new kind of example, in which case read on…

Adding a new kind of example (i.e. language type)

gen.py has the following three arrays. (The names are named weirdly to satisfy PEP-8 and still visually line up nicely.)

_lang__________ = ["lang-r", "lang-r"]
_tabs_to_check_ = ["R",      "h2o-R"]
_files_to_check = ["ex-R.R", "ex-h2o.R"]

Adding a new kind of example means adding an element to each of these arrays.

Adding a new example

The names of the code example files must match exactly what gen.py expects.

Finding data files

The ex R package has a locate function which you may find helpful.

Adding a new category (or subcategory)



3. Testing

Testing will be driven by a jenkins job that makes some assumptions.

Other do's and dont's:

TODO:
How to locally build and install the ex R package

I did this with RStudio… TODO: Need better instructions here.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.