data-8/datascience

Name: datascience

Owner: Data Science 8

Description: A Python library for introductory data science

Created: 2015-07-17 18:17:39.0

Updated: 2017-12-24 21:26:02.0

Pushed: 2018-01-10 05:32:20.0

Homepage: null

Size: 16151

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

datascience

A Berkeley library for introductory data science.

Gitter Documentation Status

written by Professor John DeNero, Professor David Culler, Sam Lau, and Alvin Wan

For an example of usage, see the Berkeley Data 8 class.

Build Status Coverage Status

Installation

Use pip:

install datascience
Changelog

This project adheres to Semantic Versioning.

v0.10.4
v0.10.3
v0.10.2
v0.10.1
v0.10.0
v0.9.5
v0.9.4
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.2
v0.8.0

Breaking changes

Additions

v0.7.1
v0.7.0
Documentation

API reference is at http://data8.org/datascience/ .

Developing

The required environment for installation and tests is the Anaconda Python3 distribution

If you encounter an Image not found error on Mac OSX, you may need an XQuartz upgrade.

Start by cloning this repository:

git clone https://github.com/data-8/datascience

Install the dependencies into a Conda environment with:

conda env create -f osx_environment.yml -n datascience
# For Linux, use
conda env create -f linux_environment.yml -n datascience

Source the environment to use the correct packages while developing:

source activate datascience
# `source deactivate` will unload the environment

The above command must be run each time you develop in the package. You can also install direnv to auto-load/unload the environment.

Install datascience locally with:

make install

Then, run the tests:

make test

After that, go ahead and start hacking!

The source activate datascience command must be run each time you develop in the package. Alternatively, you can install direnv to auto-load/unload the environment.

Documentation is generated from the docstrings in the methods and is pushed online at http://data8.org/datascience/ automatically. If you want to preview the docs locally, use these commands:

make docs       # Generates docs inside doc/ folder
make serve_docs # Starts a local server to view docs
Using Zenhub

We use Zenhub to organize development on this library. To get started, go ahead and install the Zenhub Chrome Extension.

Then navigate to the issue board or press b. You'll see a screen that looks something like this:

screenshot 2015-09-24 23 03 57

Example Workflow
  1. John creates an issue called “Everything is breaking”. It goes into the New Issues pipeline at first.
  2. This issue is important, so John immediately moves it into the To Do pipeline. Since he has to go lecture for 61A, he doesn't assign it to himself right away.
  3. Sam sees the issue, assigns himself to it, and moves it into the In Progress pipeline.
  4. After everything is fixed, Sam closes the issue.

Here's another example.

  1. Ani creates an issue asking for beautiful histograms. Like before, it goes into the New Issues pipeline.
  2. John decides that the issue is not as high priority right now because other things are breaking, so he moves it into the Backlog pipeline.
  3. When he has some more time, John assigns himself the issue and moves it into the In Progress pipeline.
  4. Once the issue is finished, he closes the issue.
Publishing
on setup.py sdist upload -r pypi

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.