Name: ACES-Training
Description: null
Created: 2016-03-18 09:09:01.0
Updated: 2016-03-18 10:07:20.0
Pushed: 2016-04-01 06:06:39.0
Homepage: null
Size: 1733
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Please refer to https://github.com/sara-nl/ACES-Training for an up-to-date version of this repository
This tutorial teaches master and PhD students how to coordinate so-called embarassingly parrallel computational tasks across different infratsructures. The tutorial shows students how to create tokens and process tokens which code for the single runs. The pipeline makes use of couchdb as a token pool server and uses python and the picasclient.
To follow the tutorial you need a python distribution and access to a couchdb instance. On lisa execute
_install --user couchdb
_install --user scikit-learn
If you want to use an own python distribution, please install the following packages.
Module | Version ——-|————— numpy | 1.6.1. scipy | 0.10.0 sklearn | 0.11 h5py | 2.0.0 xlrd | not known couchDB | 0.9
You will need the code provided in this repository. You can download it like this:
clone https://github.com/chStaiger/ACES-Training.git
Change to ACES-Training/code and start python there. All code has to be run in this directory to make sure that the imports work.
The training will make use of a double-looop crossvalidation pipeline which is described in detail in Staiger et. al. We will create tokens for the Single gene classifier and the Lee classifiers. Furthermore and for didactical reasons we will also create tokens which will fail to be processed by the pipeline.