Name: ML_Feature_Extraction_TRAINING
Owner: Hurwitz Lab
Description: pipeline for features extraction (kmer frequency count) to train a linear classifier
Forked from: aponsero/ML_Feature_Extraction_TRAINING
Created: 2017-11-02 21:29:58.0
Updated: 2017-11-02 21:30:00.0
Pushed: 2017-08-07 19:07:38.0
Homepage: null
Size: 11
Language: Shell
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
pipeline for features extraction (kmer frequency count) to train a linear classifier using HPC cluster.
please modify the
You can also modify
Run
split.sh
This command will remove short contigs from the dataset (< MIN_SIZE) and create NUM_FILE files containing SPLIT_SIZE sequences randomly selected from the DATASET. The split files are stored in RESULT_DIR/.
Once the job is completed successfully, the analysis can be run.
Run
submit.sh
Will place in queue an array job for the analysis. The final output is located in SAMPLE_DIR/kmers.