Name: Indicator_contig_predictor
Owner: hackseq
Description: A two-way classifier to characterize metagenomes based on short and long read technologies
Created: 2016-08-31 22:58:49.0
Updated: 2017-08-04 12:20:49.0
Pushed: 2016-10-17 23:32:12.0
Homepage:
Size: 56
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
Other Committers
User | Email | Most Recent Commit | # Commits |
README
De novo metagenomic marker pipeline
Pipeline
- Human Infant Microbiome Dataset (“Babybiome”)
- Kudos to Molly K. Gibson for excellent datasharing
(Gibson MK, Wang B, Ahmadi S, et al. Developmental dynamics of the preterm infant gut microbiota and antibiotic resistome. Nature microbiology. 2016;1:16024.)
- /src/query.py
- Based on magicBLAST, a new RNAseq BLAST mapper
- /src/coverager.py & /scripts/test_coverager.sh
- Generation of BAMs with magicBLAST mapping to long reads (direct streaming from SRA)
- Building a histogram of read coverage
- Thresholding for uniform deep and broad coverage of long reads with short reads (indicator contigs)
- Using chi-squared test to check for uniformity
- Generating probability of long read in short read set
- Gen. Classifier
- Separation by physiological features
- Male-Female
- Delivery mode
- Probability of gene co-occurrence?