bcgsc/chromeqc

Name: chromeqc

Owner: BC Cancer Agency Canada's Michael Smith Genome Sciences Centre

Owner: hackseq

Description: ChromeQC: Summarize sequencing library quality of 10x Genomics Chromium linked reads

Created: 2017-10-10 22:45:13.0

Updated: 2018-03-06 18:14:29.0

Pushed: 2017-11-21 22:53:20.0

Homepage: https://bcgsc.github.io/chromeqc

Size: 2067

Language: HTML

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

ChromeQC logo

ChromeQC: Summarize library quality of 10x Genomics Chromium linked reads

This tool provides a quick report on the quality of a 10x Genomics Chromium linked reads library. The report summarizes the sizes of the molecules, the number of reads per molecule, the number of molecules per barcode, and the amount of DNA per barcode. The idea is to provide a FastQC-like tool in terms of speed but to contain information provided by the Summary page of the Loupe software of 10x Genomics. ChromeQC is developed in Python 3, R, AWK, RMarkdown, and Flexdashboard, and uses BWA-MEM for read alignment.

Usage

-whitelist     : default='whitelist_barcodes', type=str
-subsample_size: default=4000                , type=int
-in            : default='-'                 , type=str
-out           : default='stdout'            , type=str
-seed          : default=1334                , type=int
-max_read_pairs: default=-1                  , type=int  , note: -1 means all read pairs
-stats_out_path: default='.'                 , type=str  , note: the directory needs to be created already
-verbose       : default=False               , no value  , note: If supplied, will be set to true, else will be false.

Examples

on3 random_sampling_from_whitelist.py -w ../data/whitelist_barcodes.txt.gz -i ../data/read-RA_si-GAGTTAGT_lane-001-chunk-0002.fastq.gz -v

The pipeline starts with raw FASTQ files of interleaved paired end reads provided by the 10x Chromium platform.

Dependencies

 install -r requirements.txt
 bundle

Prerequisites

The analysis and report will be created using R, the Tidyverse, RMarkdown, and Flexdashboard. Familiarity with some of these tools is useful, but not necessary to participate in this project. Non-technical participants are welcome to design the aesthetics of the report, prepare and deliver the presentation, and coordinate writing a brief paper about the tool.

Team Lead: Shaun Jackman | sjackman@gmail.com | @sjackman | Grad Student | BC Cancer Agency Genome Sciences Centre


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.