cbcrg/hicmaptools

Name: hicmaptools

Owner: Notredame Lab

Description: null

Created: 2015-04-16 08:54:51.0

Updated: 2017-01-18 14:13:54.0

Pushed: 2017-12-23 12:29:55.0

Homepage: null

Size: 3375

Language: C++

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

hicmaptools

hicmaptools is a collection of tools for downstream HiC contmap analysis.

Prerequisites

hicmaptools compilation requires the following tools installed on your system make, gcc-c++ and R.

Compile/Installation

Clone the git repository on your computer with the following command:

git clone git@github.com:cbcrg/hicmaptools.git hicmaptools

Make sure you have installed the required dependencies listed above. When done, move in the project root folder named hicmaptools and enter the following commands:

$ cd src
$ make

The binary will be automatically copied to the path hicmaptools/bin.

$ make install

The binary will be automatically copied to the path specified by the environment variable $USER_BIN (check that it exists before run the make command).

Usage
hicmaptools -in_map in.binmap -in_bin in.bins SELECT_ONE_QUERY_MODE query.bed -output out_file.tsv  

options:  
        -in_map      text .n_contact or binary .bimap by genBiMap commend 
        -in_bin      the bin file for contact map, .bins

query modes: 
    -bat         a loci bat: chr    start   end
        -output      ave neighboring contact of the bat

        -couple      pair of sites: chr1    start1  end1    chr2    start2  end2
        -output      contacts between all pairs

        -local       a interval: chr    start   end
        -output      all contacts inside interval

        -loop        loci gene: chr start   end
        -output      contact between two ends, ie. 5' and 3' genes

        -TAD         loci interval: chr start   end
        -output      sum/ave contact of the TAD

        -sites       interesting sites: chr start   end
        -output      contact between those sites                        

        -submap      genome region to extract: chr  start   end
        -output      sub contact map, ie. 3R:10~15MB

other parameters:
        -ner_bin     check neighbouring bins for bat mode, d.f=10
        -random      assign random size, d,f=500

For instance:

hicmaptools -in_map nm_none_1000_reduced.bimap -in_bin nm_none_1000.bins -query_interval data/10000_40000_top5.epi_domains -output 10000_40000_top5-contact.tsv

Contact Input (essential)
-in_bin

define the chromosome, start position and end position of each bin. Format is as the following:

cbin    chr from.coord  to.coord
1   2L  6000        7000
2   2L  7000        8000
3   2L  8000        9000
4   2L  9000        10000
5   2L  12000       13000
-in_map

contact map indexed by bins. Format is as the following:

Query Modes

bed format : first three required columns are enough.

-bat -couple -local -loop -TAD -sites -submap -output

There will generate two output files after excuting hicmaptools commands :

Illustration for different query options

Example

Suppose you have such files below:

And you want to use the query such as -bat

use the command :

hicmaptools -in_map nm_none_30000.n_contact -in_bin 30000.cbins -bat BATtest.txt -output temp.txt

temp : output name you assign

You will get two output files :

When you open the temp.txt, you may see:

x   chrom   start   end ... rank_obs    rank_exp    rank_nor    
3R  100000  200000  ...     0.880       0.990       0.760

You may concern whether the rank information are conviced, so you can use the tool we support to examine it.

Normal Distribution Test

If the random data are normal distribution, we can assume the rank info are convinced.

Therefore, our tool are supported to examine normal distribution, following the command:

Rscript tools/normality_test.R temp_random.txt outputname

You will get the exam ouput message and a PDF file contains three plot.

Illustration for PDF file


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.