EnvGen/POGENOM

Name: POGENOM

Owner: Environmental Genomics Group SciLifeLab/KTH Stockholm

Description: Population genomics from metagenomes

Created: 2017-06-09 07:48:24.0

Updated: 2017-06-09 13:03:06.0

Pushed: 2017-08-17 14:52:24.0

Homepage: null

Size: 32

Language: Perl

GitHub Committers

UserMost Recent Commit# Commits
Anders Andersson2018-01-15 12:53:21.021

Other Committers

UserEmailMost Recent Commit# Commits

README

POGENOM

Population genomics from metagenomes

Description

POGENOM takes as input a file of the variant call format (VCF). This is generated by mapping one or several metagenome samples against a reference genome with a read aligner and calling variants using a variant caller. POGENOM calculates the nucleotide diversity (?) within each sample. If a multiple sample vcf file is provided as input, the fixation index (FST) will also be calculated between each pair of samples. If, in addition to the VCF file, an annotation file of the General Feature Format (GFF) is provided as input, gene-wise ? and FST will also be calculated. If further a genetic code file is provided, gene-wise ? and FST will be calculated also at the amino acid level. Now also non-synonymous to synonymous polymorphism rates (pN/pS) will be calculated for each gene in each sample. POGENOM also calculates some of the above parameters for all samples collectively, treating them as a metasample.


Installation

Download the latest POGENOM distribution from https://github.com/EnvGen/POGENOM/releases and extract the files. You need to have Perl installed on your computer to run POGENOM. When running POGENOM, either move to the directory where you have put the files or give the path to the files when running, i.e. perl path/to/pogenom.pl ...

Usage (minimum input)

Either:

perl pogenom.pl --vcf_file VCF_FILE --out OUTPUT_FILES_PREFIX --genome_size GENOME_SIZE [--help]

Or:

perl pogenom.pl -vcf_file VCF_FILE --out OUTPUT_FILES_PREFIX --gff_file GFF_FILE [--help]

Required arguments

--vcf_file VCF_FILE Specify vcf file with data from a single or multiple samples.

--out OUTPUT_FILES_PREFIX Specify the prefix of the output file name(s) (overwrites existing files with same names).

--genome_size GENOME_SIZE Specify genome size (in bp; integer). Not required if --gff_file is given.

Optional arguments

--gff_file GFF_FILE Specify gff file. Either this or --genome_size must be given.

--genetic_code_file GENETIC_CODE_FILE Specify genetic code file. E.g. standard_genetic_code.txt in the POGENOM distribution.

--loci_file LOCI_FILE Specify file with ids for loci to include.

--min_count MIN_COUNT Specify minimum coverage for a locus to be included for the sample.

--min_found MIN_FOUND_IN Specify minimum number samples that a locus need to be present in to be included.

--subsample SUBSAMPLE Specify coverage level at which to subsample.

--keep_haplotypes If this is used, POGENOM will not split haplotypes into single-nucleotide variants, which is otherwise the default behaviour. --vcf_version Specify VCF file format version. Can be set to 4.2 or 4.1 (default).

--help To print help message on screen.


Citing POGENOM

POGENOM doesn't have a paper yet, meanwhile please cite it like this:

Andersson AF, Sjöqvist C (2017). POGENOM: population genomics from metagenomes. https://github.com/EnvGen/POGENOM.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.