SciLifeLab/standalone_scripts

Name: standalone_scripts

Owner: Science For Life Laboratory

Description: Repository to store standalone scripts that do not belong to any bigger package or repository

Created: 2014-09-23 12:50:27.0

Updated: 2017-03-16 14:16:54.0

Pushed: 2017-08-31 14:14:12.0

Homepage: null

Size: 176

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Standalone scripts

Repository to store standalone scripts that do not belong to any bigger package or repository.

compute_undet_index_stats.py

used to fetch stats about undermined indexes. This scripts queries statusdb x_flowcell_db and fetch informaiton about runs. The following operations are supported:

Usage

Examples:

compute_undet_index_stats.py

used to fetch stats about undermined indexes. This scripts queries statusdb x_flowcell_db and fetch informaiton about runs. The following operations are supported:

Usage

Examples:

runs_per_week.sh

Run on Irma prints a three columns:

Usage

Examp runs_per_week.sh

compute_production_stats.py

This scripts queries statusdb x_flowcelldb and project database and fetches informations useful to plot trands and aggregated data. It can be run in three modalities:

     - production-stats: for each instrument type it prints number of FCs, number of lanes, etc. It then prints a summary of all stats
     - instrument-usage: for each instrument type and year it prints different run set-ups and samples run with that set-up
     - year-stats: cumulative data production by month
Usage

Example: compute_production_stats.py --config couchdb.yaml --mode year-stats

e: compute_production_stats.py --config couchdb.yam

ons:
--config CONFIG  configuration file
Configuration

Requires a config file to access statusdb

usdb:
url: path_to_tool
username: Username
password: *********
port: port_number
backup_zendesk_tickets.py

Used to automatically back up tickets from zendesk

Usage

Example: backup_zendesk_tickets.py --config-file ~/config_files/backup_zendesk_tickets.yaml --days 30

e: backup_zendesk_tickets.py [OPTIONS]

ons:
config-file PATH  Path to the config file  [required]
days INTEGER      Since how many days ago to backup tickets
help              Show this message and exit.
Dependencies Configuration

Requires a config file:

 https://ngisweden.zendesk.com
name: mattias.ormestad@scilifelab.se
n: <ask Mattias to get token>
ut_path: /Users/kate/Dropbox/dropbox_work/zendesk/output
backup_github.py

Performs a backup of all the repositories in user's GitHub account.

Dependencies
couchdb_replication.py

handles the replication of the couchdb instance

Dependencies
data_to_ftp.py

Used to transfer data to user's ftp server maintaing the directory tree structure. Main intention is to get the data to user outside Sweden.

db_sync.sh

Script used to mirror (completely) Clarity LIMS database from production to staging server

get_sample_names.py

Prints a list of analyzed samples with user_id and ngi_id

Usage:
sample_names.py P1234
index_fixer.py

Takes in a SampleSheet.csv and generates a new one with swapped or reverse complimented indexes.

Dependencies
merge_and_rename_NGI_fastq_files.py

Merges all fastq_files from a sample into one file.

e_and_rename_NGI_fastq_files.py path/to/dir/with/inputfiles/ path/to/output/directory
project_status_extended.py

Collects information about specified project from the filesystem of irma. Without any arguments prints statistics for each sample, such as:

Usage

python project_status_extended.py P4601

To remove headers from the output, use option --skip-header

The script can take additional arguments:

quenced           List of all the sequenced samples
sequenced         List of samples that have been sequenced more than
, and flowcells
ganized           List of all the organized flowcells
-organize         List of all the not-organized flowcells
alyzed            List of all the analysed samples
-analyze          List of samples that are ready to be analyzed
alysis-failed     List of all the samples with failed analysis
der-analysis      List of the samples under analysis
der-qc            List of samples under qc. Use for projects
out BP
coherent          Project-status but only for samples which have
herent number of sequenced/organized/analyzed
w-coverage        List of analyzed samples with coverage below 28.5X
determined        List of the samples which use undetermined  
w-mapping         List of all the samples with mapping below 97 percent
owcells           List of flowcells where each sample has been sequenced
repooler.py

Calculates a decent way to re-pool samples in the case that the amount of clusters from each sample doesn't reach the required threshold due to mismeasurements in concentration.

Dependencies
quota_log.py

DO NOT USE THIS SCRIPT!

Use taca server_status uppmax instead!

Returns a summary of quota usage in Uppmax

Dependencies
Samplesheet_converter.py

For the purpose of converting Illumina samplesheet that contains Chromium 10X indexes for demultiplexing. Headers and lines with ordinary indexes will be passed without any change. Lines with Chromium 10X indexes will be expanded into 4 lines, with 1 index in each line, and suffix 'Sx' will be added at the end of sample names.

Usage

python main.py -i <inputfile> -o <outputfile> -x <indexlibrary>

set_bioinforesponsible.py

Calls up the genologics LIMS directly in order to more quickly set a bioinformatics responsible.

Dependencies
use_undetermined.sh

Creates softlinks of undetermined for specified flowcell and lane to be used in the analysis. To be run on irma.

Usage

Usage: use_undetermined.sh  <flowcell> <lane> <sample>
Example: use_undetermined.sh 160901_ST-E00214_0087_BH33GHALXX 1 P4601_273

Important

After running the script, don't forget to (re-)ORGANIZE FLOWCELL. And then analysis can be started.

ZenDesk Attachments Backup

Takes a ZenDesk XML dump backup file and searches for attachment URLs that match specified filename patterns. These are then downloaded to a local directory.

This script should be run manually on tools when the manual ZenDesk backup zip files are saved.

Usage

Run with a typical ZenDesk backup zip file (will look for tickets.xml inside the zip file):

esk_attachment_backup.py -i xml-export-yyyy-mm-dd-tttt-xml.zip

Alternatively, run directly on tickets.xml:

esk_attachment_backup.py -i ngisweden-yyyymmdd/tickets.xml
Usage

If you're using this on tools for the first time, you'll need to set up conda. tools only has v2.6 of Python installed by default, which is old and not compatible with this script

These instructions get a copy of Python 2.7 for you. You only need to do this once:

  1. Download & install Miniconda
     https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh
     Miniconda3-latest-Linux-x86_64.sh
    
  2. Tell the installer to prepend itself to your .bashrc file
  3. Log out and log in again, check that conda is in your path
  4. Create an environment for Python 2.7
    a create --name tools_py2.7 python pip
    
  5. Add it to your .bashrc file so it always loads
     source activate tools_py2.7 >> .bashrc
    

Now Python 2.7 is installed, the zendesk attachment backup script should work. You can run it by going to the Zendesk backup directory and running it on any new downloads:

esk_attachment_backup.py <latest_backup>.zip
Dependencies
SNIC API UTILS

snic_util.py is a python wrapper for SNIC API, to address most common/frequent purpose here at NGI. The subcommands available now are listed below.

Sub-commands

The script itself is more self explanatory, so python snic_util.py -h will give more info on Usage.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.