-
Notifications
You must be signed in to change notification settings - Fork 0
sheenamt/munge
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
========================================================== Scripts for munging output of NGS pipeline ========================================================== This project provides a program with a command-line interface for parsing Next Generation Sequencing data. .. contents:: Table of Contents dependencies ============ * Python 2.7.x * A UNIX-like operating system (Linux, OS X). Not tested on Windows. * BEDtools (for summarize_assay command only) * PySam (for count_umi only) * Numpy * Natsort * Pandas * XlsxWriter * suds * xlwt installation ============ Clone the project from the git repository:: cd ~/src git clone [email protected]:sheenams/munge.git cd munge Now installation can be performed using the install script provided. This will default to install at /home/genetics unless an install path is provided. sudo ./install_munge /install/path This script does a clean install. execution ========= The ``munge`` script provides the different scripts used to process data output from the pipeline. Note that for development, it is convenient to run ``munge`` from within the project directory by specifying the relative path to the script:: % ./munge Commands are constructed as follows. Every command starts with the name of the script, followed by an "action" followed by a series of required or optional "arguments". The name of the script, the action, and options and their arguments are entered on the command line separated by spaces. Help text is available for both the ``munge`` script and individual actions using the ``-h`` or ``--help`` options:: % munge -h usage: munge [-h] [-V] [-v] [-q] {help,xlsmaker,rename_hiseq,sample_crawler,}... Utilities for the munge scripts positional arguments: {help,xlsmaker,rename_hiseq,control_parser,variant_crawler, freq_creator,rename_miseq,db_annotation,quality_metrics, getpfx,combined_cnv,combined_output,annovar_bed_parser, qc_variants,combined_pindel,summary} help Detailed help for actions using `help <action>` xlsmaker Create xls workbook from all output files rename_hiseq Rename and compress HiSeq files. control_parser Compare quality control variants to OPX-240 output to check quality of run variant_crawler Create annovar file from Clinical variants csv freq_creator Calculate tallies of variants and write anovar output rename_miseq Rename MiSeq files for pipeline processing db_annotation Create annotation of all variants in db (or only from GATK) quality_metrics Parse picard and CNV output to create quality metrics file getpfx Get prefixes files (PFX.[12].fastq.gz) for running pipeline. combined_cnv Crawl analysis files to create one analysis file with all info combined_output Crawl analysis files to create one analysis file with all info annovar_bed_parser Filter a file of genomic positions given ranges of start positions qc_variants Parse variant files from pipeline, 1000G, and Complete Genomics to create QC Variant file combined_pindel Crawl analysis files to create one analysis file with all info summary Summarize output from Annovar and EVS optional arguments: -h, --help show this help message and exit -V, --version Print the version number and exit -v, --verbose Increase verbosity of screen output (eg, -v is verbose, -vv more so) -q, --quiet Suppress output Help text for an individual action is available by including the name of the action:: % munge getpfx -h usage: munge getpfx [-h] [-s SEPARATOR] datadir Get prefixes files (PFX.[12].fastq.gz) for running pipeline. Usage: munge getpfx /path/to/data positional arguments: datadir Path to directory containing fastq files. optional arguments: -h, --help show this help message and exit -s SEPARATOR, --separator SEPARATOR separator for list of prefixes versions ======== We use abbrevited git sha hashes to identify the software version:: % ./munge -V 0309.004ecac unit tests ========== Unit tests are implemented using the ``unittest`` module in the Python standard library. The ``tests`` subdirectory is itself a Python package that imports the local version (ie, the version in the project directory, not the version installed to the system) of the ``munge`` package. All unit tests can be run like this:: munge % ./testall ........................ ---------------------------------------------------------------------- Ran 24 tests in 0.155s OK A single unit test can be run by referring to a specific module, class, or method within the ``tests`` package using dot notation:: munge % ./testone tests.test_subcommands.TestQCVariants . ---------------------------------------------------------------------- Ran 1 test in 0.004s OK
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Packages 0
No packages published