Skip to content

Commit

Permalink
Merge pull request #10 from nf-core/dev
Browse files Browse the repository at this point in the history
syncing
  • Loading branch information
jfy133 authored Nov 8, 2019
2 parents 380ba73 + 80d0e6f commit 27eca27
Show file tree
Hide file tree
Showing 37 changed files with 510 additions and 218 deletions.
7 changes: 3 additions & 4 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,8 @@ However, don't be put off by this template - other more general issues and sugge

> If you need help using or modifying nf-core/eager then the best place to ask is on the pipeline channel on [Slack](https://nf-core-invite.herokuapp.com/).


## Contribution workflow

If you'd like to write some code for nf-core/eager, the standard workflow
is as follows:

Expand All @@ -24,9 +23,9 @@ is as follows:

If you're not used to this workflow with git, you can start with some [basic docs from GitHub](https://help.github.com/articles/fork-a-repo/) or even their [excellent interactive tutorial](https://try.github.io/).


## Tests
When you create a pull request with changes, [Travis CI](https://travis-ci.org/) will run automatic tests.

When you create a pull request with changes, [Travis CI](https://travis-ci.com/) will run automatic tests.
Typically, pull-requests are only fully reviewed when these tests are passing, though of course we can help out before then.

There are typically two types of tests that run:
Expand Down
4 changes: 0 additions & 4 deletions .github/markdownlint.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
# Markdownlint configuration file
default: true,
line-length: false
no-multiple-blanks: 0
blanks-around-headers: false
blanks-around-lists: false
header-increment: false
no-duplicate-header:
siblings_only: true
96 changes: 51 additions & 45 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,48 +40,54 @@ env:
script:
# Lint the pipeline code
- nf-core lint ${TRAVIS_BUILD_DIR}
# Run the basic pipeline with the test profile
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --saveReference
# Test using PMD tools
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --run_pmdtools --pairedEnd
# Run the basic pipeline with single end data (pretending its single end actually)
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --singleEnd --bwa_index results/reference_genome/bwa_index/BWAIndex/Mammoth_MT_Krause.fasta
# Run the basic pipeline with paired end data without collapsing
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --skip_collapse --saveReference
# Run the basic pipeline with paired end data without trimming CURRENTLY DISABLED UNTIL CORRECT TEST DATA AVALIABLE
#- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --skip_trim --saveReference
# Run the basic pipeline with preserve5p end option
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --preserve5p
# Run the basic pipeline with preserve5p end option
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --mergedonly
# Run the basic pipeline with preserve5p end and merged reads only options
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --preserve5p --mergedonly
# Run the basic pipeline with paired end data without adapterRemoval
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --skip_adapterremoval --saveReference
# Run the basic pipeline with output unmapped reads as fastq
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --strip_input_fastq
# Run the same pipeline testing optional step: fastp, complexity
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --complexity_filter --bwa_index results/reference_genome/bwa_index/BWAIndex/Mammoth_MT_Krause.fasta
# Test BAM Trimming
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --trim_bam --bwa_index results/reference_genome/bwa_index/BWAIndex/Mammoth_MT_Krause.fasta
# Test running with CircularMapper
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --mapper 'circularmapper' --circulartarget 'NC_007596.2'
# Test running with BWA Mem
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --mapper 'bwamem' --bwa_index results/reference_genome/bwa_index/BWAIndex/Mammoth_MT_Krause.fasta
# Test with zipped reference input
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --fasta 'https://raw.githubusercontent.com/nf-core/test-datasets/eager2/reference/Test.fasta.gz'
# Run the basic pipeline with the bam input profile, skip AdapterRemoval as no --convertBam
- nextflow run ${TRAVIS_BUILD_DIR} -profile testbam,docker --bam --skip_adapterremoval
# Run the basic pipeline with the bam input profile, convert to FASTQC for adapterremoval test
- nextflow run ${TRAVIS_BUILD_DIR} -profile testbam,docker --bam --run_convertbam
# Run the basic pipeline with FastA reference with `fna` extension
- nextflow run ${TRAVIS_BUILD_DIR} -profile test_fna,docker --pairedEnd --saveReference
# Test using pre-computed indices from a separate run beforehand
- nextflow run ${TRAVIS_BUILD_DIR} -profile test_fna,docker --pairedEnd --bwa_index results/reference_genome/bwa_index/BWAIndex/Mammoth_MT_Krause.fna --fasta_index results/reference_genome/fasta_index/Mammoth_MT_Krause.fna.fai --seq_dict results/reference_genome/seq_dict/Mammoth_MT_Krause.dict
# Test running GATK unified genotyper
- nextflow run ${TRAVIS_BUILD_DIR} -profile test_fna,docker --pairedEnd --dedupper 'dedup' --trim_bam --run_pmdtools --run_genotyping --genotyping_tool 'ug' --gatk_out_mode 'EMIT_ALL_SITES' --gatk_ug_genotype_model 'SNP'
# Test running GATK HaplotypeCaller
- nextflow run ${TRAVIS_BUILD_DIR} -profile test_fna,docker --pairedEnd --dedupper 'dedup' --trim_bam --run_pmdtools --run_genotyping --genotyping_tool 'hc' --gatk_out_mode 'EMIT_ALL_SITES' --gatk_hc_emitrefconf 'BP_RESOLUTION'
- nextflow run ${TRAVIS_BUILD_DIR} -profile test_bam,docker --skip_fastqc --skip_adapterremoval --skip_mapping --skip_deduplication --skip_qualimap --singleEnd --run_genotyping --genotyping_tool 'ug' --genotyping_source 'raw'
# Test running GATK UnifiedGenotyper and MultiVCFAnalyzer
- nextflow run ${TRAVIS_BUILD_DIR} -profile test_fna,docker --pairedEnd --dedupper 'dedup' --trim_bam --run_pmdtools --run_genotyping --genotyping_tool 'ug' --gatk_out_mode 'EMIT_ALL_SITES' --gatk_ug_genotype_model 'SNP' --run_multivcfanalyzer
# REFERENCE: Run the basic pipeline with the test profile
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-basic" -profile test,docker --pairedEnd --saveReference
# REFERENCE: Run the basic pipeline with single end data (pretending its single end actually) and all prepared index files
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-singleEnd" -profile test,docker --singleEnd
# REFERENCE: Run the basic pipeline with FastA reference with `fna` extension
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-fna_ref" -profile test_fna,docker --pairedEnd --saveReference
# REFERENCE: Test using pre-computed indices from a separate run beforehand
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-preindex_ref" -profile test_fna,docker --pairedEnd --bwa_index results/reference_genome/bwa_index/BWAIndex/Mammoth_MT_Krause.fna --fasta_index results/reference_genome/fasta_index/Mammoth_MT_Krause.fna.fai --seq_dict results/reference_genome/seq_dict/Mammoth_MT_Krause.dict
# REFERENCE: Test with zipped reference input
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-gz_ref" -profile test,docker --pairedEnd --fasta 'https://github.com/jfy133/test-datasets/raw/eager/reference/Mammoth/Mammoth_MT_Krause.fasta.gz'
# FASTP: Run the same pipeline testing optional step: fastp, complexity
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-fastp" -profile test,docker --pairedEnd --complexity_filter
# ADAPTERREMOVAL: Run the basic pipeline with paired end data without collapsing
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-skip_collapse" -profile test,docker --pairedEnd --skip_collapse
# ADAPTERREMOVAL: Run the basic pipeline with paired end data without trimming, but still merge
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-pretrim" -profile test_pretrim,docker --pairedEnd --skip_trim
# ADAPTERREMOVAL: Run the basic pipeline with paired end data without adapterRemoval
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-skip_adapterremoval" -profile test,docker --pairedEnd --skip_adapterremoval
# ADAPTERREMOVAL: Run the basic pipeline with preserve5p end option
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-preserve5p" -profile test,docker --pairedEnd --preserve5p
# ADAPTERREMOVAL: Run the basic pipeline with merged only option
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-mergedonly" -profile test,docker --pairedEnd --mergedonly
# ADAPTERREMOVAL: Run the basic pipeline with preserve5p end and merged reads only options
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-preserve5p_mergedonly" -profile test,docker --pairedEnd --preserve5p --mergedonly
# MAPPER_CIRCULARMAPPER: Test running with CircularMapper
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-circularmapper" -profile test,docker --pairedEnd --mapper 'circularmapper' --circulartarget 'NC_007596.2'
# MAPPER_BWAMEM: Test running with BWA Mem
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-bwa_mem" -profile test,docker --pairedEnd --mapper 'bwamem'
# STRIP_FASTQ: Run the basic pipeline with output unmapped reads as fastq
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-stripfastq" -profile test,docker --pairedEnd --strip_input_fastq
# BAM_FILTERING: Run basic mapping pipeline with mapping quality filtering, and unmapped export
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-unmapped_export" -profile test,docker --pairedEnd --run_bam_filtering --bam_mapping_quality_threshold 37 --bam_discard_umapped --bam_unmapped_type 'fastq'
# GENOTYPING_HC: Test running GATK HaplotypeCaller
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-haplotypercaller" -profile test_fna,docker --pairedEnd --dedupper 'dedup' --run_genotyping --genotyping_tool 'hc' --gatk_out_mode 'EMIT_ALL_SITES' --gatk_hc_emitrefconf 'BP_RESOLUTION'
# GENOTYPING_FB: Test running FreeBayes
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-freebayes" -profile test,docker --pairedEnd --dedupper 'dedup' --run_genotyping --genotyping_tool 'freebayes'
# SKIPPING: Test checking all skip steps work i.e. input bam, skipping straight to genotyping
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-skipping_logic" -profile test_bam,docker --bam --singleEnd --skip_fastqc --skip_adapterremoval --skip_mapping --skip_deduplication --skip_qualimap --skip_preseq --skip_damage_calculation --run_genotyping --genotyping_tool 'freebayes'
# TRIM_BAM/PMD/GENOTYPING_UG/MULTIVCFANALYZER: Test running PMDTools, TrimBam, GATK UnifiedGenotyper and MultiVCFAnalyzer
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-pmd_trimbam_unifiedgenotyper_multivcfanalyzer" -profile test,docker --pairedEnd --dedupper 'dedup' --run_trim_bam --run_pmdtools --run_genotyping --genotyping_source 'trimmed' --genotyping_tool 'ug' --gatk_out_mode 'EMIT_ALL_SITES' --gatk_ug_genotype_model 'SNP' --run_multivcfanalyzer
# GENOTYPING_UG/PMD/MULTIVCFANALYZER: Test running GATK UnifiedGenotyper and MultiVCFAnalyzer, additional VCFS
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-multivcfanalyzer_additionalvcfs" -profile test,docker --pairedEnd --dedupper 'dedup' --run_genotyping --genotyping_tool 'ug' --gatk_out_mode 'EMIT_ALL_SITES' --gatk_ug_genotype_model 'SNP' --run_multivcfanalyzer --additional_vcf_files 'https://raw.githubusercontent.com/jfy133/test-datasets/eager/testdata/Mammoth/vcf/JK2772_CATCAGTGAGTAGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.mapped_rmdup.bam.unifiedgenotyper.vcf.gz' --write_allele_frequencies
# BAM_INPUT: Run the basic pipeline with the bam input profile, skip AdapterRemoval as no convertBam
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-baminput_noConvertBam" -profile test_bam,docker --bam --skip_adapterremoval --run_convertbam
# BAM_INPUT: Run the basic pipeline with the bam input profile, convert to FASTQ for adapterremoval test and downstream
- nextflow run ${TRAVIS_BUILD_DIR} -name "eager-baminput_convertbam_basic" -profile test_bam,docker --bam --run_convertbam
# [DISABLED UNTIL BED FILE AVAILABLE - 2h RUN TIME WITHOUT] - SEXDETERMINAION: Run the basic pipeline with the bam input profile, but don't convert BAM, skip everything but sex determination
#- nextflow run ${TRAVIS_BUILD_DIR} -profile test_humanbam,docker --bam --skip_fastqc --skip_adapterremoval --skip_mapping --skip_deduplication --skip_qualimap --singleEnd --run_sexdeterrmine
# [DISABLED UNTIL SMALL HUMAN REFERENCE AVALIABLE - REQUIRES HUMAN FASTA] - NUCLEAR INPUT
#- nextflow run ${TRAVIS_BUILD_DIR} -profile test_humanbam,docker --bam --skip_fastqc --skip_adapterremoval --skip_mapping --skip_deduplication --skip_qualimap --singleEnd --run_nuclearcontamination

6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,17 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
* Added Support for automated tests using [GitHub Actions](https://github.com/features/actions)
* [#40](https://github.com/nf-core/eager/issues/40), [#231](https://github.com/nf-core/eager/issues/231) - Added genotyping capability through GATK UnifiedGenotyper (v3.5), GATK HaplotypeCaller (v4.1) and FreeBayes
* Added MultiVCFAnalyzer module
* [#240](https://github.com/nf-core/eager/issues/240) - Added human sex determination module.
* [#240](https://github.com/nf-core/eager/issues/240) - Added human sex determination module
* [#226](https://github.com/nf-core/eager/issues/226) - Added `--preserve5p` function for AdapterRemoval
* [#212](https://github.com/nf-core/eager/issues/212) - Added ability to use only mergedreads downstream from Adapterremoval
* [#265](https://github.com/nf-core/eager/issues/265) - Adjusted full markdown linting in Travis CI
* [#247](https://github.com/nf-core/eager/issues/247) - Added nuclear contamination with angsd

### `Fixed`

* [#227](https://github.com/nf-core/eager/issues/227) - Large re-write of input/output process logic to allow maximum flexibility. Originally to address [#227](https://github.com/nf-core/eager/issues/227), but further expanded
* Fixed Travis-Ci.org to Travis-Ci.com migration issues
* [#266](https://github.com/nf-core/eager/issues/266) - Added sanity checks for input filetypes (i.e. only BAM files can be supplied if `--bam`)

### `Dependencies`

Expand Down Expand Up @@ -116,6 +119,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
* [#122](https://github.com/nf-core/eager/pull/122) - Add pulling from Dockerhub again

### `Fixed`

* [#110](https://github.com/nf-core/eager/pull/110) - Fix for [MultiQC Missing Second FastQC report](https://github.com/nf-core/eager/issues/107)
* [#112](https://github.com/nf-core/eager/pull/112) - Remove [redundant UDG options](https://github.com/nf-core/eager/issues/89)

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ Additional functionality contained by the pipeline currently includes:

nextflow clean -k


NB. You can see an overview of the run in the MultiQC report located at `<OUTPUT_DIR>/MultiQC/multiqc_report.html`

Modifications to the default pipeline are easily made using various options
Expand Down Expand Up @@ -96,6 +95,7 @@ This pipeline was written by Alexander Peltzer ([apeltzer](https://github.com/ap
* [Maxime Garcia](https://github.com/MaxUlysse)
* [Luc Venturini](https://github.com/lucventurini)
* [Hester van Schalkwyk](https://github.com/hesterjvs)
* [Thiseas C. Lamnidis](https://github.com/TCLamnidis)

If you've contributed and you're missing in here, please let me know and I'll add you in.

Expand All @@ -113,8 +113,8 @@ If you've contributed and you're missing in here, please let me know and I'll ad
* **MultiQC** Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. [https://doi.org/10.1093/bioinformatics/btw354](https://doi.org/10.1093/bioinformatics/btw354) Download: [https://multiqc.info/](https://multiqc.info/)
* **BamUtils** Jun, G., Wing, M. K., Abecasis, G. R., & Kang, H. M. (2015). An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Research, 25(6), 918–925. [https://doi.org/10.1101/gr.176552.114](https://doi.org/10.1101/gr.176552.114) Download: [https://genome.sph.umich.edu/wiki/BamUtil](https://genome.sph.umich.edu/wiki/BamUtil)
* **FastP** Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34(17), i884–i890. [https://doi.org/10.1093/bioinformatics/bty560](https://doi.org/10.1093/bioinformatics/bty560) Download: [https://github.com/OpenGene/fastp](https://github.com/OpenGene/fastp)
* **GATK 3.8** DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V., Maguire, J. R., Hartl, C., … Daly, M. J. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43(5), 491–498. [https://doi.org/10.1038/ng.806](https://doi.org/10.1038/ng.806.) [Download](https://software.broadinstitute.org/gatk/download/)
* **GATK 3.5** DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V., Maguire, J. R., Hartl, C., … Daly, M. J. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43(5), 491–498. [https://doi.org/10.1038/ng.806](https://doi.org/10.1038/ng.806.) [Download](https://software.broadinstitute.org/gatk/download/)
* **GATK 4.X** - no citation available yet
* **MultiVCFAnalyzer** Bos, K.I. et al., (2014). Pre-Columbian mycobacterial genomes reveal seals as a source of New World human tuberculosis. Nature, 514(7523), pp.494–497. Available at: [http://dx.doi.org/10.1038/nature13591](http://dx.doi.org/10.1038/nature13591). Download: [https://github.com/alexherbig/MultiVCFAnalyzer](https://github.com/alexherbig/MultiVCFAnalyzer)
* **Sex.DetERRmine.py** Lamnidis, T.C. et al., 2018. Ancient Fennoscandian genomes reveal origin and spread of Siberian ancestry in Europe. Nature communications, 9(1), p.5018. Available at: [http://dx.doi.org/10.1038/s41467-018-07483-5](http://dx.doi.org/10.1038/s41467-018-07483-5). Download: [https://github.com/TCLamnidis/Sex.DetERRmine.git](https://github.com/TCLamnidis/Sex.DetERRmine.git).

* **ANGSD** Korneliussen, T.S., Albrechtsen, A. & Nielsen, R., 2014. ANGSD: Analysis of Next Generation Sequencing Data. BMC bioinformatics, 15, p.356. Available at: [http://dx.doi.org/10.1186/s12859-014-0356-4](http://dx.doi.org/10.1186/s12859-014-0356-4). Download: [https://github.com/ANGSD/angsd](https://github.com/ANGSD/angsd).
Binary file added assets/angsd_resources/HapMapALL.gz
Binary file not shown.
Binary file added assets/angsd_resources/HapMapChrX.gz
Binary file not shown.
Loading

0 comments on commit 27eca27

Please sign in to comment.