Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional merging and trimming #142

Merged
merged 56 commits into from
Mar 4, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
eb59405
Extra clarifications for indices, FastP and general cleanup
jfy133 Feb 1, 2019
df6305f
Merge pull request #136 from nf-core/help_message_improvements
apeltzer Feb 1, 2019
f43f4b0
Further clarification on `--max-cpus`
jfy133 Feb 2, 2019
45ce547
Update docs/usage.md
apeltzer Feb 2, 2019
e039f28
Merge pull request #139 from nf-core/docs_update
apeltzer Feb 2, 2019
91d05cb
add noCollase option
maxibor Feb 11, 2019
758fb7f
fix nf-PE read index
maxibor Feb 11, 2019
e2aaba0
update doc with noCollapse
maxibor Feb 11, 2019
bdd56dc
merge master to dev
maxibor Feb 11, 2019
55743dd
main.nf to dev
maxibor Feb 11, 2019
8ec9d67
fix funky prefix
maxibor Feb 11, 2019
6d864e1
Update docs/usage.md
apeltzer Feb 11, 2019
f3a6ff7
Update docs/usage.md
apeltzer Feb 12, 2019
0e03a00
skip trimming and collapsing
maxibor Feb 12, 2019
757e930
update test
maxibor Feb 12, 2019
949fd28
Address issue with picard memory
apeltzer Feb 19, 2019
435812b
Merge pull request #145 from apeltzer/fix-picard
apeltzer Feb 19, 2019
b311d61
Use CSI indices wherever possible
apeltzer Feb 21, 2019
a1a69a7
Merge remote-tracking branch 'upstream/dev' into fix-samtools-idx
apeltzer Feb 21, 2019
9eb36c7
Add proper changelog
apeltzer Feb 21, 2019
8d26b33
Added a contributor section to README
apeltzer Feb 21, 2019
8e29eff
Update README.md with instructions for test data
evanfloden Feb 21, 2019
8f3c151
Merge pull request #148 from evanfloden/patch-1
apeltzer Feb 21, 2019
abf4642
Add docs on this
apeltzer Feb 21, 2019
8a6dccb
Update new parameter `large_ref`
apeltzer Feb 25, 2019
b6f65b1
Fixing indices hopefully
apeltzer Feb 25, 2019
f9ac1d4
Fixing indexing
apeltzer Feb 25, 2019
7405c8e
Fix for post-dup steps
apeltzer Feb 25, 2019
7e4035d
Nicer changelog [skip ci]
apeltzer Feb 25, 2019
76e0cbd
Its unpublished stuff [skip ci]
apeltzer Feb 25, 2019
a2364b4
Merge pull request #3 from nf-core/dev
jfy133 Feb 25, 2019
e971cac
Made polyG param clearer what it is
jfy133 Feb 25, 2019
0feeb19
Update CHANGELOG.md
jfy133 Feb 25, 2019
29306df
Updated ploy G trim flag
jfy133 Feb 25, 2019
028f7ea
Merge pull request #5 from jfy133/polyg-name-improvements
jfy133 Feb 25, 2019
bc02be2
Update CHANGELOG.md
jfy133 Feb 25, 2019
ffa0085
Merge pull request #4 from jfy133/polyg-name-improvement
jfy133 Feb 25, 2019
7836ab8
Update CHANGELOG.md
jfy133 Feb 25, 2019
6d13545
Added fastP position to MultiQC config
jfy133 Feb 25, 2019
ee3c44f
Merge pull request #147 from apeltzer/fix-samtools-idx
apeltzer Feb 26, 2019
0fb1481
Merge branch 'dev' into fix-post-dedup-steps
apeltzer Feb 26, 2019
fabd1a3
Should fix the remaining issues
apeltzer Feb 26, 2019
24ea4b7
Merge branch 'fix-post-dedup-steps' of https://github.com/apeltzer/ea…
apeltzer Feb 26, 2019
c69261d
Address issues with qualimap / multiqc / multiple samples and reporting
apeltzer Feb 26, 2019
e85b58f
Merge pull request #151 from apeltzer/fix-post-dedup-steps
apeltzer Feb 27, 2019
16fedae
Merge branch 'dev' into dev
apeltzer Feb 27, 2019
71f0329
Merge pull request #152 from jfy133/dev
apeltzer Feb 27, 2019
817b9e0
Adding in publishing dedup log files as well
apeltzer Mar 1, 2019
5a255fd
Merge pull request #157 from apeltzer/publish_dedup
apeltzer Mar 2, 2019
c0251cb
Merge branch 'dev' into dev
apeltzer Mar 4, 2019
3320464
move size out of the if clause
apeltzer Mar 4, 2019
def1b35
match skip_* pattern
maxibor Mar 4, 2019
2bef839
update travis test
maxibor Mar 4, 2019
e62b503
local executor for skipping AR process
maxibor Mar 4, 2019
e807c58
initialize skip_adapterremoval
maxibor Mar 4, 2019
01d056c
fix bam test
maxibor Mar 4, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ script:
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --saveReference
# Run the basic pipeline with single end data (pretending its single end actually)
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --singleEnd --bwa_index results/reference_genome/bwa_index/bwa_index/
# Run the basic pipeline with paired end data without collapsing
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --skip_collapse --saveReference
# Run the basic pipeline with paired end data without trimming
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --skip_trim --saveReference
# Run the basic pipeline with paired end data without adapterRemoval
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --skip_adapterremoval --saveReference
# Run the same pipeline testing optional step: fastp, complexity
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --complexity_filter --bwa_index results/reference_genome/bwa_index/bwa_index/
# Test BAM Trimming
Expand Down
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

## [Unpublished / Dev Branch]

### `Added`

* [#152](https://github.com/nf-core/eager/pull/152) - Clarified `--complexity_filter` flag to be specifically for poly G trimming.
* [#155](https://github.com/nf-core/eager/pull/155) - Added [Dedup log to output folders](https://github.com/nf-core/eager/issues/154)

### `Fixed`

* [#151](https://github.com/nf-core/eager/pull/151) - Fixed [post-deduplication step errors](https://github.com/nf-core/eager/issues/128
* [#147](https://github.com/nf-core/eager/pull/147) - Fix Samtools Index for [large references](https://github.com/nf-core/eager/issues/146)
* [#145](https://github.com/nf-core/eager/pull/145) - Added Picard Memory Handling [fix](https://github.com/nf-core/eager/issues/144)

## [2.0.5] - 2019-01-28

### `Added`
Expand Down
26 changes: 23 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,20 +45,28 @@ Additional functionality contained by the pipeline currently includes:
## Quick Start

1. Install [`nextflow`](docs/installation.md)

2. Install one of [`docker`](https://docs.docker.com/engine/installation/), [`singularity`](https://www.sylabs.io/guides/3.0/user-guide/) or [`conda`](https://conda.io/miniconda.html)

3. Download the EAGER pipeline

```bash
nextflow pull nf-core/eager
```

4. Set up your job with default parameters
4. Test the pipeline using the provided test data

```bash
nextflow run nf-core -profile <docker/singularity/conda> --reads'*_R{1,2}.fastq.gz' --fasta '<REFERENCE>.fasta'
nextflow run nf-core/eager -profile <docker/singularity/conda>,test --pairedEnd
```

5. See the overview of the run with under `<OUTPUT_DIR>/MultiQC/multiqc_report.html`
5. Start running your own ancient DNA analysis!

```bash
nextflow run nf-core/eager -profile <docker/singularity/conda> --reads'*_R{1,2}.fastq.gz' --fasta '<REFERENCE>.fasta'
```

NB. You can see an overview of the run in the MultiQC report located at `<OUTPUT_DIR>/MultiQC/multiqc_report.html`

Modifications to the default pipeline are easily made using various options
as described in the documentation.
Expand All @@ -84,6 +92,18 @@ James Fellows Yates, Raphael Eisenhofer and Judith Neukamm. If you want to
contribute, please open an issue and ask to be added to the project - happy to
do so and everyone is welcome to contribute here!

## Contributors

- [James A. Fellows-Yates](https://github.com/jfy133)
- [Stephen Clayton](https://github.com/sc13-bioinf)
- [Judith Neukamm](https://github.com/JudithNeukamm)
- [Raphael Eisenhofer](https://github.com/EisenRa)
- [Maxime Garcia](https://github.com/MaxUlysse)
- [Luc Venturini](https://github.com/lucventurini)
- [Hester van Schalkwyk](https://github.com/hesterjvs)

If you've contributed and you're missing in here, please let me know and I'll add you in.

## Tool References

* **EAGER v1**, CircularMapper, DeDup* Peltzer, A., Jäger, G., Herbig, A., Seitz, A., Kniep, C., Krause, J., & Nieselt, K. (2016). EAGER: efficient ancient genome reconstruction. Genome Biology, 17(1), 1–14. [https://doi.org/10.1186/s13059-016-0918-z](https://doi.org/10.1186/s13059-016-0918-z) Download: [https://github.com/apeltzer/EAGER-GUI](https://github.com/apeltzer/EAGER-GUI) and [https://github.com/apeltzer/EAGER-CLI](https://github.com/apeltzer/EAGER-CLI)
Expand Down
5 changes: 4 additions & 1 deletion conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,10 @@ process {
withName:convertBam {
cpus = { check_max(8 * task.attempt, 'cpus') }
}

withName:makeSeqDict {
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
}

withName:bwa {
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
cpus = { check_max(8 * task.attempt, 'cpus') }
Expand Down
1 change: 1 addition & 0 deletions conf/multiqc_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ top_modules:
- '*_fastqc.zip'
path_filters_exclude:
- '*.combined.prefixed_fastqc.zip'
- 'fastp'
- 'adapterRemoval'
- 'fastqc':
name: 'FastQC (post-AdapterRemoval)'
Expand Down
33 changes: 30 additions & 3 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,10 @@ If you prefer, you can specify the full path to your reference genome when you r
```
> If you don't specify appropriate `--bwa_index`, `--fasta_index` parameters, the pipeline will create these indices for you automatically. Note, that saving these for later has to be turned on using `--saveReference`. You may also specify the path to a gzipped (`*.gz` file extension) FastA as reference genome - this will be uncompressed by the pipeline automatically for you. Note that other file extensions such as `.fna`, `.fa` are also supported but will be renamed to `.fasta` automatically by the pipeline.

### `--large_ref`

This parameter is required to be set for large reference genomes. If your reference genome is larger than 3.5GB, the `samtools index` calls in the pipeline need to generate `CSI` indices instead of `BAI` indices to accompensate for the size of the reference genome. This parameter is not required for smaller references (including a human `hg19` or `grch37`/`grch38` reference), but `>4GB` genomes have been shown to need `CSI` indices.

### `--genome` (using iGenomes)

The pipeline config files come bundled with paths to the illumina iGenomes reference index files. If running with docker or AWS, the configuration is set up to use the [AWS-iGenomes](https://ewels.github.io/AWS-iGenomes/) resource.
Expand Down Expand Up @@ -237,7 +241,7 @@ Use to set a top-limit for the default time requirement for each process.
Should be a string in the format integer-unit. eg. `--max_time '2.h'`. If not specified, will be taken from the configuration in the `-profile` flag.

### `--max_cpus`
Use to set a top-limit for the default CPU requirement for each process.
Use to set a top-limit for the default CPU requirement for each **process**. This is not the maximum number of CPUs that can be used for the whole pipeline, but the maximum number of CPUs each program can use for each program submission (known as a process). Do not set this higher than what is available on your workstation or computing node can provide. If you're unsure, ask your local IT administrator for details on compute node capabilities!
Should be a string in the format integer-unit. eg. `--max_cpus 1`. If not specified, will be taken from the configuration in the `-profile` flag.

### `--email`
Expand Down Expand Up @@ -279,12 +283,17 @@ This part of the documentation contains a list of user-adjustable parameters in

## Step skipping parameters

Some of the steps in the pipeline can be executed optionally. If you specify specific steps to be skipped, there won't be any output related to these modules.
Some of the steps in the pipeline can be executed optionally. If you specify specific steps to be skipped, there won't be any output related to these modules.

### `--skip_preseq`

Turns off the computation of library complexity estimation.

### `--skip_adapterremoval`

Turns off adaptor trimming and paired-end read merging.
Equivalent to setting both `--skip_collapse` and `--skip_trim`

### `--skip_damage_calculation`

Turns off the DamageProfiler module to compute DNA damage profiles.
Expand All @@ -299,7 +308,7 @@ Turns off duplicate removal methods DeDup and MarkDuplicates respectively. No du

## Complexity Filtering Options

### `--complexity_filter`
### `--complexity_filter_poly_g`

Performs a poly-G tail removal step in the beginning of the pipeline, if turned on. This can be useful for trimming ploy-G tails from short-fragments sequenced on two-colour Illumina chemistry such as NextSeqs (where no-fluorescence is read as a G on two-colour chemistry), which can inflate reported GC content values.

Expand Down Expand Up @@ -329,6 +338,24 @@ Defines the minimum read quality per base that is required for a base to be kept
### `--clip_min_adap_overlap` 1
Sets the minimum overlap between two reads when read merging is performed. Default is set to `1` base overlap.

### `--skip_collapse`

Turns off the paired-end read merging.

For example
```bash
--pairedEnd --skip_collapse --reads '*.fastq'
```

### `--skip_trim`

Turns off the adaptor and quality trimming.

For example
```bash
--pairedEnd --skip_trim --reads '*.fastq'
```

## Read Mapping Parameters

## BWA (default)
Expand Down
Loading