Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation to include db_type #535

Merged
merged 13 commits into from
Oct 1, 2024

Conversation

sofstam
Copy link
Collaborator

@sofstam sofstam commented Sep 26, 2024

This PR updates the documentation to describe the db_type column that has been added in database samplesheet.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/taxprofiler branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@sofstam sofstam requested review from jfy133 and a team September 26, 2024 08:23
Copy link

github-actions bot commented Sep 26, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 80cdb0c

+| ✅ 267 tests passed       |+
!| ❗   2 tests had warnings |!

❗ Test warnings:

  • nextflow_config - Config manifest.version should end in dev: 1.2
  • pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-09-26 11:52:09

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usage.md also needs updating I think?

docs/usage/tutorials.md Outdated Show resolved Hide resolved
docs/usage/tutorials.md Show resolved Hide resolved
@sofstam sofstam requested a review from jfy133 September 26, 2024 09:15
Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For usage md missing four -> four (or five) on line 119, and explaining you can specify that this allows you to use specific database/parameters against speific data types

@jfy133
Copy link
Member

jfy133 commented Sep 26, 2024

Oh and in the table itself

@sofstam sofstam requested a review from jfy133 September 26, 2024 09:47
docs/usage.md Outdated Show resolved Hide resolved
motus,db_mOTU,,/<path>/<to>/motus/motus_database/
ganon,db1,,/<path>/<to>/ganon/test-db-ganon.tar.gz
kmcp,db1,;-I 20,/<path>/<to>/kmcp/test-db-kmcp.tar.gz
tool,db_name,db_params,db_type,db_path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Myabe give a second example without the db_type column

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this for malt.

docs/usage.md Outdated
@@ -157,6 +157,7 @@ Column specifications are as follows:
| `tool` | Taxonomic profiling tool (supported by nf-core/taxprofiler) that the database has been indexed for [required]. Please note that `bracken` also implies running `kraken2` on the same database. |
| `db_name` | A unique name per tool for the particular database [required]. Please note that names need to be unique across both `kraken2` and `bracken` as well, even if re-using the same database. |
| `db_params` | Any parameters of the given taxonomic classifier/profiler that you wish to specify that the taxonomic classifier/profiling tool should use when profiling against this specific database. Can be empty to use taxonomic classifier/profiler defaults. Must not be surrounded by quotes [required]. We generally do not recommend specifying parameters here that turn on/off saving of output files or specifying particular file extensions - this should be already addressed via pipeline parameters. For Bracken databases, must at a minimum contain a `;` separating Kraken2 from Bracken parameters. |
| `db_type` | A column to distinguish between short- and long-read databases. If the column is empty, the pipeline will assume all databases (and their settings specified in `db_params`!) will be applicable for both short and long read data [optional]. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LilyAnderssonLee what are the valid values ehre? short long and short;long?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. And the default is short;long

@sofstam sofstam requested a review from jfy133 September 26, 2024 11:00
Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! @LilyAnderssonLee can you also review for accuracy?

metaphlan,db1,,short,/<path>/<to>/metaphlan/metaphlan_database/
motus,db_mOTU,,long,/<path>/<to>/motus/motus_database/
ganon,db1,,short,/<path>/<to>/ganon/test-db-ganon.tar.gz
kmcp,db1,;-I 20,short,/<path>/<to>/kmcp/test-db-kmcp.tar.gz
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a second csv example block but without the db_type column (essentially the one from before you edited).

Sorry this is what I meant before about having an example without this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR with two example blocks.

docs/usage.md Outdated
kmcp,db1,;-I 20,/<path>/<to>/kmcp/test-db-kmcp.tar.gz
tool,db_name,db_params,db_type,db_path
malt,malt85,-id 85,short,/<path>/<to>/malt/testdb-malt/
malt,malt95,-id 90,,/<path>/<to>/malt/testdb-malt.tar.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the column is in, it has to be filled (@LilyAnderssonLee do you remember). If you want both you need short;long as befote.

See my comment below about what I had actually meant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the db_type column is included in the database.csv, it should be filled with one of the following values: short, long, or short;long. If the db_type column is missing from the database.csv, it will take the default short;long.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the PR based on your comments.

docs/usage.md Outdated Show resolved Hide resolved
sofstam and others added 2 commits September 26, 2024 13:34
@jfy133 jfy133 merged commit 81f39dd into nf-core:dev Oct 1, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants