-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update: Include sample_name IRIDA-Next input column #26
Conversation
|
Tested it in IRIDA-Next locally. Looks good! |
An issue that has arisen due to these changes is that when we modify the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the one change in the CHANGELOG 😄
Great job Steven!
CHANGELOG.md
Outdated
- Modified the template for input csv file to include a `sample_name` column in addition to `sample` in-line with changes to [IRIDA-Next update] as seen with the [speciesabundance pipeline] | ||
- `sample_name` special characters will be replaced with `"_"` | ||
- If no `sample_name` is supplied in the column `sample` will be used | ||
- To avoid repeat values for `sample_name` all `sample_name` values will be suffixed with the index of the `input` samplesheet.csv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking on this - is the plan to use the index or append sample_name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is always one place in documentation where I forget to update things to the newest version! Thanks for catching this!
`sample` is a unique identifier, designed to be used internally or in IRIDA-Next, or when `sample_name` is not provided. | ||
|
||
`sample_name`, allows more flexibility in naming output files or sample identification. Unlike `sample`, `sample_name` is not required to contain unique values. `Nextflow` requires unique sample names, and therefore in the instance of repeat `sample_names`, `sample` will be suffixed to any `sample_name`. Non-alphanumeric characters (excluding `_`,`-`,`.`) will be replaced with `"_"`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this description was much needed!!!
Just one comment on how it slightly differs from the CHANGELOG.md where index was suggested as the suffix should there be repeat sample_names
😄
@@ -243,7 +261,8 @@ def select_reference(refgenome, reference_sample_id, sample_assemblies) { | |||
log.debug "Selecting reference genome ${reference_genome} from '--refgenome'." | |||
} | |||
else if (reference_sample_id) { | |||
reference_genome = sample_assemblies.filter { it[0] == reference_sample_id && it[1] != null} | |||
// Check each meta category (meta.id, meta.id_alt, meta.irida_id) for a match to params.reference_sample_id | |||
reference_genome = sample_assemblies.filter { (it[0].id == reference_sample_id || it[0].irida_id == reference_sample_id || it[0].id_alt == reference_sample_id) && it[1] != null} | |||
.ifEmpty { error("The provided reference sample ID (${reference_sample_id}) is either missing or has no associated reference assembly.") } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so good!!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for implementing this Steven, everyone else for their comments 😄
I ran this in IRIDA Next and it all works for me. No other comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great Steven!
Added in a change suggested by @emarinier to remove the necessity of adding in meta.id_alt to the schema. |
Rather than troubleshooting why the change without |
Modified the template for input
samplesheet.csv
file to include thesample_name
column in addition tosample
in-line with changes to IRIDA-Next update as seen with the speciesabundance pipeline and staramrnf. What this means is that the output files and thesample
name will be changed tosample_name
if asample_name
is called. Ifsnvphylnfc
is being locally then thesample_name
can be left blank.Made a few changes:
-
sample_name
special characters will be replaced with"_"
- If no
sample_name
is supplied in the columnsample
will be used- To avoid repeat values for
sample_name
allsample_name
values will be suffixed withsample
- Tests to check that the variety of different
sample_names
work with thePR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).