Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added offline run flag and profile #29

Merged
merged 16 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions conf/offline.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for offline run.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

params {
config_profile_name = 'Offline'
config_profile_description = 'Settings for offline run'

// Other parameters
offline_run = true
local_databases = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need also to provide here the paths to the local DBs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I can. I have no prior knowledge of the file structure on the user's machine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on slack I was referring of creating a small version of the databases for running a test using the offline mode

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now, see test_offline.config

skip_downstream = true
}
2 changes: 1 addition & 1 deletion modules/local/fetch_eggnog_group_local.nf
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ process FETCH_EGGNOG_GROUP_LOCAL {
prefix = task.ext.prefix ?: meta.id
"""
uniprotid=\$(zcat $idmap | grep \$(cat $uniprot_id) | cut -f2)
zcat $db | grep \$uniprotid | cut -f 5 | tr ',' '\n' | awk -F'.' '{ print \$2 }' > ${prefix}_eggnog_group_raw.txt
zcat $db | grep \$uniprotid | cut -f 5 | tr ',' '\\n' | awk -F'.' '{ print \$2 }' > ${prefix}_eggnog_group_raw.txt
uniprotize_oma_online.py ${prefix}_eggnog_group_raw.txt > ${prefix}_eggnog_group.txt
csv_adorn.py ${prefix}_eggnog_group.txt EggNOG > ${prefix}_eggnog_group.csv

Expand Down
4 changes: 3 additions & 1 deletion modules/local/write_seqinfo.nf
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ process WRITE_SEQINFO {

input:
tuple val(meta), val(uniprot_id)
val offline_run

output:
tuple val(meta), path("*_id.txt"), path("*_taxid.txt"), path("*_exact.txt") , emit: seqinfo
Expand All @@ -19,10 +20,11 @@ process WRITE_SEQINFO {

script:
prefix = task.ext.prefix ?: meta.id
tax_command = offline_run ? "echo 'UNKNOWN' > ${prefix}_taxid.txt" : "fetch_oma_taxid_by_id.py $uniprot_id > ${prefix}_taxid.txt"
"""
echo "${uniprot_id}" > ${prefix}_id.txt
echo "true" > ${prefix}_exact.txt
fetch_oma_taxid_by_id.py $uniprot_id > ${prefix}_taxid.txt
$tax_command

cat <<- END_VERSIONS > versions.yml
"${task.process}":
Expand Down
2 changes: 2 additions & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ params {

// Ortholog options
use_all = false
offline_run = false
local_databases = false
skip_oma = false
oma_path = null
Expand Down Expand Up @@ -202,6 +203,7 @@ profiles {
test { includeConfig 'conf/test.config' }
test_fasta { includeConfig 'conf/test_fasta.config' }
test_full { includeConfig 'conf/test_full.config' }
offline { includeConfig 'conf/offline.config' }
}

// Set default registry for Apptainer, Docker, Podman and Singularity independent of -profile
Expand Down
7 changes: 7 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,13 @@
"help_text": "If set to `true`, the pipeline will use local databases for the analysis.",
"fa_icon": "fas fa-database"
},
"offline_run": {
"type": "boolean",
"default": "false",
"description": "Run the pipeline in offline mode. Overrides all online database flags.",
"help_text": "If set to `true`, the pipeline will run in offline mode. `local_databases` must be set separately.",
"fa_icon": "fas fa-database"
},
"skip_oma": {
"type": "boolean",
"default": "false",
Expand Down
15 changes: 14 additions & 1 deletion subworkflows/local/get_orthologs.nf
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,14 @@ workflow GET_ORTHOLOGS {
ch_versions = Channel.empty()
ch_orthogroups = Channel.empty()

fasta_input = false
ch_samplesheet_fasta.ifEmpty {
fasta_input = true
}
if (fasta_input && params.offline_run) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to through a warning here instead making clear that using fasta could have an impact on the rate limit if too many files are provided?

error "Offline run is currently not supported with fasta files as input."
}

// Preprocessing - find the ID and taxid of the query sequences
ch_samplesheet_fasta
.map { it -> [it[0], file(it[1])] }
Expand All @@ -41,14 +49,19 @@ workflow GET_ORTHOLOGS {
ch_versions = ch_versions.mix(IDENTIFY_SEQ_ONLINE.out.versions)

WRITE_SEQINFO (
ch_samplesheet_query
ch_samplesheet_query,
params.offline_run
)

ch_query = IDENTIFY_SEQ_ONLINE.out.seqinfo.mix(WRITE_SEQINFO.out.seqinfo)
ch_versions = ch_versions.mix(WRITE_SEQINFO.out.versions)

// Ortholog fetching

if(params.use_all && params.offline_run) {
itrujnara marked this conversation as resolved.
Show resolved Hide resolved
warning("Trying to use online databases in offline mode. Are you sure?") // TODO: make a warning
}

if(params.use_all) {
// OMA
if (params.local_databases) {
Expand Down