Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Params case translation between camelCase and kebab-case doesn't always work as expected #4033

Closed
mhoban opened this issue Jun 16, 2023 · 10 comments · Fixed by #4702
Closed

Comments

@mhoban
Copy link

mhoban commented Jun 16, 2023

Bug report

Expected behavior and actual behavior

Parameters are sometimes incorrectly translated between camelCase and kebab-case. Particularly, when default values of params are defined in nextflow.config, the translation to kebab-case doesn't work as expected, but when they are defined in the main script, they seem to work fine.

Steps to reproduce the problem

Here's a simple script:

#!/usr/bin/env nextflow

params.firstParam = false
params.secondParam = "default value"

workflow {
  println(params)
}

If we run it without any arguments, we can see the default values, with all parameters successfully translated into kebab-case:

$ ./test.nf
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [grave_ptolemy] DSL2 - revision: 5375fb2bec
[firstParam:false, first-param:false, secondParam:default value, second-param:default value]

If we run it with options in camelCase or kebab-case, everything seems to work fine:

$ ./test.nf --firstParam --secondParam 'new value'
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [pedantic_easley] DSL2 - revision: 5375fb2bec
[firstParam:true, first-param:true, secondParam:new value, second-param:new value]

$ ./test.nf --first-param --second-param 'new value'
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [sharp_noyce] DSL2 - revision: 5375fb2bec
[first-param:true, firstParam:true, second-param:new value, secondParam:new value]

However, when the default values of the parameters are specified in nextflow.config, it behaves differently. In this example, we have the same script minus the definition of the params:

#!/usr/bin/env nextflow
nextflow.enable.dsl=2

workflow {
  println(params)
}

In addition, we have a nextflow.config file where the default parameter values are defined:

params {
  firstParam = false
  secondParam = "default value"
}

When we run it with no options, we see the default values:

$ ./test.nf
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [sick_faggin] DSL2 - revision: 5d765fd59e
[firstParam:false, first-param:false, secondParam:default value, second-param:default value]

If we run it with camelCase options, it behaves as expected:

$ ./test.nf --firstParam --secondParam 'new value'
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [lonely_mirzakhani] DSL2 - revision: 5d765fd59e
[firstParam:true, first-param:true, secondParam:new value, second-param:new value]

However, when we run it with kebab-case options, the values don't get picked up:

$ ./test.nf --first-param --second-param 'new value'
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [fabulous_wescoff] DSL2 - revision: 5d765fd59e
[firstParam:false, first-param:false, secondParam:default value, second-param:default value]

There seems to be a disconnect somewhere in the logic that handles the command-line option case translation. If params are defined in the main script it works fine, but if they're defined in the config file, it doesn't.

Environment

  • Nextflow version: 23.04.2 build 5870
  • Java version: openjdk version "11.0.19"
  • Operating system: Ubuntu 20.04.6 LTS
  • Bash version: 5.0.17(1)-release (x86_64-pc-linux-gnu)

Additional context

nextflow.log here:
nextflow.log

Also, as an aside, the --option=value mode of parameter passing doesn't seem to be supported. This is may be as designed, but it's something I noticed while testing this:

# the following *don't* work, although that may be as designed:
$ ./test.nf --first-param --second-param='new value'
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [infallible_yalow] DSL2 - revision: d14c7a6085
[first-param:--second-param=new value, firstParam:--second-param=new value, secondParam:default value, second-param:default value]

mykle at kinilau (screen *flowtest*) in ~/tmp
$ ./test.nf --firstParam --secondParam='new value'
N E X T F L O W  ~  version 23.04.2
Launching `./test.nf` [nauseous_agnesi] DSL2 - revision: d14c7a6085
[firstParam:--secondParam=new value, first-param:--secondParam=new value, secondParam:default value, second-param:default value]
@mhoban
Copy link
Author

mhoban commented Jun 21, 2023

After further investigation, this appears to have more to do with whether the params are initialized like this (where things don't seem to work the right way):

params {
  thing1 = false
  thing2 = "another thing"
}

or this (where they work as expected):

params.thing1 = false
params.thing2 = "another thing"

Copy link

stale bot commented Dec 15, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 15, 2023
@theJasonFan
Copy link

I ran into this issue as well. The intended behavior should be for

params {
    fooBar = false
}

and

params.fooBar

to both create a foo-bar kebab case value right?

(Per https://www.nextflow.io/docs/latest/cli.html#pipeline-parameters).

@stale stale bot removed the stale label Jan 16, 2024
@theJasonFan
Copy link

@pditommaso @marcodelapierre would you be willing mark this issue as a bug? This issue might cause unexpected reproducibility issues. Thanks!

@marcodelapierre
Copy link
Member

@bentsherman on the top of your head, may any of your recent fixes have addressed this behaviour?

@bentsherman
Copy link
Member

Possibly. That fix was released in 23.11.0-edge, so someone should test the example with that version or newer

@mhoban
Copy link
Author

mhoban commented Jan 25, 2024

I just tested it in 23.12.0-edge, and the problem was still there.

@bentsherman
Copy link
Member

This is happening because of the read-only nature of the params map. When a new param is added to the map, if the param key is already present (whether in kebab-case or camelCase), the new param is not added. In your case, the params are added in the following order:

firstParam=false
secondParam=default value
first-param=true
second-param=new value

The first two params from the config are added and aliased. Then the CLI params are ignored because they are already defined. This is because the CLI params and config params are collected before the kebab/camel aliasing is applied.

@pditommaso I recall you were thinking we should just remove the params aliasing altogether. But I think, if we just separate the read-only behavior from the aliasing, so that we can build the params map internally and then make it read-only, that should solve all of these quirks once and for all.

@marcodelapierre
Copy link
Member

What was the original rationale on the params aliasing?

From what I know so far, I would consider removing the aliasing, as it is a source of confusion and subtle errors. This would need to be communicated clearly in the new NF version, as there may be people relying on that.

@bentsherman
Copy link
Member

bentsherman commented Jan 31, 2024

Well, the kebab-case aliasing is nice to have for CLI params, and it's actually pretty simple to implement in isolation, see the linked PR. The complexity comes from the current ParamsMap which tries to implement both aliasing and some read-only behavior, that makes the code very hard to understand and prone to bugs.

I would propose that we actually move the aliasing part to the CLI code, so that it is applied only when merging CLI params with the other params. That's the only place where it's needed as you can't use kebab-case in a config file or script, and it should also make the params map much easier to reason about.

If we remove anything it should be the definition of params in scripts, I think that is what adds a lot of complexity and it isn't even used very much. That is a separate effort, but certainly would be easier to think through if the aliasing is moved elsewhere.

pditommaso added a commit that referenced this issue Sep 30, 2024
…solve #4033)


Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants