Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure: Params File path cannot be a blob container 'az://' #4904

Open
Takadonet opened this issue Apr 11, 2024 · 5 comments
Open

Azure: Params File path cannot be a blob container 'az://' #4904

Takadonet opened this issue Apr 11, 2024 · 5 comments

Comments

@Takadonet
Copy link

Takadonet commented Apr 11, 2024

Bug report

Expected behavior and actual behavior

Expecting that ability to reference a azure blob container for -params-file 'az://full/path/param.json'.

Actual behavior is a NextFlow error Missing Nextflow session which stop application from running. If -params-file file is on local file system, works as expected.

Steps to reproduce the problem

Program output

Top part of the stackTrace. Full nextflow.log attached.

Apr-11 10:33:09.445 [main] DEBUG nextflow.plugin.BasePlugin - Plugin started [email protected]
Apr-11 10:33:09.468 [main] DEBUG nextflow.file.FileHelper - > Added 'AzFileSystemProvider' to list of installed providers [az]
Apr-11 10:33:09.468 [main] DEBUG nextflow.file.FileHelper - Started plugin 'nf-azure' required to handle file: az://root/params.json
Apr-11 10:33:09.472 [main] DEBUG n.cloud.azure.file.AzPathFactory - Creating Azure path factory
Apr-11 10:33:09.473 [main] ERROR nextflow.cli.Launcher - @unknown
java.lang.IllegalStateException: Missing Nextflow session
        at nextflow.cloud.azure.config.AzConfig.getConfig(AzConfig.groovy:66)
        at nextflow.cloud.azure.config.AzConfig.getConfig(AzConfig.groovy:72)
        at nextflow.cloud.azure.file.AzPathFactory.parseUri(AzPathFactory.groovy:51)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:567)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1254)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1030)
        at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:1036)
        at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:1019)
        at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:97)
        at nextflow.file.FileSystemPathFactory$_parse_closure1.doCall(FileSystemPathFactory.groovy:76)
        at nextflow.file.FileSystemPathFactory$_parse_closure1.call(FileSystemPathFactory.groovy)
        at nextflow.file.FileSystemPathFactory.lookup0(FileSystemPathFactory.groovy:104)
        at nextflow.file.FileSystemPathFactory.parse(FileSystemPathFactory.groovy:76)
        at nextflow.file.FileHelper.asPath0(FileHelper.groovy:309)
        at nextflow.file.FileHelper.asPath(FileHelper.groovy:297)
        at nextflow.cli.CmdRun.validateParamsFile(CmdRun.groovy:641)
        at nextflow.cli.CmdRun.memoizedMethodPriv$parsedParamsMap(CmdRun.groovy:574)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

nextflow.log

Environment

  • Nextflow version: 23.10.1.5891
  • Java version: [?]
  • Operating system: Ubuntu 20.04.6
  • Bash version: 5.0.17

Additional context

Command being executed : nextflow -log nextflow.log -c azure_batch.config run 'https://github.com/DarianHole/test-nextflow' -w 'az://root/workdir/' --outdir 'az://root/outputs/' -params-file 'az://root/params.json'

@bentsherman
Copy link
Member

Related to #4494

Thanks for the triage, I was stumped by the previous ticket but maybe the session not being created yet could explain it

@Takadonet
Copy link
Author

Takadonet commented Apr 11, 2024

My co-workers have indicated that using -params-file S3 buckets for AWS works so perhaps order when the session is created/available is the issue. Perhaps taking a look at the AWS plugins can give hints on why it works there and not in the azure plugin.

@bentsherman
Copy link
Member

Indeed the problem is that the config is loaded before the session, but the config also loads the params file to apply the params. The S3 and AZ filesystems in Nextflow depend on some config settings to resolve paths with the necessary credentials, so there is a circular dependency here.

The discussions in #2723 and #4669 are relevant here. Separating the params definition from the config file might help resolve this circular dependency. If the config file can be loaded first, then the params are resolved, the params file could be a remote file and rely on the config to retrieve remote paths. As long as the relevant config settings are themselves not dependent on params, which I don't think is typically done.

@Takadonet
Copy link
Author

Based on those discussions, it appears that quick fix is not available. We will make our own temporary workaround of writing the params-files onto the local file system or attach a volume to the container instance.

Would it safe to say that you are leaning towards the functionality of remote files for -params-file in the future?

@bentsherman
Copy link
Member

It's an interesting question. Of course some files simply can't be remote, like the nextflow log, config files, because they are used before the config settings are available to authenticate with remote storage. The params file sits in a grey area where it might be possible if we can get the dependencies right.

To be honest it's not a critical factor in the design of config / params. If we can accommodate it or if it helps us narrow down some design choices, I'll try to support it. But I doubt we will hang the entire design on whether or not the params file can be remote. I think it's relatively easy to stage the params file locally beforehand

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants