Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module config #4510

Closed
wants to merge 13 commits into from
24 changes: 12 additions & 12 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,20 @@

## Configuration file

When a pipeline script is launched, Nextflow looks for configuration files in multiple locations. Since each configuration file can contain conflicting settings, the sources are ranked to determine which settings are applied. Possible configuration sources, in order of priority:
When a pipeline script is launched, Nextflow looks for configuration files in multiple locations. Since each configuration file may contain conflicting settings, they are resolved as follows (from highest to lowest priority):

1. Parameters specified on the command line (`--something value`)
2. Parameters provided using the `-params-file` option
3. Config file specified using the `-c my_config` option
4. The config file named `nextflow.config` in the current directory
5. The config file named `nextflow.config` in the workflow project directory
3. Config file specified using the `-c <config-file>` option
4. The config file `nextflow.config` in the launch directory
5. The config file `nextflow.config` in the project directory
bentsherman marked this conversation as resolved.
Show resolved Hide resolved
6. The config file `$HOME/.nextflow/config`
7. Values defined within the pipeline script itself (e.g. `main.nf`)
7. Values defined in the pipeline script (e.g. `main.nf`)

When more than one of these options for specifying configurations are used, they are merged, so that the settings in the first override the same settings appearing in the second, and so on.

:::{tip}
If you want to ignore any default configuration files and use only a custom one, use `-C <config file>`.
You can use the `-C <config-file>` option to use a single configuration file and ignore all other files.
:::

### Config syntax
Expand Down Expand Up @@ -1343,12 +1343,12 @@ The above configuration snippet sets 2 cpus for the processes annotated with the

#### Selector priority

When mixing generic process configuration and selectors the following priority rules are applied (from lower to higher):
Process configuration settings are resolved as follows (from lowest to highest priority):
marcodelapierre marked this conversation as resolved.
Show resolved Hide resolved
marcodelapierre marked this conversation as resolved.
Show resolved Hide resolved

1. Process generic configuration.
2. Process specific directive defined in the workflow script.
3. `withLabel` selector definition.
4. `withName` selector definition.
1. Process configuration settings (without a selector)
2. Process directives in the process definition
3. Process `withLabel` selectors
4. Process `withName` selectors

For example:

Expand All @@ -1360,7 +1360,7 @@ process {
}
```

Using the above configuration snippet, all workflow processes use 4 cpus if not otherwise specified in the workflow script. Moreover processes annotated with the `foo` label use 8 cpus. Finally the process named `bar` uses 32 cpus.
With the above configuration, all processes will use 4 cpus if not otherwise specified in their process definition. Processes annotated with the `foo` label will use 8 cpus. Any process named `bar` (or imported as `bar`) will use 32 cpus.

(config-report)=

Expand Down
46 changes: 46 additions & 0 deletions docs/module.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,52 @@ Those scripts will be made accessible like any other command in the task environ
This feature requires the use of a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors.
:::

## Module config
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

:::{versionadded} 23.11.0-edge
:::

Modules can define a config file called `module.config` in the module directory. This config file can be used to apply process configuration to processes that are invoked in the module. Unlike a regular Nextflow configuration file, the module config assumes the {ref}`process config scope <config-process>` and only supports process config settings.
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

Here is an example module with a module config:

```
<module-dir>
|── main.nf
└── module.config
```

```groovy
// main.nf
process BAR {
}

workflow FOO {
BAR()
}
```

```groovy
bentsherman marked this conversation as resolved.
Show resolved Hide resolved
// module.config
withName:'FOO:BAR' {
marcodelapierre marked this conversation as resolved.
Show resolved Hide resolved
ext.args = '--n-iters 1000'
publishDir = "${params.outdir}/foo_bar"
}
```

In the above example, the module config defines process config settings using the same syntax as a Nextflow config file (including {ref}`process selectors <config-process-selectors>`) but implicitly within the process config scope. The selector `FOO:BAR` matches the process `BAR` invoked by workflow `FOO`.

Furthermore, the selector `BAR` would have also worked in this case, even if `BAR` is used elsewhere in the pipeline, because the module config is only applied to this module. This assurance is the advantage of defining process config in module config files instead of the global pipeline config.

Process configuration is resolved as follows (from highest to lowest priority):
marcodelapierre marked this conversation as resolved.
Show resolved Hide resolved

1. Command line options (e.g. `-process.*`)
2. Pipeline config files
3. Module config files
4. Process {ref}`directives <process-directives>`

Similarly, if a "caller" module invokes a process in a "callee" module, the "caller" module config will take priority over the "callee" module config. In this way, a module can define a "default" configuration that can be overridden at higher and higher levels, where a process might be called in different contexts that require different config settings.

## Sharing modules

Modules are designed to be easy to share and re-use across different pipelines, which helps eliminate duplicate work and spread improvements throughout the community. While Nextflow does not provide an explicit mechanism for sharing modules, there are several ways to do it:
Expand Down
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit tests for edits in ExecutionStack ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentsherman for your consideration

Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,14 @@ class ExecutionStack {
ctx instanceof WorkflowDef ? ctx : null
}

static List<WorkflowDef> workflows() {
final result = [] as List<WorkflowDef>
for( def entry : stack )
if( entry instanceof WorkflowDef )
result << entry
return result
}

static void push(ExecutionContext script) {
stack.push(script)
}
Expand Down
30 changes: 24 additions & 6 deletions modules/nextflow/src/main/groovy/nextflow/script/IncludeDef.groovy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new unit tests required for IncludeDef?

Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,6 @@

package nextflow.script

import nextflow.exception.ScriptCompilationException
import nextflow.plugin.extension.PluginExtensionProvider
import nextflow.plugin.Plugins

import java.nio.file.NoSuchFileException
import java.nio.file.Path

Expand All @@ -31,7 +27,11 @@ import groovy.transform.PackageScope
import groovy.util.logging.Slf4j
import nextflow.NF
import nextflow.Session
import nextflow.config.ConfigParser
import nextflow.exception.IllegalModulePath
import nextflow.exception.ScriptCompilationException
import nextflow.plugin.extension.PluginExtensionProvider
import nextflow.plugin.Plugins
/**
* Implements a script inclusion
*
Expand Down Expand Up @@ -136,12 +136,30 @@ class IncludeDef {
static BaseScript loadModule0(Path path, Map params, Session session) {
final binding = new ScriptBinding() .setParams(params)

// the execution of a library file has as side effect the registration of declared processes
new ScriptParser(session)
// executing a module script also registers any component definitions
final script = new ScriptParser(session)
.setModule(true)
.setBinding(binding)
.runScript(path)
.getScript()

// load module config if it exists
final configPath = path.parent.resolve('module.config')
if( configPath.exists() ) {
final config = new ConfigParser()
.setParams(params)
.parse(configPath)
.toMap()

// remove any params inserted by the config parser
for( def value : config.values() )
if( value instanceof Map )
value.remove('params')

ScriptMeta.get(script).setConfig(config)
}

return script
}

@PackageScope
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,20 @@ class ProcessDef extends BindableDef implements IterableDef, ChainableDef {
throw new ScriptRuntimeException("Missing script in the specified process block -- make sure it terminates with the script string to be executed")

// apply config settings to the process
processConfig.applyConfig((Map)session.config.process, baseName, simpleName, processName)
def configs = [] as List<Map>

// -- process module config
configs << ScriptMeta.get(owner).getConfig()

// -- workflow module configs
for( def workflow : ExecutionStack.workflows() )
configs << ScriptMeta.get(workflow.getOwner()).getConfig()

// -- session config
configs << (Map)session.config.process

for( def config : configs )
processConfig.applyConfig(config, baseName, simpleName, processName)
}

@Override
Expand Down
18 changes: 14 additions & 4 deletions modules/nextflow/src/main/groovy/nextflow/script/ScriptMeta.groovy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new unit tests required for ScriptMeta?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentsherman for your consideration

Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ class ScriptMeta {
/** The module components included in the script */
private Map<String,ComponentDef> imports = new HashMap<>(10)

/** The module config associated with this script */
private Map config = new HashMap<>()

/** Whenever it's a module script or the main script */
private boolean module

Expand All @@ -110,6 +113,8 @@ class ScriptMeta {

boolean isModule() { module }

Map getConfig() { config }

ScriptMeta(BaseScript script) {
this.clazz = script.class
for( def entry : definedFunctions0(script) ) {
Expand All @@ -130,6 +135,11 @@ class ScriptMeta {
this.module = val
}

@PackageScope
void setConfig(Map config) {
this.config = config
}

private void incFunctionCount(String name) {
final count = functionsCount.getOrDefault(name, 0)
functionsCount.put(name, count+1)
Expand Down Expand Up @@ -296,11 +306,11 @@ class ScriptMeta {
void addModule(ScriptMeta script, String name, String alias) {
assert script
assert name
// include a specific
def item = script.getComponent(name)
if( !item )
// include a specific component
def component = script.getComponent(name)
if( !component )
throw new MissingModuleComponentException(script, name)
addModule0(item, alias)
addModule0(component, alias)
}

protected void addModule0(ComponentDef component, String alias=null) {
Expand Down
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we have unit tests about the composition of configs from multiple sources (ideally, process, module and workflow) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentsherman for your consideration

Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package nextflow.script

import nextflow.NF
import nextflow.Session
import spock.lang.Specification
/**
Expand All @@ -8,6 +9,10 @@ import spock.lang.Specification
*/
class ProcessDefTest extends Specification {

def setupSpec() {
NF.init()
}

def 'should clone a process with a new name '() {

given:
Expand Down Expand Up @@ -39,6 +44,7 @@ class ProcessDefTest extends Specification {
def 'should apply process config' () {
given:
def OWNER = Mock(BaseScript)
def META = ScriptMeta.register(OWNER)
def CONFIG = [
process:[
cpus:2, memory: '1GB',
Expand Down