The publishDir
process directive (details
here) is very
useful when you want to organize a subset of the output files of your pipeline
in a different folder. Even though all the intermediate and final files are in
the work
directory, you could want to have some of them in the results
folder, for example. This is fine and publishDir
solves this just fine. If you
want to have some files going to a path, and some other files going somewhere
else, according to patterns, that's also fine. This is easily solved by
publishDir
the following way:
process FOO {
publishDir path: 'results/texts', mode: 'copy', pattern: '*.txt'
publishDir path: 'results/images', mode: 'copy', pattern: '*.svg'
...
}
...
In the snippet above, all the output files from the process FOO
that end with
.txt
will go to the results/texts
folder. All the other output files that
end with .svg
will go to the results/images
folder. Great, isn't it? But
what if I want text files to go to results/texts
and everything else to
results/rest
? The pattern
option only accepts default gobbling (not extglob)
so you can't negate a globbing expression. However, the saveAs
option of the
publishDir
directive accepts closures (details
here) so you can
pass a closure that does exactly what we want. The snippet below is a solution
for this problem.
process FOO {
publishDir path: 'results', mode: 'copy', saveAs: { filename ->
filename.indexOf(".txt") > 0 ? "texts/$filename" : "rest/$filename" }
input:
path ifile
output:
path "new${ifile.name}"
script:
"""
echo "something else" > new${ifile.name}
"""
}
workflow {
Channel
.of(
file('a.txt'),
file('b.txt'),
file('c.jpeg')
)
| FOO
}
The snippet above will save the files ending in .txt
to results/texts
and
everything else will be saved in results/rest
. Using the tree
command, we
can see the final tree structure:
tree results
results/
├── rest
│ └── newc.jpeg
└── texts
├── newa.txt
└── newb.txt
3 directories, 3 files