Skip to content

Commit

Permalink
Allow filestore bypass (#3652)
Browse files Browse the repository at this point in the history
* Turn on 3 more CWL tests I can't get to reproducibly fail

* Make promises say when they are being made and fulfilled

* Fix promise ID reporting and add job generation reporting so we can catch stale reads more easily

* Fix multiline f-string

* Stop thinking the 0th local job also needs to run on Kubernetes

* Fix log levels

* Catch in advance if the worker is trying to use a file we never imported

* Get the workflow to run through by processing the embedded tool's tool

* Move all the imports back to the initial setup where they belong

* Quiet debugging

* Drop extra whitespace

* Make error for an escaped import more decisive

* Apply the longer conformance test timeout

* Catch the new Kubernetes missing config exception in decorator

* Say we have handled all CWL Process objects

* Simplify dispatch

* Recursively list directories in the input and in tools when setting up the workflow

* Make sure to map everything in the listing and the shadow listing exactly once

* Add a bunch of code to try and rewrite listings to stop CWL from looking at them

* Come up with a method to send directory contents around that at least seems to work

* Drop whitespace

* Widen typing to better match what's actually passed

* Remove some no longer useful debug prints

* Just use the base Process class

* Don't clear listings at the ends of jobs

* Make the ToilPathMapper make directories

* Document what visit() is supposed to do

* Don't stage children when we stage parents, and go back to clearing listings

* Steal cwltool's work of making the _: directories

* Make stage_listing actually work

* Remind the output file stager that it doesn't actually have the file store handy and skip exporting anything that's not a real file

* Fix CWL test 87 by making sure Directory listings are in final output

* Reorder job execution so we can rebuild the listing earlier

* Revert "Reorder job execution so we can rebuild the listing earlier"

This reverts commit a1fd2bd.

* Use a single complex pass everywhere to encode directory contents without disturbing listings

* Make sure listings' File and Directory objects get visited if they existed

* Remove whitespace

* Hackily propagate CWL unsupported feature detection back to cwltoil

* fail_exit_code → failure_exit_code

* Skip missing optional secondary files in workflow input

* Accept optional secondary files in tool definitions

* Allow bypassign the file store for in place update support

* Reenable file staging postprocessing even when not using the FileStore; is it correct?

* Test everything with file store bypass

* Stage output files into place from per-job output temp directories

* bump cwltool version

* Filter secondary files always but tolerate file:

* Try adding a toil imported flag for hiding illegitimate secondary files

* Stop running all jobs as top level

* Stop tagging files as imported

* Document --bypass-file-store

* Use the right name for the function

* Fix check sense and assert variable names

* Give a better message when using unsupported features.

* Spell type constraint

* Use toilfile: instead of toilfs:

* Call job generations versions instead

* Improve comments

* Use the One True path_to_loc

* Use more succinct human readable name finding

* Link to CWL issue about the listing specs

* Fix comment I stopped writing because it was wrong

* Drop TODO that doesn't seem to break any conformance tests

* Implement toildir: for output staging

* Note when we think later on is

* Break out CWL utilities and add tests for them

* Get new unit tests to pass

* Revert "Use the One True path_to_loc"

This reverts commit 1feae24.

* Fix lingering toilfs scheme name

* Add download_structure test

* Fix typo

* Pass MyPy type checking

* Drop trailing whitespace

* Import moved function

Co-authored-by: Michael R. Crusoe <[email protected]>
Co-authored-by: Michael R. Crusoe <[email protected]>
  • Loading branch information
3 people authored Jul 16, 2021
1 parent 70359d1 commit f194cbf
Show file tree
Hide file tree
Showing 14 changed files with 1,506 additions and 277 deletions.
12 changes: 12 additions & 0 deletions docs/running/cwl.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,18 @@ samples inputs, it could look something like:

.. literalinclude:: ../../src/toil/test/docs/scripts/tutorial_cwlexample.py

Running CWL workflows with InplaceUpdateRequirement
---------------------------------------------------

Some CWL workflows use the ``InplaceUpdateRequirement`` feature, which requires
that operations on files have visible side effects that Toil's file store
cannot support. If you need to run a workflow like this, you can make sure that
all of your worker nodes have a shared filesystem, and use the
``--bypass-file-store`` option to ``toil-cwl-runner``. This will make it leave
all CWL intermediate files on disk and share them between jobs using file
paths, instead of storing them in the file store and downloading them when jobs
need them.

Toil & CWL Tips
---------------

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def run_setup():
gcs = 'google-cloud-storage==1.6.0'
gcs_oauth2_boto_plugin = 'gcs_oauth2_boto_plugin==1.14'
apacheLibcloud = 'apache-libcloud==2.2.1'
cwltool = 'cwltool==3.0.20201203173111'
cwltool = 'cwltool==3.1.20210616134059'
galaxyToolUtil = 'galaxy-tool-util'
htcondor = 'htcondor>=8.6.0'
kubernetes = 'kubernetes>=12.0.1, <13'
Expand Down
2 changes: 1 addition & 1 deletion src/toil/batchSystems/kubernetes.py
Original file line number Diff line number Diff line change
Expand Up @@ -463,7 +463,7 @@ def issueBatchJob(self, jobDesc):

# Try the job as local
localID = self.handleLocalJob(jobDesc)
if localID:
if localID is not None:
# It is a local job
return localID
else:
Expand Down
Loading

0 comments on commit f194cbf

Please sign in to comment.