-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Adding Cloud Pipelines #90
[WIP] Adding Cloud Pipelines #90
Conversation
Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
…eractive code bits to figure out where the entrypoint for an executor run is Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
…ke into add/google-cloud-pipelines
…alk through and test for basic example Signed-off-by: Vanessa Sochat <[email protected]>
…dir to storage cache, and then the instance can retrieve it. Signed-off-by: Vanessa Sochat <[email protected]>
…ribute Signed-off-by: Vanessa Sochat <[email protected]>
…rom pip and install from this branch Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
…achine type prefix, ready for draft PR Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
Signed-off-by: Vanessa Sochat <[email protected]>
@johanneskoester this is a little confusing with black, because I installed the exact version that is run in the test suites, and there seems to be a difference in result. Here is what I'm doing locally: $ black --check snakemake tests/*.py
All done! ✨ 🍰 ✨
51 files would be left unchanged.
$ black --version
black, version 19.3b0 Note that the version is the same installed there. I also tried installing with conda in case it made a difference (it didn't). What am I missing? |
…oogle cloud project on command line (derived from storage client or optional environment to override that) Signed-off-by: Vanessa Sochat <[email protected]>
That's so weird! The black formatting passed this time. I don't think I made any changes to that file. Must be ghosties. |
@vsoch did you select a draft PR? I still have a merge button here and the icon is green instead of grey in the list. |
oops I just totally forgot :/ Merging is still blocked so is it okay? I'll add WIP to the heading so others know. |
Yeah, no problem ;-). |
xref #91 |
…ences Signed-off-by: Vanessa Sochat <[email protected]>
…uggestions for Signed-off-by: vsoch <[email protected]>
@johanneskoester I'll want to ask for your suggestions for how to reduce "cognitive complexity" of functions - I made them modular at a level I thought made sense, and most of the length for the ones flagged is due to really long WorkflowError messages. I'm happy to refactor in any way you think is better. |
Co-authored-by: Johannes Köster <[email protected]>
…ge to be uploaded once Signed-off-by: vsoch <[email protected]>
That issue is resolved now. |
I have looked at them again, and I think they are indeed ok to remain like this. Sorry for the noise. |
GCloud credentials are now in place. I think you should be able to activate your test case here. In line with the kubernetes and tibanna tests, please move your test case in a separate file |
@johanneskoester what is the name of the envar secret - did you just call it |
E.g., this is what I'd guess, let me know if I'm off env:
AWS_AVAILABLE: ${{ secrets.AWS_ACCESS_KEY_ID }}
GCP_AVAILABLE: ${{ secrets.GCP_SA_KEY }}
GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS }} |
…rrect envar name Signed-off-by: vsoch <[email protected]>
Signed-off-by: vsoch <[email protected]>
Dearest sonarcloud - you've had a change of heart! Because before you said my code was smelly. Maybe it's good smelly now :) |
Reported to already be set without needing specified here. Co-authored-by: Johannes Köster <[email protected]>
Co-authored-by: Johannes Köster <[email protected]>
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
@johanneskoester the test did not run still - I can't see settings/secrets so I'll need your guidance about what needs to be checked. |
You are on a fork. github does not expose secrets to forks. |
Let us go on in PR #384 |
* starting figuring out how to not remove wrappers Signed-off-by: Vanessa Sochat <[email protected]> * early work to add gcloud executor Signed-off-by: Vanessa Sochat <[email protected]> * getting basic bucket creation and test upload working, and adding interactive code bits to figure out where the entrypoint for an executor run is Signed-off-by: Vanessa Sochat <[email protected]> * updating to show debugging Signed-off-by: Vanessa Sochat <[email protected]> * moving change of working directory to after parsing config files Signed-off-by: Vanessa Sochat <[email protected]> * removing old cleanup wrappers logic Signed-off-by: Vanessa Sochat <[email protected]> * removing old cleanup wrappers logic Signed-off-by: Vanessa Sochat <[email protected]> * work on base skeleton - pipeline requests are created and I need to walk through and test for basic example Signed-off-by: Vanessa Sochat <[email protected]> * lots of work on exector for Google Life Sciences - we upload the workdir to storage cache, and then the instance can retrieve it. Signed-off-by: Vanessa Sochat <[email protected]> * release 5.7.4 has a bug looking for a workflow to have a workflow attribute Signed-off-by: Vanessa Sochat <[email protected]> * more bug fixes to google science pipeline, also want to use version from pip and install from this branch Signed-off-by: Vanessa Sochat <[email protected]> * adding docker base example to google cloud tests Signed-off-by: Vanessa Sochat <[email protected]> * omg successful run! ✨ Signed-off-by: Vanessa Sochat <[email protected]> * restoring to original (working) execution with custom image, adding machine type prefix, ready for draft PR Signed-off-by: Vanessa Sochat <[email protected]> * fixing formatting errors with black Signed-off-by: Vanessa Sochat <[email protected]> * using same version of black as the testing Signed-off-by: Vanessa Sochat <[email protected]> * testing again after reformatting, removing need for user to specify google cloud project on command line (derived from storage client or optional environment to override that) Signed-off-by: Vanessa Sochat <[email protected]> * adding warning about environment not being secret for google life sciences Signed-off-by: Vanessa Sochat <[email protected]> * updating Google Life Sciences executor to programatically select machineType and add verbose comments about envars not being secrets and selection of machinetypes Signed-off-by: Vanessa Sochat <[email protected]> * Update snakemake/executors.py Co-Authored-By: Johannes Köster <[email protected]> * changing default arguments for regions to be included in arg parser Signed-off-by: Vanessa Sochat <[email protected]> * adding hashlib function to snakemake/common.py, tweaks to default arguments and function names Signed-off-by: Vanessa Sochat <[email protected]> * renaming google_life_sciences to google_lifesciences and using job resources disk_mb with padding of 10gb Signed-off-by: Vanessa Sochat <[email protected]> * custom entrypoint / cmd is working for snakemake base! Signed-off-by: Vanessa Sochat <[email protected]> * imports went away? Signed-off-by: Vanessa Sochat <[email protected]> * updating common and executors to fix some sonarcube linting issues Signed-off-by: Vanessa Sochat <[email protected]> * trivial use of exec_job to fix SonarCloud "bug" Signed-off-by: Vanessa Sochat <[email protected]> * updating machine type selection to take first in list before filtering to smallest Signed-off-by: Vanessa Sochat <[email protected]> * why was stylesheet changed? Signed-off-by: Vanessa Sochat <[email protected]> * why was stylesheet changed also in utils? Signed-off-by: Vanessa Sochat <[email protected]> * updating life science executor to only include source files and working directory .snakemake scripts, also accidentally black formatted python scripts in tests, should not be an issue Signed-off-by: Vanessa Sochat <[email protected]> * not sure why WorkflowError isnt defined for a file I didnt edit, how did it pass other CI? Signed-off-by: Vanessa Sochat <[email protected]> * fixing bug with merge with master - an extra few lines were kept with a function (and should not have been) Signed-off-by: Vanessa Sochat <[email protected]> * refactoring to not require additional scripts (passing exec_job directly to entrypoint) and then uploading only one archive Signed-off-by: Vanessa Sochat <[email protected]> * e2 prefix machines dont seem to work for google life sciences api Signed-off-by: Vanessa Sochat <[email protected]> * gcp life sciences cannot currently support m1 or e2 instance types, adding m1 to be filtered out for machine selection Signed-off-by: Vanessa Sochat <[email protected]> * import of time should then use time.sleep Signed-off-by: Vanessa Sochat <[email protected]> * bug that archive files silently not being added, and adding more robust function to derive relative snakefile path Signed-off-by: Vanessa Sochat <[email protected]> * bump current container image to v5.10.0 and add debug statements to show Signed-off-by: Vanessa Sochat <[email protected]> * updating Google Life Sciences to use get_container_image for latest container version Signed-off-by: vsoch <[email protected]> * adding default-resources setting for google-lifesciences and better message when resources exceeded. Signed-off-by: vsoch <[email protected]> * be more specific about resource limits for LHS Signed-off-by: vsoch <[email protected]> * locations api now needs to be used since there is more than one location, * no longer works Signed-off-by: vsoch <[email protected]> * adding parameter for location, and default to region or prefix of region Signed-off-by: vsoch <[email protected]> * google import should not be at top of file, wont work locally Signed-off-by: vsoch <[email protected]> * import google for wrong function Signed-off-by: vsoch <[email protected]> * must use lowercase and not camel case for variables Signed-off-by: vsoch <[email protected]> * if resources are requested, update default resources Signed-off-by: vsoch <[email protected]> * updating event to use debug instead of info Signed-off-by: vsoch <[email protected]> * updating stderr event to use debug instead of error Signed-off-by: vsoch <[email protected]> * missing adding argument to --skip-script-cleanup to base job, need need to test Signed-off-by: vsoch <[email protected]> * adding cleaner implementataion for --skip-script-cleanup to be shared amongst executors Signed-off-by: vsoch <[email protected]> * google_lifesciences needs to be treated as a non local exec Signed-off-by: vsoch <[email protected]> * adding install of crc32c library for remote Signed-off-by: vsoch <[email protected]> * adding support for google-lifesciences gpu To add support for GPU, the user is limited to n1- general machine types, and we do this by way of setting the self._machine_type_prefix to be n1 (if it does not already start with n1). We can then choose an accelerator type that is greater than or equal to what the user has requested, and then choose from that set. With the current implementation the P100 type will be the first chosen, and this makes sense as (I think) it is the cheapest GPU option, something likely wanted by researchers running these at scale. In the future (when someone asks for it) we can add some flag that will allow the user to further specify a GPU accelerator type, but for now this seems reasonable. Once the accelerator is added to the VM, we can confirm that it creates an instance with GPU support Signed-off-by: vsoch <[email protected]> * snakefile in tests/common.py should switch between tempdir and original currently, some tests use the snakefile in the temporary working directory, and others user the original. There is a flag, no_tmpdir, that can distinguish this for the working directory specification, and the same needs to be done for the snakefile, as different tests have different expectations for its location. Signed-off-by: vsoch <[email protected]> * fixing test to not expect file to be in same directory Signed-off-by: vsoch <[email protected]> * Google Life Sciences executor must use relative paths for build package Currently, if the user puts a path for a snakefile that is not relative to the working directory where the build package is generated, the replace function (that replaces the workdir in each file path) will not work and the build package will be generated with a hard coded path on the user system. Since our options are to try and muck around with moving files, or hackily changing paths for the included files (not ideal) a much cleaner solution is to do similar to what a docker build does, and require context to be at or for some subfolder level of the working directory. Most runs that Ive seen will be for a Snakefile in a GitHub repo, meaning the working directory includes the snakefile, so I think this will be a rare use case. I still want to catch it, in case it does occur! Signed-off-by: vsoch <[email protected]> * remove unneeded line Signed-off-by: vsoch <[email protected]> * adding machine and gpu_model job resources this will add two optional job resources to the Google Life Sciences executor, including a machine (to mirror --machine-type-prefix) and a gpu_model (for the user to select from the cloud or vendor specific namespace of gpu models, e.g., nvidia-tesla-p4. If the user specifies a gpu_model without gpu or nvidia_gpu, the count will default to 1. The --machine-type-prefix command line arg also overrides the job resource, since this could be desired. The docs are also updated with these changes. Signed-off-by: vsoch <[email protected]> * typo model -> gpu_model Signed-off-by: vsoch <[email protected]> * string resources should be in quotes Per #329, a string defined resource like machine or gpu_model should include quotes around the name Signed-off-by: vsoch <[email protected]> * gcloud beta life sciences client change in behavior we used to be able to do gcloud beta lifesciences describe <id> but now a full project string is required for this to work. Signed-off-by: vsoch <[email protected]> * updating GoogleLifeSciences Executor to look for machine_type instead of machine definition in the job resources Signed-off-by: vsoch <[email protected]> * initial changes after review from Johannes Signed-off-by: vsoch <[email protected]> * fixing reasonable code smells, the cognitive complexity ones I need suggestions for Signed-off-by: vsoch <[email protected]> * Adding persistence.aux_path to be used for workflow tar.gz Co-authored-by: Johannes Köster <[email protected]> * adding tests for google life sciences, taking shot that I know the correct envar name Signed-off-by: vsoch <[email protected]> * Removing secrets.GOOGLE_APPLICATION_CREDENTIALS Reported to already be set without needing specified here. Co-authored-by: Johannes Köster <[email protected]> * changing env.GOOGLE_ to just use GCP_AVAILABLE. Co-authored-by: Johannes Köster <[email protected]> Co-authored-by: Johannes Köster <[email protected]> Co-authored-by: Vanessasaurus <[email protected]>
do not merge this is a draft pull request!
This PR will encompass early work to add Google Cloud Pipelines to Snakemake. Specifically this means that:
We have several technical questions that still need answering, which I'll summarize here:
A few notes for some choices I made for the implementation:
I suspect this will fail linting and I'll push again with fixes :)