-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dagster-wandb] Integration with Weights & Biases #10470
Conversation
@chrishiste is attempting to deploy a commit to the Elementl Team on Vercel. A member of the Team first needs to authorize it. |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Ignored Deployment
|
python_modules/dagster/setup.py
Outdated
@@ -99,7 +99,7 @@ def get_version() -> str: | |||
"grpcio-tools>=1.32.0,<1.44.0", # related to above grpcio pins | |||
"mock==3.0.5", | |||
"objgraph", | |||
"protobuf==3.13.0", # without this, pip will install the most up-to-date protobuf | |||
"protobuf==3.20.3", # without this, pip will install the most up-to-date protobuf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this is updating the dagster module itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuhan is this concerning? That is changing the protobuf version in python_modules/dagster/setup.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this blocking this PR? asking because i'd recommend splitting this out to a standalone PR so in case of anything reverting could be easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change, tests won't run due to incompatible versions. I can move this out to another PR. Should I do this?
Then we merge that single line change. Before this one.
@yuhan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see. lets include it then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok finally removed the pin here in a separate PR: #11974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a ton for putting this together. Overall it looks great!
A few feedback around the example:
- could we add a tests dir
examples/with_wandb/with_wandb_tests
in the example so we can include the unit tests all together in our CI. example: https://github.com/dagster-io/dagster/blob/master/examples/development_to_production/development_to_production_tests - naming wise, i tend to use nouns for asset names (which usually represent some objects like artifacts, tables, models, etc) and verbs for op names (which often are operations therefore verbs)
i saw the tests are failing likely related to our set up for new module or example. i'll take a stab and try fixing those in this branch.
description="Resource for interacting with Weights & Biases", | ||
) | ||
def wandb_resource(context: InitResourceContext) -> Dict[str, Any]: | ||
"""Resource for interacting with Weights & Biases.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could maybe add an example for how to use this resource in the docstring. example: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-snowflake/dagster_snowflake/resources.py#L277-L314
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be okay to add a link to the documentation instead of something like the example you shared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the nice thing about including the code snippets here is it'd show up in the api docs directly, for example: https://docs.dagster.io/_apidocs/libraries/dagster-airbyte so the reader doesn't need to navigate away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that makes sense I will create a commit including this, will let you know when it's ready
examples/with_wandb/with_wandb/assets/custom_serialization_example.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuhan Thanks for the review. I've pushed another commit including your feedback. Please have a look 😄
python_modules/dagster/setup.py
Outdated
@@ -99,7 +99,7 @@ def get_version() -> str: | |||
"grpcio-tools>=1.32.0,<1.44.0", # related to above grpcio pins | |||
"mock==3.0.5", | |||
"objgraph", | |||
"protobuf==3.13.0", # without this, pip will install the most up-to-date protobuf | |||
"protobuf==3.20.3", # without this, pip will install the most up-to-date protobuf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuhan is this concerning? That is changing the protobuf version in python_modules/dagster/setup.py
examples/with_wandb/with_wandb/assets/custom_serialization_example.py
Outdated
Show resolved
Hide resolved
description="Resource for interacting with Weights & Biases", | ||
) | ||
def wandb_resource(context: InitResourceContext) -> Dict[str, Any]: | ||
"""Resource for interacting with Weights & Biases.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be okay to add a link to the documentation instead of something like the example you shared?
Co-authored-by: Dave La Chasse <[email protected]> Co-authored-by: Ashraf Shaik <[email protected]>
1e20d2e
to
2f559ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the local storage to store downloaded Artifacts. Does Dagster always run with a local filesystem that we can use? And what would happen when there is no more disk space?
dagster is always running with a local filesystem, but when it's deployed in a remote setup, for example, each run could be in its own container which means local filesystem will be ephemera in a run scope, i.e. files downloaded to run 1's local filesystem won't be available to the next run.
i think it's ok to default to local filesystem, but it'd be great to give an option for users to download to other filesystems, e.g. an option to hook to an s3 bucket. -- but this isn't blocking this PR IMO. we can add it in a later iteration.
examples/with_wandb/README.md
Outdated
|
||
This directory contains examples showcasing the integration with Weights & Biases (W&B). | ||
|
||
View this example in the Dagster docs at https://docs.dagster.io/integrations/wandb. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We linked to https://docs.dagster.io/integrations/wandb in the examples/with_wandb/README.md. Can you confirm it's where we documentation would eventually live?
is the idea to link to a guide/tutorial or an api doc? the integrations/
pages are usually hands-on guides whereas the PR will generate the api doc at https://docs.dagster.io/_apidocs/libraries/dagster-wandb
i saw you also linked the api doc to https://docs.wandb.ai/guides/integrations/dagster. so do you think this link can go to the guide on w&b's site or we can reword it to point to the api doc on dagster's site?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I changed all those links to point to the full docs on our side
from .ops.simple_job import simple_job_example | ||
|
||
|
||
@repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heads up: as we're moving the top-level entry to using Definitions, i'll update this file to use the new apis.
context: #11167
python_modules/dagster/setup.py
Outdated
@@ -99,7 +99,7 @@ def get_version() -> str: | |||
"grpcio-tools>=1.32.0,<1.44.0", # related to above grpcio pins | |||
"mock==3.0.5", | |||
"objgraph", | |||
"protobuf==3.13.0", # without this, pip will install the most up-to-date protobuf | |||
"protobuf==3.20.3", # without this, pip will install the most up-to-date protobuf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this blocking this PR? asking because i'd recommend splitting this out to a standalone PR so in case of anything reverting could be easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again for putting this together and bearing with my slow reviews. I can take it from here and aim to include in the next release : )
@yuhan Glad to hear it, thanks for the help :) |
Deployment failed with the following error:
|
cherry-picked commits from #11830 here to fix the tests + merge latest master into this branch to prep for merging :) |
the follow up is #12059 which converts the example to using the latest Definitions apis |
Deployment failed with the following error:
|
da39465
to
0c9ce71
Compare
Deployment failed with the following error:
|
Deployment failed with the following error:
|
Co-authored-by: Dave La Chasse <[email protected]> Co-authored-by: Ashraf Shaik <[email protected]> Note: W&B partners with the Dagster team on this. If you are eager to use this integration for your own projects, please hold while we get this polished. Don't hesitate to provide feedback or ask questions :) ### Summary & Motivation This PR contains the integration code with [Weights & Biases](https://docs.wandb.ai/). Making it easy within Dagster to: - use and create W&B [Artifacts](https://docs.wandb.ai/guides/artifacts) - use and create Registered Models in W&B [Model Registry](https://docs.wandb.ai/guides/models) - run training jobs on dedicated compute using W&B [Launch](https://docs.wandb.ai/guides/launch) - use the [wandb](https://github.com/wandb/wandb) client in ops and assets You will find code for the integration and examples on how to use the integration. For more in depth documentation check the [current draft](https://docs.google.com/document/d/10qOyhbJKnJR4kazGqMF22db6UhXpeHop3_QV90eaiW4/edit). **Useful references:** - https://docs.wandb.ai/ref/python/artifact - https://docs.wandb.ai/ref/python/run - https://docs.wandb.ai/guides/launch **Notes and questions for the Dagster team:** - We're using the local storage to store downloaded Artifacts. Does Dagster always run with a local filesystem that we can use? And what would happen when there is no more disk space? - We linked to `https://docs.dagster.io/integrations/wandb` in the `examples/with_wandb/README.md`. Can you confirm it's where we documentation would eventually live? - I'm expecting your review to be mostly on the form over the content but we're open to refactor the code to suit Dagster coding style better. Don't hesitate to be vocal if something could be improved. ### How I Tested These Changes This integration has been extensively tested manually and through unit tests (included in the code). **Known Issues** tox fails currently due to a protobuf bug. There is an open Github [issue](protocolbuffers/protobuf#10075). I was able to remove the error by pinning another version but I'm not sure of the implications. Probably better to wait for a fix until we release this. --------- Co-authored-by: Dave La Chasse <[email protected]> Co-authored-by: Ashraf Shaik <[email protected]> Co-authored-by: yuhan <[email protected]>
Co-authored-by: Dave La Chasse [email protected]
Co-authored-by: Ashraf Shaik [email protected]
Note: W&B partners with the Dagster team on this. If you are eager to use this integration for your own projects, please hold while we get this polished. Don't hesitate to provide feedback or ask questions :)
Summary & Motivation
This PR contains the integration code with Weights & Biases.
Making it easy within Dagster to:
You will find code for the integration and examples on how to use the integration.
For more in depth documentation check the current draft.
Useful references:
Notes and questions for the Dagster team:
https://docs.dagster.io/integrations/wandb
in theexamples/with_wandb/README.md
. Can you confirm it's where we documentation would eventually live?How I Tested These Changes
This integration has been extensively tested manually and through unit tests (included in the code).
Known Issues
tox fails currently due to a protobuf bug. There is an open Github issue. I was able to remove the error by pinning another version but I'm not sure of the implications. Probably better to wait for a fix until we release this.