Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dagster-wandb] Integration with Weights & Biases #10470

Merged
merged 24 commits into from
Feb 8, 2023

Conversation

chrishiste
Copy link
Contributor

@chrishiste chrishiste commented Nov 10, 2022

Co-authored-by: Dave La Chasse [email protected]
Co-authored-by: Ashraf Shaik [email protected]

Note: W&B partners with the Dagster team on this. If you are eager to use this integration for your own projects, please hold while we get this polished. Don't hesitate to provide feedback or ask questions :)

Summary & Motivation

This PR contains the integration code with Weights & Biases.

Making it easy within Dagster to:

  • use and create W&B Artifacts
  • use and create Registered Models in W&B Model Registry
  • run training jobs on dedicated compute using W&B Launch
  • use the wandb client in ops and assets

You will find code for the integration and examples on how to use the integration.

For more in depth documentation check the current draft.

Useful references:

Notes and questions for the Dagster team:

  • We're using the local storage to store downloaded Artifacts. Does Dagster always run with a local filesystem that we can use? And what would happen when there is no more disk space?
  • We linked to https://docs.dagster.io/integrations/wandb in the examples/with_wandb/README.md. Can you confirm it's where we documentation would eventually live?
  • I'm expecting your review to be mostly on the form over the content but we're open to refactor the code to suit Dagster coding style better. Don't hesitate to be vocal if something could be improved.

How I Tested These Changes

This integration has been extensively tested manually and through unit tests (included in the code).

Known Issues
tox fails currently due to a protobuf bug. There is an open Github issue. I was able to remove the error by pinning another version but I'm not sure of the implications. Probably better to wait for a fix until we release this.

@vercel
Copy link

vercel bot commented Nov 10, 2022

@chrishiste is attempting to deploy a commit to the Elementl Team on Vercel.

A member of the Team first needs to authorize it.

@vercel
Copy link

vercel bot commented Nov 10, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
dagster ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 8, 2023 at 7:32PM (UTC)
1 Ignored Deployment
Name Status Preview Comments Updated
dagit-storybook ⬜️ Ignored (Inspect) Feb 8, 2023 at 7:32PM (UTC)

@@ -99,7 +99,7 @@ def get_version() -> str:
"grpcio-tools>=1.32.0,<1.44.0", # related to above grpcio pins
"mock==3.0.5",
"objgraph",
"protobuf==3.13.0", # without this, pip will install the most up-to-date protobuf
"protobuf==3.20.3", # without this, pip will install the most up-to-date protobuf
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is updating the dagster module itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuhan is this concerning? That is changing the protobuf version in python_modules/dagster/setup.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this blocking this PR? asking because i'd recommend splitting this out to a standalone PR so in case of anything reverting could be easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, tests won't run due to incompatible versions. I can move this out to another PR. Should I do this?
Then we merge that single line change. Before this one.
@yuhan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i see. lets include it then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok finally removed the pin here in a separate PR: #11974

@yuhan yuhan self-requested a review November 17, 2022 22:15
Copy link
Contributor

@yuhan yuhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for putting this together. Overall it looks great!

A few feedback around the example:

  1. could we add a tests dir examples/with_wandb/with_wandb_tests in the example so we can include the unit tests all together in our CI. example: https://github.com/dagster-io/dagster/blob/master/examples/development_to_production/development_to_production_tests
  2. naming wise, i tend to use nouns for asset names (which usually represent some objects like artifacts, tables, models, etc) and verbs for op names (which often are operations therefore verbs)

i saw the tests are failing likely related to our set up for new module or example. i'll take a stab and try fixing those in this branch.

python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
description="Resource for interacting with Weights & Biases",
)
def wandb_resource(context: InitResourceContext) -> Dict[str, Any]:
"""Resource for interacting with Weights & Biases."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be okay to add a link to the documentation instead of something like the example you shared?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nice thing about including the code snippets here is it'd show up in the api docs directly, for example: https://docs.dagster.io/_apidocs/libraries/dagster-airbyte so the reader doesn't need to navigate away

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that makes sense I will create a commit including this, will let you know when it's ready

examples/with_wandb/with_wandb/repository.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@chrishiste chrishiste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuhan Thanks for the review. I've pushed another commit including your feedback. Please have a look 😄

examples/with_wandb/with_wandb/assets/advanced_example.py Outdated Show resolved Hide resolved
@@ -99,7 +99,7 @@ def get_version() -> str:
"grpcio-tools>=1.32.0,<1.44.0", # related to above grpcio pins
"mock==3.0.5",
"objgraph",
"protobuf==3.13.0", # without this, pip will install the most up-to-date protobuf
"protobuf==3.20.3", # without this, pip will install the most up-to-date protobuf
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuhan is this concerning? That is changing the protobuf version in python_modules/dagster/setup.py

python_modules/dagster-wandb/dagster_wandb/version.py Outdated Show resolved Hide resolved
examples/with_wandb/with_wandb/repository.py Outdated Show resolved Hide resolved
description="Resource for interacting with Weights & Biases",
)
def wandb_resource(context: InitResourceContext) -> Dict[str, Any]:
"""Resource for interacting with Weights & Biases."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be okay to add a link to the documentation instead of something like the example you shared?

python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
python_modules/dagster-wandb/dagster_wandb/io_manager.py Outdated Show resolved Hide resolved
@chrishiste chrishiste requested a review from yuhan December 6, 2022 19:09
Copy link
Contributor

@yuhan yuhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the local storage to store downloaded Artifacts. Does Dagster always run with a local filesystem that we can use? And what would happen when there is no more disk space?

dagster is always running with a local filesystem, but when it's deployed in a remote setup, for example, each run could be in its own container which means local filesystem will be ephemera in a run scope, i.e. files downloaded to run 1's local filesystem won't be available to the next run.

i think it's ok to default to local filesystem, but it'd be great to give an option for users to download to other filesystems, e.g. an option to hook to an s3 bucket. -- but this isn't blocking this PR IMO. we can add it in a later iteration.


This directory contains examples showcasing the integration with Weights & Biases (W&B).

View this example in the Dagster docs at https://docs.dagster.io/integrations/wandb.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We linked to https://docs.dagster.io/integrations/wandb in the examples/with_wandb/README.md. Can you confirm it's where we documentation would eventually live?

is the idea to link to a guide/tutorial or an api doc? the integrations/ pages are usually hands-on guides whereas the PR will generate the api doc at https://docs.dagster.io/_apidocs/libraries/dagster-wandb
i saw you also linked the api doc to https://docs.wandb.ai/guides/integrations/dagster. so do you think this link can go to the guide on w&b's site or we can reword it to point to the api doc on dagster's site?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I changed all those links to point to the full docs on our side

from .ops.simple_job import simple_job_example


@repository
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heads up: as we're moving the top-level entry to using Definitions, i'll update this file to use the new apis.

context: #11167

@@ -99,7 +99,7 @@ def get_version() -> str:
"grpcio-tools>=1.32.0,<1.44.0", # related to above grpcio pins
"mock==3.0.5",
"objgraph",
"protobuf==3.13.0", # without this, pip will install the most up-to-date protobuf
"protobuf==3.20.3", # without this, pip will install the most up-to-date protobuf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this blocking this PR? asking because i'd recommend splitting this out to a standalone PR so in case of anything reverting could be easier.

Copy link
Contributor

@yuhan yuhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for putting this together and bearing with my slow reviews. I can take it from here and aim to include in the next release : )

@chrishiste
Copy link
Contributor Author

@yuhan Glad to hear it, thanks for the help :)

@vercel
Copy link

vercel bot commented Feb 7, 2023

Deployment failed with the following error:

Resource is limited - try again in 6 hours (more than 100, code: "api-deployments-free-per-day").

@yuhan
Copy link
Contributor

yuhan commented Feb 7, 2023

cherry-picked commits from #11830 here to fix the tests + merge latest master into this branch to prep for merging :)

@yuhan
Copy link
Contributor

yuhan commented Feb 7, 2023

the follow up is #12059 which converts the example to using the latest Definitions apis

@vercel
Copy link

vercel bot commented Feb 7, 2023

Deployment failed with the following error:

Resource is limited - try again in 5 hours (more than 100, code: "api-deployments-free-per-day").

@yuhan yuhan force-pushed the wandb-integration branch from da39465 to 0c9ce71 Compare February 7, 2023 09:57
@vercel
Copy link

vercel bot commented Feb 7, 2023

Deployment failed with the following error:

Resource is limited - try again in 4 hours (more than 100, code: "api-deployments-free-per-day").

@yuhan yuhan requested a review from erinkcochran87 as a code owner February 8, 2023 00:59
@vercel
Copy link

vercel bot commented Feb 8, 2023

Deployment failed with the following error:

Resource is limited - try again in 3 minutes (more than 100, code: "api-deployments-free-per-day").

@yuhan yuhan merged commit 7aecf4a into dagster-io:master Feb 8, 2023
yuhan added a commit that referenced this pull request Feb 8, 2023
### Summary & Motivation
depends on #10470

this PR converts the example from using repository to defs as the entry
point, per changes in 1.1.7
(#11167)

### How I Tested These Changes
- local dagit can load
- unit test
smackesey pushed a commit that referenced this pull request Feb 8, 2023
Co-authored-by: Dave La Chasse
<[email protected]>
Co-authored-by: Ashraf Shaik
<[email protected]>

Note: W&B partners with the Dagster team on this. If you are eager to
use this integration for your own projects, please hold while we get
this polished. Don't hesitate to provide feedback or ask questions :)

### Summary & Motivation
This PR contains the integration code with [Weights &
Biases](https://docs.wandb.ai/).

Making it easy within Dagster to:
- use and create W&B [Artifacts](https://docs.wandb.ai/guides/artifacts)
- use and create Registered Models in W&B [Model
Registry](https://docs.wandb.ai/guides/models)
- run training jobs on dedicated compute using W&B
[Launch](https://docs.wandb.ai/guides/launch)
- use the [wandb](https://github.com/wandb/wandb) client in ops and
assets

You will find code for the integration and examples on how to use the
integration.

For more in depth documentation check the [current
draft](https://docs.google.com/document/d/10qOyhbJKnJR4kazGqMF22db6UhXpeHop3_QV90eaiW4/edit).
 
**Useful references:** 
- https://docs.wandb.ai/ref/python/artifact
- https://docs.wandb.ai/ref/python/run
- https://docs.wandb.ai/guides/launch

**Notes and questions for the Dagster team:**
- We're using the local storage to store downloaded Artifacts. Does
Dagster always run with a local filesystem that we can use? And what
would happen when there is no more disk space?
- We linked to `https://docs.dagster.io/integrations/wandb` in the
`examples/with_wandb/README.md`. Can you confirm it's where we
documentation would eventually live?
- I'm expecting your review to be mostly on the form over the content
but we're open to refactor the code to suit Dagster coding style better.
Don't hesitate to be vocal if something could be improved.

### How I Tested These Changes
This integration has been extensively tested manually and through unit
tests (included in the code).

**Known Issues**
tox fails currently due to a protobuf bug. There is an open Github
[issue](protocolbuffers/protobuf#10075). I was
able to remove the error by pinning another version but I'm not sure of
the implications. Probably better to wait for a fix until we release
this.

---------

Co-authored-by: Dave La Chasse <[email protected]>
Co-authored-by: Ashraf Shaik <[email protected]>
Co-authored-by: yuhan <[email protected]>
smackesey pushed a commit that referenced this pull request Feb 8, 2023
### Summary & Motivation
depends on #10470

this PR converts the example from using repository to defs as the entry
point, per changes in 1.1.7
(#11167)

### How I Tested These Changes
- local dagit can load
- unit test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants