-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle partitions natively in W&B IO Manager #15170
Merged
Merged
Changes from 1 commit
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
f643978
Handle partitions natively in W&B IO Manager
chrishiste 58b0565
Merge branch 'master' into wandb-integration-bugfix
chrishiste b8fe716
Lint code
chrishiste ce11ab4
make ruff
yuhan f329dcd
Merge pull request #1 from dagster-io/15170-ruff
chrishiste 2f2c24c
Fix small bug
chrishiste 49e7d80
Return value directly when there is only one partition
chrishiste 6061efb
make black
yuhan bc91165
Merge pull request #2 from dagster-io/15170-black
chrishiste b6638b8
Merge remote-tracking branch 'origin/master' into pr/15170
yuhan a6dd514
Merge pull request #3 from dagster-io/15170-merge-master
chrishiste File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
86 changes: 86 additions & 0 deletions
86
examples/with_wandb/with_wandb/assets/advanced_partitions_example.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
from dagster import AssetIn, StaticPartitionsDefinition, asset | ||
|
||
import wandb | ||
|
||
partitions_def = StaticPartitionsDefinition(["red", "orange", "yellow", "blue", "green"]) | ||
|
||
ARTIFACT_NAME = "my_advanced_configuration_partitioned_asset" | ||
|
||
|
||
@asset( | ||
group_name="partitions", | ||
partitions_def=partitions_def, | ||
name=ARTIFACT_NAME, | ||
compute_kind="wandb", | ||
metadata={ | ||
"wandb_artifact_configuration": { | ||
"aliases": ["special_alias"], | ||
} | ||
}, | ||
) | ||
def write_advanced_artifact(context): | ||
"""Example writing an Artifact with partitions and custom metadata.""" | ||
artifact = wandb.Artifact(ARTIFACT_NAME, "dataset") | ||
partition_key = context.asset_partition_key_for_output() | ||
|
||
if partition_key == "red": | ||
return "red" | ||
elif partition_key == "orange": | ||
return wandb.Table(columns=["color"], data=[["orange"]]) | ||
elif partition_key == "yellow": | ||
table = wandb.Table(columns=["color"], data=[["yellow"]]) | ||
artifact.add(table, "custom_table_name") | ||
else: | ||
table = wandb.Table(columns=["color", "value"], data=[[partition_key, 1]]) | ||
artifact.add(table, "default_table_name") | ||
return artifact | ||
|
||
|
||
@asset( | ||
group_name="partitions", | ||
compute_kind="wandb", | ||
ins={ | ||
"partitions": AssetIn( | ||
asset_key=ARTIFACT_NAME, | ||
metadata={ | ||
"wandb_artifact_configuration": { | ||
"partitions": { | ||
# The wildcard "*" means "all non-configured partitions" | ||
"*": { | ||
"get": "default_table_name", | ||
}, | ||
# You can override the wildcard for specific partition using their key | ||
"yellow": { | ||
"get": "custom_table_name", | ||
}, | ||
# You can collect a specific Artifact version | ||
"orange": { | ||
"version": "v0", | ||
}, | ||
# You can collect a specific alias, note you must specify the 'get' value. | ||
# This is because the wildcard is only applied to partitions that haven't | ||
# been configured. | ||
"blue": { | ||
"alias": "special_alias", | ||
"get": "default_table_name", | ||
}, | ||
}, | ||
}, | ||
}, | ||
) | ||
}, | ||
output_required=False, | ||
) | ||
def read_objects_directly(context, partitions): | ||
"""Example reading all Artifact partitions from the previous asset.""" | ||
for partition, content in partitions.items(): | ||
context.log.info(f"partition={partition}, type={type(content)}") | ||
if partition == "red": | ||
context.log.info(content) | ||
elif partition == "orange": | ||
# The orange partition was a raw W&B Table, the IO Manager wrapped that Table in an | ||
# Artifact. The default name for the table is 'Table'. We could have also set | ||
# the partition 'get' config to receive the table directly instead of the Artifact. | ||
context.log.info(content.get("Table").get_column("color")) | ||
else: | ||
context.log.info(content.get_column("color")) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
62 changes: 62 additions & 0 deletions
62
examples/with_wandb/with_wandb/assets/multi_partitions_example.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
from dagster import ( | ||
AssetIn, | ||
DailyPartitionsDefinition, | ||
MultiPartitionsDefinition, | ||
StaticPartitionsDefinition, | ||
asset, | ||
) | ||
|
||
import wandb | ||
|
||
partitions_def = MultiPartitionsDefinition( | ||
{ | ||
"date": DailyPartitionsDefinition(start_date="2023-01-01", end_date="2023-01-05"), | ||
"color": StaticPartitionsDefinition(["red", "yellow", "blue"]), | ||
} | ||
) | ||
|
||
|
||
@asset( | ||
group_name="partitions", | ||
partitions_def=partitions_def, | ||
name="my_multi_partitioned_asset", | ||
compute_kind="wandb", | ||
metadata={ | ||
"wandb_artifact_configuration": { | ||
"type": "dataset", | ||
} | ||
}, | ||
) | ||
def create_my_multi_partitioned_asset(context): | ||
"""Example writing an Artifact with mutli partitions and custom metadata.""" | ||
partition_key = context.asset_partition_key_for_output() | ||
context.log.info(f"Creating partitioned asset for {partition_key}") | ||
if partition_key == "red|2023-01-02": | ||
artifact = wandb.Artifact("my_multi_partitioned_asset", "dataset") | ||
table = wandb.Table(columns=["color"], data=[[partition_key]]) | ||
return artifact.add(table, "default_table_name") | ||
return partition_key # e.g. "blue|2023-01-04" | ||
|
||
|
||
@asset( | ||
group_name="partitions", | ||
compute_kind="wandb", | ||
ins={ | ||
"my_multi_partitioned_asset": AssetIn( | ||
metadata={ | ||
"wandb_artifact_configuration": { | ||
"partitions": { | ||
"red|2023-01-02": { | ||
"get": "custom_table_name", | ||
}, | ||
}, | ||
}, | ||
}, | ||
) | ||
}, | ||
output_required=False, | ||
) | ||
def read_all_multi_partitions(context, my_multi_partitioned_asset): | ||
"""Example reading all Artifact partitions from the previous asset.""" | ||
for partition, content in my_multi_partitioned_asset.items(): | ||
context.log.info(f"partition={partition}, content={content}") |
66 changes: 66 additions & 0 deletions
66
examples/with_wandb/with_wandb/assets/simple_partitions_example.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
import random | ||
|
||
from dagster import ( | ||
AssetIn, | ||
DailyPartitionsDefinition, | ||
TimeWindowPartitionMapping, | ||
asset, | ||
) | ||
|
||
partitions_def = DailyPartitionsDefinition(start_date="2023-01-01", end_date="2023-02-01") | ||
|
||
|
||
@asset( | ||
group_name="partitions", | ||
partitions_def=partitions_def, | ||
name="my_daily_partitioned_asset", | ||
compute_kind="wandb", | ||
metadata={ | ||
"wandb_artifact_configuration": { | ||
"type": "dataset", | ||
} | ||
}, | ||
) | ||
def create_my_daily_partitioned_asset(context): | ||
"""Example writing an Artifact with daily partitions and custom metadata.""" | ||
# Happens when the asset is materialized in multiple runs (one per partition) | ||
if context.has_partition_key: | ||
partition_key = context.asset_partition_key_for_output() | ||
context.log.info(f"Creating partitioned asset for {partition_key}") | ||
return random.randint(0, 100) | ||
|
||
# Happens when the asset is materialized in a single run | ||
# Important: this will throw an error because we don't support materializing a partitioned | ||
# asset in a single run | ||
partition_key_range = context.asset_partition_key_range | ||
context.log.info(f"Creating partitioned assets for window {partition_key_range}") | ||
return random.randint(0, 100) | ||
|
||
|
||
@asset( | ||
group_name="partitions", | ||
compute_kind="wandb", | ||
ins={"my_daily_partitioned_asset": AssetIn()}, | ||
output_required=False, | ||
) | ||
def read_all_partitions(context, my_daily_partitioned_asset): | ||
"""Example reading all Artifact partitions from the first asset.""" | ||
for partition, content in my_daily_partitioned_asset.items(): | ||
context.log.info(f"partition={partition}, content={content}") | ||
|
||
|
||
@asset( | ||
group_name="partitions", | ||
partitions_def=partitions_def, | ||
compute_kind="wandb", | ||
ins={ | ||
"my_daily_partitioned_asset": AssetIn( | ||
partition_mapping=TimeWindowPartitionMapping(start_offset=-1) | ||
) | ||
}, | ||
output_required=False, | ||
) | ||
def read_specific_partitions(context, my_daily_partitioned_asset): | ||
"""Example reading specific Artifact partitions from the first asset.""" | ||
for partition, content in my_daily_partitioned_asset.items(): | ||
context.log.info(f"partition={partition}, content={content}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
load_from: | ||
- python_module: with_wandb |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying to wrap my head around here. where does the
partitions
come from?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the
AssetIn