Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to KfpV2 #477

Merged
merged 34 commits into from
Oct 11, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
17451c5
Feature/vertex compiler (#411)
GeorgesLorre Sep 11, 2023
52102a5
Add vertex runner (#429)
GeorgesLorre Sep 15, 2023
a0ea8e4
Add hardware configs (#433)
PhilippeMoussalli Sep 15, 2023
65e4553
Fix v2 defaults (#436)
PhilippeMoussalli Sep 18, 2023
03b1c3b
Make ComponentSpec the base for arg building
GeorgesLorre Sep 19, 2023
6faf90c
Make ComponentSpec the base for arg building
GeorgesLorre Sep 19, 2023
99016d7
Feature/no artifacts (#444)
GeorgesLorre Sep 21, 2023
12b74ae
Add more default/optional argument logic
GeorgesLorre Sep 21, 2023
32a8c04
Add cluser_type to default args
GeorgesLorre Oct 2, 2023
e49867b
Fix ruff error
GeorgesLorre Oct 2, 2023
cf179f1
Fix isOptional and defaultValue conversion
RobbeSneyders Oct 9, 2023
55fc4fe
Update runner to use KfP v2 API
RobbeSneyders Oct 9, 2023
fa14ea0
Change input_partition_rows to accept -1 as default
RobbeSneyders Oct 9, 2023
cd82bae
Update load_from_hf_hub defaults
RobbeSneyders Oct 9, 2023
987054b
Update download_images defaults
RobbeSneyders Oct 9, 2023
44a7d4d
Merge branch 'main' into feature/kfp-v2
RobbeSneyders Oct 10, 2023
74f7099
Merge branch 'main' into feature/kfp-v2
PhilippeMoussalli Oct 10, 2023
347eec4
re-enable cache
PhilippeMoussalli Oct 10, 2023
c89998c
Fix tests
RobbeSneyders Oct 10, 2023
1815d02
Update datacomp pipeline
RobbeSneyders Oct 10, 2023
081422e
Merge remote-tracking branch 'origin/feature/kfp-v2' into feature/kfp-v2
RobbeSneyders Oct 10, 2023
931df56
Remove python version upper bound
RobbeSneyders Oct 10, 2023
4342aa8
Re-add test suite for Python 3.11
RobbeSneyders Oct 10, 2023
a789c03
Add Python 3.12 upper bound
RobbeSneyders Oct 10, 2023
2108894
Add gcp dependencies to vertex extra
RobbeSneyders Oct 10, 2023
b778f5e
Address PR comments
RobbeSneyders Oct 10, 2023
c7e3a9f
Add Python 3.12 trove classifier to pyproject.toml
RobbeSneyders Oct 10, 2023
c6601eb
Lower python upper bound to 3.11 again to prevent slow dependency res…
RobbeSneyders Oct 10, 2023
d32dffc
Remove 3.11 test suite
RobbeSneyders Oct 10, 2023
fef720e
Address PR comments
RobbeSneyders Oct 10, 2023
3cb3a27
Changes based on self-review
RobbeSneyders Oct 10, 2023
35dad13
Update component defaults for kfpv2
RobbeSneyders Oct 10, 2023
c7ab7d5
Fix tests for kfp 2.3.0
RobbeSneyders Oct 10, 2023
0c6d403
disable kfpv2 default caching
PhilippeMoussalli Oct 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11']
python-version: ['3.8', '3.9', '3.10']
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand Down
2 changes: 1 addition & 1 deletion components/download_images/fondant_component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ args:
resize_only_if_bigger:
description: If True, resize only if image is bigger than image_size.
type: bool
default: 'False'
default: False
min_image_size:
description: Minimum size of the images.
type: int
Expand Down
2 changes: 0 additions & 2 deletions docs/components/component_spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,8 +124,6 @@ The `args` section describes which arguments the component takes. Each argument
`description` and a `type`, which should be one of the builtin Python types. Additionally, you can
set an optional `default` value for each argument.

_Note:_ default iterable arguments such as `dict` and `list` have to be passed as a string
(e.g. `'{"foo":1, "bar":2}`, `'["foo","bar]'`)
```yaml
args:
custom_argument:
Expand Down
3 changes: 2 additions & 1 deletion docs/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ def build_pipeline():
"batch_size": 2,
"max_new_tokens": 50,
},
number_of_gpus=1,
number_of_accelerators=1,
accelerator_name="GPU",
node_pool_label="node_pool",
node_pool_name="model-inference-pool",
)
Expand Down
6 changes: 4 additions & 2 deletions examples/pipelines/controlnet-interior-design/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,15 +45,17 @@
"batch_size": 2,
"max_new_tokens": 50,
},
number_of_gpus=1,
number_of_accelerators=1,
accelerator_name="GPU",
)
segment_images_op = ComponentOp.from_registry(
name="segment_images",
arguments={
"model_id": "openmmlab/upernet-convnext-small",
"batch_size": 2,
},
number_of_gpus=1,
number_of_accelerators=1,
accelerator_name="GPU",
)

write_to_hub_controlnet = ComponentOp(
Expand Down
110 changes: 56 additions & 54 deletions examples/pipelines/datacomp/pipeline.py
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -41,64 +41,66 @@
"dataset_name": "nielsr/datacomp-small-with-text-embeddings",
"column_name_mapping": load_component_column_mapping,
"index_column": "uid",
# "n_rows_to_load": 1000,
"n_rows_to_load": 1000,
},
node_pool_label="node_pool",
node_pool_name="n2-standard-64-pool",
cache=False,
)
download_images_op = ComponentOp.from_registry(
name="download_images",
arguments={
"retries": 2,
"min_image_size": 0,
"max_aspect_ratio": float("inf"),
},
node_pool_label="node_pool",
node_pool_name="n2-standard-64-pool",
input_partition_rows=1000,
cache=False,
)
detect_text_op = ComponentOp(
component_dir="components/detect_text",
arguments={
"batch_size": 2,
},
node_pool_label="node_pool",
node_pool_name="model-inference-mega-pool",
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
number_of_accelerators=1,
accelerator_name="GPU",
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
cache=False,
)
mask_images_op = ComponentOp(
component_dir="components/mask_images",
node_pool_label="node_pool",
node_pool_name="n2-standard-64-pool",
cache=False,
)
embed_images_op = ComponentOp.from_registry(
name="image_embedding",
arguments={
"batch_size": 2,
},
node_pool_label="node_pool",
node_pool_name="model-inference-mega-pool",
number_of_accelerators=1,
accelerator_name="GPU",
cache=False,
)
add_clip_score_op = ComponentOp(
component_dir="components/add_clip_score",
node_pool_label="node_pool",
node_pool_name="n2-standard-64-pool",
cache=False,
)
filter_clip_score_op = ComponentOp(
component_dir="components/filter_clip_score",
arguments={
"pct_threshold": 0.5,
},
node_pool_label="node_pool",
node_pool_name="n2-standard-64-pool",
)
# download_images_op = ComponentOp.from_registry(
# name="download_images",
# arguments={
# "retries": 2,
# "min_image_size": 0,
# "max_aspect_ratio": float("inf"),
# },
# node_pool_label="node_pool",
# node_pool_name="n2-standard-64-pool",
# input_partition_rows=1000,
# cache=False,
# )
# detect_text_op = ComponentOp(
# component_dir="components/detect_text",
# arguments={
# "batch_size": 2,
# },
# node_pool_label="node_pool",
# node_pool_name="model-inference-mega-pool",
# number_of_gpus=1,
# cache=False,
# )
# mask_images_op = ComponentOp(
# component_dir="components/mask_images",
# node_pool_label="node_pool",
# node_pool_name="n2-standard-64-pool",
# cache=False,
# )
# embed_images_op = ComponentOp.from_registry(
# name="embed_images",
# arguments={
# "batch_size": 2,
# },
# node_pool_label="node_pool",
# node_pool_name="model-inference-mega-pool",
# number_of_gpus=1,
# cache=False,
# )
# add_clip_score_op = ComponentOp(
# component_dir="components/add_clip_score",
# node_pool_label="node_pool",
# node_pool_name="n2-standard-64-pool",
# cache=False,
# )
# filter_clip_score_op = ComponentOp(
# component_dir="components/filter_clip_score",
# arguments={
# "pct_threshold": 0.5,
# },
# node_pool_label="node_pool",
# node_pool_name="n2-standard-64-pool",
# )


# add ops to pipeline
pipeline.add_op(load_from_hub_op)
Expand Down
6 changes: 4 additions & 2 deletions examples/pipelines/finetune_stable_diffusion/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,8 @@
"batch_size": 2,
"max_new_tokens": 50,
},
number_of_gpus=1,
number_of_accelerators=1,
accelerator_name="GPU",
)

write_to_hub = ComponentOp(
Expand All @@ -80,7 +81,8 @@
"hf_token": "hf_token",
"image_column_names": ["images_data"],
},
number_of_gpus=1,
number_of_accelerators=1,
accelerator_name="GPU",
)

pipeline = Pipeline(
Expand Down
7 changes: 4 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,14 @@ classifiers = [
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Topic :: Software Development",
"Topic :: Software Development :: Libraries",
"Topic :: Software Development :: Libraries :: Python Modules",
"Typing :: Typed",
]

[tool.poetry.dependencies]
python = ">= 3.8"
python = ">= 3.8 < 3.11"
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
dask = {extras = ["dataframe", "distributed", "diagnostics"], version = ">= 2023.4.1"}
importlib-resources = { version = ">= 1.3", python = "<3.9" }
jsonschema = ">= 4.18"
Expand All @@ -51,14 +50,16 @@ fsspec = { version = ">= 2023.4.0", optional = true}
gcsfs = { version = ">= 2023.4.0", optional = true }
s3fs = { version = ">= 2023.4.0", optional = true }
adlfs = { version = ">= 2023.4.0", optional = true }
kfp = { version = ">= 1.8.19, < 2", optional = true }
kfp = { version = "2.0.1", optional = true, extras =["kubernetes"] }
pandas = { version = ">= 1.3.5", optional = true }
google-cloud-aiplatform = { version = "1.32.0", optional = true}

[tool.poetry.extras]
aws = ["fsspec", "s3fs"]
azure = ["fsspec", "adlfs"]
gcp = ["fsspec", "gcsfs"]
kfp = ["kfp"]
vertex = ["kfp", "google-cloud-aiplatform"]
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved

[tool.poetry.group.test.dependencies]
pre-commit = "^3.1.1"
Expand Down
Loading