Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds comet-ml plugin example #1702

Merged
merged 1 commit into from
Nov 13, 2024

Conversation

thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Jul 3, 2024

This PR adds a comet-ml example.

To fully run the example, you need to use this ImageSpec, because the plugin is in a PR:

comet_ml_package = (
    "git+https://github.com/thomasjpfan/flytekit.git@"
    "d04d0129f99eb7e7edcffeae04d4f401f49be6d6#subdirectory=plugins/flytekit-comet-ml"
)

image = ImageSpec(
    builder="fast-builder",
    name="unionai",
    apt_packages=["git"],
    packages=[
        "torch==2.3.1",
        "comet-ml==3.43.2",
        "lightning==2.3.0",
        comet_ml_package,
        "unionai==0.1.43",
        "torchvision",
    ],
    registry="ghcr.io/thomasjpfan",
)

flyteorg/flytekit#2550

"comet-ml==3.43.2",
"lightning==2.3.0",
"flytekitplugins-comet-ml",
"unionai==0.1.43",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the unionai dependency, right?
Should we add this in Union's doc instead of Open Source's doc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a mistake. unionai is not needed.


REGISTRY = "localhost:30000"
image = ImageSpec(
name="unionai",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as below, we should use ghcr.io/flyteorg

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ImageSpec, most users will not have push access to ghcr.io/flyteorg, so they will need to change the registry to run it.

Using localhost:30000 means I am optimizing for the sandbox use case. An alternative is to use os.getenv("REGISTRY", "localhost:30000"), and let it be an environment variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I like your idea

Future-Outlier
Future-Outlier previously approved these changes Jul 17, 2024
Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


REGISTRY = "localhost:30000"
image = ImageSpec(
name="unionai",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I like your idea

# dynamic-log-links:
# - comet-ml-execution-id:
# displayName: Comet
# templateUris: {{ .taskConfig.host }}/{{ .taskConfig.workspace }}/{{ .taskConfig.project_name }}/{{ .executionName }}{{ .nodeId }}{{ .taskRetryAttempt }}{{ .taskConfig.link_suffix }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add double quote here.

# templateUris: {{ .taskConfig.host }}/{{ .taskConfig.workspace }}/{{ .taskConfig.project_name }}/{{ .executionName }}{{ .nodeId }}{{ .taskRetryAttempt }}{{ .taskConfig.link_suffix }}
# - comet-ml-custom-id:
# displayName: Comet
# templateUris: {{ .taskConfig.host }}/{{ .taskConfig.workspace }}/{{ .taskConfig.project_name }}/{{ .taskConfig.experiment_key }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing

Future-Outlier
Future-Outlier previously approved these changes Jul 17, 2024
neverett
neverett previously approved these changes Jul 23, 2024
Copy link
Contributor

@neverett neverett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few suggestions, otherwise LGTM

.. tags:: Integration, Data, Metrics, Intermediate
```

Comet’s machine learning platform integrates with your existing infrastructure and tools so you can manage, visualize, and optimize models—from training runs to production monitoring. This plugin integrates Flyte with Comet by configuring links between the two platforms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Comet’s machine learning platform integrates with your existing infrastructure and tools so you can manage, visualize, and optimize modelsfrom training runs to production monitoring. This plugin integrates Flyte with Comet by configuring links between the two platforms.
Comet’s machine learning platform integrates with your existing infrastructure and tools so you can manage, visualize, and optimize models from training runs to production monitoring. This plugin integrates Flyte with Comet by configuring links between the two platforms.

#
# # Comet Example
# Comet’s machine learning platform integrates with your existing infrastructure and
# tools so you can manage, visualize, and optimize models—from training runs to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# tools so you can manage, visualize, and optimize modelsfrom training runs to
# tools so you can manage, visualize, and optimize models from training runs to


# %% [markdown]
# First, we specify the project and workspace that we will use with Comet's platform
# Please update `PROJECT_NAME` and `WORKSPACE` to the value associated with your account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Please update `PROJECT_NAME` and `WORKSPACE` to the value associated with your account.
# Please update `PROJECT_NAME` and `WORKSPACE` to the values associated with your account.

@thomasjpfan
Copy link
Member Author

@neverett The comet-ml check is passing now: https://github.com/flyteorg/flytesnacks/actions/runs/10182511543/job/28165251450?pr=1702

There are a bunch other tests that are failing tho.

@neverett
Copy link
Contributor

neverett commented Aug 1, 2024

@thomasjpfan can you try merging master in and rerunning?

@fiedlerNr9
Copy link

fiedlerNr9 commented Aug 13, 2024

Running the example given in this PR was not working for me with ImageSpec.
MisconfigurationException: CometLogger requires either api_key or save_dir during initialization.
Execution on Dogfood

I had to add the following command to create a home dir & assign permissions for the flytekit user ->

image = ImageSpec(
    name="comet-ml",
    packages=[
        "torch==2.3.1",
        "comet-ml==3.43.2",
        "lightning==2.3.0",
        "flytekitplugins-comet-ml",
        "torchvision",
    ],
    commands=["mkdir -p /home/flytekit", "chown -R flytekit /home/flytekit"],
    registry=REGISTRY,
)

Anyone experiencing the same?

@thomasjpfan
Copy link
Member Author

For this example to work, it needs ImageSpec(builder="default", ...). The envd builder does not create a home directory for it's user.

ppiegaze
ppiegaze previously approved these changes Nov 11, 2024
Signed-off-by: Thomas J. Fan <[email protected]>
@thomasjpfan thomasjpfan force-pushed the thomasjpfan/comet-ml-init branch from 0c6983b to 8256865 Compare November 12, 2024 19:54
@thomasjpfan thomasjpfan enabled auto-merge (squash) November 13, 2024 15:30
@ppiegaze ppiegaze self-requested a review November 13, 2024 22:21
@thomasjpfan thomasjpfan merged commit 72aa5b2 into flyteorg:master Nov 13, 2024
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants