-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal]: Flyte System Tags and metadata #3320
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
5742671
[Proposal]: Flyte System Tags and metadata
kumare3 bc087ed
added images
kumare3 a5f6886
Adapt execution tags RFC with results from contributor's sync discuss…
fg91 cfdba4f
update doc
pingsutw 1d049e3
Merge branch 'master' of github.com:flyteorg/flyte into flyte-tags
pingsutw dd27c18
update doc
pingsutw dbb14fd
Update rfc/system/0001-flyte-execution-tags.md
pingsutw fca779e
Update rfc/system/0001-flyte-execution-tags.md
pingsutw ba8efc9
Update rfc/system/0001-flyte-execution-tags.md
pingsutw b90586f
Update rfc/system/0001-flyte-execution-tags.md
pingsutw 9eb5e22
update doc
pingsutw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
# Flyte Execution Tags and Metadata | ||
|
||
**Authors:** | ||
|
||
- @kumare3 | ||
|
||
## 1 Executive Summary | ||
|
||
Flyte currently provides no visual ways of grouping executions and other | ||
entities apart from with a “project” and “domain”. Usually a project is used by a team of | ||
engineers and/or researchers, and sometimes they may be experimenting in | ||
isolation or running multiple co-horts of experiments. Also, there are cases in | ||
which grouping executions by a group might make them easier to find and | ||
associate them better. This document provides a solution of how we could | ||
improve the experience of discovering executions within Flyte. It also provides | ||
motivation of how this feature could further improve other ecosystem projects. | ||
|
||
## 2 Motivation | ||
|
||
As a User I want to | ||
- Group a certain number of executions into an experiment group - for ease of debugging and discovery - when launching the workflows, while they are running or after they have already finished. | ||
- I want to mark certain executions as “blessed” or “released”. This could be done through providing a semantic version after the execution is successful | ||
- I want to group all “sub launchplan” executions with the parent execution. | ||
- External systems could group executions based on some identifiers. | ||
- Users could name their executions without having to worry about the character limits, uniqueness constraints and limited characterset. | ||
- Simplify filtering of certain executions | ||
- Be able to remove tags from an execution after it has been started, or after it has finished. | ||
|
||
## 3 Proposed Implementation | ||
|
||
### Support for tags | ||
|
||
We propose to solve the problem of discovery by supporting arbitrary metadata association with an entity. This is similar to concept of “tags” as in AWS. | ||
The tags are represented as plain string. | ||
We'll add tags to [ExecutionCreateRequest](https://docs.flyte.org/projects/flyteidl/en/latest/protos/docs/admin/admin.html#executioncreaterequest) -> [ExecutionSpec](https://docs.flyte.org/projects/flyteidl/en/latest/protos/docs/admin/admin.html#executionspec). | ||
|
||
The resultant tags will be persisted in the database, instead of being applied to the | ||
execution in Kubernetes. We'll create two new tables in the flyteadmin database. | ||
- ``execution_admin_tags`` is a join table, and it contains admin_tags ID and execution ID. | ||
- ``admin_tags`` saves all the tag names | ||
|
||
As a first step, we recommend that these tags are | ||
persisted associated with an execution and limit total number of tags per execution to 10-15 | ||
The ListExecutions API is updated to return all | ||
- Filtered executions by tags with supported `and` `or` queries | ||
- All associated executions with every execution | ||
|
||
For the second step, we will support attaching tags to the project, task, and workflow. In addition, | ||
we will enable users to update the tags of an execution after it has been created. This will be done | ||
through a flyteadmin API (need to create a new endpoint for updating the tags). | ||
|
||
Once this is implemented the UI and CLI can be updated to support these | ||
queries. | ||
|
||
### CLI Interface | ||
|
||
A workflow or task can be executed using | ||
|
||
```bash | ||
pyflyte run --remote --tags '["hello", "world"]' test.py wf --input1=10 | ||
``` | ||
(or equivalently in flytectl) | ||
|
||
flytectl and flyte remote can support filtering of executions by tags. Example | ||
in flytectl, | ||
```bash | ||
flytectl get execution -p flytesnacks -d development --filter.tags="hello,world" | ||
``` | ||
|
||
### UI Interface | ||
All tags are treated the same way and allow search/filter and click based grouping. | ||
Users will get the regular executions view with all the tags available on each execution. | ||
The users are allowed to filter an execution simply by clicking on a tag and then all | ||
executions are filtered by that tag. | ||
|
||
![Grouping / Filtering UX](https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/rfc/tags/labels-filter.png) | ||
|
||
|
||
### Support for descriptions | ||
pingsutw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Users want to describe their executions and especially once an experiment | ||
succeeds they may want to add a lot more description and data about the | ||
experiment. Thus, we propose to add descriptions to the execution as well. | ||
|
||
Allow add description when you start an execution | ||
```bash | ||
pyflyte run --remote --tags '["key1", "key2"]' --description "........" test.py wf --input1=10 | ||
``` | ||
|
||
It should be possible to add the description as a Markdown | ||
```bash | ||
pyflyte run --remote --tags '["key1", "key2"]' --description README.md test.py wf --input1=10 | ||
``` | ||
|
||
It should be possible to add a description for an execution in the UI, after | ||
the execution has been created. | ||
![UI Descriptions](https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/rfc/tags/description-edit.png) | ||
|
||
## 4 Metrics & Dashboards | ||
NA | ||
|
||
## 5 Drawbacks | ||
It is important to understand that this may add a little more stress to the | ||
metadata database. Back of the envelope calculation | ||
|
||
-> 1 million executions * 20 tags each. | ||
-> Each tag is a plain string, which has 64 characters. | ||
-> 1.28 * 10^9 bytes (assuming one byte per character" -> 1.28GB | ||
|
||
This is not significant even though it will increase as executions increase | ||
|
||
## 6 Alternatives | ||
NA | ||
|
||
|
||
## 7 Potential Impact and Dependencies | ||
This is one of the most requested features in Flyte and will solve | ||
a lot of problems. | ||
|
||
|
||
## 8 Unresolved questions | ||
NA | ||
|
||
## 9 Conclusion | ||
With tags, users can discover their executions and other flyte entities easily. | ||
By storing tags in the database, it allows flyte easily search and filter executions, and also allows us to easily update/remove the tags of an execution after it has been created. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do
--tag hello --tag world
instead of providing a list of tags in string format?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can support both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion about this is not so strong that I'd say we need to support two options in case others prefer
--tags '["hello", "world"]'
. I personally find lists or jsons in string representation as cli args a bit cumbersome.