-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Config Override #3553
[RFC] Config Override #3553
Conversation
I've just read the linked issue (#475) and it seems like there is a lot of good conversation going on there. Should we finish the discussion on how to best achieve that in #475 and once things are ready copy the result over to this RFC? |
I've added my thoughts in #475 (comment) |
03-30-2023 Meeting notes. Byron Hsu: not sure how to implement this in the UI |
I agree with @bstadlbauer that when workflows become very nested/deep, this syntax will be very difficult to write for the user: wf_overrides = WFOverride(
n0=TaskOverride(
container_image="repo/image:0.0.1"
limits=Resource(cpu="2")
),
n1=WFOverride( # subwf
n0=TaskOverride(
container_image="repo/image:0.0.3"
limits=Resource(cpu="3")
)
)
) In addition, it will also be very complicated to come up with a way to show these overrides in the UI. This is why I proposed in the contributors meeting to create "named overrides" in order to avoid replicating the DAG structure again in the overrides. This could look e.g. like this: @task
def train(dataset_uri: str) -> str:
pass
@task
def evaluate(model_uri: str):
...
@workflow
def wf(...):
model1 = train(...).with_runtime_overrides("model_1_resources")
model2 = train(...).with_runtime_overrides("model_2_resources")
evaluate(model_uri = model1)
evaluate(model_uri = model2) This effectively means that at registration time the author of the workflow can mark the configs of specific nodes overridable when starting an execution. By marking them this way in the workflow and not the respective task decorator, the same task, in this case By naming the overrides, we also circumvent the nesting problem which would make it very simple to configure overrides in FlyteConsole: In addition, the FlyteConsole would not be overloaded when pre-filling the overridable configs since this is only possible for selected tasks, not the entire (potentially giant) workflow. Considerations: @task
def train(dataset_uri: str) -> str:
pass
@dynamic
def subwf(model_uri: str):
model = train(...).with_runtime_overrides("model_resources")
...
@workflow
def wf(...):
subwf() Let us consider a workflow where the runtime override is configured in a dynamic subworkflow. I don't know whether it would be possible when creating an execution of However, I feel it would be absolutely acceptable if the user specifies such a named override manually in the FlyteConsole when launching the execution and then when the dynamic subworkflow registers a task with a named override that matches a name the user specified when creating the execution, the override is applied. @hamersaw raised another concern: Let's say team 1 maintains this workflow: @task
def train_task_team1():
....
@workflow
def wf_team1():
train_task_team1().with_runtime_overrides("train_resources") And let's say team 2 maintains this workflow:
An engineer in team 2 might start an execution in the FlyteConsole and override the resources for their train task but potentially unknowingly also for the task of team 1. I would definitely say that the named overrides should only be valid within an execution. I personally am not too worried about this scenario but this might be a bigger issue in large organisations. In theory (without knowing at this point whether this is feasible) I could imagine something like this:
However, this might be going too far. |
Hi @fg91, you approach looks pretty neat. Can you add more details on how |
@fg91 I like this, the UX is great! Trying to take a step back here, it seems like this is pretty much a special case of "tagging" + matching on a tag (option 2 from #475 (comment)), right? So in theory, if I.e. on the backed, we could represent an override as something like (pseudocode; names all TBD): class NodeMatchCriteria:
task_name: str
node_name: Optional[str] = None
tags: Dict[str, str] = {}
function_arguments: Dict[str, Any] = {}
...
class ConfigOverride:
match: NodeMatchCriteria
resources: Resources
... I think if we could keep the matching a bit more general we could set us up for future usecases without too much additional implementation effort. One thing this would also allow (on the backend) is to re-run a workflow with override based on a node-name (as the node name is known from the UI in the second run). What do you think? |
My concern with "named override" method is that users can only override the configs on UI/CLI, but cannot override them inside the code. There are two folds of overriding in the code.
The second case would be extremely useful when reference workflows are shared across the team, but each team wants to tailor the config to their need.
|
To resolve to second case above, one possible way is to define a special input parameter (
|
I'm personally not a fan of passing task resources (or their overrides) as worklow/task arguments since this mixes task logic configuration with task resource config:
When being able to modify the code and not having to rely on overrides at runtime, what is the reason for not using the existing |
…/config-override
https://docs.google.com/document/d/1gaWU3lsa66APG2aD95_BeL9hFEsEsFiNGGj7KcQmt8c/edit?usp=sharing l listed down our usecase to better step back and think from the user side |
To resolve the concern that users cannot override them inside the code regarding @fg91's idea. Me and @pingsutw came up with a solution.
There are two ways they can override value:
|
That looks like a very nice UX to me @ByronHsu 👏 I think |
yeah that would be nice @fg91 |
@ByronHsu can we update the rfc itself with the proposal? |
@goyalankit What do you think ^? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ByronHsu for being persistent here even though this took longer than anticipated! Let's make sure to get this across the finish line and start implementing soon 👍
I think the UX is great now! The part that is still unclear to me is how the TaskNodeConfigOverride
is tied to a particular task on the backend.
Two options would come to mind:
-
.with_runtime_override("task-yee")
would fill a new field (something likestring runtime_override_name
) in theTaskNode
proto. Then theWorkflowMetadata
, theLaunchPlanSpec
as well as theExecutionSpec
would have a new fieldmap<string, TaskNodeConfigOverride> runtime_overrides
. -
We add a new field
map<string, string> tags
toTaskNode
. Then.with_runtime_override("task-yee")
would set a"runtime_override_name": "task-yee"
tag on the node. TheTaskNodeConfigOverride
would then need to know what it applies to (similar toNodeMatchCriteria
in this comment).
While the UX would be the same, this would also allow for other matches (e.g. based on node name or like here) in the future, without us needing to change the flyteidl API.
Also, should we add a sentence about precedence of these? E.g. something like UI (=Execution) > LaunchPlan > Workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like the proposed UX now 👏
with_runtime_overrides
was the first name that came to my mind when I first had this idea. @kumare3 called it with_override_hook
(?) in the contributors meeting. Which name should we write into the RFC in the end?
2023-05-11: Agree on naming before merge, also address implementation concerns. Byron might need help to distribute implementation tasks. The suggestion is to work on backend first (propeller and flytekit first). TSC happy to help kickoff implementation. |
Here are all the different ways to override the config.
from flytekit import task, workflow, Resources
@task
def t1():
print("hello world")
@workflow
def sub_wf():
t1()
from flytekit import task, workflow, Resources
from wf2 import sub_wf
@task
def t0():
print("hello world")
@workflow()
def wf():
t0()
sub_wf()
# wf1.py
@workflow()
def wf():
t0().with_runtime_override("model_1_resources", runtime_override_config(cpu=1, mem="1Gi"))
...
# wf2.py
@workflow
def sub_wf():
t1().with_runtime_override("model_2_resources", runtime_override_config(cpu=1, mem="1Gi"))
# wf1.py
@workflow(override={"model_2_resources": runtime_override_config(cpu=2, mem="2Gi")})
def wf():
... FlyteIDL:
FlyteAdmin:
Flytekit:
@ByronHsu is going to update IDL. cc @fg91 @bstadlbauer would you like to help work on flytekit and admin? |
@pingsutw do we also want to override on |
@pingsutw @ByronHsu what do you think of this: # wf1.py
@workflow
def wf():
t0()
sub_wf().override={"model_2_resources": runtime_override_config(cpu=2, mem="2Gi")} instead of # wf1.py
@workflow(override={"model_2_resources": runtime_override_config(cpu=2, mem="2Gi")})
def wf(): ?
Yes, I think this would be very useful 👍 |
@fg91 I like that, too. It will be much easier to implement.
yup, we should support it as well |
what if users want to override a task under the subworkflow's subworkflow? and users want to do that inside
|
@ByronHsu You still do it at top-level workflow. old one: @workflow(override={"model_2_resources": runtime_override_config(cpu=2, mem="2Gi")})
def wf():
t0()
sub_wf() Fabio suggested: @workflow
def wf():
t0()
sub_wf().override={"model_2_resources": runtime_override_config(cpu=2, mem="2Gi")} |
got it. So you meant if
My idea for subworkflow is that they do
The args are |
sub_wf().with_runtime_override(...).with_runtime_override(...).with_runtime_override(...) |
We can provide 'with_runtime_override' to override a single task in a sub workflow and 'with_runtime_overrides' for multiple tasks in a sub workflow. |
I'm in favor of chaining (e.g. @pingsutw Happy to help out! |
Instead of chaining or two methods, one with sub_wf().with_runtime_override({"model_2_resources": runtime_override_config(cpu=1, mem="1Gi"), "model_3_resources": ...}) Also happy to take a ticket @pingsutw :) |
Also works of course 👍 |
To me the RFC looks ready to approve after the results of the latest discussions have been incorporated into the rfc doc itself 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is shaping up to be a great feature!
@ByronHsu Also ok with this after all changes discussed there are incorporated in the RFC |
Also, one question that came up during a first stab at implementing this - what would the UX for overriding |
@bstadlbauer how about passing a nested dict? |
Signed-off-by: byhsu <[email protected]>
5af675d
to
b62ec71
Compare
Tracking issue
#475
Describe your changes
Add RFC
Note to reviewers
Need discussion on the UI part!