-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support mapping over pod tasks #510
Conversation
Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]>
flytekit/core/map_task.py
Outdated
if self._run_task.task_type not in _K8S_POD_TARGET_TASK_TYPES: | ||
return None | ||
|
||
self._run_task.set_command_fn(self.get_command) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Struggled with the right way to do this. An alternative was serializing the run_task's k8s pod as is and updating the command with pyflyte-map-execute
after the fact but then that requires the map task to be familiar with the serialized structure of each non-container target type task, which didn't feel right.
Another option to avoid the _K8S_POD_TARGET_TASK_TYPES set is to call the corresponding method, (e.g. _run_task.get_k8s_pod(settings)
and then see if it's empty or not before returning in this method - but this seems (marginally) inefficient.
If you can think of a better way to do this, let me know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fair to call https://github.com/flyteorg/flytekit/blob/master/plugins/pod/flytekitplugins/pod/task.py#L103. This is because we now have K8sPodSpec as a target.
So ideally we should have one method per target. Or we could simply call it getTarget
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only thing I would say is it should be at the base level. the base should have getContainer, getK8sPodSpec, getSQL, getHTTPAPI etc
and the semantics should be crisp and state that we will only invoke one of them.
Now the question is which one to invoke? and so maybe we need to invoke all, and all should return None, except the one that is needed by that plugin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for python proto serialization we can't just produce one getTarget method for the oneof field, unfortunately. So each task needs to implement each get_foo target option
re: invoking each core target and having each return None except for the applicable target - this sounds good. Refactored, PTAL @kumare3
@@ -278,3 +278,46 @@ def simple_pod_task(i: int): | |||
assert serialized.template.k8s_pod.metadata.labels == {"label": "foo"} | |||
assert serialized.template.k8s_pod.metadata.annotations == {"anno": "bar"} | |||
assert serialized.template.k8s_pod.pod_spec is not None | |||
|
|||
|
|||
def test_map_pod_task_serialization(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't make complete sense to add this test here as opposed to test_map_task in core/ - but then that introduces a plugin dependency in core tests and I'm not sure what the lesser evil is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i completely get it, the thing is K8sPodSpec now is a core plugin.
Maybe the core task should support taking a yaml / json and the k8spod plugin can continue using the python interface to improve it. This way the flytekit remains devoid of k8s dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm not sure if i'm understanding correctly, but you mean have serialize() formulate the k8sPod target from a PodSpec struct? I'm not really a fan of making code more confusing (in this case, splitting up how we populate the k8sPod target across methods) just to make testing easier
Codecov Report
@@ Coverage Diff @@
## master #510 +/- ##
==========================================
+ Coverage 85.22% 85.25% +0.02%
==========================================
Files 369 369
Lines 27871 27913 +42
Branches 2269 2269
==========================================
+ Hits 23752 23796 +44
- Misses 3497 3498 +1
+ Partials 622 619 -3
Continue to review full report at Codecov.
|
Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]>
@contextmanager | ||
def prepare_target(self): | ||
""" | ||
Alters the underlying run_task command to modify it for map task execution and then resets it after. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorrect doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I think this is fine... instead of mucking with the command at the map task layer (because Pod objects are too complicated to muck around with), the correct command to use is delegated to the underlying task to use as it wishes, and the properly formed Pod is returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, a follow up issue to model tasktemplate.target
in flytekit.core
@kumare3 done: flyteorg/flyte#1122 |
Signed-off-by: Haytham Abuelfutuh <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]> Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: Katrina Rogan [email protected]
TL;DR
Updates map task serialization to support non-container target type tasks (e.g. pod tasks).
Type
Are all requirements met?
Complete description
How did you fix the bug, make the feature etc. Link to any design docs etc
Tracking Issue
flyteorg/flyte#1051
Follow-up issue
NA