Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mapping over pod tasks #510

Merged
merged 6 commits into from
Jun 10, 2021
Merged

Support mapping over pod tasks #510

merged 6 commits into from
Jun 10, 2021

Conversation

katrogan
Copy link
Contributor

@katrogan katrogan commented Jun 8, 2021

Signed-off-by: Katrina Rogan [email protected]

TL;DR

Updates map task serialization to support non-container target type tasks (e.g. pod tasks).

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

How did you fix the bug, make the feature etc. Link to any design docs etc

Tracking Issue

flyteorg/flyte#1051

Follow-up issue

NA

katrogan added 2 commits June 8, 2021 11:02
Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]>
if self._run_task.task_type not in _K8S_POD_TARGET_TASK_TYPES:
return None

self._run_task.set_command_fn(self.get_command)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struggled with the right way to do this. An alternative was serializing the run_task's k8s pod as is and updating the command with pyflyte-map-execute after the fact but then that requires the map task to be familiar with the serialized structure of each non-container target type task, which didn't feel right.

Another option to avoid the _K8S_POD_TARGET_TASK_TYPES set is to call the corresponding method, (e.g. _run_task.get_k8s_pod(settings) and then see if it's empty or not before returning in this method - but this seems (marginally) inefficient.

If you can think of a better way to do this, let me know

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fair to call https://github.com/flyteorg/flytekit/blob/master/plugins/pod/flytekitplugins/pod/task.py#L103. This is because we now have K8sPodSpec as a target.
So ideally we should have one method per target. Or we could simply call it getTarget

Copy link
Contributor

@kumare3 kumare3 Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing I would say is it should be at the base level. the base should have getContainer, getK8sPodSpec, getSQL, getHTTPAPI etc
and the semantics should be crisp and state that we will only invoke one of them.

Now the question is which one to invoke? and so maybe we need to invoke all, and all should return None, except the one that is needed by that plugin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for python proto serialization we can't just produce one getTarget method for the oneof field, unfortunately. So each task needs to implement each get_foo target option

re: invoking each core target and having each return None except for the applicable target - this sounds good. Refactored, PTAL @kumare3

@@ -278,3 +278,46 @@ def simple_pod_task(i: int):
assert serialized.template.k8s_pod.metadata.labels == {"label": "foo"}
assert serialized.template.k8s_pod.metadata.annotations == {"anno": "bar"}
assert serialized.template.k8s_pod.pod_spec is not None


def test_map_pod_task_serialization():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't make complete sense to add this test here as opposed to test_map_task in core/ - but then that introduces a plugin dependency in core tests and I'm not sure what the lesser evil is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i completely get it, the thing is K8sPodSpec now is a core plugin.
Maybe the core task should support taking a yaml / json and the k8spod plugin can continue using the python interface to improve it. This way the flytekit remains devoid of k8s dependency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm not sure if i'm understanding correctly, but you mean have serialize() formulate the k8sPod target from a PodSpec struct? I'm not really a fan of making code more confusing (in this case, splitting up how we populate the k8sPod target across methods) just to make testing easier

@katrogan katrogan changed the title wip map pod tasks Support mapping over pod tasks Jun 9, 2021
@codecov
Copy link

codecov bot commented Jun 9, 2021

Codecov Report

Merging #510 (fa4d655) into master (e348c1b) will increase coverage by 0.02%.
The diff coverage is 96.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #510      +/-   ##
==========================================
+ Coverage   85.22%   85.25%   +0.02%     
==========================================
  Files         369      369              
  Lines       27871    27913      +42     
  Branches     2269     2269              
==========================================
+ Hits        23752    23796      +44     
- Misses       3497     3498       +1     
+ Partials      622      619       -3     
Impacted Files Coverage Δ
plugins/tests/pod/test_pod.py 95.20% <92.30%> (-0.38%) ⬇️
tests/flytekit/unit/core/test_serialization.py 92.10% <92.30%> (-0.03%) ⬇️
flytekit/core/map_task.py 80.43% <100.00%> (+2.38%) ⬆️
flytekit/core/python_auto_container.py 83.33% <100.00%> (+1.41%) ⬆️
plugins/pod/flytekitplugins/pod/task.py 92.95% <0.00%> (+1.40%) ⬆️
flytekit/core/interface.py 79.06% <0.00%> (+1.74%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e348c1b...fa4d655. Read the comment docs.

Signed-off-by: Katrina Rogan <[email protected]>
@katrogan katrogan marked this pull request as ready for review June 9, 2021 21:55
katrogan added 3 commits June 9, 2021 15:00
Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]>
@contextmanager
def prepare_target(self):
"""
Alters the underlying run_task command to modify it for map task execution and then resets it after.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be correct

Copy link
Contributor

@wild-endeavor wild-endeavor Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I think this is fine... instead of mucking with the command at the map task layer (because Pod objects are too complicated to muck around with), the correct command to use is delegated to the underlying task to use as it wishes, and the properly formed Pod is returned.

@kumare3 kumare3 self-requested a review June 10, 2021 21:35
Copy link
Contributor

@kumare3 kumare3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, a follow up issue to model tasktemplate.target in flytekit.core

@katrogan
Copy link
Contributor Author

@kumare3 done: flyteorg/flyte#1122

@katrogan katrogan merged commit d52d4ff into master Jun 10, 2021
EngHabu pushed a commit that referenced this pull request Jun 25, 2021
Signed-off-by: Haytham Abuelfutuh <[email protected]>
wild-endeavor pushed a commit that referenced this pull request Jun 30, 2021
wild-endeavor pushed a commit that referenced this pull request Jun 30, 2021
Signed-off-by: Haytham Abuelfutuh <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants