-
Notifications
You must be signed in to change notification settings - Fork 61
Add option to record whether a task ran on a spot instance. #82
Conversation
protos/flyteidl/event/event.proto
Outdated
// The instance type configured for interruptible tasks. | ||
INTERRUPTIBLE = 1; | ||
} | ||
InstanceType instance_type = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it ever run on both? if it gets interrupted three time and the fourth time we use a non-spot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep it can, but that's why this is per task execution. if we retry on a non-spot node that retry attempt will be a distinct task execution and the instance type will change for its metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
} | ||
|
||
// Holds metadata around how a task was executed. | ||
message TaskExecutionMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, can we conform to - flyteorg/flyte#325
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm I can add a freeform map as well? is there a specific reason this shouldn't have any structure?
protos/flyteidl/event/event.proto
Outdated
// Holds metadata around how a task was executed. | ||
message TaskExecutionMetadata { | ||
// Includes the type of machine used for this specific task execution. | ||
enum InstanceType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this an enum? Is it because interruptible is a top level field in Task. I think that is fine,
I would like to add other things here like - TaskType, TaskPluginID
and then a map<string,string> of other data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because i specifically wanted to define the set of possible values for instance type in particular :)
@kumare3 I'm not sure I understand why we want the task execution metadata to be specifically a free form map? in any case, if using an enum for instance type is acceptable can I keep this PR as is and let whoever takes on flyteorg/flyte#325 handle updating the proto message accordingly? I'm happy to take on that issue, but I'd rather make a separate PR to implement it :) |
protos/flyteidl/event/event.proto
Outdated
// The instance type configured for interruptible tasks. | ||
INTERRUPTIBLE = 1; | ||
} | ||
InstanceType instance_type = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it called "Instance Type" does that not refer to the p316xlarge
stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps InstanceClass would be more appropriate (and less specific?) if it's preferable to use the p316xlarge stuff, then the freeform map may make more sense here too @EngHabu
TL;DR
Add option to record whether a task ran on a spot instance.
Type
Are all requirements met?
Complete description
https://docs.google.com/document/d/1nqAp5-DanklNvG6OeQ3uIacZYGvf0oXioz7CboP_CsU/edit
Tracking Issue
flyteorg/flyte#520
Follow-up issue
NA