-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revising matchable resource documentation #257
Changes from 9 commits
3c4e619
1318c0c
7c113cd
cbe3568
e81c760
563fb43
70bdb44
c024505
db31dc4
16ead7c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
apiVersion: v1 | ||
kind: ResourceQuota | ||
metadata: | ||
name: project-quota | ||
namespace: {{ namespace }} | ||
spec: | ||
hard: | ||
limits.cpu: {{ projectQuotaCpu }} | ||
limits.memory: {{ projectQuotaMemory }} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,36 +4,45 @@ | |
Configuring customizable resources | ||
################################## | ||
|
||
As the complexity of your user-base grows, you may find yourself tweaking resource assignments based on specific projects, domains and workflows. | ||
This document walks through how to use MatchableResource attributes to customize your workflow execution environment. | ||
As the complexity of your user base grows, you may find yourself tweaking resource assignments based on specific projects, domains and workflows. This document walks through how and in what ways you can configure your Flyte deployment. | ||
|
||
Flyteadmin allows for overrides of task resource request and limit defaults, kubernetes cluster resource configuration, | ||
dynamic task execution queues and specifying executions on specific kubernetes clusters. These can all be overriden for specific combinations | ||
of domain; domain and project; domain, project and workflow (name); and domain, project, workflow (name), and launch plan. | ||
|
||
*************************** | ||
Configurable Resource Types | ||
*************************** | ||
|
||
The proto definition is the definitive source of | ||
Flyte allows these custom settings along the following combination of dimensions | ||
|
||
- domain | ||
- project and domain | ||
- project, domain, and name (must be either a workflow name or a launch plan name) | ||
|
||
Please see the :ref:`concepts` document for more information on projects and domains. Along these dimensions, the following settings are configurable. Note that not all three of the combinations above are valid for each of these settings. | ||
|
||
- Defaults for task resource requests and limits (when not specified by the author of the task). | ||
- Settings for the cluster resource configuration that feeds into Admin's cluster resource manager. | ||
- Execution queues that are used for Dynamic Tasks. Read more about execution queues here, but effectively they're meant to be used with constructs like AWS Batch. | ||
- Determining how workflow executions get assigned to clusters in a multi-cluster Flyte deployment. | ||
|
||
The proto definition is the definitive source of which | ||
`matchable attributes <https://github.com/lyft/flyteidl/blob/master/protos/flyteidl/admin/matchable_resource.proto>`_ | ||
which can be customized. See below for a detailed explanation | ||
can be customized. | ||
|
||
Each of the four above settings are discussed below. Also, since the flyte-cli tool does not yet hit these endpoints, we are including some sample ``curl`` commands for administrators to reference. | ||
|
||
|
||
Task Resources | ||
============== | ||
|
||
This includes setting default value for task resource requests and limits for the following resources: | ||
|
||
- cpu | ||
|
||
- gpu | ||
|
||
- memory | ||
|
||
- storage | ||
|
||
In the absence of an override the global | ||
`default values <https://github.com/lyft/flyteadmin/blob/6a64f00315f8ffeb0472ae96cbc2031b338c5840/flyteadmin_config.yaml#L124,L134>`_ | ||
`default values <https://github.com/lyft/flyteadmin/blob/6a64f00315f8ffeb0472ae96cbc2031b338c5840/flyteadmin_config.yaml#L124,L134>`__ | ||
in the flyteadmin config are used. | ||
|
||
The override values from the database are assigned at execution time. | ||
|
@@ -42,11 +51,22 @@ The override values from the database are assigned at execution time. | |
Cluster Resources | ||
================= | ||
|
||
These are free-form key-value pairs which are used when creating project-domain based resources on Flyte kubernetes clusters. | ||
These are free-form key-value pairs which are used when filling in the templates that Admin feeds into its cluster manager. The keys represent templatized variables in `clusterresource template yaml <https://github.com/lyft/flyteadmin/tree/master/sampleresourcetemplates>`__ and the values are what you want to see filled in. | ||
|
||
The keys represent templatized variables in `clusterresource template yaml <https://github.com/lyft/flyteadmin/tree/master/sampleresourcetemplates>`_ | ||
In the absence of custom override values, templateData from the `flyteadmin config <https://github.com/lyft/flyteadmin/blob/6a64f00315f8ffeb0472ae96cbc2031b338c5840/flyteadmin_config.yaml#L154,L159>`__ is used as a default. | ||
|
||
In the absence of custom override values, templateData from the `flyteadmin config <https://github.com/lyft/flyteadmin/blob/6a64f00315f8ffeb0472ae96cbc2031b338c5840/flyteadmin_config.yaml#L154,L159>`_ is used as a default. | ||
Note that these settings can only take on domain, or a project and domain specificity. Project & domain together in Flyte form Kubernetes namespaces. Since Flyte has not tied in the notion of a workflow or a launch plan to any Kubernetes constructs, specifying a workflow or launch plan name doesn't make any sense. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
this is not always true - this is now configurable (e.g. for the L5 use case namespaces are now domain only) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is a good point, let me remove that. |
||
|
||
|
||
Command | ||
------- | ||
Running the following, will make it so that when Admin fills in cluster resource templates, the K8s namespace ``projectname-staging`` will have a resource quota of 1000 CPU cores and 5TB of memory. | ||
|
||
.. code-block:: console | ||
|
||
curl --request PUT 'https://flyte.company.net/api/v1/project_domain_attributes/projectname/staging' --header 'Content-Type: application/json' --data-raw '{"attributes":{"matchingAttributes":{"clusterResourceAttributes":{"attributes":{"projectQuotaCpu": "1000", "projectQuotaMemory": "5000Gi"}}}}}' | ||
|
||
These values will in turn be used to fill in the template fields in the file at ``kustomize/overlays/sandbox/admindeployment/clusterresource-templates/ab_project-resource-quota.yaml`` from the base of this repository for the ``projectname-staging`` namespace and that namespace only. For other namespaces, the defaults at the bottom of `this file <https://github.com/lyft/flyteadmin/blob/6a64f00315f8ffeb0472ae96cbc2031b338c5840/flyteadmin_config.yaml#L152>`__ would still be applied. | ||
|
||
|
||
Execution Queues | ||
|
@@ -55,10 +75,17 @@ Execution Queues | |
Execution queues are use to determine where dynamic tasks run. | ||
|
||
Execution queues themselves are currently defined in the | ||
`flyteadmin config <https://github.com/lyft/flyteadmin/blob/6a64f00315f8ffeb0472ae96cbc2031b338c5840/flyteadmin_config.yaml#L97,L106>`_. | ||
`flyteadmin config <https://github.com/lyft/flyteadmin/blob/6a64f00315f8ffeb0472ae96cbc2031b338c5840/flyteadmin_config.yaml#L97,L106>`__. | ||
|
||
The **attributes** associated with an execution queue must match the **tags** for workflow executions. The tags are associated with configurable resources | ||
stored in the admin database. | ||
stored in the Admin database. | ||
|
||
Command | ||
------- | ||
|
||
.. code-block:: console | ||
|
||
curl --request PUT 'https://flyte.company.net/api/v1/workflow_attributes/project/domain/YourWorkflowName' --header 'Content-Type: application/json' --data-raw '{"attributes":{"matchingAttributes":{"executionQueueAttributes":{"tags":["my_queue"]}}}}' | ||
|
||
|
||
Execution Cluster Label | ||
|
@@ -99,19 +126,19 @@ Let's say that our database includes the following | |
+------------+--------------+----------+-------------+-----------+ | ||
|
||
Any inbound CreateExecution requests with **[Domain: Production, Project: widgetmodels, Workflow: Demand]** for any launch plan would have a tag value of "supply". | ||
Any inbound CreateExecution requests with **[Domain: Production, Project: widgetmodels]** for any workflow other than Deman and for any launch plan would have a tag value of "critical". | ||
Any inbound CreateExecution requests with **[Domain: Production, Project: widgetmodels]** for any workflow other than Demand and for any launch plan would have a tag value of "critical". | ||
|
||
All other inbound CreateExecution requests would use the default values specified in the flyteadmin config (if any). | ||
|
||
********* | ||
Debugging | ||
********* | ||
|
||
Use the `get <https://github.com/lyft/flyteidl/blob/ba13965bcfbf7e7bfce40664800aaf1f2a1088a1/protos/flyteidl/service/admin.proto#L395>`_ endpoint | ||
Use the `get <https://github.com/lyft/flyteidl/blob/ba13965bcfbf7e7bfce40664800aaf1f2a1088a1/protos/flyteidl/service/admin.proto#L395>`__ endpoint | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yikes, thanks for correcting all of these 🤦♀️ |
||
to see if overrides exist for a specific resource. | ||
|
||
E.g. `https://example.com/api/v1/project_domain_attributes/widgetmodels/production?resource_type=2 <https://example.com/api/v1/project_domain_attributes/widgetmodels/production?resource_type=2>`_ | ||
E.g. `https://example.com/api/v1/project_domain_attributes/widgetmodels/production?resource_type=2 <https://example.com/api/v1/project_domain_attributes/widgetmodels/production?resource_type=2>`__ | ||
|
||
To get the global state of the world, use the list all endpoint, e.g. `https://example.com/api/v1/matchable_attributes?resource_type=2 <https://example.com/api/v1/matchable_attributes?resource_type=2>`_. | ||
To get the global state of the world, use the list all endpoint, e.g. `https://example.com/api/v1/matchable_attributes?resource_type=2 <https://example.com/api/v1/matchable_attributes?resource_type=2>`__. | ||
|
||
The resource type enum (int) is defined in the `proto <https://github.com/lyft/flyteidl/blob/ba13965bcfbf7e7bfce40664800aaf1f2a1088a1/protos/flyteidl/admin/matchable_resource.proto#L8,L20>`_. | ||
The resource type enum (int) is defined in the `proto <https://github.com/lyft/flyteidl/blob/ba13965bcfbf7e7bfce40664800aaf1f2a1088a1/protos/flyteidl/admin/matchable_resource.proto#L8,L20>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: doesn't need the
ab_
we just do that so that namespaces get created first (golang reads the files in alphabetically)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... when these are done as config maps then, what order are they run in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the config maps are mounted as files here, no?