From 59455feed595397a46f4e39266bcdc18d586147f Mon Sep 17 00:00:00 2001 From: Jerop Date: Tue, 28 Jun 2022 12:02:04 -0400 Subject: [PATCH] TEP-0086: Larger Results via Sidecar Logs We propose that we provide the multiple solutions, all guarded behind a `larger-results` feature flag, so that we can experiment and figure out a way forward. These gated solutions will be alpha, and can be changed or removed at any time. In this change, we propose experimenting with Sidecar Logs as a solution for providing larger Results within the CRDs. This will be enabled by setting `larger-results`: `"sidecar-logs"`. This solution can be changed at any time, or removed completely. This will give us an opportunity to gather user feedback and find ways to address the concerns, as we figure out a way forward. /kind tep --- ...ng-the-way-result-parameters-are-stored.md | 81 +++++++++++++++---- teps/README.md | 2 +- 2 files changed, 66 insertions(+), 17 deletions(-) diff --git a/teps/0086-changing-the-way-result-parameters-are-stored.md b/teps/0086-changing-the-way-result-parameters-are-stored.md index 33c7f2a57..49d7ad806 100644 --- a/teps/0086-changing-the-way-result-parameters-are-stored.md +++ b/teps/0086-changing-the-way-result-parameters-are-stored.md @@ -2,7 +2,7 @@ status: proposed title: Changing the way result parameters are stored creation-date: '2021-09-27' -last-updated: '2022-06-09' +last-updated: '2022-08-01' authors: - '@tlawrie' - '@imjasonh' @@ -10,6 +10,8 @@ authors: - '@pritidesai' - '@tomcli' - '@ScrapCodes' +- '@chitrangpatel' +- '@jerop' --- # TEP-0086: Changing the way result parameters are stored @@ -21,7 +23,7 @@ authors: - [Non-Goals](#non-goals) - [Use Cases (optional)](#use-cases-optional) - [Requirements](#requirements) - - [Required](#required) + - [Proposal](#proposal) - [Alternatives](#alternatives) - [Result Sidecar - Upload results from sidecar](#result-sidecar---upload-results-from-sidecar) - [Option: Supporting multiple sidecars](#option-supporting-multiple-sidecars) @@ -44,7 +46,10 @@ authors: - [Store results on PVCs](#store-results-on-pvcs) - [No change. Use workspaces.](#no-change-use-workspaces) - [Repurpose Artifact Storage API](#repurpose-artifact-storage-api) - - [Using logs emitted by the Task](#using-logs-emitted-by-the-task) + - [Sidecar Logs](#sidecar-logs) + - [Feature Gates for Sidecar Logs](#feature-gates-for-sidecar-logs) + - [Design Details of Sidecar Logs](#design-details-of-sidecar-logs) + - [Caveats of Sidecar Logs](#caveats-of-sidecar-logs) - [Infrastructure Needed (optional)](#infrastructure-needed-optional) - [Upgrade & Migration Strategy (optional)](#upgrade--migration-strategy-optional) - [Implementation Pull request(s)](#implementation-pull-requests) @@ -115,6 +120,19 @@ Additionally, this will help projects that wrap/abstract Tekton where users unde * Define a clear upper limit on the expected maximum size of a result. * Support an environment where executing pods are not permitted to make network connections within the cluster. +## Proposal + +There's no solution that will meet all the [requirements](#requirements) described above. There are several efforts to +build proof of concepts for different solutions described in [alternatives](#alternatives). We propose that we provide +the multiple solutions, all guarded behind a `larger-results` feature flag, so that we can experiment and figure out a +way forward. These gated solutions will be alpha, and can be changed or removed at any time. + +These are the proof of concepts we'll provide and their feature flags: + +| Solution | Feature Gate | Description | +|-------------------------------|----------------------------------|------------------------------------------------------------------------| +| [Sidecar Logs](#sidecar-logs) | `larger-results`: `sidecar-logs` | Use stdout logs from a dedicated `Sidecar` to return a `Result` object | + ## Alternatives ### Result Sidecar - Upload results from sidecar @@ -374,21 +392,47 @@ Cons: - [Docs on setting up storage](https://github.com/tektoncd/pipeline/blob/main/docs/install.md#configuring-pipelineresource-storage) - [Interface](https://github.com/tektoncd/pipeline/blob/main/pkg/artifacts/artifacts_storage.go#L39-L47) -### Using logs emitted by the Task +### Sidecar Logs - - We are also exploring using **stdout logs from a dedicated sidecar to return a json result object** as a simpler way -to support larger TaskResults, but we need to explore this in a POC as we suspect we may not be able to rely on logs for this. - - The controller would wait for the sidecar to exit and then read the logs based on a particular query and append info to the TaskRun - - Potential to use a CloudEvent object to wrap result object +We propose using **stdout logs from a dedicated `Sidecar` to return a json `Result` object** to support larger +`Results`. The controller would wait for the `Sidecar` to exit and then read the logs based on a particular query and +append info to the `TaskRun`. -Cons: -- No guarantees on when logs will be available (would have to wait for this before considering a TaskRun complete) -- No guarantee you'll be able to access a log before it disappears (e.g. logs will not be available via the k8s API - once a pod is deleted) -- The storage of the result parameter may still be limited by a number of scenarios, including: - - [1.5 MB CRD size](https://github.com/kubernetes/kubernetes/issues/82292) - - The total size of the PipelineRun _if_ the TaskRun content is included, however - [TEP-100 is removing this](https://github.com/tektoncd/community/blob/main/teps/0100-embedded-taskruns-and-runs-status-in-pipelineruns.md) +We propose injecting a dedicated `Sidecar` alongside the `Steps` which will watch the `Results` path of the `Steps`. +This `Sidecar` will output the name of the `Result` and its contents to stdout in a parsable pattern. The `TaskRun` +controller will access the stdout logs of the `Sidecar`, extract the `Result` and its contents during reconciliation. +After the `Steps` have terminated, the `TaskRun` controller will attach the `Results` from the logs of the `Sidecar` +instead of using the termination message (hence bypassing the 4K limit) to update the `Results` fields in the CRD. +This method keeps the rest of the current functionality identical and does not require any external storage mechanism. +For further details, see the [demonstration][demo] of the proof of concept. + +#### Feature Gates for Sidecar Logs + +This solution will be gated using a `larger-results` feature flag which users can set to `"sidecar-logs"` to enable it. +This provides an opportunity to experiment with this solution to provide `Results` within the CRDs as we figure out +the solutions for external storage. + +In [TEP-0100][tep-0100] we proposed changes to `PipelineRun` status to reduce the amount of information stored about +the status of `TaskRuns` and `Runs` to improve performance, reduce memory bloat and improve extensibility. Now that +those changes have been implemented, the `PipelineRun` status is set up to handle larger `Results` without +exacerbating the performance and storage issues that were there before. For `ChildReferences` to be populated, the +`embedded-status` must be set to `"minimal"`. Thus, will require that minimal embedded status is enabled during the +migration until it becomes the only supported status. This requirement also makes the behavior clearer to users - +auto-adding the minimal status when users have not enabled it makes the user experience opaque and surprising. + +#### Design Details of Sidecar Logs + +In order to parse the `Results` volume and output the contents to stdout in the `Container`, the `Sidecar` will run +a script that: +- is auto-generated based on the required `Results` - identified from `taskSpec.results` field. +- periodically checks for files in the directory `/tekton/results`. +- when a `Result` is found, it prints it to stdout in a parsable pattern. +- When all the expected `Results` are found, it breaks out of the periodic loop and exits. + +#### Caveats of Sidecar Logs + +- The storage of the result parameter may still be limited by the maximum allowed [CRD size][crd-size]. Note that the + maximum size is shared among all the `Results` of the `TaskRun` and all its other contents (e.g. `Status`). ### Using embedded storage client to store result files in remote storage @@ -452,3 +496,8 @@ It will be a quick reference for those looking for implementation of this TEP. - [HackMD Result Collector Sidecar Design](https://hackmd.io/a6Kl4oS0SaOyBqBPTirzaQ) - [TEP-0086 Design Breakout Session Recording](https://drive.google.com/file/d/1lIqyy1RyZMYOrMCC2CLZD8eOf0NrVeDb/view?usp=sharing) - [TEP-0086 Design Breakout Session Notes](https://hackmd.io/YU_g27vRS2S5DwfBXDGpYA?view) + +[crd-size]: https://github.com/kubernetes/kubernetes/issues/82292 +[tep-0100]: ./0100-embedded-taskruns-and-runs-status-in-pipelineruns.md +[demo]: https://drive.google.com/file/d/14tDHNgpzOZ--5nMsOsTBhxsDgDDM_7iQ/view?t=1h01m41s +[tektoncd/pipeline#3497]: https://github.com/tektoncd/pipeline/issues/3497 diff --git a/teps/README.md b/teps/README.md index 1c476b0f8..6756c3264 100644 --- a/teps/README.md +++ b/teps/README.md @@ -238,7 +238,7 @@ This is the complete list of Tekton teps: |[TEP-0083](0083-scheduled-and-polling-runs-in-tekton.md) | Scheduled and Polling runs in Tekton | proposed | 2021-09-13 | |[TEP-0084](0084-endtoend-provenance-collection.md) | end-to-end provenance collection | proposed | 2021-09-16 | |[TEP-0085](0085-per-namespace-controller-configuration.md) | Per-Namespace Controller Configuration | proposed | 2021-10-14 | -|[TEP-0086](0086-changing-the-way-result-parameters-are-stored.md) | Changing the way result parameters are stored | proposed | 2022-06-09 | +|[TEP-0086](0086-changing-the-way-result-parameters-are-stored.md) | Changing the way result parameters are stored | proposed | 2022-08-01 | |[TEP-0088](0088-result-summaries.md) | Tekton Results - Record Summaries | proposed | 2021-10-01 | |[TEP-0089](0089-nonfalsifiable-provenance-support.md) | Non-falsifiable provenance support | implementable | 2022-01-18 | |[TEP-0090](0090-matrix.md) | Matrix | implementable | 2022-06-22 |