diff --git a/mkdocs/docs/concepts/buildingblocks.md b/mkdocs/docs/concepts/buildingblocks.md index e7f166423..40239426e 100644 --- a/mkdocs/docs/concepts/buildingblocks.md +++ b/mkdocs/docs/concepts/buildingblocks.md @@ -39,7 +39,7 @@ When two or more versions participate in an experiment, Iter8 **recommends a ver ## Metrics -Metric backends like Prometheus, New Relic, Sysdig and Elastic collect metrics for deployed versions and serve them through REST APIs. Iter8 defines a new Kubernetes resource called **Metric** that makes it easy to use metrics in experiments from any RESTful metrics backend. +Metric providers like **Prometheus**, **New Relic**, **Sysdig** and **Elastic** can collect metrics for versions and serve them through REST APIs. Iter8 defines a new Kubernetes resource called **Metric** that makes it easy to use metrics in experiments from these and any other RESTful metrics provider. ## Objectives diff --git a/mkdocs/docs/concepts/features.md b/mkdocs/docs/concepts/features.md index f9098e71c..5197b621f 100644 --- a/mkdocs/docs/concepts/features.md +++ b/mkdocs/docs/concepts/features.md @@ -14,7 +14,7 @@ Iter8 makes it easy to achieve the following goals. - **Progressive** traffic shifting. - **Dark launches**, **traffic mirroring** and **traffic segmentation**. - Use Helm, Kustomize, and plain YAML/JSON for defining your app manifests. -- Use out-of-the-box metrics or define custom metrics based on data in **Prometheus**. +- Use metrics from any RESTful provider including **Prometheus**, **New Relic**, **Sysdig**, and **Elastic**. - Statistically rigorous evaluation of versions, traffic splitting, and promotion/rollback decisions using **Bayesian learning** and **multi-armed bandit** algorithms. - Observe experiments in realtime. diff --git a/mkdocs/docs/getting-started/install.md b/mkdocs/docs/getting-started/install.md index f4ee16d5f..d2306e79c 100644 --- a/mkdocs/docs/getting-started/install.md +++ b/mkdocs/docs/getting-started/install.md @@ -5,43 +5,18 @@ title: Installation # Installation -## Step 1: Iter8 +## Iter8 -!!! example "Prerequisites" - - 1. A Kubernetes cluster - 2. [kubectl CLI](https://kubernetes.io/docs/tasks/tools/install-kubectl/) - -Install Iter8 in your Kubernetes cluster as follows. +Install Iter8 in your Kubernetes cluster as follows. This installation requires [`kubectl`](https://kubernetes.io/docs/tasks/tools/install-kubectl/). ```shell -export TAG=v0.3.2 +export TAG=v0.4.3 curl -s https://raw.githubusercontent.com/iter8-tools/iter8-install/main/install.sh | bash ``` -## (Optional) Step 2: Prometheus add-on - -Install Iter8's Prometheus add-on in your cluster as follows. This step assumes you have installed Iter8 following Step 1 above. - -```shell -export TAG=v0.3.2 -curl -s https://raw.githubusercontent.com/iter8-tools/iter8-install/main/install-prom-add-on.sh | bash -``` - -??? note "Running Iter8 tutorials without Iter8's Prometheus add-on" - When you installed Iter8 in the first step above, you also installed several *out-of-the-box* Iter8 metric resources. They are required for running the tutorials documented on this site. - - The out-of-the-box metric resources have a urlTemplate field. This field is configured as the URL of the Prometheus instance created in this step. - - You can skip this step and still run Iter8 tutorials using your own Prometheus instance. To do so, ensure that your Prometheus instance scrapes the end-points that would have been scraped by the Prometheus instance created in this step, and configure the urlTemplate fields of Iter8 metric resources to match the URL of your Prometheus instance. - -## (Optional) Step 3: iter8ctl -The iter8ctl CLI enables real-time observability of Iter8 experiments. - -!!! example "Prerequisites" - - Go 1.13+ +## iter8ctl (optional) +The `iter8ctl` client facilitates real-time observability of Iter8 experiments. Install `iter8ctl` on your local machine as follows. This installation requires Go 1.13+. ```shell -GO111MODULE=on GOBIN=/usr/local/bin go get github.com/iter8-tools/iter8ctl@v0.1.2 +GO111MODULE=on GOBIN=/usr/local/bin go get github.com/iter8-tools/iter8ctl@v0.1.3 ``` \ No newline at end of file diff --git a/mkdocs/docs/getting-started/quick-start/with-knative.md b/mkdocs/docs/getting-started/quick-start/with-knative.md index 6bfcb0ce9..b37701301 100644 --- a/mkdocs/docs/getting-started/quick-start/with-knative.md +++ b/mkdocs/docs/getting-started/quick-start/with-knative.md @@ -251,7 +251,7 @@ Observe the experiment in realtime. Paste commands from the tabs below in separa === "iter8ctl" Install `iter8ctl`. You can change the directory where `iter8ctl` binary is installed by changing `GOBIN` below. ```shell - GO111MODULE=on GOBIN=/usr/local/bin go get github.com/iter8-tools/iter8ctl@v0.1.2 + GO111MODULE=on GOBIN=/usr/local/bin go get github.com/iter8-tools/iter8ctl@v0.1.3 ``` Periodically describe the experiment. @@ -261,7 +261,7 @@ Observe the experiment in realtime. Paste commands from the tabs below in separa sleep 4 done ``` - ??? info "iter8ctl output" + ??? info "Look inside `iter8ctl` output" The `iter8ctl` output will be similar to the following. ```shell ****** Overview ****** diff --git a/mkdocs/docs/images/illustration.png b/mkdocs/docs/images/illustration.png index 8b6728836..e28ff787a 100644 Binary files a/mkdocs/docs/images/illustration.png and b/mkdocs/docs/images/illustration.png differ diff --git a/mkdocs/docs/images/whatisiter8.png b/mkdocs/docs/images/whatisiter8.png index d468b5266..24f2f5e69 100644 Binary files a/mkdocs/docs/images/whatisiter8.png and b/mkdocs/docs/images/whatisiter8.png differ diff --git a/mkdocs/docs/metrics/defining-iter8-metrics.md b/mkdocs/docs/metrics/defining-iter8-metrics.md new file mode 100644 index 000000000..2bc3b459d --- /dev/null +++ b/mkdocs/docs/metrics/defining-iter8-metrics.md @@ -0,0 +1,645 @@ +--- +template: main.html +--- + +# Defining Iter8 Metrics + +This document describes how you can create Iter8 metrics and (optionally) supply authentication information that may be required by the metrics provider. + +Metric providers differ in the following aspects. + +* HTTP request authentication method: no authentication, basic auth, API keys, or bearer token +* HTTP request method: GET or POST +* Format of HTTP parameters and/or JSON body used while querying them +* Format of the JSON response returned by the provider +* The logic used by Iter8 to extract the metric value from the JSON response + +The examples in this document focus on Prometheus, NewRelic, Sysdig, and Elastic. However, the principles illustrated here will enable you to use metrics from any provider in experiments. + +## Defining metrics + +> **Note:** Metrics are defined by you, the **Iter8 end-user**. + +=== "Prometheus" + + Prometheus does not support any authentication mechanism *out-of-the-box*. However, Prometheus can be setup in conjunction with a reverse proxy, which in turn can support HTTP request authentication, as described [here](https://prometheus.io/docs/guides/basic-auth/). + + === "No Authentication" + The following is an example of an Iter8 metric with Prometheus as the provider. This example assumes that Prometheus can be queried by Iter8 without any authentication. + + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Metric + metadata: + name: request-count + spec: + description: A Prometheus example + provider: prometheus + params: + - name: query + value: >- + sum(increase(revision_app_request_latencies_count{service_name='${name}',${userfilter}}[${elapsedTime}s])) or on() vector(0) + type: Counter + jqExpression: ".data.result[0].value[1] | tonumber" + urlTemplate: http://myprometheusservice.com/api/v1 + ``` + + === "Basic auth" + Suppose Prometheus is set up to enforce basic auth with the following credentials: + + ```yaml + username: produser + password: t0p-secret + ``` + + You can enable Iter8 to query this Prometheus instance as follows. + + 1. **Create secret:** Create a Kubernetes secret that contains the authentication information. In particular, this secret needs to have the `username` and `password` fields in the `data` section with correct values. + ```shell + kubectl create secret generic promcredentials -n myns --from-literal=username=produser --from-literal=password=t0p-secret + ``` + + 2. **Create RBAC rule:** Provide the required permissions for Iter8 to read this secret. The service account `iter8-analytics` in the `iter8-system` namespace will have permissions to read secrets in the `myns` namespace. + ```shell + kubectl create rolebinding iter8-cred --clusterrole=iter8-secret-reader-analytics --serviceaccount=iter8-system:iter8-analytics --namespace=myns + ``` + + 3. **Define metric:** When defining the metric, ensure that the `authType` field is set to `Basic` and the appropriate `secret` is referenced. + + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Metric + metadata: + name: request-count + spec: + description: A Prometheus example + provider: prometheus + params: + - name: query + value: >- + sum(increase(revision_app_request_latencies_count{service_name='${name}',${userfilter}}[${elapsedTime}s])) or on() vector(0) + type: Counter + authType: Basic + secret: myns/promcredentials + jqExpression: ".data.result[0].value[1] | tonumber" + urlTemplate: https://my.secure.prometheus.service.com/api/v1 + ``` + + ??? hint "Brief explanation of the `request-count` metric" + 1. Prometheus enables metric queries using HTTP GET requests. `GET` is the default value for the `method` field of an Iter8 metric. This field is optional; it is omitted in the definition of `request-count`, and defaulted to `GET`. + 2. Iter8 will query Prometheus during each iteration of the experiment. In each iteration, Iter8 will use `n` HTTP queries to fetch metric values for each version, where `n` is the number of versions in the experiment[^2]. + 3. The HTTP query used by Iter8 contains a single query parameter named `query` as [required by Prometheus](https://prometheus.io/docs/prometheus/latest/querying/api/). The value of this parameter is derived by [substituting the placeholders](#placeholder-substitution) in the value string. + 4. The `jqExpression` enables Iter8 to extract the metric value from the JSON response returned by Prometheus. + 5. The `urlTemplate` field provides the URL of the prometheus service. + +=== "New Relic" + New Relic uses API Keys to authenticate requests as documented [here](https://docs.newrelic.com/docs/apis/rest-api-v2/get-started/introduction-new-relic-rest-api-v2/). The API key may be directly embedded within the Iter8 metric, or supplied as part of a Kubernetes secret. + + === "API key embedded in metric" + The following is an example of an Iter8 metric with Prometheus as the provider. In this example, `t0p-secret-api-key` is the New Relic API key. + + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Metric + metadata: + name: name-count + spec: + description: A New Relic example + provider: newrelic + params: + - name: nrql + value: >- + SELECT count(appName) FROM PageView WHERE revisionName='${revision}' SINCE ${elapsedTime} seconds ago + type: Counter + headerTemplates: + - name: X-Query-Key + value: t0p-secret-api-key + jqExpression: ".results[0].count | tonumber" + urlTemplate: https://insights-api.newrelic.com/v1/accounts/my_account_id + ``` + + === "API key embedded in secret" + Suppose your New Relic API key is `t0p-secret-api-key`; you wish to store this API key in a Kubernetes secret, and reference this secret in an Iter8 metric. You can do so as follows. + + 1. **Create secret:** Create a Kubernetes secret containing the API key. + ```shell + kubectl create secret generic nrcredentials -n myns --from-literal=mykey=t0p-secret-api-key + ``` + The above secret contains a data field named `mykey` whose value is the API key. The data field name (which can be any string of your choice) will be used in Step 3 below as a placeholder. + + 2. **Create RBAC rule:** Provide the required permissions for Iter8 to read this secret. The service account `iter8-analytics` in the `iter8-system` namespace will have permissions to read secrets in the `myns` namespace. + ```shell + kubectl create rolebinding iter8-cred --clusterrole=iter8-secret-reader-analytics --serviceaccount=iter8-system:iter8-analytics --namespace=myns + ``` + + 3. **Define metric:** When defining the metric, ensure that the `authType` field is set to `APIKey` and the appropriate `secret` is referenced. In the `headerTemplates` field, include `X-Query-Key` as the name of a header field (as [required by New Relic](https://docs.newrelic.com/docs/insights/event-data-sources/insights-api/query-insights-event-data-api/#create-request)). The value for this header field is a templated string. Iter8 will substitute the placeholder ${mykey} at query time, by looking up the referenced `secret` named `nrcredentials` in the `myns` namespace. + + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Metric + metadata: + name: name-count + spec: + description: A New Relic example + provider: newrelic + params: + - name: nrql + value: >- + SELECT count(appName) FROM PageView WHERE revisionName='${revision}' SINCE ${elapsedTime} seconds ago + type: Counter + authType: APIKey + secret: myns/nrcredentials + headerTemplates: + - name: X-Query-Key + value: ${mykey} + jqExpression: ".results[0].count | tonumber" + urlTemplate: https://insights-api.newrelic.com/v1/accounts/my_account_id + ``` + + ???+ hint "Brief explanation of the `name-count` metric" + 1. New Relic enables metric queries using both HTTP GET or POST requests. `GET` is the default value for the `method` field of an Iter8 metric. This field is optional; it is omitted in the definition of `name-count`, and defaulted to `GET`. + 2. Iter8 will query New Relic during each iteration of the experiment. In each iteration, Iter8 will use `n` HTTP queries to fetch metric values for each version, where `n` is the number of versions in the experiment[^2]. + 3. The HTTP query used by Iter8 contains a single query parameter named `nrql` as [required by New Relic](https://docs.newrelic.com/docs/insights/event-data-sources/insights-api/query-insights-event-data-api/). The value of this parameter is derived by [substituting the placeholders](#placeholder-substitution) in its value string. + 4. The `jqExpression` enables Iter8 to extract the metric value from the JSON response returned by New Relic. + 5. The `urlTemplate` field provides the URL of the New Relic service. + +=== "Sysdig" + Sysdig data API accepts HTTP POST requests and uses a bearer token for authentication as documented [here](https://docs.sysdig.com/en/sysdig-rest-api-conventions.html). The bearer token may be directly embedded within the Iter8 metric, or supplied as part of a Kubernetes secret. + + === "Bearer token embedded in metric" + The following is an example of an Iter8 metric with Sysdig as the provider. In this example, `87654321-1234-1234-1234-123456789012` is the Sysdig bearer token (also referred to as access key by Sysdig). + + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Metric + metadata: + name: cpu-utilization + spec: + description: A Sysdig example + provider: sysdig + body: >- + { + "last": ${elapsedTime}, + "sampling": 600, + "filter": "kubernetes.app.revision.name = '${revision}'", + "metrics": [ + { + "id": "cpu.cores.used", + "aggregations": { "time": "avg", "group": "sum" } + } + ], + "dataSourceType": "container", + "paging": { + "from": 0, + "to": 99 + } + } + method: POST + type: Gauge + headerTemplates: + - name: Accept + value: application/json + - name: Authorization + value: Bearer 87654321-1234-1234-1234-123456789012 + jqExpression: ".data[0].d[0] | tonumber" + urlTemplate: https://secure.sysdig.com/api/data + ``` + + === "Bearer token embedded in secret" + Suppose your Sysdig token is `87654321-1234-1234-1234-123456789012`; you wish to store this token in a Kubernetes secret, and reference this secret in an Iter8 metric. You can do so as follows. + + 1. **Create secret:** Create a Kubernetes secret containing the token. + ```shell + kubectl create secret generic sdcredentials -n myns --from-literal=token=87654321-1234-1234-1234-123456789012 + ``` + The above secret contains a data field named `token` whose value is the Sysdig token. The data field name (which can be any string of your choice) will be used in Step 3 below as a placeholder. + + 2. **Create RBAC rule:** Provide the required permissions for Iter8 to read this secret. The service account `iter8-analytics` in the `iter8-system` namespace will have permissions to read secrets in the `myns` namespace. + ```shell + kubectl create rolebinding iter8-cred --clusterrole=iter8-secret-reader-analytics --serviceaccount=iter8-system:iter8-analytics --namespace=myns + ``` + + 3. **Define metric:** When defining the metric, ensure that the `authType` field is set to `Bearer` and the appropriate `secret` is referenced. In the `headerTemplates` field, include `Authorize` header field (as [required by Sysdig](https://docs.sysdig.com/en/sysdig-rest-api-conventions.html)). The value for this header field is a templated string. Iter8 will substitute the placeholder ${token} at query time, by looking up the referenced `secret` named `sdcredentials` in the `myns` namespace. + + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Metric + metadata: + name: cpu-utilization + spec: + description: A Sysdig example + provider: sysdig + body: >- + { + "last": ${elapsedTime}, + "sampling": 600, + "filter": "kubernetes.app.revision.name = '${revision}'", + "metrics": [ + { + "id": "cpu.cores.used", + "aggregations": { "time": "avg", "group": "sum" } + } + ], + "dataSourceType": "container", + "paging": { + "from": 0, + "to": 99 + } + } + method: POST + authType: Bearer + secret: myns/sdcredentials + type: Gauge + headerTemplates: + - name: Accept + value: application/json + - name: Authorization + value: Bearer ${token} + jqExpression: ".data[0].d[0] | tonumber" + urlTemplate: https://secure.sysdig.com/api/data + ``` + + ???+ hint "Brief explanation of the `cpu-utilization` metric" + 1. Sysdig enables metric queries using both POST requests; hence, the method field of the Iter8 metric is set to POST. + 2. Iter8 will query Sysdig during each iteration of the experiment. In each iteration, Iter8 will use `n` HTTP queries to fetch metric values for each version, where `n` is the number of versions in the experiment[^2]. + 3. The HTTP query used by Iter8 contains a JSON body as [required by Sysdig](https://docs.sysdig.com/en/working-with-the-data-api.html). This JSON body is derived by [substituting the placeholders](#placeholder-substitution) in body template. + 4. The `jqExpression` enables Iter8 to extract the metric value from the JSON response returned by Sysdig. + 5. The `urlTemplate` field provides the URL of the Sysdig service. + +=== "Elastic" + + Elasticsearch REST API accepts HTTP GET or POST requests and uses basic authentication as documented [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/http-clients.html#http-clients). Suppose Elasticsearch is set up to enforce basic auth with the following credentials: + + ```yaml + username: produser + password: t0p-secret + ``` + + You can then enable Iter8 to query the Elasticsearch service as follows. + + 1. **Create secret:** Create a Kubernetes secret that contains the authentication information. In particular, this secret needs to have the `username` and `password` fields in the `data` section with correct values. + ```shell + kubectl create secret generic elasticcredentials -n myns --from-literal=username=produser --from-literal=password=t0p-secret + ``` + + 2. **Create RBAC rule:** Provide the required permissions for Iter8 to read this secret. The service account `iter8-analytics` in the `iter8-system` namespace will have permissions to read secrets in the `myns` namespace. + ```shell + kubectl create rolebinding iter8-cred --clusterrole=iter8-secret-reader-analytics --serviceaccount=iter8-system:iter8-analytics --namespace=myns + ``` + + 3. **Define metric:** When defining the metric, ensure that the `authType` field is set to `Basic` and the appropriate `secret` is referenced. + + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Metric + metadata: + name: average-sales + spec: + description: An elastic example + provider: elastic + body: >- + { + "aggs": { + "range": { + "date_range": { + "field": "date", + "ranges": [ + { "from": "now-${elapsedTime}s/s" } + ] + } + }, + "items_to_sell": { + "filter": { "term": { "version": "${revision}" } }, + "aggs": { + "avg_sales": { "avg": { "field": "sale_price" } } + } + } + } + } + method: POST + authType: Basic + secret: myns/elasticcredentials + type: Gauge + headerTemplates: + - name: Content-Type + value: application/json + jqExpression: ".aggregations.items_to_sell.avg_sales.value | tonumber" + urlTemplate: https://secure.elastic.com/my/sales + ``` + + ???+ hint "Brief explanation of the `average sales` metric" + 1. Elastic enables metric queries using GET or POST requests. In the elastic example, The method field of the Iter8 metric is set to POST. + 2. Iter8 will query Elastic during each iteration of the experiment. In each iteration, Iter8 will use `n` HTTP queries to fetch metric values for each version, where `n` is the number of versions in the experiment[^2]. + 3. The HTTP query used by Iter8 contains a JSON body as [required by Elastic](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html). This JSON body is derived by [substituting the placeholders](#placeholder-substitution) in body template. + 4. The `jqExpression` enables Iter8 to extract the metric value from the JSON response returned by Elastic. + 5. The `urlTemplate` field provides the URL of the Elastic service. + +## Placeholder substitution + +> **Note:** This step is automated by **Iter8**. + +Iter8 will substitute placeholders in the metric query based on the time elapsed since the start of the experiment, and information associated with each version in the experiment. + +Suppose the [metrics defined above](#defining-metrics) are referenced within an experiment as follows. Further, suppose this experiment has started, Iter8 is about to do an iteration of this experiment, and the time elapsed since the start of the experiment is 600 seconds. + +??? abstract "Look inside sample experiment" + ```yaml linenums="1" + apiVersion: iter8.tools/v2alpha2 + kind: Experiment + metadata: + name: sample-exp + spec: + target: default/sample-app + strategy: + testingPattern: Canary + criteria: + # This experiment assumes that metrics have been created in the `myns` namespace + requestCount: myns/request-count + objectives: + - metric: myns/name-count + lowerLimit: 50 + - metric: myns/cpu-utilization + upperLimit: 90 + - metric: myns/average-sales + lowerLimit: "250.0" + duration: + intervalSeconds: 10 + iterationsPerLoop: 10 + versionInfo: + baseline: + name: current + variables: + - name: revision + value: sample-app-v1 + - name: userfilter + value: 'usergroup!~"wakanda"' + candidates: + - name: candidate + variables: + - name: revision + value: sample-app-v2 + - name: userfilter + value: 'usergroup=~"wakanda"' + ``` + +For the sample experiment above, Iter8 will use two HTTP(S) queries to fetch metric values, one for the baseline version, and another for the candidate version. + +=== "Prometheus" + + Consider the baseline version. Iter8 will send an HTTP(S) request with a single parameter named `query` whose value equals: + ``` + sum(increase(revision_app_request_latencies_count{service_name='current',usergroup!~"wakanda"}[600s])) or on() vector(0) + ``` + +=== "New Relic" + Consider the baseline version. Iter8 will send an HTTP(S) request with a single parameter named `nrql` whose value equals: + ``` + SELECT count(appName) FROM PageView WHERE revisionName='sample-app-v1' SINCE 600 seconds ago + ``` + +=== "Sysdig" + Consider the baseline version. Iter8 will send an HTTP(S) request with the following JSON body: + ```json linenums="1" + { + "last": 600, + "sampling": 600, + "filter": "kubernetes.app.revision.name = 'sample-app-v1'", + "metrics": [ + { + "id": "cpu.cores.used", + "aggregations": { "time": "avg", "group": "sum" } + } + ], + "dataSourceType": "container", + "paging": { + "from": 0, + "to": 99 + } + } + ``` + +=== "Elastic" + Consider the baseline version. Iter8 will send an HTTP(S) request with the following JSON body: + ```json linenums="1" + { + "aggs": { + "range": { + "date_range": { + "field": "date", + "ranges": [ + { "from": "now-600s/s" } + ] + } + }, + "items_to_sell": { + "filter": { "term": { "version": "sample-app-v1" } }, + "aggs": { + "avg_sales": { "avg": { "field": "sale_price" } } + } + } + } + } + ``` + +The placeholder `$elapsedTime` has been substituted with 600, which is the time elapsed since the start of the experiment. The other placeholders have been substituted based on the *versionInfo* field of the baseline version in the experiment. Iter8 builds and sends an HTTP request in a similar manner for the candidate version as well. + +## JSON response + +> **Note:** This step is handled by the **metrics provider**. + +The metrics provider is expected to respond to Iter8's HTTP request with a JSON object. The format of this JSON object is defined by the provider. + +=== "Prometheus" + The format of the Prometheus JSON response is [defined here](https://prometheus.io/docs/prometheus/latest/querying/api/#format-overview). A sample Prometheus response is as follows. + ```json linenums="1" + { + "status": "success", + "data": { + "resultType": "vector", + "result": [ + { + "value": [1556823494.744, "21.7639"] + } + ] + } + } + ``` + +=== "New Relic" + The format of the New Relic JSON response is [discussed here](https://docs.newrelic.com/docs/insights/event-data-sources/insights-api/query-insights-event-data-api/#example). A sample New Relic response is as follows. + ```json linenums="1" + { + "results": [ + { + "count": 80275388 + } + ], + "metadata": { + "eventTypes": [ + "PageView" + ], + "eventType": "PageView", + "openEnded": true, + "beginTime": "2014-08-03T19:00:00Z", + "endTime": "2017-01-18T23:18:41Z", + "beginTimeMillis=": 1407092400000, + "endTimeMillis": 1484781521198, + "rawSince": "'2014-08-04 00:00:00+0500'", + "rawUntil": "`now`", + "rawCompareWith": "", + "clippedTimeWindows": { + "Browser": { + "beginTimeMillis": 1483571921198, + "endTimeMillis": 1484781521198, + "retentionMillis": 1209600000 + } + }, + "messages": [], + "contents": [ + { + "function": "count", + "attribute": "appName", + "simple": true + } + ] + } + } + ``` + +=== "Sysdig" + The format of the Sysdig JSON response is [discussed here](https://docs.sysdig.com/en/working-with-the-data-api.html). A sample Sysdig response is as follows. + ```json linenums="1" + { + "data": [ + { + "t": 1582756200, + "d": [ + 6.481 + ] + } + ], + "start": 1582755600, + "end": 1582756200 + } + ``` + +=== "Elastic" + The format of the Elastic JSON response is [discussed here](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html). A sample Elastic response is as follows. + ```json linenums="1" + { + "aggregations": { + "items_to_sell": { + "doc_count": 3, + "avg_sales": { "value": 128.33333333333334 } + } + } + } + ``` + +## Processing the JSON response + +> **Note:** This step is automated by **Iter8**. + +Iter8 uses [jq](https://stedolan.github.io/jq/) to extract the metric value from the JSON response of the provider. The `jqExpression` used by Iter8 is supplied as part of the metric definition. When the `jqExpression` is applied to the JSON response, it is expected to yield a number. + +=== "Prometheus" + Consider the `jqExpression` defined in the [sample Prometheus metric](#defining-metrics). Let us apply it to the [sample JSON response from Prometheus](#json-response). + ```shell + echo '{ + "status": "success", + "data": { + "resultType": "vector", + "result": [ + { + "value": [1556823494.744, "21.7639"] + } + ] + } + }' | jq ".data.result[0].value[1] | tonumber" + ``` + Executing the above command results yields `21.7639`, a number, as required by Iter8. + +=== "New Relic" + Consider the `jqExpression` defined in the [sample New Relic metric](#defining-metrics). Let us apply it to the [sample JSON response from New Relic](#json-response). + ```shell + echo '{ + "results": [ + { + "count": 80275388 + } + ], + "metadata": { + "eventTypes": [ + "PageView" + ], + "eventType": "PageView", + "openEnded": true, + "beginTime": "2014-08-03T19:00:00Z", + "endTime": "2017-01-18T23:18:41Z", + "beginTimeMillis=": 1407092400000, + "endTimeMillis": 1484781521198, + "rawSince": "'2014-08-04 00:00:00+0500'", + "rawUntil": "`now`", + "rawCompareWith": "", + "clippedTimeWindows": { + "Browser": { + "beginTimeMillis": 1483571921198, + "endTimeMillis": 1484781521198, + "retentionMillis": 1209600000 + } + }, + "messages": [], + "contents": [ + { + "function": "count", + "attribute": "appName", + "simple": true + } + ] + } + }' | jq ".results[0].count | tonumber" + ``` + Executing the above command results yields `80275388`, a number, as required by Iter8. + +=== "Sysdig" + Consider the `jqExpression` defined in the [sample Sysdig metric](#defining-metrics). Let us apply it to the [sample JSON response from Sysdig](#json-response). + ```shell + echo '{ + "data": [ + { + "t": 1582756200, + "d": [ + 6.481 + ] + } + ], + "start": 1582755600, + "end": 1582756200 + }' | jq ".data[0].d[0] | tonumber" + ``` + Executing the above command results yields `6.481`, a number, as required by Iter8. + +=== "Elastic" + Consider the `jqExpression` defined in the [sample Elastic metric](#defining-metrics). Let us apply it to the [sample JSON response from Elastic](#json-response). + ```shell + echo '{ + "aggregations": { + "items_to_sell": { + "doc_count": 3, + "avg_sales": { "value": 128.33333333333334 } + } + } + }' | jq ".aggregations.items_to_sell.avg_sales.value | tonumber" + ``` + Executing the above command results yields `128.33333333333334`, a number, as required by Iter8. + +> **Note:** The shell command above is for illustration only. Iter8 uses Python bindings for `jq` to evaluate the `jqExpression`. + +## Error handling + +> **Note:** This step is automated by **Iter8**. + +Errors may occur during Iter8's metric queries due to a number of reasons (for example, due to an invalid `jqExpression` supplied within the metric). If Iter8 encounters errors during its attempt to retrieve metric values, Iter8 will mark the respective metric as unavailable. + +[^1]: Iter8 can be used with any provider that can receive an HTTP request and respond with a JSON object containing the metrics information. Documentation requests and contributions (PRs) are welcome for providers not listed here. +[^2]: In a conformance experiment, `n = 1`. In canary and A/B experiments, `n = 2`. In A/B/n experiments, `n > 2`. \ No newline at end of file diff --git a/mkdocs/docs/metrics/using-metrics.md b/mkdocs/docs/metrics/using-metrics.md index 1faade249..fd84e07d1 100644 --- a/mkdocs/docs/metrics/using-metrics.md +++ b/mkdocs/docs/metrics/using-metrics.md @@ -5,7 +5,7 @@ template: main.html # Using Metrics in Experiments !!! tip "Iter8 metrics API" - Iter8 defines a new Kubernetes resource called Metric that makes it easy to use metrics in experiments from RESTful metric backends like Prometheus, New Relic, Sysdig and Elastic. + Iter8 defines a new Kubernetes resource called Metric that makes it easy to use metrics in experiments from RESTful metric providers like Prometheus, New Relic, Sysdig and Elastic. List metrics available in your cluster using the `kubectl get metrics.iter8.tools` command. Use metrics in experiments by referencing them in experiment criteria. @@ -18,11 +18,11 @@ kubectl get metrics.iter8.tools --all-namespaces ```shell NAMESPACE NAME TYPE DESCRIPTION -iter8-knative 95th-percentile-tail-latency gauge 95th percentile tail latency -iter8-knative error-count counter Number of error responses -iter8-knative error-rate gauge Fraction of requests with error responses -iter8-knative mean-latency gauge Mean latency -iter8-knative request-count counter Number of requests +iter8-knative 95th-percentile-tail-latency Gauge 95th percentile tail latency +iter8-knative error-count Counter Number of error responses +iter8-knative error-rate Gauge Fraction of requests with error responses +iter8-knative mean-latency Gauge Mean latency +iter8-knative request-count Counter Number of requests ``` ## Referencing metrics diff --git a/mkdocs/docs/reference/apispec.md b/mkdocs/docs/reference/apispec.md index 092d30a1f..11f6d2149 100644 --- a/mkdocs/docs/reference/apispec.md +++ b/mkdocs/docs/reference/apispec.md @@ -8,7 +8,7 @@ template: main.html The Iter8 API provides two [Kubernetes custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) to automate metrics and AI-driven experiments, progressive delivery, and rollout of Kubernetes and OpenShift apps. 1. The **Experiment** resource provides expressive controls required by application developers and service operators who wish to automate new releases of their apps in a robust, principled and metrics-driven manner. These controls encompass [testing, deployment, traffic shaping, and version promotion functions](../../../concepts/buildingblocks/) and can be flexibly composed to automate [diverse use-cases](../../../tutorials/knative/canary-progressive/). - 2. The **Metric** resource encapsulates the REST query that is used by Iter8 for retrieving a metric value from the metrics backend. Metrics are referenced in experiments. + 2. The **Metric** resource encapsulates the REST query that is used by Iter8 for retrieving a metric value from the metrics provider. Metrics are referenced in experiments. !!! note "API Version" @@ -132,16 +132,19 @@ Standard Kubernetes [meta.v1/ObjectMeta](https://kubernetes.io/docs/reference/ge #### Spec | Field name | Field type | Description | Required | | ----- | ------------ | ----------- | -------- | -| params | [][NamedValue](#namedvalue) | List of name/value pairs corresponding to the name and value of the HTTP query parameters used by Iter8 when querying the metrics backend. Each name represents a parameter name; the corresponding value is a string template with placeholders, which will be interpolated by Iter8 at query time. | No | -| description | string | Human readable description. | No | -| units | string | Units of measurement. Units are used only for display purposes. | No | -| type | string | Metric type. Valid values are `counter` and `gauge`. Default value = `gauge`. | No | -| sampleSize | string | Reference to a metric that represents the number of data points over which the metric value is computed. This field applies only to `gauge` metrics. References can be expressed in the form 'name' or 'namespace/name'. If just `name` is used, the implied namespace is the namespace of the referring metric. | No | -| provider | string | Type of the metrics database. Provider is used only for display purposes. | No | -| jqExpression | string | The [jq](https://stedolan.github.io/jq/) expression used by Iter8 to extract the metric value from the JSON response of the metrics backend to a metrics query. | Yes | -| secret | string | Reference to a secret that contains information used for authenticating with the metrics database. In particular, Iter8 uses data in this secret to interpolate the HTTP headers and URL while querying the database. References can be expressed in the form 'name' or 'namespace/name'. If just `name` is used, the implied namespace is the namespace where Iter8 is installed (which is `iter8-system` by default). | No | -| headerTemplates | [][NamedValue](#namedvalue) | List of templates for headers that should be added to metrics queries. Variable portions of the headers, expressed in the form `{.name}` will be replaced at runtime with the value of the `name` entry defined in the secret. If no value can be found in the secret, no replacement will be done. | No | -| urlTemplate | string | Template for URL of metrics server. Variable portions of the URL, expressed in the form `{.name}` will be replaced at runtimme with the value of the `name` entry defined in the secret. If no value can be found in the secret, no replacement will be done. | Yes | +| description | string | Human readable description. This field is meant for informational purposes. | No | +| units | string | Units of measurement. This field is meant for informational purposes. | No | +| provider | string | Type of the metrics provider. This field is meant for informational purposes. | No | +| params | [][NamedValue](#namedvalue) | List of name/value pairs corresponding to the name and value of the HTTP query parameters used by Iter8 when querying the metrics provider. Each name represents a parameter name; the corresponding value is a string template with placeholders; the placeholders will be dynamically substituted by Iter8 with values at query time. | No | +| body | string | String used to construct the JSON body of the HTTP request. Body may be templated, in which Iter8 will attempt to substitute placeholders in the template at query time using version information. | No | +| type | string | Metric type. Valid values are `Counter` and `Gauge`. Default value = `Gauge`. A `Counter` metric is one whose value never decreases over time. A `Gauge` metric is one whose value may increase or decrease over time. | No | +| method | string | HTTP method (verb) used in the HTTP request. Valid values are `GET` and `POST`. Default value = `GET`. | No | +| authType | string | Identifies the type of authentication used in the HTTP request. Valid values are `Basic`, `Bearer` and `APIKey` which correspond to HTTP authentication with these respective methods. | No | +| sampleSize | string | Reference to a metric that represents the number of data points over which the value of this metric is computed. This field applies only to `Gauge` metrics. References can be expressed in the form 'name' or 'namespace/name'. If just `name` is used, the implied namespace is the namespace of the referring metric. | No | +| secret | string | Reference to a secret that contains information used for authenticating with the metrics provider. In particular, Iter8 uses data in this secret to substitute placeholders in the HTTP headers and URL while querying the provider. References can be expressed in the form 'name' or 'namespace/name'. If just `name` is used, the implied namespace is the namespace where Iter8 is installed (which is `iter8-system` by default). | No | +| headerTemplates | [][NamedValue](#namedvalue) | List of name/value pairs corresponding to the name and value of the HTTP request headers used by Iter8 when querying the metrics provider. Each name represents a header field name; the corresponding value is a string template with placeholders; the placeholders will be dynamically substituted by Iter8 with values at query time. Placeholder substitution is attempted only if `authType` and `secret` fields are present. | No | +| jqExpression | string | The [jq](https://stedolan.github.io/jq/) expression used by Iter8 to extract the metric value from the JSON response returned by the provider. | Yes | +| urlTemplate | string | Template for the metric provider's URL. Typically, urlTemplate is expected to be the actual URL without any placeholders. However, urlTemplate may be templated, in which case, Iter8 will attempt to substitute placeholders in the urlTemplate at query time using the `secret` referenced in the metric. Placeholder substitution will not be attempted if `secret` is not specified. | Yes | ## Experiment field types @@ -214,7 +217,7 @@ Standard Kubernetes [meta.v1/ObjectMeta](https://kubernetes.io/docs/reference/ge | Field name | Field type | Description | Required | | ----- | ---- | ----------- | -------- | | name | string | Name of the version. | Yes | -| variables | [][NamedValue](#namedvalue) | Variables are name-value pairs associated with a version. Metrics and tasks within experiment specs can contain strings with placeholders. Iter8 uses variables to interpolate these strings. | No | +| variables | [][NamedValue](#namedvalue) | Variables are name-value pairs associated with a version. Metrics and tasks within experiment specs can contain strings with placeholders. Iter8 uses variables to substitute placeholders in these strings. | No | | weightObjRef | [corev1.ObjectReference](https://pkg.go.dev/k8s.io/api@v0.20.0/core/v1#ObjectReference) | Reference to a Kubernetes resource and a field-path within the resource. Iter8 uses `weightObjRef` to get or set weight (traffic percentage) for the version. | No | @@ -534,7 +537,7 @@ The `common` task library provides the `exec` task. Use this task to execute she - "sample-app" # release name - "--namespace=iter8-system" # release namespace - "sample-app" # chart name - - "--values=https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/canaryprogressive/{{ .promote }}-values.yaml" # values URL dynamically interpolated + - "--values=https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/canaryprogressive/{{ .promote }}-values.yaml" # placeholder is substituted dynamically ``` === "Kustomize" @@ -562,7 +565,7 @@ The `common` task library provides the `exec` task. Use this task to execute she kustomize build github.com/iter8-tools/iter8/samples/knative/canaryfixedsplit/{{ .name }}?ref=master | kubectl apply -f - ``` -### Interpolation of task inputs +### Placeholder substitution in task inputs Inputs to tasks can contain placeholders, or template variables, which will be dynamically substituted when the task is executed by Iter8. For example, in the sample experiment above, one input is: @@ -570,13 +573,13 @@ Inputs to tasks can contain placeholders, or template variables, which will be d "https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/quickstart/{{ .promote }}.yaml" ``` -In this case, the placeholder is `{{ .promote }}`. Variable interpolation works as follows. +In this case, the placeholder is `{{ .promote }}`. Placeholder substitution in task inputs works as follows. 1. Iter8 will find the version recommended for promotion. This information is stored in the `status.versionRecommendedForPromotion` field of the experiment. The version recommended for promotion is the `winner`, if a `winner` has been found in the experiment. Otherwise, it is the baseline version supplied in the `spec.versionInfo` field of the experiment. 2. If the placeholder is `{{ .name }}`, Iter8 will substitute it with the name of the version recommended for promotion. Else, if it is any other variable, Iter8 will substitute it with the value of the corresponding variable for the version recommended for promotion. Variable values are specified in the `variables` field of the version detail. Note that variable values could have been supplied by the creator of the experiment, or by other tasks such as `init-experiment` that may already have been executed by Iter8 as part of the experiment. -??? example "Interpolation Example 1" +??? example "Placeholder substitution Example 1" Consider the sample experiment above. Suppose the `winner` of this experiment was `candidate`. Then: @@ -585,7 +588,7 @@ In this case, the placeholder is `{{ .promote }}`. Variable interpolation works 3. The value of the placeholder for the version recommended for promotion is `candid`. 4. The command executed by the `exec` task is then `kubectl apply -f https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/quickstart/candid.yaml`. -??? example "Interpolation Example 2" +??? example "Placeholder substitution Example 2" Consider the sample experiment above. Suppose the `winner` of this experiment was `current`. Then: @@ -594,7 +597,7 @@ In this case, the placeholder is `{{ .promote }}`. Variable interpolation works 3. The value of the placeholder for the version recommended for promotion is `base`. 4. The command executed by the `exec` task is then `kubectl apply -f https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/quickstart/base.yaml`. -??? example "Interpolation Example 3" +??? example "Placeholder substitution Example 3" Consider the sample experiment above. Suppose the experiment did not yield a `winner`. Then: @@ -604,7 +607,7 @@ In this case, the placeholder is `{{ .promote }}`. Variable interpolation works 4. The command executed by the `exec` task is then `kubectl apply -f https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/quickstart/base.yaml`. ### Disable Interpolation (always do this in a `start` action) -By default, the `common/exec` task will attempt to find the version recommended for promotion, and use its values to interpolate the inputs to the task. However, this behavior will lead to task failure since version recommended for promotion will be generally undefined at this stage of the experiment. To use the `common/exec` task as part of an experiment `start` action, set `disableInterpolation` to `true` as illustrated in the `kubectl/Helm/Kustomize` samples above. +By default, the `common/exec` task will attempt to find the version recommended for promotion, and use its values to substitute placeholders in the inputs to the task. However, this behavior will lead to task failure since version recommended for promotion will be generally undefined at this stage of the experiment. To use the `common/exec` task as part of an experiment `start` action, set `disableInterpolation` to `true` as illustrated in the `kubectl/Helm/Kustomize` samples above. ### Error handling in tasks When a task exits with an error, it will result in the failure of the experiment to which it belongs. diff --git a/mkdocs/docs/roadmap.md b/mkdocs/docs/roadmap.md index 46d8a3567..587d4390b 100644 --- a/mkdocs/docs/roadmap.md +++ b/mkdocs/docs/roadmap.md @@ -12,19 +12,20 @@ hide: * Blue/green deployment pattern * Experiments with `support` and `confidence` 2. **Metrics** - * Support for NewRelic, DataDog, Elastic, and other RESTful metric databases + * Support for more metric providers like MySQL, PostgreSQL, CouchDB, MongoDB, Google Analytics and Fortio. 3. **Enhanced MLOps experiments** * Customized experiments/metrics for serving frameworks like TorchServe and TFServing 4. **GitOps** * Integration with ArgoCD, Flux and other GitOps operators -5. **Enhancing Kubernetes and OpenShift integration** +5. **Notifications** + * Integration with Slack, GitHub, and other RESTful services +6. **Enhancing Kubernetes and OpenShift integration** * Improved support for KFServing * Enhanced support for Istio using new Iter8 Experiment API * Support for OpenShift Serverless + * Enhanced Knative metrics in tutorials using OpenTelemetry collector * Support for Ambassador and Kong networking layers in KNative * Support for experimenting with configuration and routes in Knative -6. **Notifications** - * Integration with Slack, GitHub, and other RESTful services 7. **Git triggered workflows and CI/CD** * Integration with GitHub Actions and other pipeline providers 8. **Helm tests** diff --git a/mkdocs/docs/tutorials/knative/canary-progressive.md b/mkdocs/docs/tutorials/knative/canary-progressive.md index 4fc8a63dc..27e1700dd 100644 --- a/mkdocs/docs/tutorials/knative/canary-progressive.md +++ b/mkdocs/docs/tutorials/knative/canary-progressive.md @@ -144,7 +144,7 @@ kubectl apply -f $ITER8/samples/knative/canaryprogressive/experiment.yaml - "sample-app" # release name - "--namespace=default" # release namespace - "sample-app" # chart name - - "--values=https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/canaryprogressive/{{ .promote }}-values.yaml" # values URL dynamically interpolated + - "--values=https://raw.githubusercontent.com/iter8-tools/iter8/master/samples/knative/canaryprogressive/{{ .promote }}-values.yaml" # placeholder 'promote' is dynamically substituted criteria: # mean latency of version should be under 50 milliseconds # 95th percentile latency should be under 100 milliseconds diff --git a/mkdocs/mkdocs.yml b/mkdocs/mkdocs.yml index 10abd076c..6cd19f0e8 100644 --- a/mkdocs/mkdocs.yml +++ b/mkdocs/mkdocs.yml @@ -2,7 +2,7 @@ site_name: Iter8 site_url: https://iter8.tools/latest site_author: Srinivasan Parthasarathy site_description: >- - Iter8 makes it easy to maximize business value and guarantee SLOs during releases of your Kubernetes apps/ML models. Automate metrics-driven experiments, progressive delivery, validation, and promotion/rollback. Maximize release velocity while protecting end-user experience. Quick start in 5 mins. + Iter8 makes it easy to maximize business value and guarantee SLOs during releases of your Kubernetes apps/ML models. Automate metrics-driven experiments and progressive delivery. Maximize release velocity while protecting end-user experience. Quick start in 5 mins. # Repository repo_name: iter8-tools/iter8 @@ -132,6 +132,7 @@ nav: - Generating requests externally: tutorials/traffic.md - Metrics: - Using metrics in experiments: metrics/using-metrics.md + - Defining Iter8 metrics: metrics/defining-iter8-metrics.md - Reference: - Iter8 API specification: reference/apispec.md - Contributing: contributing.md diff --git a/mkdocs/overrides/main.html b/mkdocs/overrides/main.html index 516e183f7..6ed9c24b8 100644 --- a/mkdocs/overrides/main.html +++ b/mkdocs/overrides/main.html @@ -31,7 +31,7 @@ - + @@ -42,11 +42,13 @@ - + - + + +