Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce versionInfo in metrics CRD #933

Closed
2 tasks
sriumcp opened this issue Aug 4, 2021 · 3 comments
Closed
2 tasks

Introduce versionInfo in metrics CRD #933

sriumcp opened this issue Aug 4, 2021 · 3 comments
Labels
area/analytics Metrics, statistical estimation, and bandit algorithms for traffic shifting kind/enhancement New feature or request

Comments

@sriumcp
Copy link
Member

sriumcp commented Aug 4, 2021

Is your feature request related to a problem? Please describe the problem.
Currently, metric definitions are complicated due to two reasons

  1. The use of interpolation for version information, which makes metric definitions complex to begin with.
  2. The fact that the placeholders for interpolation are inside the metrics resource, and the variables used for substituting placeholders are inside the experiment resource, which makes interpolation not just complex, but really complicated.

Describe the feature/solution you'd like
Alter the definition of custom metrics as follows ... there will now be a single placeholder called ${elapsedTime}, which has nothing to do with a version. Conceptually, this is a far more easier definition to follow than having to link between two different resources and define complex interpolation logic.

apiVersion: iter8.tools/v2alpha2
kind: Metric
metadata:
  name: request-count
spec:
  description: A Prometheus example
  provider: prometheus
  versionInfo: # number of versions here need to match number of versions in the experiment object
  - params:
    - name: query
      value: >-
        sum(increase(revision_app_request_latencies_count{service_name='hello'}[${elapsedTime}s])) or on() vector(0)
    # alternatively, for POST queries, `params` above can be replaced with `body`
  - params:
    - name: query
      value: >-
        sum(increase(revision_app_request_latencies_count{service_name='hello-candidate'}[${elapsedTime}s])) or on() vector(0)
  type: Counter
  jqExpression: ".data.result[0].value[1] | tonumber"
  urlTemplate: http://myprometheusservice.com/api/v1

Alter the definition of mock metrics as follows ...

apiVersion: iter8.tools/v2alpha2
kind: Metric
metadata:
  name: request-count
spec:
  mock: true # this is now a boolean field
  versionInfo: # number of versions here need to match number of versions in the experiment object
  - value: "20.0"
  - value: "15.0"

Note: no changes are needed for builtin metrics due to #912

Note: versionInfo for metrics will be a list of versionDetail objects. This object has three optional fields, namely,
params, body, and value. First two are for custom metrics, and the last one is for a mock metric.

Note on validation: length of versionInfo needs to be the same, wherever versionInfo lists appear.

Note on validation: if mock == true, every version should have a value.

Will this feature/solution bring new benefits for Iter8 users? What are they?
The above change effectively decouples metrics and experiment resources, apart from the basic requirement that the number of versions in the metrics object should equal the number of versions in the experiment object.

Making metrics and experiment objects self-contained in the above ways simplifies definition of both.

Does this issue require a design doc/discussion? If there is a link to the design document/discussions, please provide it below.
Yes, please see above.

How will this feature be tested?

  • Metrics tests need to pass (i.e., there will be equivalent of tests we have right now for current metrics).

How will this feature be documented?

  • iter8.tools documentation.
@sriumcp sriumcp added kind/enhancement New feature or request area/analytics Metrics, statistical estimation, and bandit algorithms for traffic shifting area/controller labels Aug 4, 2021
@kalantar
Copy link
Member

kalantar commented Aug 4, 2021

This seems like a good idea. But it means metrics are specific to versions. Where version is generated in a CI pipeline, the version may change on every use. Won't I have to define new metrics for every experiment?

@sriumcp
Copy link
Member Author

sriumcp commented Aug 4, 2021

Yes and no.


If metrics are collected with consistent labels, example hello and hello-candidate are names of baseline and candidate services respectively in the above example... and metrics are collected under these labels, then, in every experiment ... no changes are needed for metrics resources. This is the situation in automatically generated metrics by service meshes like Istio. Of course, it also means, you are consistently naming your baseline and candidate deployments/services, and not naming them randomly.


Otherwise, yes. Create metrics for an experiment everytime you create a new experiment. Even in this situation, note that, in HelmX and SimpleX experiments, metrics can be templated as part of Helm templates alongside the experiment template, which again means, from the CI/CD perspective, the only change is to the values file.

SimpleX, and HelmX are really the ways to simplify CI/CD with Iter8. Not raw experiments.


So, the additional complexity of creating metrics for each experiment happens only if:

  1. You don't use HelmX
  2. You don't use SimpleX
  3. You choose to change the version name under which metrics are collected for every new version (instead of using more standardized names like baseline and candidate). Clearly, the latter will be the recommended best practice for metrics collection in the context of Iter8 experiments (since it will avoid the need to create metrics alongside experiments).

Without the above change, CI/CD still needs to populate versionInfo variables in the experiment object for each metric...


@sriumcp
Copy link
Member Author

sriumcp commented Aug 5, 2021

Note: This issue is now subsumed and superseded by #934

@sriumcp sriumcp added v2alpha3 and removed v2beta1 labels Aug 5, 2021
@sriumcp sriumcp closed this as completed Aug 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/analytics Metrics, statistical estimation, and bandit algorithms for traffic shifting kind/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants