Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop metrics CRD #934

Closed
2 tasks
sriumcp opened this issue Aug 5, 2021 · 32 comments
Closed
2 tasks

Drop metrics CRD #934

sriumcp opened this issue Aug 5, 2021 · 32 comments
Labels
area/analytics Metrics, statistical estimation, and bandit algorithms for traffic shifting area/explainability Describing results of an experiment area/install Iter8 installation and packaging kind/enhancement New feature or request

Comments

@sriumcp
Copy link
Member

sriumcp commented Aug 5, 2021

Is your feature request related to a problem? Please describe the problem.
Iter8 has two CRDs at the moment, experiment and metric. Here are the disadvantages of two CRDs (see below for its lone advantage).

  1. Two CRDs are inherently "more stuff" to explain and understand then one. It is nice to say, everything in Iter8 revolves around a single new CRD and that's it, which also makes it a bit more inviting to try and understand that one type.
  2. Having to manipulate two types of resources in CI/CD pipeline introduces additional complexity. In particular, the user needs to think through how to name/structure domain resources in order to leave metrics unaffected. See Introduce versionInfo in metrics CRD #933
  3. Single CRD substantially simplifies simplex experiments. Enable simplex experiments #948
  4. Problems like race conditions (metrics not yet created when experiment is created), and invalid metric obj refs go away. There are no metrics objects anymore.

Describe the feature/solution you'd like
Note: The metric objects below will mirror the structure of the new metrics CRD as described in #933 (i.e., they will include versionInfo). They will also not include sampleSize. This can be reintroduced at a later stage (non-breaking CRD changes are always ok).

kind: experiment
spec:
  metrics:
  # names must be unique
  - name: my-metric-1
    description: A Prometheus example
    provider: prometheus
    params:
    - name: query
      value: >-
        sum(increase(revision_app_request_latencies_count{service_name='${serviceName}'}[${elapsedTime}s])) or on() vector(0)
    versionInfo: # number of versions here need to match number of versions in the experiment object
    - serviceName: hello # version info for each version will be a map
    - serviceName: hello-candidate # version info for each version will be a map
  type: Counter
  jqExpression: ".data.result[0].value[1] | tonumber"
  url: http://myprometheusservice.com/api/v1
  - name: my-metric-2
    mock: true
    versionInfo:
    - approximateValue: 20.5
    - approximateValue: 18.0
  
  # criteria are same as before, except, metric names are simply names, no longer namespaces-names

Will this feature/solution bring new benefits for Iter8 users? What are they?
See the problems created by a single CRD above.

Will this feature/solution bring new benefits for Iter8 developers? What are they?
Simplifies code in all Iter8 components. Metrics will no longer be in status section of the experiment, but in spec.

Does this issue require a design doc/discussion? If there is a link to the design document/discussions, please provide it below.
Yes, see above. There is one disadvantage of this approach which is the experiment spec just got bigger. Considering the fact that the metrics is simply moving from status to spec, it is not really that big of a disadvantage.

How will this feature be tested?

  • Existing tests will be modified to accommodate the change.

How will this feature be documented?

  • iter8.tools documentation.

Additional context
Assuming this feature makes it into v2alpha3, we can close #912 without having to change anything. Iter8 install will automatically become simplified (one-liner) because of this change, because there won't be any builtin metrics objects to install anymore.

@sriumcp sriumcp added kind/enhancement New feature or request area/analytics Metrics, statistical estimation, and bandit algorithms for traffic shifting area/controller area/explainability Describing results of an experiment area/install Iter8 installation and packaging labels Aug 5, 2021
@kalantar
Copy link
Member

kalantar commented Aug 5, 2021

One lesson from v1 was to separate the metrics from the experiment. This reverses that. What has changed? Or do we conclude that it just didn't make anything easier?

@kalantar
Copy link
Member

kalantar commented Aug 5, 2021

This change does not really address a problem with (non-builtin) metrics -- they are hard to define. In part, I think this is because there is no good way to test them. If each is it's own object I can imagine a tool that would let us test just a single metric. An authoring tool of some sort. By folding into an experiment, this seems harder to do. How do we assist users to define them?

@kalantar
Copy link
Member

kalantar commented Aug 5, 2021

If we go down this route, it seems like we should rethink the metrics (sub)-crd. There would be a lot of repetition in it. For example, can we reuse secret, url, headers, authtype, method, (proposed) versionInfo -- can we share it with the existing versionInfo -- even jqExpression. Clearly these can't be shared across all metrics, but is there value in group metrics with common properties like these.

@sriumcp
Copy link
Member Author

sriumcp commented Aug 5, 2021

Hmm... I agree @kalantar with your latest point, which really is where some thought needs to go in terms of the design.

Re: authoring tool, we were discussing #865 in this context. This subcommand can be modified as follows...

iter8ctl debug -e <experiment-name> -m <metric-name> -a <Iter8 analytics URL>

So, I don't think much changes in that regard.

@sriumcp
Copy link
Member Author

sriumcp commented Aug 5, 2021

I am ok with leaving this issue open until v2beta1 instead of trying to accommodate it in v2alpha3.

I would really prefer if we can get to an agreement on this issue ... nail the design for reuse of secrets, url, headers, authType, method etc., and absorb it in v2alpha3.

name, type, versionInfo are the fields that will not be re-used across metrics sub-stanzas. Everything else potentially can be.

@sriumcp
Copy link
Member Author

sriumcp commented Aug 5, 2021

Or do we conclude that it just didn't make anything easier?

Yes, this is indeed my conclusion. I think the CRD fields for metrics are beautiful :) But I am not at all sure a separate CRD is helping... in fact, I don't think it is. Hence, the case for ditching the CRD and retaining the fields.


Separating them out this way also enabled us to evolve the metric fields much quicker (with minimal changes to the controller), and much of the heavy lifting for metrics being done by the analytics service. But I think we do have an excellent handle now on the actual fields needed to fetch metrics (with examples for 4 different metrics DBs & builtins & mocks).

@sriumcp
Copy link
Member Author

sriumcp commented Aug 5, 2021

Re: metrics specification... a better attempt to factor out common fields is below.

metricDefaults: # optional; default values for metrics
# <everything in metrics CRD, except name and type and versionInfo, can go here... all these fields are optional >
# <the idea is to default any field in metrics that can be defaulted>
metrics:
- name: ...
  <exactly the same set of fields as in metrics CRD> # if a field is repeated here, and in metricDefaults, this will override the default.

@kalantar
Copy link
Member

kalantar commented Aug 5, 2021

My impression is that we will need something like:

metricsGroup:
    - url ... # <defaults for everything in metrics CRD, except name and type and versionInfo, can go here
        metrics:
        - name: ...
    - # another metricsGroup

To support use of 2 metrics backend ... which, it is my impression is often the case for a/b testing. newrelic (reward) and prometheus (slo validation). Is this not true?

@sriumcp
Copy link
Member Author

sriumcp commented Aug 5, 2021

How about the following? Same idea as yours, with some name changes.

backends:
  # in the default stanza, we expect to see the same fields as in metrics CRD, except name, type, versionInfo
- default: <defaults specific to this backend> 
  # in each metric, we only allow name, type, and versionInfo -- so no concept of overriding anymore
  metrics: <metrics using this backend>

In this design, if you have different defaults for the metrics with the same metrics URL (extremely unlikely scenario), it is still supported; you simply list them as two different backends.

@sriumcp sriumcp added the v2beta1 label Aug 6, 2021
@sriumcp
Copy link
Member Author

sriumcp commented Aug 6, 2021

Please see #942 (comment) for an example.

@kalantar
Copy link
Member

kalantar commented Aug 6, 2021

If we are saying this supercedes #930, we should indicate here that we still want to shorten the names of the fields.

@sriumcp
Copy link
Member Author

sriumcp commented Aug 6, 2021

Note: This issue also needs to address all concerns from #930. In particular, co-locate versionInfo with metrics in order to avoid spooky action at a distance anti-pattern

@sriumcp
Copy link
Member Author

sriumcp commented Aug 11, 2021

kind: experiment
spec:
  metrics: # required
    # anywhere versionInfo appears, it is a list whose length needs to match length of versionNames
    # name and namespace will be commonly used by Iter8's builtin metrics
    # interpretation of name and namespace is metric specific
    # conventions for the metrics / configmap yet to be sorted out
    versionInfo: # required
    - namespace:
      iter8-mock/user-engagement: "15.0"
      iter8-mock/mean-latency: 
    - namespace:
      mean-latency: 
  criteria:
    reward:
    - metric: iter8-mock/user-engagement
      preferredDirection: high # low for reward metrics
    objectives:
    - metric: iter8-openshift-route/mean-latency
      upperLimit: 200 # msec
  versionNames:
  - current
  - candidate

params, body

kind: configmap
data:
  backends:
  - name: iter8-openshift-route
    secret: namespace/name # of the secret
    url: # take out templating for url for now
    headers: # are templates
    jqExpression:
    provider:
    method: 
    authType:
    
    metrics:
    - name: mean-latency
      type: gauge
      params:
      - name: param
        value: | 
           some complicated prometheus query with things like $name, $namespace, $serviceName, $elapsedTime
    - name: mean-latency-other
      type: gauge
      params:
      - name: body
        value: | 
           some complicated prometheus POST query with things like $name, $namespace, $serviceName, $elapsedTime

@kalantar kalantar mentioned this issue Aug 12, 2021
2 tasks
@kalantar
Copy link
Member

Today Metric.SampleSize is a reference to a counter metric resource. Is it correct to say that this a backend concept (that can be overridden by a metric)? This used to refer to a metric CR. I assume now that it would refer to another metric object. Is it reasonable to assume that it must be defined in the same backend as the referencing metric? Or do we need a way to refer to metrics from other backends?

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

We can drop Metric.SampleSize for now (we can discuss how to bring it back later, if we need to -- big if).

It was a noble concept when conceived (intended for use alongside confidence / other measures); we simply haven't used it yet (and it is adding complexity without value at the moment).

@kalantar
Copy link
Member

kalantar commented Aug 17, 2021

When we discussed this issue, we indicated that fields of a Backend could be overridden by fields in one of the metrics. Does this apply to all fields, or are some fields, by definition, back end only -- ie, not overridable. Fields are (currently):

  • AuthType
  • Method
  • Body
  • Provider
  • JQExpression
  • Secret
  • Headers
  • URL

Should VersionInfo be here as well?

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

Except Body, all other fields (including versionInfo) above should be backend specific.

Name, Type, Body, Params should be metric specific.

You should be able to override everything within the experiment (or define new backends / metrics, not pre-defined).

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

Given that we are not really going to be able to get rid of VersionInfo for metrics, I feel we don't need two different params (one for each version), we can keep the params / body structure as it is right now (i.e., once per metric with versioninfo based templating).

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

kind: configmap
data:
  backends:
  - name: iter8-openshift-route
    secret: namespace/name # of the secret
    url: # take out templating for url for now
    headers: # are templates
    jqExpression:
    provider:
    method: 
    authType:
    
    metrics:
    - name: mean-latency
      type: gauge
      params:
      - name: param
        value: some complicated prometheus query with things like $name, $namespace, $serviceName, $elapsedTime
        # stuff like $name, $namespace, and $serviceName will come from the `versionInfo` section of this metric in the experiment

@kalantar
Copy link
Member

Now that I look at the definition of VersionInfo:

VersionInfo []VersionDetail `json:"versionInfo,omitempty" yaml:"versionInfo,omitempty"`
...
type VersionDetail map[string]string

I am unsure that I like the use of VersionInfo in Experiment.Spec as a list of version labels. I wonder if we really just want two names for list of version labels and map of name/value.

@kalantar
Copy link
Member

Sorry... I am looking at your example (and see others) with configmap. Did we decide all metrics are in configmaps; not in the experiment? Or what is the breakdown we have in mind. Makes sense to have backends in configmaps I think.

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

We do not make any assumptions in terms of breakdown between configmap and experiment metrics...

  1. Define whatever you want to define in the configmap.
  2. (Re) define whatever you want to define in the experiment. If anything in the experiment overlaps with anything in the configmap, the experiment will override configmap.

As far the schema... for both, we will use the same schema as above (a list of backends, each with a list of metrics).

@kalantar
Copy link
Member

Can there be more than one configmap? Do we use a fixed name for the configmap? Or is there a reference in the experiment? or via an annotation in the configmap?

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

Could we please start with a single configmap, that analytics service is bootstrapped with to begin with? Name of the configmap can be fixed in analytics' deployment spec.

Definitely no reference to configmap in the experiment to start with (if we feel the need for it, we can annotate the experiment subsequently).

Idea is that Iter8 will come with a few pre-baked backends and metrics for many domains which the user can start using (with almost no changes; except perhaps URL). Of course, if the user wants to extend the backends or metrics within a given backend, they can do so with individual experiments to begin with, and change the configmap subsequently (we can document this process).


Moved VersionInfo to backends (instead of Metrics). Please see above.

@kalantar
Copy link
Member

Could we please start with a single configmap, that analytics service is bootstrapped with to begin with? Name of the configmap can be fixed in analytics' deployment spec.

It needs also to be configured in iter8ctl.

@kalantar
Copy link
Member

Could we please start with a single configmap, that analytics service is bootstrapped with to begin with? Name of the configmap can be fixed in analytics' deployment spec.

We should also have it configured in etc3; it should be validated if users may modify it. The sooner a problem is found, the better.

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

The validation can happen in analytics. I see no reason why every iter8 component needs to be exposed to this configmap (at least to begin with).

If the configmap is invalid, analytics will simply not start (and quit with an appropriate error message).


Analytics already does the heavy lifting when it comes to (fetching) metrics. It can do so when it comes to validation of metrics definitions (in the configmap) as well.


When v2alpha1 was designed, I recall the decision to have etc3 populate the metrics. It was partly due to our unfamiliarity with Python client APIs for K8s. This is not the case anymore. We're using Python K8s APIs to fetch secrets already in analytics...as far as this configmap is concerned, even K8s client APIs will not be required (to begin with), because the analytics pod can start up with the configmap.

Validation in Python will be using Pydantic models.


@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

We should also have it configured in etc3; it should be validated if users may modify it. The sooner a problem is found, the better.

At least initially, the configmap will be an install time artifact. So, the problem will be found at the soonest possible instant -- as soon as Iter8 is installed. Eventually, we can decide how to let users edit the configmap more dynamically (if we find that this what users actually want).


The user always has the option of ignoring / overriding what is in the configmap and directly specifying metrics conf in the experiment (which is typed and validated at the CRD level).

@kalantar
Copy link
Member

We will also defer validation that any criteria's reference to a metric is valid to analytics as well?

@sriumcp
Copy link
Member Author

sriumcp commented Aug 17, 2021

Yes, that's correct. Analytics can simply complain (in an iter8ctl debug experiment friendly manner and format) that it is not finding the corresponding metric and will do nothing for a criterion.

In other words, when the user does iter8ctl describe, they will see certain metrics being unavailable, and objectives corresponding to those metrics not being satisfied (or rewards marked as unavailable). And when the user does iter8ctl debug, they will know why this happened because of iter8logs emitted by analytics.

Anyway, this is one of the use-cases we identified for iter8logs. We can finesse the actual format / validation logic when we get to that issue.

@kalantar
Copy link
Member

Moved VersionInfo to backends (instead of Metrics). Please see above.

This doesn't seem intuitive especially if it is a configmap. Perhaps that is just a poor choice and that it would be only in an experiment.

From closest to furthest use. The closest would be in the reward/objective itself:

criteria:
  rewards:
  - metric: 
    versionInfo:
  objectives:
  - metric:
    versionInfo:

It is my observation that if there is more than 1 reward/criteria it is likely that much of the versionInfo will be duplicated. Hence it may make sense to put it at the level of Criteria:

criteria:
  versionInfo:
  rewards:
  objectives:

If we define metrics in the experiment, it may make sense to include versionInfo here. In particular, we would expect the use of the variable to be here in the definition of the metric. When this is in the experiment, it makes sense. However, when in a configmap, less so since we don't know what the versions are. A scenario that uses fixed names would work though.

metrics:
  versionInfo:
  -  name: metric_name
     type: Gauge
     params:
  - name: another_metric
criteria:
  rewards:
  objectives:

Furthest is the backend. This is mostly metric independent so perhaps this is not the right place? A backend may include a list of metrics; it would make sense there as above.

Does this make sense? I am proposing to not include them in reward/objective but do allow in critiera and metrics.

@sriumcp
Copy link
Member Author

sriumcp commented Aug 18, 2021

If you take a specific backend, for example, Prometheus add-on supplied by Istio, and look at the metrics exported by it, there are specific label values associated with different versions. And this association is the same across metrics exported by Istio.

For example, you will see destination_service_name and source_service_name labels appearing consistently across metrics for this backend (like sum_latency, sum_requests_total, and so on).


The above situation is the same with Linkerd, Sysdig (used in Code Engine), KFServing (Prometheus add on I created for the KFServing project), and so on.


The expectation is that versionInfo, possibly URL, and possibly secret are the only three fields that the user might need to set in order use builtin backends and builtin metrics.

  1. Of these three, URL should be configured as part of configmap (so this should surface as an install time option). Of course, if the user really wants, they can override in the experiment.
  2. Secret will be surfaced both as an install time option and can be overridden in the experiment (for single tenant use-cases, install time is the better (meaning, easier) way to do this; all applications / experiments will use the same secret).
  3. VersionInfo will be some dummy (or null value) in configmap, and should be supplied as part of the experiment, at a backend level. VersionInfo is fundamentally application (version) specific, while backend is cluster specific (and intended to be common across multiple applications). Hence, the need to override versioninfo for backend in each experiment.

metrics:
  versionInfo:
  -  name: metric_name
     type: Gauge
     params:
  - name: another_metric

Above yaml seems to be mixing a list with a map key, which is invalid (unless the idea is to put metric definitions under versionInfo, which is way too complicated).

I am assuming the intent is more like ...

metrics:
  versionInfo: <a map here>
  metricInfo:
    -  name: metric_name
       type: Gauge
       params:
    - name: another_metric

This is conceptually equivalent to putting the versionInfo under backends, with the only difference being user has more typing to do and the overall CR looks more complicated because of the needless extra field under metrics (which used to be simply a list of metrics).


IMO, versionInfo should have nothing to do with criteria (which is very clean at the moment, and can be left alone without complicating it further; also please see #973 ).


IMO, none of the above discussion changes whether we use a configmap or a CRD for describing backends. The discussion re: where we put versionInfo within experiment remains the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/analytics Metrics, statistical estimation, and bandit algorithms for traffic shifting area/explainability Describing results of an experiment area/install Iter8 installation and packaging kind/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants