Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] "MetricsContext#add" (reduce metrics duplication) #46

Open
simonespa opened this issue Jun 14, 2020 · 9 comments
Open

[Feature] "MetricsContext#add" (reduce metrics duplication) #46

simonespa opened this issue Jun 14, 2020 · 9 comments
Labels
enhancement New feature or request

Comments

@simonespa
Copy link

simonespa commented Jun 14, 2020

Describe the user story

Currently, all metrics are filtered against all defined dimensions, resulting in metrics associated to unrelated dimensions with a consequent duplication, unless you flush the metric logger multiple times, resulting in separate JSON logs in CloudWatch.

As a developer I'd like to filter a group of metrics against different dimensions within the same metric logger context (same JSON payload)

Use Case

I'm currently working on a project where the server sends events rather than individual metrics. This event is a single JSON object containing all relevant metrics, dimensions and properties measured during the course of it (the metric logger is flushed only once per event instance).

As an example, let's take a pretty common event for web applications and call it page-request, which is triggered any time a web page is requested by a user. Let's assume the collected metrics and dimensions are the followings:

RequestCount
Counts the number of HTTP requests the server receives from the user. This metric is used to calculate the RPS.

  • Dimensions: PageType
  • Unit: Count
  • Aggregations: Sum

ResponseTime
The response time in milliseconds.

  • Dimensions: PageType
  • Unit: Milliseconds
  • Aggregations: Avg, 50th percentile, 95th percentile, 99th percentile

UpstreamRequestCount
Counts the the number of HTTP request the app performs towards its upstream services.

  • Dimensions: Client
  • Unit: Count
  • Aggregations: Sum

Where:

  • PageType: is the type of page requested by the user (e.g. home, player, etc)
  • Client: the name of the upstream service

From the example, RequestCount and ResponseTime share the same dimension, whereas UpstreamRequestCount is applied to a different one.

Let's write an example of metric logger which is called once immediately after the HTTP response has been sent to the user

const namespace = config.get('namespace');

export const logPageRequest = metricScope(metrics => {
  return async pageRequestEvent => {
    const {
      requestCount,
      responseTime,
      upstreamRequestCount,
      pageType,
      client
    } = pageRequestEvent;
    metrics.setNamespace(namespace);
    metrics.putMetric('RequestCount', requestCount, Unit.Count);
    metrics.putMetric('ResponseTime', responseTime, Unit.Milliseconds);
    metrics.putMetric('UpstreamRequestCount', upstreamRequestCount, Unit.Count);
    metrics.setDimensions(
      { PageType: pageType },
      { Client: client }
    );
  };
});

This example generates the following JSON log

default

and the following metrics are extracted

Screenshot 2020-06-12 at 16 47 05

Screenshot 2020-06-12 at 16 47 22

Screenshot 2020-06-12 at 16 47 47

We can see that UpstreamRequestCount is also applied to the PageType dimension and RequestCount and ResponseTime to Client, effectively generating unnecessary new metrics (3 in this example).

Describe the outcome you'd like

According to the previous example, I'd like to filter the UpstreamRequestCount by Client only and RequestCount and ResponseTime by PageType, resulting in the following metrics

Screenshot 2020-06-12 at 16 49 41

Screenshot 2020-06-12 at 16 50 00

Screenshot 2020-06-12 at 16 50 22

Here, the PageType group contains only the metrics that we want to apply, same thing for Client.

According to the EMF specification it is possible to add multiple CloudWatchMetrics objects

"CloudWatchMetrics": [
  {
    ... ...
  },
  {
    ... ...
  }
]

in order to define different groups of metrics that we want to apply to different dimensions. If we consider the previous example once again, we need to generate a JSON payload like the following

new

where the two metrics sharing the same dimensions are defined within the same CloudWatchMetrics object.

Generally speaking, each CloudWatchMetrics object contains metric that are filtered by the same group of dimensions.

Describe the solution you are proposing

To do so, I'm proposing to add a new method to the MetricsContext interface called add. The method will accept only one parameter which is an object defined in the next section.

Syntax

{
    "Name": String,
    "Value": Number,
    "Unit": String,
    "Metrics": [ MetricItem, ... ],
    "Dimensions": Object
}

Properties

Name

The metric name.

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: String

Value

The metric value.

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number

Unit

The metric unit (e.g. Unit.Count, Unit.Milliseconds, etc.)

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number

Metrics

An array of objects (see MetricItem type).

Required: only if Name, Value and Unit are undefined, optional otherwise.
Type: Array

Dimensions

The dimensions to filter the defined metrics by. This objects is a map of key/value pairs that stores the name and value of the dimension. Each property value must be of type String.

Required: yes
Type: Object

Types

MetricItem

{
  "Name": String,
  "Value": Number
  "Unit": String
}

Considering the previous example, our metric logger will look like

const namespace = config.get('namespace');

export const logPageRequest = metricScope(metrics => {
  return async pageRequestEvent => {
    const {
      requestCount,
      responseTime,
      upstreamRequestCount,
      pageType,
      client
    } = pageRequestEvent;
    metrics.setNamespace(namespace);
    metrics.add({
      Metrics: [
        {
          Name: 'RequestCount',
          Value: requestCount,
          Unit: Unit.Count
        },
        {
          Name: 'ResponseTime',
          Value: responseTime,
          Unit: Unit.Milliseconds
        }
      ],
      Dimensions: { PageType: pageType }
    });
  };
});

When we have one metric, we can either do

    metrics.add({
      Metrics: [
        {
          Name: 'UpstreamRequestCount',
          Value: upstreamRequestCount,
          Unit: Unit.Count
        }
      ],
      Dimensions: { Client: client }
    });

or

    metrics.add({
      Name: 'UpstreamRequestCount',
      Value: upstreamRequestCount,
      Unit: Unit.Count,
      Dimensions: { Client: client }
    });

Any other considerations about the solution

We could have modified the LogSerializer to optionally generate the multiple CloudWatchMetrics objects by means of a flag, but currently there is no association between group of metrics sharing the same dimensions. To achieve that, we could have modified the internal data structure by adding a mapping between them, resulting in a new method anyway to allow the user to express this relationship via the public API.

By creating a brand new method we keep the current data structure as is and the API back compatible with the previous version. The new method will have a separate data structure to allow the LogSerializer to easily understand how to serialise it.

@simonespa simonespa changed the title [Feature] Apply metrics to different groups of dimensions - MetricsContext#add [Feature] MetricsContext#add (reduce metrics duplication) Jun 14, 2020
@simonespa simonespa changed the title [Feature] MetricsContext#add (reduce metrics duplication) [Feature] "MetricsContext#add" (reduce metrics duplication) Jun 14, 2020
@jaredcnance
Copy link
Member

jaredcnance commented Jun 14, 2020

Thanks for the detailed write-up. This is definitely do-able and is something that's been requested a few times. I really appreciate the time you took to propose a new API. In general, I like the proposal. The first concern I have is how add might treat the case where single and multiple metrics are both supplied. I think I would prefer to keep a simple interface for add and perhaps introduce a second method allowing a single metric. I don't think having top-level Name/Value/Unit adds enough to warrant the additional complexity in a single method, this seems fine to me:

metrics.add({
      Metrics: [
        { Name: 'RequestCount', Value: requestCount, Unit: Unit.Count }
      ],
      Dimensions: { PageType: pageType }
    });

That being said, another alternative that could handle single and multiple metrics without changing the signature might be:

addAggregation(dimensions: Set<String, String>, ...metrics: MetricItem[]);

which would be used like:

// single
metrics.addAggregation(
  { PageType: pageType }, 
  { Name: 'RequestCount', Value: requestCount, Unit: Unit.Count });

// multiple
metrics.addAggregation(
  { PageType: pageType }, 
  { Name: 'RequestCount', Value: requestCount, Unit: Unit.Count },
  { Name: 'ResponseTime', Value: responseTime, Unit: Unit.Milliseconds });

Another alternative that addresses the multi-dimension issue:

metrics.setProperty("PageType", pageType);
metrics.setProperty("Client", client);

metrics.addAggregation([ "PageType" ], 
  { Name: 'RequestCount', Value: requestCount, Unit: Unit.Count });

metrics.addAggregation([ "PageType", "Client" ],
  { Name: 'RequestCount', Value: requestCount, Unit: Unit.Count },
  { Name: 'ResponseTime', Value: responseTime, Unit: Unit.Milliseconds });

That still requires the author to define the same metric twice. So, another alternative that attempts to eliminate redundancy:

metrics.setProperty("PageType", pageType);
metrics.setProperty("Client", client);

metrics.add({ 
  name: "RequestCount",
  value: requestCount,
  unit:  Unit.Milliseconds,
  dimensions: [ [ "PageType" ], [ "PageType", "Client" ] ]
});

metrics.add({ 
  name: "ResponseTime",
  value: responseTime,
  unit:  Unit.Milliseconds,
  dimensions: [ [ "PageType", "Client" ] ]
});

@jaredcnance jaredcnance added the enhancement New feature or request label Jun 14, 2020
@simonespa
Copy link
Author

simonespa commented Jun 18, 2020

Hi @jaredcnance, thanks for your reply.

The first concern I have is how add might treat the case where single and multiple metrics are both supplied. I think I would prefer to keep a simple interface for add and perhaps introduce a second method allowing a single metric.

I share your same concern actually. I had that thought while I was writing the issue to be honest.

The simplest solution appears to be:

metrics.add({
  Metrics: [ MetricItem, ...],
  Dimensions: { Key: 'Value' }
});

This way we don't have to do any check on the optionality of the parameters and it simplify the implementation.

In case we want to provide two method signatures - one for multi metrics and one for single metric - we have to opt for the positional input parameters version, something like

metrics.add(metrics: Array<MetricItem>, dimensions: Object)
metrics.add(metrics: MetricItem, dimensions: Object)

This way we can overload the method and ensure that those 2 input parameters must be defined (no optionality), having something like:

// Multi metrics
metrics.add([
  {
    Name: 'RequestCount',
    Value: 1,
    Unit: Unit.Count
  },
  {
    Name: 'ResponseTime',
    Value: 59,
    Unit: Unit.Milliseconds
  }
], {
  PageType: 'player'
});

// Single metric
metrics.add({
  Name: 'UpstreamRequestCount',
  Value: 1,
  Unit: Unit.Count
}, {
  Client: 'upstream'
});

Whereas, if we still want to stick with the object parameter, we could think of:

metrics.add({
  Metric: MetricItem,
  Metrics: [ MetricItem, ... ],
  Dimensions: { Key: 'Value' }
});

where this time there is no top-level properties that complicates the situation. The simplest implementation that I could think of for this won't check for Metric and Metrics to be mutually exclusive. This could be used like:

metrics.add({
  Metric: {
    Name: 'UpstreamRequestCount',
    Value: 1,
    Unit: Unit.Count
  },
  Dimensions: { Client: 'upstream' }
});

or

metrics.add({
  Metrics: [
    {
      Name: 'RequestCount',
      Value: 1,
      Unit: Unit.Count
    },
    {
      Name: 'ResponseTime',
      Value: 1,
      Unit: Unit.Milliseconds
    }
  ],
  Dimensions: { Client: 'upstream' }
});

If someone for some reason decide to do

metrics.add({
  Metric: {
    Name: 'SomeExtraOne',
    Value: 1,
    Unit: Unit.Count
  },
  Metrics: [
    {
      Name: 'RequestCount',
      Value: 1,
      Unit: Unit.Count
    },
    {
      Name: 'ResponseTime',
      Value: 1,
      Unit: Unit.Milliseconds
    }
  ],
  Dimensions: { Client: 'upstream' }
});

The simplest solution is to allow it (no conditional checks) and merge Metric and Metrics together.

Some considerations

Now, let's say we decided which signature flavour to pick, there are a couple of important considerations worth noting.

Coexistence of old and new methods

The API still has the putMetric, putDimensions, setDimensions methods, and these could be potentially used together with the new method that we are going to introduce. I think that the simplest solution should be that if both approaches are used in the same codebase, the Serialiser should simply merge the metrics. For example, let's say we have the following situation:

metrics.add({
  Metrics: [
    {
      Name: 'RequestCount',
      Value: 1,
      Unit: Unit.Count
    },
    {
      Name: 'ResponseTime',
      Value: 100,
      Unit: Unit.Milliseconds
    }
  ],
  Dimensions: { PageType: 'player' }
});
metrics.putMetric('UpstreamRequestCount', 1, Unit.Count);
metrics.setDimensions(
      { Client: 'upstream' }
);

This should generate the following serialised JSON payload:

{
  "_aws": {
    "Timestamp": 1592319905021,
    "LogGroupName": "/awslogs/test",
    "CloudWatchMetrics": [
        {
            "Dimensions": [
                [
                    "PageType"
                ]
            ],
            "Metrics": [
                {
                    "Name": "RequestCount",
                    "Unit": "Count"
                },
                {
                    "Name": "ResponseTime",
                    "Unit": "Milliseconds"
                }
            ],
            "Namespace": "/awslogs/test"
        },
        {
            "Dimensions": [
                [
                    "Client"
                ]
            ],
            "Metrics": [
                {
                    "Name": "UpstreamRequestCount",
                    "Unit": "Count"
                }
            ],
            "Namespace": "/awslogs/test"
        }
    ]
},
"RequestCount": 1,
"ResponseTime": 100,
"PageType": "player",
"UpstreamRequestCount": 1,
"Client": "upstream"
}

Anything added via the old methods, goes within the same CloudWatchMetrics object, whereas everything added via the new add method will be added as a new CloudWatchMetrics object. This way we keep the API back-compatible and then in future we could potentially think of deprecating putMetric, putDimensions and setDimensions and keep the APi with one simple method.

Default dimensions

With the current API I can call setDimensions and reset them or use putDimensions and keep them by adding the custom dimensions.

With the new add method I assume we would like to keep the same functionality. One solution that comes to mind is doing so via a boolean flag where we modify the behaviour of add based on its value.

Example 1

metrics.add(input: Object, resetDimension: boolean);

metrics.add({
  Metric: MetricItem,
  Metrics: [ MetricItem, ... ],
  Dimensions: { Key: 'Value' }
}, true);

Example 2

metrics.add({
  Metric: MetricItem,
  Metrics: [ MetricItem, ... ],
  Dimensions: { Key: 'Value' },
  Reset: boolean
});

Example 3

metrics.add(metrics: Array<MetricItem>, dimensions: Object, resetDimensions: boolean)
metrics.add(metrics: MetricItem, dimensions: Object, resetDimensions: boolean)

otherwise we will need to create another method that does so by setting the flag, something like:

metrics.resetDefaultDimensions(true);

What to do when the add method is called with the same metric name

Let's say we call add with the same metric name but with different value, or unit, or even dimension association, what should we do? For example:

metrics.add({
  Metric: {
    Name: 'RequestCount',
    Value: 1,
    Unit: Unit.Count
  },
  Dimensions: { Client: 'upstream' }
});

metrics.add({
  Metric: {
    Name: 'RequestCount',
    Value: 50,
    Unit: Unit.Count
  },
  Dimensions: { Type: 'other' }
});

Do we treat the metric name as a key, and we override the Value, Unit and all dimensions with the last entry? At first it seems a good idea, but if you think of it, you can have multiple metrics, in that case which one is the key? What do you override? As you can see a lot of other complication and inconsistencies arise.

I think that we should just add them as a new CloudWatchMetrics object and leave CloudWatch deal with it. If the EMF spec allows it, we shouldn't try to validate the semantic, but just the schema. If you do something similar, you'll face the consequences by having the same metric potentially with different dimensions or even worse, different units.

Ok, I promise this is all 😄

Let me know your thoughts. I'm available to help out if needed.

@jaredcnance
Copy link
Member

jaredcnance commented Aug 18, 2020

I think I agree with the suggestions above. Conceptually, I think we need the idea of supporting multiple aggregations. The existing APIs operate on the “default” aggregation and the new method(s) allow the creation of new aggregations. This makes it fairly straightforward to implement.

Regarding default dimensions, I think having an optional Boolean parameter includeDefaultDimensions with a value of true might be fine. Reset isn’t very clear to me.

Regarding duplicate metrics, we could do this for now and later provide the ability to increase validation strictness. Today, we don’t throw on any validation errors and this was IMO a mistake. I think a good starting point would be to introduce a validationStrictness option that would control the behavior of the library when validation errors occur.

@davidtheclark
Copy link
Contributor

I'd also appreciate this feature.

The proposed interface mostly makes sense to me. I worry that the method name add is too generic, for two reasons: (1) The MetricLogger is used to add non-metric properties to logs, as well as metric data points, so the name add is ambiguous about whether you're adding a property or a metric. (2) Every user will have to ask "When do I use add and when do I use putMetric? They sound the same: the names don't indicate a difference." Is putMetricWithDimensions too verbose?

I think it would be useful to expose a way for the user to see the default dimensions — those set with putDimensions and setDimensions — since they might want to transform that list in some way for their metric-specific dimensions. For example, I could use a metrics.dimensions property, or metrics.getDimensions() method, to add one additional dimension for my special metric, while making sure it also uses all the default dimensions.

@jaredcnance
Copy link
Member

Thanks for the input @davidtheclark. I can get behind putMetricsWithDimensions or putAggregation(...). The proposed behavior aligns more with the semantics of put over add since adding new dimension or metric values would replace existing ones.

I think it would be useful to expose a way for the user to see the default dimensions

Is your idea that you would do something like this? Or have I misunderstood?

metrics.add({
  Metric: MetricItem,
  Dimensions: { ...metrics.getDefaultDimensions(), ...myDimensions }
});

@davidtheclark
Copy link
Contributor

@jaredcnance Yes, that's what I had in mind for using the exposed default dimensions.

Why is "aggregation" a potential term here? What is being aggregated?

@jaredcnance
Copy link
Member

jaredcnance commented Aug 18, 2020

Metric values are aggregated along dimensions. Each unique namespace and dimension combination results in an aggregation for a given metric name (sum, min, max, count, percentile distribution, etc.). The proposed method defines a new aggregation for a given metric (combination of dimensions and namespace). If that wasn't already clear, then it may not be a useful term in this context.

@jaredcnance
Copy link
Member

@davidtheclark @simonespa I've opened #54 to provide a concrete example of what this API might look like. Please add your comments there.

@simonespa
Copy link
Author

hi @jaredcnance thanks for that. I'll have a look and leave feedbacks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants