-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] "MetricsContext#add" (reduce metrics duplication) #46
Comments
MetricsContext#add
MetricsContext#add
(reduce metrics duplication)
MetricsContext#add
(reduce metrics duplication)
Thanks for the detailed write-up. This is definitely do-able and is something that's been requested a few times. I really appreciate the time you took to propose a new API. In general, I like the proposal. The first concern I have is how metrics.add({
Metrics: [
{ Name: 'RequestCount', Value: requestCount, Unit: Unit.Count }
],
Dimensions: { PageType: pageType }
}); That being said, another alternative that could handle single and multiple metrics without changing the signature might be: addAggregation(dimensions: Set<String, String>, ...metrics: MetricItem[]); which would be used like: // single
metrics.addAggregation(
{ PageType: pageType },
{ Name: 'RequestCount', Value: requestCount, Unit: Unit.Count });
// multiple
metrics.addAggregation(
{ PageType: pageType },
{ Name: 'RequestCount', Value: requestCount, Unit: Unit.Count },
{ Name: 'ResponseTime', Value: responseTime, Unit: Unit.Milliseconds }); Another alternative that addresses the multi-dimension issue: metrics.setProperty("PageType", pageType);
metrics.setProperty("Client", client);
metrics.addAggregation([ "PageType" ],
{ Name: 'RequestCount', Value: requestCount, Unit: Unit.Count });
metrics.addAggregation([ "PageType", "Client" ],
{ Name: 'RequestCount', Value: requestCount, Unit: Unit.Count },
{ Name: 'ResponseTime', Value: responseTime, Unit: Unit.Milliseconds }); That still requires the author to define the same metric twice. So, another alternative that attempts to eliminate redundancy: metrics.setProperty("PageType", pageType);
metrics.setProperty("Client", client);
metrics.add({
name: "RequestCount",
value: requestCount,
unit: Unit.Milliseconds,
dimensions: [ [ "PageType" ], [ "PageType", "Client" ] ]
});
metrics.add({
name: "ResponseTime",
value: responseTime,
unit: Unit.Milliseconds,
dimensions: [ [ "PageType", "Client" ] ]
}); |
Hi @jaredcnance, thanks for your reply.
I share your same concern actually. I had that thought while I was writing the issue to be honest. The simplest solution appears to be: metrics.add({
Metrics: [ MetricItem, ...],
Dimensions: { Key: 'Value' }
}); This way we don't have to do any check on the optionality of the parameters and it simplify the implementation. In case we want to provide two method signatures - one for multi metrics and one for single metric - we have to opt for the positional input parameters version, something like metrics.add(metrics: Array<MetricItem>, dimensions: Object)
metrics.add(metrics: MetricItem, dimensions: Object) This way we can overload the method and ensure that those 2 input parameters must be defined (no optionality), having something like: // Multi metrics
metrics.add([
{
Name: 'RequestCount',
Value: 1,
Unit: Unit.Count
},
{
Name: 'ResponseTime',
Value: 59,
Unit: Unit.Milliseconds
}
], {
PageType: 'player'
});
// Single metric
metrics.add({
Name: 'UpstreamRequestCount',
Value: 1,
Unit: Unit.Count
}, {
Client: 'upstream'
}); Whereas, if we still want to stick with the object parameter, we could think of: metrics.add({
Metric: MetricItem,
Metrics: [ MetricItem, ... ],
Dimensions: { Key: 'Value' }
}); where this time there is no top-level properties that complicates the situation. The simplest implementation that I could think of for this won't check for metrics.add({
Metric: {
Name: 'UpstreamRequestCount',
Value: 1,
Unit: Unit.Count
},
Dimensions: { Client: 'upstream' }
}); or metrics.add({
Metrics: [
{
Name: 'RequestCount',
Value: 1,
Unit: Unit.Count
},
{
Name: 'ResponseTime',
Value: 1,
Unit: Unit.Milliseconds
}
],
Dimensions: { Client: 'upstream' }
}); If someone for some reason decide to do metrics.add({
Metric: {
Name: 'SomeExtraOne',
Value: 1,
Unit: Unit.Count
},
Metrics: [
{
Name: 'RequestCount',
Value: 1,
Unit: Unit.Count
},
{
Name: 'ResponseTime',
Value: 1,
Unit: Unit.Milliseconds
}
],
Dimensions: { Client: 'upstream' }
}); The simplest solution is to allow it (no conditional checks) and merge Some considerationsNow, let's say we decided which signature flavour to pick, there are a couple of important considerations worth noting. Coexistence of old and new methodsThe API still has the metrics.add({
Metrics: [
{
Name: 'RequestCount',
Value: 1,
Unit: Unit.Count
},
{
Name: 'ResponseTime',
Value: 100,
Unit: Unit.Milliseconds
}
],
Dimensions: { PageType: 'player' }
});
metrics.putMetric('UpstreamRequestCount', 1, Unit.Count);
metrics.setDimensions(
{ Client: 'upstream' }
); This should generate the following serialised JSON payload: {
"_aws": {
"Timestamp": 1592319905021,
"LogGroupName": "/awslogs/test",
"CloudWatchMetrics": [
{
"Dimensions": [
[
"PageType"
]
],
"Metrics": [
{
"Name": "RequestCount",
"Unit": "Count"
},
{
"Name": "ResponseTime",
"Unit": "Milliseconds"
}
],
"Namespace": "/awslogs/test"
},
{
"Dimensions": [
[
"Client"
]
],
"Metrics": [
{
"Name": "UpstreamRequestCount",
"Unit": "Count"
}
],
"Namespace": "/awslogs/test"
}
]
},
"RequestCount": 1,
"ResponseTime": 100,
"PageType": "player",
"UpstreamRequestCount": 1,
"Client": "upstream"
} Anything added via the old methods, goes within the same Default dimensionsWith the current API I can call With the new Example 1 metrics.add(input: Object, resetDimension: boolean);
metrics.add({
Metric: MetricItem,
Metrics: [ MetricItem, ... ],
Dimensions: { Key: 'Value' }
}, true); Example 2 metrics.add({
Metric: MetricItem,
Metrics: [ MetricItem, ... ],
Dimensions: { Key: 'Value' },
Reset: boolean
}); Example 3 metrics.add(metrics: Array<MetricItem>, dimensions: Object, resetDimensions: boolean)
metrics.add(metrics: MetricItem, dimensions: Object, resetDimensions: boolean) otherwise we will need to create another method that does so by setting the flag, something like: metrics.resetDefaultDimensions(true); What to do when the add method is called with the same metric nameLet's say we call add with the same metric name but with different value, or unit, or even dimension association, what should we do? For example: metrics.add({
Metric: {
Name: 'RequestCount',
Value: 1,
Unit: Unit.Count
},
Dimensions: { Client: 'upstream' }
});
metrics.add({
Metric: {
Name: 'RequestCount',
Value: 50,
Unit: Unit.Count
},
Dimensions: { Type: 'other' }
}); Do we treat the metric name as a key, and we override the Value, Unit and all dimensions with the last entry? At first it seems a good idea, but if you think of it, you can have multiple metrics, in that case which one is the key? What do you override? As you can see a lot of other complication and inconsistencies arise. I think that we should just add them as a new Ok, I promise this is all 😄 Let me know your thoughts. I'm available to help out if needed. |
I think I agree with the suggestions above. Conceptually, I think we need the idea of supporting multiple aggregations. The existing APIs operate on the “default” aggregation and the new method(s) allow the creation of new aggregations. This makes it fairly straightforward to implement. Regarding default dimensions, I think having an optional Boolean parameter Regarding duplicate metrics, we could do this for now and later provide the ability to increase validation strictness. Today, we don’t throw on any validation errors and this was IMO a mistake. I think a good starting point would be to introduce a validationStrictness option that would control the behavior of the library when validation errors occur. |
I'd also appreciate this feature. The proposed interface mostly makes sense to me. I worry that the method name I think it would be useful to expose a way for the user to see the default dimensions — those set with |
Thanks for the input @davidtheclark. I can get behind
Is your idea that you would do something like this? Or have I misunderstood? metrics.add({
Metric: MetricItem,
Dimensions: { ...metrics.getDefaultDimensions(), ...myDimensions }
}); |
@jaredcnance Yes, that's what I had in mind for using the exposed default dimensions. Why is "aggregation" a potential term here? What is being aggregated? |
Metric values are aggregated along dimensions. Each unique namespace and dimension combination results in an aggregation for a given metric name (sum, min, max, count, percentile distribution, etc.). The proposed method defines a new aggregation for a given metric (combination of dimensions and namespace). If that wasn't already clear, then it may not be a useful term in this context. |
@davidtheclark @simonespa I've opened #54 to provide a concrete example of what this API might look like. Please add your comments there. |
hi @jaredcnance thanks for that. I'll have a look and leave feedbacks. |
Describe the user story
Currently, all metrics are filtered against all defined dimensions, resulting in metrics associated to unrelated dimensions with a consequent duplication, unless you flush the metric logger multiple times, resulting in separate JSON logs in CloudWatch.
As a developer I'd like to filter a group of metrics against different dimensions within the same metric logger context (same JSON payload)
Use Case
I'm currently working on a project where the server sends events rather than individual metrics. This event is a single JSON object containing all relevant metrics, dimensions and properties measured during the course of it (the metric logger is flushed only once per event instance).
As an example, let's take a pretty common event for web applications and call it
page-request
, which is triggered any time a web page is requested by a user. Let's assume the collected metrics and dimensions are the followings:RequestCount
Counts the number of HTTP requests the server receives from the user. This metric is used to calculate the RPS.
PageType
Count
Sum
ResponseTime
The response time in milliseconds.
PageType
Milliseconds
Avg
,50th percentile
,95th percentile
,99th percentile
UpstreamRequestCount
Counts the the number of HTTP request the app performs towards its upstream services.
Client
Count
Sum
Where:
PageType:
is the type of page requested by the user (e.g. home, player, etc)Client
: the name of the upstream serviceFrom the example,
RequestCount
andResponseTime
share the same dimension, whereasUpstreamRequestCount
is applied to a different one.Let's write an example of metric logger which is called once immediately after the HTTP response has been sent to the user
This example generates the following JSON log
and the following metrics are extracted
We can see that
UpstreamRequestCount
is also applied to thePageType
dimension andRequestCount
andResponseTime
toClient
, effectively generating unnecessary new metrics (3 in this example).Describe the outcome you'd like
According to the previous example, I'd like to filter the
UpstreamRequestCount
byClient
only andRequestCount
andResponseTime
byPageType
, resulting in the following metricsHere, the
PageType
group contains only the metrics that we want to apply, same thing forClient
.According to the EMF specification it is possible to add multiple
CloudWatchMetrics
objectsin order to define different groups of metrics that we want to apply to different dimensions. If we consider the previous example once again, we need to generate a JSON payload like the following
where the two metrics sharing the same dimensions are defined within the same
CloudWatchMetrics
object.Generally speaking, each
CloudWatchMetrics
object contains metric that are filtered by the same group of dimensions.Describe the solution you are proposing
To do so, I'm proposing to add a new method to the
MetricsContext
interface calledadd
. The method will accept only one parameter which is an object defined in the next section.Syntax
Properties
Name
The metric name.
Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: String
Value
The metric value.
Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number
Unit
The metric unit (e.g. Unit.Count, Unit.Milliseconds, etc.)
Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number
Metrics
An array of objects (see MetricItem type).
Required: only if Name, Value and Unit are undefined, optional otherwise.
Type: Array
Dimensions
The dimensions to filter the defined metrics by. This objects is a map of key/value pairs that stores the name and value of the dimension. Each property value must be of type String.
Required: yes
Type: Object
Types
MetricItem
Considering the previous example, our metric logger will look like
When we have one metric, we can either do
or
Any other considerations about the solution
We could have modified the
LogSerializer
to optionally generate the multipleCloudWatchMetrics
objects by means of a flag, but currently there is no association between group of metrics sharing the same dimensions. To achieve that, we could have modified the internal data structure by adding a mapping between them, resulting in a new method anyway to allow the user to express this relationship via the public API.By creating a brand new method we keep the current data structure as is and the API back compatible with the previous version. The new method will have a separate data structure to allow the
LogSerializer
to easily understand how to serialise it.The text was updated successfully, but these errors were encountered: