Timestamp doesn't represent correct metric time #53

Dunedan · 2020-08-27T16:08:00Z

According to the EMF specification each log record's _aws_ metadata object must have a single "Timestamp" attribute representing the time the metrics are associated with. However when using aws-embedded-metrics the timestamp in the metadata is set to the time the function decorated with @metric_scope got called. So obviously the timestamp is different to the time MetricLogger().put_metric() for individual metrics gets called. This leads to incorrect timestamps as soon as the decorated function emits metrics after more than the minimum resolution shown in CloudWatch Metrics, which I believe is one minute when using EMF.

A single function running longer than one minute is probably pretty common, so I believe this is a real problem. What I'd expect aws-embedded-metrics to do is to store the current time together with the metric value when MetricLogger().put_metric() is called and create one log record per minute worth of metric values when serializing the data. This should work fine, as metric timestamps submitted to CloudWatch Metrics can be up to two weeks in the past. Maybe there is also the possibility to submit the log records for "finished" resolution intervals asynchronously in the background to get the metric values into CloudWatch Metrics, while the decorated function is still running.

While such a change would unfortunately create more log records, I believe it would more accurately represent the time metrics are associated with.

The text was updated successfully, but these errors were encountered:

jaredcnance · 2020-08-27T16:24:02Z

Rather than using a single decorator for the entire function, can you decorate individual methods that need to be instrumented instead? You can also create the logger and flush manually instead if you prefer. This would give you complete control over how the events are emitted. I believe the feature you’re asking for can be built on top of what exists today.

Edit: on second thought, this might not provide you the behavior you’re looking for if you want to maintain a common set of key-value pairs with all events emitted by the function.

Dunedan · 2020-08-27T17:59:15Z

With function I meant Python function, not AWS Lambda function. Sorry, for being not clear enough about that. Being able to use multiple metrics scopes inside a single Python function is why I suggested that aws-embedded-metrics should also provide a context manager for metrics_scope in #50.

However that's not why I opened this issue. I believe users of aws-embedded-metrics shouldn't need to jump through hops to get a correct timestamp for their metrics. And currently its neither straight forward to get a correct timestamp nor is it obvious that this is a problem.

While I further thought about the issue I noticed that there doesn't even need to be a full minute between the call of the decorated function and the put_metric() call to cause an incorrect timestamp. Let's assume you call the function at 12:00:55 and call put_metric() inside the function at 12:01:05, then the timestamp in the log record metadata will point to 12:00:55 and the metric will be displayed in CloudWatch Metrics as if it had happened at 12:00, while it happened at 12:01.

That's not as wrong as for long running Python functions, but still a wrong timestamp.

jaredcnance · 2020-08-27T18:10:33Z

I agree and see the value in this. I think it would be fairly straightforward to round the current time to the nearest minute and then bucket the metrics. I believe this could be done with changes just to the MetricsContext and to the LogSerializer.

In my mind, the MetricsContext would change from storing metrics as a Dict[str, Metric] to Dict[int, Dict[str,Metric]] and then of course the LogSerializer would need to generate events for each timestamp.

Additionally, we might consider putting this behind a feature flag as some users may not want to risk splitting their metrics over multiple events and may be fine with the current behavior which ensures all metrics in a transaction are present in a single event.

heitorlessa · 2020-09-17T10:48:43Z

@jaredcnance a backward compatible solution could be allowing a timestamp to be associated with a metric.

That would allow customers and libraries to opt in to a more accurate resolution.

Logging different objects will quickly cause a CloudWatch Logs ingestion & storage cost to spike

jaredcnance added the question Further information is requested label Aug 27, 2020

jaredcnance added enhancement New feature or request and removed question Further information is requested labels Aug 27, 2020

Dunedan mentioned this issue Sep 17, 2020

Provide timestamp per metric not per metrics set aws-powertools/powertools-lambda-python#166

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timestamp doesn't represent correct metric time #53

Timestamp doesn't represent correct metric time #53

Dunedan commented Aug 27, 2020

jaredcnance commented Aug 27, 2020 •

edited

Loading

Dunedan commented Aug 27, 2020

jaredcnance commented Aug 27, 2020

heitorlessa commented Sep 17, 2020

Timestamp doesn't represent correct metric time #53

Timestamp doesn't represent correct metric time #53

Comments

Dunedan commented Aug 27, 2020

jaredcnance commented Aug 27, 2020 • edited Loading

Dunedan commented Aug 27, 2020

jaredcnance commented Aug 27, 2020

heitorlessa commented Sep 17, 2020

jaredcnance commented Aug 27, 2020 •

edited

Loading