Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exemplars feature #4094

Merged
merged 51 commits into from
Sep 13, 2024
Merged

Add exemplars feature #4094

merged 51 commits into from
Sep 13, 2024

Conversation

fcollonval
Copy link
Contributor

@fcollonval fcollonval commented Jul 31, 2024

Description

Fixes #2407
Fixes open-telemetry/opentelemetry-python-contrib#2158

API changes

  • opentelemetry.metrics.instrument: Add optional Context argument to all record/add/set value1
  • opentelemetry.metrics.Observation: Add optional Context attribute
  • Add Exemplar, ExemplarFilters and ExemplarReservoirs following the spec and the implementation in JS and Java SDK
  • opentelemety.sdk.metrics.MeterProvider: Add optional exemplar_filter attribute to the constructor
    The filter will be stored in the SdkConfiguration (via a new attribute exemplar_filter)
  • opentelemety.sdk.metrics._internal.Measurement: Add mandatory time_unix_nano and Context attributes
  • opentelemetry.sdk.metrics._internal.*DataPoint: Add optional exemplars attribute
  • opentelemetry.sdk.metrics._internal._Aggregation:
    • Add an attribute _reservoir set from a reservoir_factory
    • Add method _collect_exemplar that is called by the children classes on collect
    • Make aggregate not abstracted any longer and add an optional should_sample_exemplar argument. If it is True (default), the exemplar reservoir will be offer to Measurement to store an exemplar. That method need to be called via super on all children classes
  • opentelemetry.sdk.metrics.view.View: Add optional attribute exemplar_reservoir_factory defining the exemplar reservoir factory per _Aggregation type.
  • Add should_sample_exemplar argument on method consume_measurement from MeasurementConsumer, MetricReaderStorage and _ViewInstrumentMatch

Notes

  • The filter must be configurable in MetricProvider, hence I went for propagating the should_sample value along side the measurements. This is the approach chosen in the .NET SDK. Another possibility would be to mimic the Java SDK approach that is wrapping the wanted exemplar reservoir in an exemplar reservoir applying first the filter (although I don't know right now how to implement that variant).
  • The reservoir factory provided to View is based on _Aggregation type. I find that non-optimal as those classes are meant to be protected when the factory is meant to be public. But I don't see an alternative.
  • The reservoir factory per _Aggregation type returns a (**kwargs): ExemplarReservoir factory because some reservoirs need to be configured based on the aggregation parameters; e.g. the exponential histogram bucket size will impact the number of buckets in the exemplar reservoir.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Unit tests have been added for the new base class as well as for instrumentation interactions and a integration test with the console exporter.

Does This PR Require a Contrib Repo Change?

  • Yes. - Link to PR:
  • No.

It does not require a change. But the metrics exporters in that repo will require an update to include the added exemplars; best as a follow-up of this PR.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

Footnotes

  1. This is similar to the Java and Javascript SDK.

Copy link

linux-foundation-easycla bot commented Jul 31, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@fcollonval
Copy link
Contributor Author

Dear maintainers,

I would like to contribute to this project by adding the support for exemplars.

Currently, I have added the base classes (based on the JavaScript SDK implementation). And now I'm facing challenges to integrate this into the code base. Analyzing the C# and Java SDK, it seems that the _Aggregation class is the best place to save the ExemplarReservoir:

But this raises 3 questions:

  • How best to provide the ExemplarFilter and reservoir factory to the constructor?
  • How to get the attributes in the aggregate method - this is the required variable missing compared to Java?
  • How to get the context in the collect method - this is the required variable missing compared to Java?

Any additional pointers would be appreciated.

@lzchen
Copy link
Contributor

lzchen commented Aug 1, 2024

@fcollonval

Thanks so much for picking this up. I am not too familiar with exemplars but happy to open up a discussion about the architectural design. A couple of questions/changes I have in mind:

  1. The time the API call was made to record a Measurement.

This would probably entail adding a timestamp to Measurement to represent when the api call was made/when the Measurement was created.

  1. The associated trace id and span id of the active Span within Context of the Measurement at API call time.

Most likely another field in Measurement to store Context information. This would most likely get populated at time of recording/Measurement creation. Hopefully this addresses your question 3.

  1. Where in the metrics pipeline do we envision the ExemplarFilter and ExemplarReservoir to "hook" onto (aka when should offer and collect be called?).

offer can probably be called when a Measurement is being "processed", so I'm envisioning a separate method in Aggregation called process_exemplar(Measurement, attributes), where attributes are the POST view-filtered attributes that actually show up in the time-series and measurement.attributes are the pre-view filtered attributes that were actually used to record. This probably addresses your question 2.

collect is most likely called when the Aggregation calls collect as well.

The "collect" method MUST return accumulated Exemplars.

I am not exactly sure what "accumulated Exemplars" means but from my basic understanding it seems like we just need to return the set of Exemplar s in memory that we have been accumulating.

  1. For question 1, I am not too sure what it is you are asking but this is my preliminary understanding of how the components should be constructed:

A new ExemplarReservoir MUST be created for every known timeseries data point, as determined by aggregation and view configuration.

There will most likely be a single ExemplarReservoir instance per Aggregation since each represent a known timeseries which is created when the Aggregation is created.

The ExemplarFilter SHOULD be a configuration parameter of a MeterProvider for an SDK.

Most likely passed down as a reference to the Aggregation.

How best to provide the ExemplarFilter and reservoir factory to the constructor?

Is there a need for a factory? What are you envisioning here?

Feel free to add any insights or correct any misunderstandings I may have :)

@fcollonval
Copy link
Contributor Author

Thanks a lot @lzchen for all the information.

Thanks for pointing out about the Measurement it clarifies how I can provide the missing info the ExemplarReservoir.

The "collect" method MUST return accumulated Exemplars.

I am not exactly sure what "accumulated Exemplars" means but from my basic understanding it seems like we just need to return the set of Exemplar s in memory that we have been accumulating.

This is also my understanding. The reservoir will provide what it collected.

A new ExemplarReservoir MUST be created for every known timeseries data point, as determined by aggregation and view configuration.

There will most likely be a single ExemplarReservoir instance per Aggregation since each represent a known timeseries which is created when the Aggregation is created.

👍

The ExemplarFilter SHOULD be a configuration parameter of a MeterProvider for an SDK.

Most likely passed down as a reference to the Aggregation.

👍 I missed that point in the spec

reservoir factory to the constructor?

Is there a need for a factory? What are you envisioning here?

From https://github.com/open-telemetry/opentelemetry-specification/blob/5381b55dd8e6adcbf99e153533e5ad8ea3dd6b38/specification/metrics/sdk.md?plain=1#L387

The SDK MUST accept the following stream configuration parameters:

  • exemplar_reservoir: A functional type that generates an exemplar reservoir a MeterProvider will
    use when storing exemplars. This functional type needs to be a factory or
    callback similar to aggregation selection functionality which allows
    different reservoirs to be chosen by the aggregation.

This is implemented in the Java SDK - and I guess it is required to support:

Custom ExemplarReservoir
The SDK MUST provide a mechanism for SDK users to provide their own ExemplarReservoir implementation.

Thanks a lot for the pointers, I'll work towards a first working version.

@catherine-m-zhang
Copy link

Hi there! I have also started looking into adding exemplars as well and would love to chat with you about it (I've reached out on linkedin).

@fcollonval
Copy link
Contributor Author

fcollonval commented Aug 14, 2024

@catherine-m-zhang you should have receive an invitation to my fork. Could you accept it so I can grant you enough rights to push on the branch directly?

I updated the PR description with the changes.

Here is a status of the work as far as I can tell:

  • CODE Implement base classes
  • CODE Connect the dots with the existing code
  • TEST Fix existing tests and lint the code (some tests may not be fixed yet as I only checked the ones in opentelemetry-sdk/tests/metrics
  • TEST Add unit tests for the new classes. I started for the filters as an exemple in opentelemetry-sdk/tests/metrics/test_exemplarfilter.py. You can take inspiration from the JS SDK.
  • TEST Add more unit tests for the new API in particular for all _Aggregation classes, for the reservoir factory that is customizable in View.
  • CODE The exemplar attributes should be filtered to the ones not included in the metric data point (see spec).
  • CODE Update the exporters to export the exemplars - I did not look at all if this will work out of the box or not.
  • DOC Add documentation/examples for the feature

I added an integration test that may be helpful in https://github.com/open-telemetry/opentelemetry-python/pull/4094/files#diff-46edae6a309b04be4e5143e01feb450ffc9f576c1b146bd288db31ed175c8982

@fcollonval
Copy link
Contributor Author

@lzchen here is a first workish version - feel free to have a look especially at the API changes to see if they fit the current API logic.

I'm off for the next 15 days but Catherine offers to help on the subject.

@lzchen
Copy link
Contributor

lzchen commented Aug 16, 2024

@catherine-m-zhang

Is this pr ready for review? If that's the case, feel free to mark as a PR instead of draft.

@czhang771
Copy link

@lzchen I don't think I have the correct access to mark this as ready for review but this is ready to be reviewed!

@fcollonval fcollonval marked this pull request as ready for review August 25, 2024 06:13
@fcollonval fcollonval requested a review from a team August 25, 2024 06:13
@fcollonval
Copy link
Contributor Author

Dear maintainers, this PR is finalized (CI should be green 🤞) and ready for final review.

Gentle ping to @lzchen @pmcollins @emdneto

@fcollonval fcollonval requested a review from lzchen September 6, 2024 06:54
@splusq
Copy link

splusq commented Sep 12, 2024

@fcollonval what else is remaining to close on this PR?

Copy link
Contributor

@lzchen lzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fcollonval

Great job in bringing this into fruition! If there are no more changes I can merge this in if you'd like.

@lzchen lzchen merged commit d5fb2c4 into open-telemetry:main Sep 13, 2024
376 checks passed
@fcollonval fcollonval deleted the ft/exemplars branch September 14, 2024 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add exemplar support to all WSGI based instrumentations Metrics: Add support for exemplars
8 participants