Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/logdeup] Add scope aggregator #34607

Merged

Conversation

MikeGoldsmith
Copy link
Member

@MikeGoldsmith MikeGoldsmith commented Aug 12, 2024

Description:

Adds a scope aggregator to separate log records by different scopes.

Also includes some refactor work:

  • Extend pdatautil.hash with configurable options of values to hash, includes ability of returning uint64
  • Add consistent hash functions for resource, scope and log records that use new hash functions
  • Update aggregators to keep a copy of the original resource, scope and log record instead of just it's attributes
    • Simplifies ConsumeLogs & Export functions
  • Update newLogCounter to set lastObservedTime
  • Update existing and add more tests

Link to tracking Issue:

Testing:

Updated existing and added more unit tests to verify behaviour.

processor/logdedupprocessor/processor.go Outdated Show resolved Hide resolved

hasher.Write(attrHash[:])
// getScopeKey creates a unique hash for the scope to use as a map key
func getScopeKey(scope pcommon.InstrumentationScope) uint64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this to pdatautil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended pdatautil to do more of the work here instead in a716f65

attrHash := pdatautil.MapHash(logRecord.Attributes())
// getResourceKey creates a unique hash for the resource to use as a map key
func getResourceKey(resource pcommon.Resource) uint64 {
hasher := xxhash.New()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for wrapping the pdatautil output? In general I think we should standardize where possible. If there's something deficient about the pdatautil implementation, can we fix or enhance it? Maybe just a pdatautil.ResourceHash which aliases MapHash so that there's an unambiguous standard?

}

// getLogKey creates a unique hash for the log record to use as a map key
func getLogKey(logRecord plog.LogRecord) uint64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why this one isn't standard, since we're only accounting for some fields. Maybe an options pattern accounts for it in the future, but I'm find with leaving it alone for now.

@MikeGoldsmith
Copy link
Member Author

@djaglowski Moved hash generation into pdatautil, let me know what you think 😄

@djaglowski djaglowski merged commit fd7e8b0 into open-telemetry:main Aug 12, 2024
154 checks passed
@github-actions github-actions bot added this to the next release milestone Aug 12, 2024
@MikeGoldsmith MikeGoldsmith deleted the mikegoldsmith/logdedup-scope branch August 12, 2024 15:36
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this pull request Sep 12, 2024
**Description:**

Adds a scope aggregator to separate log records by different scopes.

Also includes some refactor work:
- Extend pdatautil.hash with configurable options of values to hash,
includes ability of returning uint64
- Add consistent hash functions for resource, scope and log records that
use new hash functions
- Update aggregators to keep a copy of the original resource, scope and
log record instead of just it's attributes
  - Simplifies ConsumeLogs & Export functions
- Update newLogCounter to set lastObservedTime
- Update existing and add more tests

**Link to tracking Issue:**

- Closes open-telemetry#34606 34606

**Testing:**

Updated existing and added more unit tests to verify behaviour.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Maintain and aggregate per instrumentation scope
3 participants