Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple spans (activities) within the same Trace for StackExchangeRedis #1153

Merged
merged 7 commits into from
Sep 3, 2020
Merged

Support multiple spans (activities) within the same Trace for StackExchangeRedis #1153

merged 7 commits into from
Sep 3, 2020

Conversation

kellybirr
Copy link
Contributor

@kellybirr kellybirr commented Aug 25, 2020

High Level

  • Add support for multiple spans (activities) within the same TraceId.
  • Use C# value tuples in cache keys and values to reduce heap allocations.
  • Value names in C# tuples to make code slightly easier to read.

Detailed Description
This fix enhances the StackExchangeRedis Instrumentation to support multiple child spans in the same trace so that systems using multiple child activities get proper nesting of Redis actions in their trace graphs.

The issue was the "cache" by "TraceId" that used the same profiler across and entire trace, ignoring the spans.

This update will also slightly reduce memory allocations resulting in an a tiny performance improvement.

This change will behave exactly like before, in code that only uses one span within the trace. It will also now behave as expected when used with multiple spans.

Before/After Screenshots (Jaeger)
BEFORE
AFTER

Full Repro Example
https://github.com/kellybirr/tracing-demo
The code and setup at the repro above reproduces the bug. It's even visible in the screenshot in the readme of that repo.

I've tested this fix with that code, it works as expected. That's how I got the "AFTER" screenshot included here.

- Use C# value tupples in cache keys and values to reduce heap allocations and ensure proper key comparison.
- Value names in C# tupples to make code slightly easier to read.
@kellybirr kellybirr requested a review from a team August 25, 2020 00:10
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Aug 25, 2020

CLA Check
The committers are authorized under a signed CLA.

@cijothomas
Copy link
Member

@kellybirr Thanks! Could you sign the CLA, add tests for this bug/fix?

@kellybirr
Copy link
Contributor Author

CLA Signed. I can come up with a test, but that will likely require exposing internals to the test as the cached values never get exposed outside the class. Any thoughts?

Fixing the code formatting issue being reported by code scanner.
@codecov
Copy link

codecov bot commented Aug 25, 2020

Codecov Report

Merging #1153 into master will increase coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1153      +/-   ##
==========================================
+ Coverage   78.85%   78.93%   +0.07%     
==========================================
  Files         219      219              
  Lines        6329     6332       +3     
==========================================
+ Hits         4991     4998       +7     
+ Misses       1338     1334       -4     
Impacted Files Coverage Δ
...ngeRedis/StackExchangeRedisCallsInstrumentation.cs 96.55% <100.00%> (+5.64%) ⬆️
...ZPages/Implementation/ZPagesExporterEventSource.cs 62.50% <0.00%> (+6.25%) ⬆️

@cijothomas
Copy link
Member

CLA Signed. I can come up with a test, but that will likely require exposing internals to the test as the cached values never get exposed outside the class. Any thoughts?

Can you validate that the activity received the Mock contains the expected values?

Also, internals of StackEnchange instrumentation should already be visible to its tests. (https://github.com/open-telemetry/opentelemetry-dotnet/blob/master/src/OpenTelemetry.Instrumentation.StackExchangeRedis/AssemblyInfo.cs#L19)

@CodeBlanch
Copy link
Member

@kellybirr A couple of questions:

  • Did you write a benchmark for this? If not, are you interested in doing one? The ValueTuple should avoid an allocation, but not sure about using it as key in a ConcurrentDictionary. ConcurrentDictionary could end up boxing it which might be a perf hit.
  • Do you happen to know how ValueTuple generates its HashCode? We need to make sure it uses the key & value in its composition. I suspect it does but if not that could also be a perf killer.

@kellybirr
Copy link
Contributor Author

kellybirr commented Aug 25, 2020

@CodeBlanch That's a really good point. I think we're going to be good with this as I've coded it.

I did consider using a String key with code like string cacheKey = $"{parent.TraceId}|{parent.SpanId}" since strings are easy and well known. I decided that the ValueTuple<ActivityTraceId,ActivitySpanId> would be a more "modern" solution. If a string would be preferred for simplicity, I don't have any argument against it.

As to your specific questions.

  1. The ConcurrentDictionary<K,T> has specific code in it to deal with value-type keys and seems pretty well optimized for them, based on the Source Code. This is consistent with all the work the .Net team has been putting into reducing heap allocations. The generic dictionaries appear to only box value-type keys when accessed via the original (non-generic) IDictionary interface.

  2. The ValueTuple<T1,T2> uses HashCode.Combine() in it's GetHashCode method, based on the Source Code. This ensures that both internal structs hash codes are taken into account and a new hash code is created.

Given these factors, the ValueTuple<T1,T2> seems to be an ideal key for a ConcurrentDictionary<K,T>. Again, if a string is preferred, I'm good with that.

As for writing a benchmark. I'm not opposed to it. I'm also not familiar with the benchmarking framework you're using. I probably would not get it done quickly.

@kellybirr
Copy link
Contributor Author

@cijothomas I just added a unit test

Copy link
Member

@CodeBlanch CodeBlanch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution & for taking the time to respond @kellybirr. I think we're good to merge. If you want to mess with a benchmark, feel free, they are actually pretty easy to do. I might do it when I have some free time just out of curiosity.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@cijothomas cijothomas merged commit a3f1c98 into open-telemetry:master Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants