Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move stackdriver trace exporter to new interface and pdata #486

Merged
merged 40 commits into from
Aug 6, 2020

Conversation

stevencl1013
Copy link
Contributor

Description: Changes stackdriver trace exporter to use pdata and implement the new exporter interface rather than the old, and use the OT Go stackdriver exporter rather than the OC one. The metric exporter is just converting from pdata back to consumerdata and still using the OC exporter.

Testing: Modified the tests for stackdriver, data conversion, and factory to reflect changes.

@stevencl1013 stevencl1013 requested a review from a team July 21, 2020 22:46
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 21, 2020

CLA Check
The committers are authorized under a signed CLA.

@stevencl1013 stevencl1013 changed the title Stackdriver pdata Move stackdriver trace exporter to new interface and pdata Jul 21, 2020
@nilebox
Copy link
Member

nilebox commented Jul 27, 2020

@stevencl1013 I think I've found the root cause after debugging the test:

2020/07/27 16:11:20 Failed to export to Stackdriver: rpc error: code = Canceled desc = context canceled
    TestStackdriverExport: stackdriver_test.go:105: test timed out

Note the first error message. It comes from cloudtrace.go#L221 (SDK exporter you have imported in this PR).

So we fail to export, because the context has been cancelled. The question is what is that context and why has it been cancelled early? The answer is related to open-telemetry/opentelemetry-collector#1386 as you suspected.

Hopefully it's not too confusing?

The workaround is trivial: don't forward the context passed into collector exporter to SDK exporter, and create a new context instead.
In other words, in the pushTraces function do the following:

        // NOTE: SDK Trace exporter exports data asynchronously via bundler, 
        // so we can't use the parent (synchronous) context, which gets cancelled 
        // immediately after this function is executed.
        ctx = context.Background() // create a new context to avoid cancellation
	for _, span := range spans {
		te.texporter.ExportSpan(ctx, span)
		goodSpans++
	}

Alternatively, we may make the use of bundler configurable, i.e. support "synchronous mode" which would've made more sense here.

/cc @james-bebbington

@james-bebbington
Copy link
Member

james-bebbington commented Jul 27, 2020

The workaround is trivial: don't forward the context passed into collector exporter to SDK exporter, and create a new context instead.

Great work finding this issue.

I think it would make more sense to put the fix into operations-go instead though, as the same issue could occur for other clients of that library with its current implementation. To avoid this kind of issue entirely, we should ensure we always create a new context at the time we start asynchronous work.

i.e. do not pass context into bundler; instead you can extract the spans from context first, and pass those in or do something similar.

@codecov
Copy link

codecov bot commented Jul 27, 2020

Codecov Report

Merging #486 into master will increase coverage by 0.03%.
The diff coverage is 68.20%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #486      +/-   ##
==========================================
+ Coverage   86.26%   86.30%   +0.03%     
==========================================
  Files         195      195              
  Lines       10632    10579      -53     
==========================================
- Hits         9172     9130      -42     
+ Misses       1128     1123       -5     
+ Partials      332      326       -6     
Flag Coverage Δ
#integration 71.09% <ø> (ø)
#unit 86.13% <68.20%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
exporter/stackdriverexporter/stackdriver.go 35.55% <40.00%> (-8.39%) ⬇️
exporter/stackdriverexporter/factory.go 68.42% <80.00%> (+25.56%) ⬆️
exporter/stackdriverexporter/spandata.go 82.90% <84.34%> (+12.62%) ⬆️
exporter/awsxrayexporter/translator/cause.go 94.73% <0.00%> (-5.27%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 32c9e6a...ca94625. Read the comment docs.

@stevencl1013
Copy link
Contributor Author

@nilebox Thanks for finding the cause of the issue! I committed the change and it seems to be working now, but I'll also look into modifying the operations-go repo as @james-bebbington mentioned.

@stevencl1013 stevencl1013 requested a review from nilebox July 30, 2020 15:34
Copy link
Member

@nilebox nilebox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just need a couple of final cleanups.

exporter/stackdriverexporter/go.mod Outdated Show resolved Hide resolved
exporter/stackdriverexporter/go.mod Outdated Show resolved Hide resolved
exporter/stackdriverexporter/spandata.go Show resolved Hide resolved
exporter/stackdriverexporter/stackdriver.go Outdated Show resolved Hide resolved
exporter/stackdriverexporter/stackdriver.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
@nilebox
Copy link
Member

nilebox commented Aug 3, 2020

@stevencl1013 the build is failing, please ping me when it's ready for another round of reviews.

@stevencl1013 stevencl1013 requested a review from nilebox August 4, 2020 15:49
Copy link
Member

@nilebox nilebox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for patiently addressing issues and feedback :)

@nilebox
Copy link
Member

nilebox commented Aug 5, 2020

@bogdandrutu Just noticed that my approval is not "green", could you add me to approvers in the contrib repo as well?

@bogdandrutu bogdandrutu merged commit 80509d6 into open-telemetry:master Aug 6, 2020
ljmsc referenced this pull request in ljmsc/opentelemetry-collector-contrib Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants