Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OTLP receiver Shutdown() bug #2555

Conversation

tigrannajaryan
Copy link
Member

@tigrannajaryan tigrannajaryan commented Feb 25, 2021

OTLP receiver had a bug due to which it could generate data after Shutdown()
was returned.

It was visible in TestShutdown, with the following symptoms (it was a racy bug):

=== RUN   TestShutdown
    otlp_test.go:871:
        	Error Trace:	otlp_test.go:871
        	Error:      	Not equal:
        	            	expected: 4
        	            	actual  : 5
        	Test:       	TestShutdown
    otlp_test.go:871:
        	Error Trace:	otlp_test.go:871
        	Error:      	Not equal:
        	            	expected: 6
        	            	actual  : 7
        	Test:       	TestShutdown
--- FAIL: TestShutdown (0.49s)

I added a Helper which allows to wait until all in-flight operations are complete
before the Shutdown() function returns.

The Helper can be used by other components too in the future.

@tigrannajaryan tigrannajaryan changed the title Verify OTLP receiver Shutdown() DRAFT. DONT REVIEW: Verify OTLP receiver Shutdown() Feb 25, 2021
@tigrannajaryan tigrannajaryan force-pushed the feature/tigran/shutdown-otlp-receiver branch 2 times, most recently from 3163ee0 to 30ca26c Compare February 26, 2021 19:44
@tigrannajaryan tigrannajaryan changed the title DRAFT. DONT REVIEW: Verify OTLP receiver Shutdown() Fix OTLP receiver Shutdown() bug Feb 26, 2021
@tigrannajaryan tigrannajaryan force-pushed the feature/tigran/shutdown-otlp-receiver branch from 30ca26c to 96e9d25 Compare February 26, 2021 19:46
@codecov
Copy link

codecov bot commented Feb 26, 2021

Codecov Report

Merging #2555 (1442692) into main (e6319ac) will decrease coverage by 0.04%.
The diff coverage is 77.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2555      +/-   ##
==========================================
- Coverage   91.69%   91.64%   -0.05%     
==========================================
  Files         267      268       +1     
  Lines       15107    15161      +54     
==========================================
+ Hits        13852    13894      +42     
- Misses        871      877       +6     
- Partials      384      390       +6     
Impacted Files Coverage Δ
receiver/otlpreceiver/otlp.go 87.50% <50.00%> (-4.50%) ⬇️
receiver/otlpreceiver/logs/otlp.go 91.66% <60.00%> (-8.34%) ⬇️
receiver/otlpreceiver/metrics/otlp.go 91.66% <60.00%> (-8.34%) ⬇️
receiver/otlpreceiver/trace/otlp.go 93.75% <60.00%> (-6.25%) ⬇️
receiver/otlpreceiver/shutdownhelper/helper.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6319ac...1442692. Read the comment docs.

OTLP receiver had a bug due to which it could generate data after Shutdown()
was returned.

It was visible in TestShutdown, with the following symptoms (it was a racy bug):

=== RUN   TestShutdown
    otlp_test.go:871:
        	Error Trace:	otlp_test.go:871
        	Error:      	Not equal:
        	            	expected: 4
        	            	actual  : 5
        	Test:       	TestShutdown
    otlp_test.go:871:
        	Error Trace:	otlp_test.go:871
        	Error:      	Not equal:
        	            	expected: 6
        	            	actual  : 7
        	Test:       	TestShutdown
--- FAIL: TestShutdown (0.49s)

I added a Helper which allows to wait until all in-flight operations are complete
before the Shutdown() function returns.

The Helper can be used by other components too in the future.
@tigrannajaryan tigrannajaryan force-pushed the feature/tigran/shutdown-otlp-receiver branch from 96e9d25 to 1442692 Compare February 26, 2021 19:55
@tigrannajaryan tigrannajaryan marked this pull request as ready for review February 26, 2021 19:56
@tigrannajaryan tigrannajaryan requested a review from a team February 26, 2021 19:56
@tigrannajaryan
Copy link
Member Author

I need a few pairs of fresh eye to verify the Helper implementation.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a much simpler fix #2564

return errTimeout
case <-doneWaitingForProcessing:
}
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be moved in the select

@tigrannajaryan
Copy link
Member Author

I think this is a much simpler fix #2564

This is nice. I did not know there was already the graceful shutdown methods for servers. I will close this PR.

@tigrannajaryan tigrannajaryan deleted the feature/tigran/shutdown-otlp-receiver branch March 3, 2022 16:56
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023
Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.9.0 to 0.9.1.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](aquasecurity/trivy-action@0.9.0...0.9.1)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants