Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdk: shut down span processors automatically #280

Merged
merged 10 commits into from
Nov 27, 2019

Conversation

mauriciovasquezbernal
Copy link
Member

The BatchExportSpanProcessor is an asynchronous span processor that uses a
worker thread to call the different exporters. Before this commit applications
had to shut down the span processor explicitely to guarantee that all the spans
were summited to the exporters, this was not very intuitive for the users.

This commit removes that limitation by implementing the tracer's del
method and an atexit hook. According to del's documentation [1] it is
possible that sometimes it's not called, for that reason the atexit hook is
also used to guarantee that the processor is shut down in all the cases.

[1] https://docs.python.org/3/reference/datamodel.html#object.__del__

The BatchExportSpanProcessor is an asynchronous span processor that uses a
worker thread to call the different exporters.  Before this commit applications
had to shut down the span processor explicitely to guarantee that all the spans
were summited to the exporters, this was not very intuitive for the users.

This commit removes that limitation by implementing the tracer's __del__
method and an atexit hook.  According to __del__'s documentation [1] it is
possible that sometimes it's not called, for that reason the atexit hook is
also used to guarantee that the processor is shut down in all the cases.

[1] https://docs.python.org/3/reference/datamodel.html#object.__del__
@@ -333,6 +334,13 @@ def __init__(
self._current_span_slot = Context.register_slot(slot_name)
self._active_span_processor = MultiSpanProcessor()
self.sampler = sampler
self._atexit_hanlder = atexit.register(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference would be to not do this by default, but have it as something the user can plug himself, or at least make this optional.

Copy link
Member

@c24t c24t Nov 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosalberto why not do this by default? Because it would be surprising that calling del results in (blocking!) network calls?

I agree about making this configurable, but I think this is much better than the current default: call shutdown or lose unexported spans.

@codecov-io
Copy link

codecov-io commented Nov 8, 2019

Codecov Report

Merging #280 into master will increase coverage by 0.05%.
The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #280      +/-   ##
==========================================
+ Coverage   85.56%   85.61%   +0.05%     
==========================================
  Files          33       33              
  Lines        1600     1592       -8     
  Branches      178      179       +1     
==========================================
- Hits         1369     1363       -6     
+ Misses        185      182       -3     
- Partials       46       47       +1
Impacted Files Coverage Δ
...emetry-sdk/src/opentelemetry/sdk/trace/__init__.py 89.94% <88.88%> (+1%) ⬆️
...ry-sdk/src/opentelemetry/sdk/resources/__init__.py 68.18% <0%> (-2.66%) ⬇️
...opentelemetry/sdk/context/propagation/b3_format.py 84% <0%> (-0.62%) ⬇️
opentelemetry-sdk/src/opentelemetry/sdk/util.py 85.54% <0%> (-0.35%) ⬇️
...ts/src/opentelemetry/ext/http_requests/__init__.py 88.88% <0%> (-0.31%) ⬇️
...ry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py 90.27% <0%> (-0.27%) ⬇️
...xt-jaeger/src/opentelemetry/ext/jaeger/__init__.py 85% <0%> (-0.19%) ⬇️
...src/opentelemetry/ext/opentracing_shim/__init__.py 95.68% <0%> (-0.18%) ⬇️
...app/src/opentelemetry_example_app/flask_example.py 100% <0%> (ø) ⬆️
...sdk/src/opentelemetry/sdk/trace/export/__init__.py 86.79% <0%> (+0.94%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eee7d84...53e032c. Read the comment docs.

@carlosalberto
Copy link
Contributor

carlosalberto commented Nov 8, 2019

Btw, in Java we have a TracerSdk.shutdown() method that does clean up tasks, including shutting down the processor/exporters automatically. Maybe we should add it here as well?

@mauriciovasquezbernal
Copy link
Member Author

Do you think it is a valid requirement to have a shutdown method that has to be called to guarantee that spans are submitted to the exporters?

Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a much-needed improvement to the API, and the right default behavior.

Heads up that __del__ can introduce some subtle bugs though; after this change, if we ever add a tracer attribute to a SpanProcessor we'll get a memory leak.

@@ -333,6 +334,13 @@ def __init__(
self._current_span_slot = Context.register_slot(slot_name)
self._active_span_processor = MultiSpanProcessor()
self.sampler = sampler
self._atexit_hanlder = atexit.register(
Copy link
Member

@c24t c24t Nov 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosalberto why not do this by default? Because it would be surprising that calling del results in (blocking!) network calls?

I agree about making this configurable, but I think this is much better than the current default: call shutdown or lose unexported spans.

self._active_span_processor.shutdown
)

def __del__(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why put this on Tracer instead of SpanProcessor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that having it in Tracer makes it more general and other possible SpanProcessor implementations would not have to take care of this.

@mauriciovasquezbernal
Copy link
Member Author

This problem looks a little bit nasty to me and I am not sure what is the way to go.
The fact that __del__ is not guarantee to be called and that in some cases can cause memory leaks is not helping that much, I don't know if we should rely on it.

My proposal would be something like:

  1. Implement Tracer.shutdown() in the SDK and document it. (Maybe this should also be discussed at specification level?)
  2. Have a parameter on Tracer.__init__(..., shutdown_on_exit=True), if the parameter is True use aexit to register a call to Tracer.shutdown().
  3. Don't use __del__, two reasons: (a) I don't think people would create multiple tracer instances in an application and destroy them before the application ends, (b) it is not guarantee that it'll be called in all cases.

If you agree @carlosalberto @c24t I can update this PR with that approach.

@Oberon00
Copy link
Member

I also don't like __del__ (even though I believe the memory leak issue has been fixed for quite some time, see https://www.python.org/dev/peps/pep-0442/ which deals with "cyclic isolates"). So I support the solution that you, @mauriciovasquezbernal, suggest.

However, even the shutdown_on_exit bit is a bit problematic because when this is set to true, we have a (strong) reference from the global atexit handler list to the tracer (you could atexit.unregister in Tracer.shutdown, but still). I would be in favor of handling any (semi-)automatic atexit registration in #123 (related #195). Logically, atexit-registration belongs more in the region of set_preferred_tracer_implementation than Tracer.__init__.

@mauriciovasquezbernal
Copy link
Member Author

I implemented the solution I proposed, I admit that I am not totally happy with that but I don't know a better solution now.

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I think there is room for improvement with a global object, but no need to block this valuable pr for that.

Any unit tests that should be written or modified along with this change?

- test that span processor (Simple and Batch) shut down exporters
- test that trace shuts down span processor
- test atexit hook in Tracer.shutdown()
@mauriciovasquezbernal
Copy link
Member Author

mauriciovasquezbernal commented Nov 13, 2019

I added some shutdown tests in 981f2ba.
The atexit ones were a bit complicated, I had to use some imagination.

no available in python < 3.6
- avoid using hasattr and assign member to None instead
- use shutil.which to overcome problem in windows
@Oberon00 Oberon00 added the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Nov 20, 2019
opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py Outdated Show resolved Hide resolved
opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py Outdated Show resolved Hide resolved
tracer.add_span_processor(mock_processor)

{tracer_shutdown}
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very weird, but it looks like it works!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I didn't find another way to do it!

@c24t
Copy link
Member

c24t commented Nov 23, 2019

@mauriciovasquezbernal FYI if you check "allow edits from maintainers" other people can push to this branch, e.g. to fix merge conflicts.

My suggestions aren't blocking, but I'd appreciate if you commit them when you fix the merge conflicts here. I can merge this once it's fixed.

@c24t c24t removed the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Nov 23, 2019
@mauriciovasquezbernal
Copy link
Member Author

@c24t I committed them by hand and also fix conflicts with master.
btw, I had "allow edits from maintainers" checked but I have no idea why it doesn't allow you to push to my branch.

allow edits

@c24t c24t merged commit a42b063 into open-telemetry:master Nov 27, 2019
@mauriciovasquezbernal mauriciovasquezbernal deleted the mauricio/shutdown_exporter branch April 14, 2020 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants