Support concurrent exports #781

jwilm · 2022-04-14T20:49:50Z

Problem

Applications generating significant span volume can end up dropping data due to the synchronous export step. According to the opentelemetry spec,

This function [export()] will never be called concurrently for the same exporter instance. It can be called again only after the current call returns.

However, it does not place a restriction on concurrent I/O or anything of that nature. There is an ongoing discussion about tweaking the language to make this more clear, and it seems there is consensus on concurrent transmissions being OK with the spec.

Approach

With that in mind, this commit makes the exporters return a future that can be spawned concurrently. Unfortunately, this means that the export() method can no longer be async while taking &mut self. The latter is desirable to enforce the no concurrent calls line of the spec, so the choice is made here to return a future instead with the lifetime decoupled from self. This resulted in a bit of additional verbosity, but for the most part the async code can still be shoved into an async fn for the ergonomics.

The jaeger exporter is left untouched for now as the changes here might take a couple of different approaches (decouple exporter via a channel, or a bigger refactor to fix the &mut borrows deep in the exporter logic).

Discussion

I'm opening this PR as a draft to start some discussion with the OpenTelemetry-Rust maintainers. To kick this off, I have results to share from a particularly busy application of ours which has been dropping tons of spans due to reaching queue capacity:

This is a 2 day period. To the left, we have millions of spans dropped because the queue was not being flushed fast enough. To the right, there are nearly 0 occurrences of dropped spans due to this issue.

For us, this is a massive improvement.

A couple of questions for the maintainers:

Is this sort of feature interesting to the project? I recognize that not everyone will need this, but at least for us, it is quite important (to the point we would likely maintain a patch to keep this functionality).
Assuming the feature is interesting for the project, how do you feel about the overall approach?
Finally, I could use some input on the jaeger side if there is interest in landing this patch. This exporter is quite a bit more complex and relies heavily on &mut borrows several layers down.

Remaining Work

Join spawned tasks on shutdown()
Configuration for number of spawned export tasks export_concurrency
Limit spawned export tasks to export_concurrency
Synchronous behavior when export_concurrency: 1
Attempt to drain queue on force_flush
Jaeger exporter
Clean up TODOs

linux-foundation-easycla · 2022-04-14T20:49:53Z

The committers listed above are authorized under a signed CLA.

✅ login: jwilm / name: Joe Wilm (42039a6, 3045d96)

TommyCpp · 2022-04-15T04:21:04Z

Thanks for the feedback:+1:, I agree with the approach in principle. We interpreted the spec as BatchSpanProcessor must finish sending one batch span before starting another.

My concern is, like others pointed out in the spec issue, without a bound it can cause an OOM. We can add a parameter to config the number of concurrent sending tasks. If the parameter is set to 1 then disable the concurrent sending.

Another issue is based on the current model, the shutdown function will return after the last batch is exported. If we are exporting multiple batches concurrently we need to make sure they all finished exporting before returning from shutdown. Similar to force_flush function(We may need to define whether flush means here. Is it starting the exporting or finishing the exporting?)

Jaeger is a little complex. I think the &mut borrow is caused by emit_batch method of generated thrift agent. We may need to propagate futures from uploaders. Need to take a closer look to figure out what's the best approach here.

PS. For some reason, your link to the spec issue is not working. open-telemetry/opentelemetry-specification#2434.

jwilm · 2022-04-15T05:54:02Z

Thanks for the quick reply!

My concern is, like others pointed out in the spec issue, without a bound it can cause an OOM. We can add a parameter to config the number of concurrent sending tasks. If the parameter is set to 1 then disable the concurrent sending.

Makes sense; this would be fairly straightforward to add.

Another issue is based on the current model, the shutdown function will return after the last batch is exported. If we are exporting multiple batches concurrently we need to make sure they all finished exporting before returning from shutdown.

The shutdown "joining" logic makes sense; a bit embarrassed I didn't consider that up front. Also should be relatively simple to add.

Similar to force_flush function(We may need to define whether flush means here. Is it starting the exporting or finishing the exporting?)

My 2c: in the non-concurrent model it meant starting a request up to the batch size. In the concurrent model, it's still possible to have pending items in queue up to the timeout; to me it still makes sense to consider flushing the start of an export.

Jaeger is a little complex. I think the &mut borrow is caused by emit_batch method of generated thrift agent. We may need to propagate futures from uploaders. Need to take a closer look to figure out what's the best approach here.

That's right. The generated thrift agent has some mutable methods for working with the io streams and some sort of frame counter. It's by no means untenable but certainly more work than the other exports (hence the draft for discussion).

PS. For some reason, your link to the spec issue is not working. open-telemetry/opentelemetry-specification#2434.

Fixed, thank you.

Would it be premature for me to continue with the suggested fixes?

jtescher · 2022-04-17T16:40:19Z

@jwilm these all sound like good improvements to me 👍

TommyCpp · 2022-04-17T18:51:50Z

I think we can implement the proposed changes. One thing is your branch seems stale and I'd suggest rebasing on the latest to include some critical changes like we split opentelemetry into opentelemetry-sdk and opentelemetry-api. It might be easier than merge upstream after your changes

jwilm · 2022-04-20T18:11:45Z

I rebased the existing work on main, and I'm starting on the proposed changes (will track on OP).

In the mean time, I would appreciate some input on the Jaeger exporter. I noticed that stackdriver exporter spawns a separate task for the actual export work, and the SpanExporter impl simply sends the batch via channel. Perhaps the same approach would be suitable for Jaeger?

jwilm · 2022-04-21T01:15:43Z

I believe all the requested changes have been implemented, save for the Jaeger exporter.

Considering the implementation path there, I think a task queue system could be utilized where the Jaeger exporter will only export a single batch concurrently (regardless of BSP settings). This seems like the path of least resistance to landing this PR, but it also means that the Jaeger exporter doesn't actually gain any concurrency. Perhaps this is OK for now and Jaeger concurrency can be a future enhancement?

Separately, I took a closer look at the stackdriver exporter, and I think it could easily take advantage of the BSP concurrency management instead of rolling its own. Let me know if you'd like that done.

Waiting on your feedback before proceeding.

Thanks!

jwilm · 2022-04-21T19:20:20Z

opentelemetry-jaeger/src/exporter/config/agent.rs

-
-        Ok(builder.build())
-    }
+    // pub fn build_simple(mut self) -> Result<TracerProvider, TraceError> {


These build simple / sync methods are the last blocker for the jaeger implementation. In order to support the non-pipelined I/O resources like the Agent uploaders, the Exporter and Uploaders were decoupled via a channel, and the uploaders now run on a dedicated task. However, this meant that Exporter::new has to return a task to spawn as well, similar to the stackdriver exporter. This was problematic with this build_simple, etc. methods since there's no runtime available on which to spawn that task.

I'll need feedback on whether we can simply thread the Runtime through here and whether this design is acceptable.

I updated the commit to thread the runtime through, but I don't have enough context in the project to know if this is the right approach.

Thanks for the hard work! Sorry didn't get much time yesterday to take a look. Will try to take a look tomorrow or over the weekend. We do want to keep install_simple without runtime parameter just so that you can work with opentelemetry without a runtime. But I need a closer look to see it's possible to do it under the new exporter APIs.

Is it just that method in particular, or is the goal that the opentelemetry library can be used without a runtime? If it's just that method, we can probably figure something out using the cfg(feature) attrs internally.

I think in general we want to allow people to use it without a runtime. As a util library, we shouldn't assume the runtime and deny certain use cases.

codecov · 2022-04-22T02:47:20Z

Codecov Report

Merging #781 (b647016) into main (02e15b2) will decrease coverage by 0.4%.
The diff coverage is 39.3%.

@@           Coverage Diff           @@
##            main    #781     +/-   ##
=======================================
- Coverage   70.2%   69.8%   -0.5%     
=======================================
  Files        109     109             
  Lines       8963    9045     +82     
=======================================
+ Hits        6293    6314     +21     
- Misses      2670    2731     +61

Impacted Files	Coverage Δ
opentelemetry-datadog/src/exporter/mod.rs	`22.4% <0.0%> (-2.0%)`	⬇️
opentelemetry-jaeger/src/lib.rs	`93.0% <ø> (ø)`
opentelemetry-otlp/src/span.rs	`0.0% <0.0%> (ø)`
opentelemetry-sdk/src/export/trace/mod.rs	`100.0% <ø> (ø)`
opentelemetry-sdk/src/export/trace/stdout.rs	`9.0% <0.0%> (-0.6%)`	⬇️
opentelemetry-zipkin/src/exporter/mod.rs	`0.0% <0.0%> (ø)`
opentelemetry-zipkin/src/exporter/uploader.rs	`0.0% <0.0%> (ø)`
opentelemetry-jaeger/src/exporter/mod.rs	`51.0% <23.6%> (-4.6%)`	⬇️
opentelemetry-sdk/src/runtime.rs	`69.7% <50.0%> (ø)`
opentelemetry-sdk/src/trace/span_processor.rs	`79.8% <72.1%> (-2.2%)`	⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 02e15b2...b647016. Read the comment docs.

TommyCpp · 2022-04-22T04:08:02Z

Considering the implementation path there, I think a task queue system could be utilized where the Jaeger exporter will only export a single batch concurrently (regardless of BSP settings). This seems like the path of least resistance to landing this PR, but it also means that the Jaeger exporter doesn't actually gain any concurrency. Perhaps this is OK for now and Jaeger concurrency can be a future enhancement?

I think the easiest way is to wrap the uploader using mutex? I think the ideal solution is to replace the Uploader trait with two enum/trait that separates the thread-safe uploader(which supports concurrent exporting) and non-thread-safe one(which need to export sequentially(.

Separately, I took a closer look at the stackdriver exporter, and I think it could easily take advantage of the BSP concurrency management instead of rolling its own. Let me know if you'd like that done.

I think it makes sense. Curious what's your thoughts @djc. Also, we can open a separate PR to address this.

djc · 2022-04-22T08:35:52Z

Separately, I took a closer look at the stackdriver exporter, and I think it could easily take advantage of the BSP concurrency management instead of rolling its own. Let me know if you'd like that done.

Sounds good, are you actually going to be using stackdriver or are you just cleaning stuff up here? :)

jwilm · 2022-04-23T04:04:30Z

I think the easiest way is to wrap the uploader using mutex?

I considered that, but it has some downsides including loss of ordering of uploads. The task queue seemed cleaner, but it did necessitate spawning a future somehow. I suppose another option is spawning a thread for the non-async/await uploader(s).

I think the ideal solution is to replace the Uploader trait

Would it be possible to expose two different exporters from Jaeger? This might open up the design space a bit more.

Sounds good, are you actually going to be using stackdriver or are you just cleaning stuff up here? :)

Just an offer to clean up :)

TommyCpp · 2022-04-23T16:55:02Z

but it has some downsides including loss of ordering of uploads

I think if we allow concurrent exporting. We won't be able to guarantee the strict order of spans. i. e multiple requests in flight could finish in arbitrary order. Need to double-check how Jaeger or other backend handle this but I don't think it is a no go. I will report back what I found

I suppose another option is spawning a thread for the non-async/await uploader(s).

Yeah I think although it is against the best practice when working with async runtime we can try this.

djc · 2022-04-25T09:50:53Z

Just an offer to clean up :)

Would be great, thanks!

TommyCpp · 2022-05-01T02:25:17Z

Need to double-check how Jaeger or other backend handle this but I don't think it is a no go. I will report back what I found

Tested around Jaeger and it seems OK for us in ingest spans out of order. I am OK if we spawn a thread or use a mutex now. As long as we don't make an API change we can work on improving the internals in the following PR.

Would it be possible to expose two different exporters from Jaeger

I think it's possible but probably will be better to address it in a different PR :)

Thanks again for the wonderful work

jwilm · 2022-05-05T17:14:06Z

I think there's a path to keeping the task based approach if the only blocker is the install_simple method not taking a runtime. Given that the core library requires some sort of async runtime, we could potentially detect that based on cargo features and insert the runtime internally rather than requiring it as an argument in the public API.

What do you think?
Thanks!

djc · 2022-05-05T20:04:44Z

opentelemetry-zipkin/Cargo.toml

@@ -38,6 +38,7 @@ http = "0.2"
 reqwest = { version = "0.11", optional = true, default-features = false }
 surf = { version = "2.0", optional = true, default-features = false }
 thiserror = { version = "1.0"}
+futures = "0.3"


Can we save on dependencies by using futures-util (or even futures-core) for most of these?

With the exception of the current Jaeger exporter implementation, we should be able to get away with using only futures-util. I'll make that change.

Ah, and maybe in some cases even futures-core (at least for Zipkin)

Done in the latest commit.

Applications generating significant span volume can end up dropping data due to the synchronous export step. According to the opentelemetry spec, This function will never be called concurrently for the same exporter instance. It can be called again only after the current call returns. However, it does not place a restriction on concurrent I/O or anything of that nature. There is an [ongoing discussion] about tweaking the language to make this more clear. With that in mind, this commit makes the exporters return a future that can be spawned concurrently. Unfortunately, this means that the `export()` method can no longer be async while taking &mut self. The latter is desirable to enforce the no concurrent calls line of the spec, so the choice is made here to return a future instead with the lifetime decoupled from self. This resulted in a bit of additional verbosity, but for the most part the async code can still be shoved into an async fn for the ergonomics. The main exception to this is the `jaeger` exporter which internally requires a bunch of mutable references. I plan to discuss with the opentelemetry team the overall goal of this PR and get buy-in before making more invasive changes to support this in the jaeger exporter. [ongoing discussion]: open-telemetry/opentelemetry-specification#2434

Prior, export tasks were run in "fire and forget" mode with runtime::spawn. SpanProcessor now manages tasks directly using FuturesUnordered. This enables limiting overall concurrency (and thus memory footprint). Additionally, flush and shutdown logic now spawn an additional task for any unexported spans and wait on _all_ outstanding tasks to complete before returning.

Users may desire to control the level of export concurrency in the batch span processor. There are two special values: max_concurrent_exports = 0: no bound on concurrency max_concurrent_exports = 1: no concurrency, makes everything synchronous on the messaging task.

Key points - decouple exporter from uploaders via channel and spawned task - some uploaders are a shared I/O resource and cannot be multiplexed - necessitates a task queue - eg, HttpClient will spawn many I/O tasks internally, AgentUploader is a single I/O resource. Different level of abstraction. - Synchronous API not supported without a Runtime argument. I updated the API to thread one through, but maybe this is undesirable. I'm also exploiting the fact in the Actix examples that it uses Tokio under the hood to pass through the Tokio runtime token. - Tests pass save for a couple of flakey environment ones which is likely a race condition.

jwilm · 2022-05-05T22:06:52Z

This should be ready for another review. The Jaeger API is as it was before, and a thread is used internally to handle the task execution.

As a future work, I think it would go a long way to have two different exporters for Jaeger: one which does the exports directly with no thread and synchronously, and then the more production oriented asynchronous and concurrent exporter.

TommyCpp

I think it looks really great and should be a great improvement 👍 . We can look into Jaeger more and optimize the jaeger exporter in a following up PR.

opentelemetry-sdk/src/trace/span_processor.rs

TommyCpp · 2022-05-06T03:26:12Z

scripts/test.sh

@@ -2,7 +2,7 @@

 set -eu

-cargo test --all "$@"
+cargo test --all "$@" -- --test-threads=1


any reason why we need the test thread to be 1 here? Does running them in parallel cause some issues?

Ah yeah, I noticed there were some flakey tests due to modifying the environment and colliding with each other.

TommyCpp · 2022-05-08T04:18:01Z

scripts/lint.sh

-  cargo_feature opentelemetry-jaeger "wasm_collector_client"
-  cargo_feature opentelemetry-jaeger "collector_client, wasm_collector_client"
-  cargo_feature opentelemetry-jaeger "default"
+  cargo_feature opentelemetry-jaeger "full"


This is nice to have the full feature, but we probably want to keep those separate tests to make sure those feature can also work on their own

TommyCpp

Nice work 👍 Just one nit and fixing the lint/test I think we are good to merge

The minimal necessary futures library (core, util, futures proper) is now used in all packages touched by the concurrent exporters work.

To keep the API _actually_ simple, we now leverage a thread to run the jaeger exporter internals.

Per PR feedback, the default should match the previous behavior of 1 batch at a time.

This finishes the remaining TODOs on the concurrent-exports branch. The major change included here adds shutdown functionality to the jaeger exporter which ensures the exporter has finished its tasks before exiting.

This was erroneously committed.

OTEL_BSP_MAX_CONCURRENT_EXPORTS may now be specified in the environment to configure the number of max concurrent exports. This configurable now has parity with the other options of the span_processor.

TommyCpp

LGTM 🎉

TommyCpp · 2022-05-13T01:53:05Z

Any other suggestions or do you think it's good to merge? cc @jtescher @djc

djc · 2022-05-13T19:28:52Z

I don't have bandwidth to review this in detail, sorry.

jwilm · 2022-05-16T18:23:55Z

Anything I can do to help land this?

jwilm · 2022-05-20T22:06:43Z

🎉 thanks for the merge!

Are there plans to publish a new release to crates.io anytime soon? We would love to get off our git dependencies.

TommyCpp · 2022-05-24T05:49:11Z

I hope we can release a new version of 0.18 sometime soon. You can track the process in #779

jwilm force-pushed the concurrent-exports branch from 3045d96 to d18c5c5 Compare April 20, 2022 18:06

TommyCpp mentioned this pull request Apr 21, 2022

Prepare for v0.18.0 release #779

Merged

jwilm force-pushed the concurrent-exports branch 2 times, most recently from 385caa2 to 6f75887 Compare April 21, 2022 19:16

jwilm commented Apr 21, 2022

View reviewed changes

jwilm force-pushed the concurrent-exports branch 2 times, most recently from 2d1df86 to 0617ca8 Compare April 21, 2022 20:07

jwilm marked this pull request as ready for review April 21, 2022 20:48

jwilm requested a review from a team April 21, 2022 20:48

djc reviewed May 5, 2022

View reviewed changes

jwilm added 3 commits May 5, 2022 15:04

jwilm force-pushed the concurrent-exports branch from 8026e0e to c41652e Compare May 5, 2022 22:04

jwilm force-pushed the concurrent-exports branch from c41652e to f308ac8 Compare May 5, 2022 22:05

jwilm requested a review from TommyCpp May 5, 2022 22:06

TommyCpp reviewed May 6, 2022

View reviewed changes

jwilm requested a review from TommyCpp May 6, 2022 18:38

TommyCpp reviewed May 8, 2022

View reviewed changes

jwilm added 6 commits May 10, 2022 18:13

Reduce dependencies on futures

1571e21

The minimal necessary futures library (core, util, futures proper) is now used in all packages touched by the concurrent exporters work.

Remove runtime from Jaeger's install_simple

33555df

To keep the API _actually_ simple, we now leverage a thread to run the jaeger exporter internals.

Add Arc lost in a rebase

cd8e054

Fix OTEL_BSP_MAX_CONCURRENT_EXPORTS name and value

790609e

Per PR feedback, the default should match the previous behavior of 1 batch at a time.

Fix remaining TODOs

c988229

This finishes the remaining TODOs on the concurrent-exports branch. The major change included here adds shutdown functionality to the jaeger exporter which ensures the exporter has finished its tasks before exiting.

Restore lint.sh script

fec6c9f

This was erroneously committed.

jwilm force-pushed the concurrent-exports branch from 24f81b7 to fec6c9f Compare May 11, 2022 01:13

jwilm requested a review from TommyCpp May 11, 2022 01:34

Make max concurrent exports env configurable

b647016

OTEL_BSP_MAX_CONCURRENT_EXPORTS may now be specified in the environment to configure the number of max concurrent exports. This configurable now has parity with the other options of the span_processor.

TommyCpp approved these changes May 12, 2022

View reviewed changes

TommyCpp merged commit 7534891 into open-telemetry:main May 17, 2022

TommyCpp mentioned this pull request Jun 30, 2022

fix(jaeger): reqwest client runs inside a non-tokio runtime #829

Merged

TommyCpp mentioned this pull request Jul 23, 2022

feat(jaeger): remove internal message queue between exporter and exporting tasks #848

Merged

tsloughter mentioned this pull request Sep 24, 2024

Clarify spans export concurrency open-telemetry/opentelemetry-specification#4205

Merged

Support concurrent exports #781

Support concurrent exports #781

Conversation

jwilm commented Apr 14, 2022 • edited Loading

Problem

Approach

Discussion

Remaining Work

linux-foundation-easycla bot commented Apr 14, 2022 • edited Loading

TommyCpp commented Apr 15, 2022

jwilm commented Apr 15, 2022

jtescher commented Apr 17, 2022

TommyCpp commented Apr 17, 2022 • edited Loading

jwilm commented Apr 20, 2022

jwilm commented Apr 21, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Apr 22, 2022 • edited Loading

Codecov Report

TommyCpp commented Apr 22, 2022 • edited Loading

djc commented Apr 22, 2022

jwilm commented Apr 23, 2022

TommyCpp commented Apr 23, 2022 • edited Loading

djc commented Apr 25, 2022

TommyCpp commented May 1, 2022 • edited Loading

jwilm commented May 5, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwilm commented May 5, 2022 • edited Loading

TommyCpp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TommyCpp left a comment

Choose a reason for hiding this comment

TommyCpp left a comment

Choose a reason for hiding this comment

TommyCpp commented May 13, 2022

djc commented May 13, 2022

jwilm commented May 16, 2022

jwilm commented May 20, 2022

TommyCpp commented May 24, 2022

jwilm commented Apr 14, 2022 •

edited

Loading

linux-foundation-easycla bot commented Apr 14, 2022 •

edited

Loading

TommyCpp commented Apr 17, 2022 •

edited

Loading

codecov bot commented Apr 22, 2022 •

edited

Loading

TommyCpp commented Apr 22, 2022 •

edited

Loading

TommyCpp commented Apr 23, 2022 •

edited

Loading

TommyCpp commented May 1, 2022 •

edited

Loading

jwilm commented May 5, 2022 •

edited

Loading