Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable remaining HTTP clients in metrics integration tests #2441

Open
scottgerring opened this issue Dec 17, 2024 · 17 comments
Open

Enable remaining HTTP clients in metrics integration tests #2441

scottgerring opened this issue Dec 17, 2024 · 17 comments

Comments

@scottgerring
Copy link
Contributor

in #2432 and the associated PR, we added integration tests for metrics.

However, HTTP-client exporters have issues that cause test failures, so they are skipped (we are only testing gRPC currently):

///
/// TODO - the HTTP metrics exporters do not seem to flush at the moment.
/// TODO - fix this asynchronously.
///
#[cfg(test)]
#[cfg(not(feature = "hyper-client"))]
#[cfg(not(feature = "reqwest-client"))]
#[cfg(not(feature = "reqwest-blocking-client"))]
mod tests {

Remove this exclusion and fix the tests or the impl!

@scottgerring
Copy link
Contributor Author

I will look at this next

@scottgerring
Copy link
Contributor Author

here's the error that's breaking this:

thread 'OpenTelemetry.Metrics.PeriodicReader' panicked at opentelemetry-http/src/lib.rs:166:32:
there is no reactor running, must be called from the context of a Tokio 1.x runtime
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This is being thrown specifically by this line:

let mut response = time::timeout(self.timeout, self.inner.request(request)).await??;

(note that that time::timeout is a tokio function)

And here's what's happening. We create a PeriodicReader with its own thread:

let result_thread_creation = thread::Builder::new()
.name("OpenTelemetry.Metrics.PeriodicReader".to_string())
.spawn(move || {
let mut interval_start = Instant::now();
let mut remaining_interval = interval;
otel_info!(
name: "PeriodReaderThreadStarted",
interval_in_millisecs = interval.as_millis(),
timeout_in_millisecs = timeout.as_millis()
);

And then we do async-things inside it:

let mut response = time::timeout(self.timeout, self.inner.request(request)).await??;

... which should be enabled by our wrapping of the whole thing up in futures_executor:

// Relying on futures executor to execute async call.
// TODO: Add timeout and pass it to exporter or consider alternative
// design to enforce timeout here.
let exporter_result = futures_executor::block_on(self.exporter.export(&mut rm));

@scottgerring
Copy link
Contributor Author

scottgerring commented Dec 18, 2024

The mystery deepens. I rolled my own futures-timeout that does not depend on tokio (yes, I know):

struct Timeout<F> {
future: F,
delay: Pin<Box<dyn Future<Output = ()> + Send>>,
}
impl<F: Future + std::marker::Unpin> Future for Timeout<F> where
F::Output: Send,
{
type Output = Result<F::Output, &'static str>;
fn poll(self: Pin<&mut Self>, cx: &mut std::task::Context<'_>) -> Poll<Self::Output> {
let this = self.get_mut();
// Poll the delay future
if this.delay.as_mut().poll(cx).is_ready() {
return Poll::Ready(Err("timeout"));
}
// Poll the main future
match Pin::new(&mut this.future).poll(cx) {
Poll::Ready(output) => Poll::Ready(Ok(output)),
Poll::Pending => Poll::Pending,
}
}
}
fn timeout<F: Future + Send>(duration: Duration, future: F) -> Timeout<F> {
Timeout {
future,
delay: Box::pin(futures_timer::Delay::new(duration)),
}
}

This gets us further - we no longer panic in the timeout, but we panic further into hyper, where again we are expecting a tokio runtime - screenshot shows the frame that kicks back into tokio, and eventually panics for lack of a tokio runtime:

CleanShot 2024-12-18 at 14 53 05@2x

I feel like I am missing something! It's not clear to me how all this rather-tokio-dependant-looking-stuff can run within a non-tokio thread context. I can see that the path for the logging integration ends up with its own tokio context for hyper to work with, rather than sticking it on a non-tokio thread:

pub fn new(connector: C, timeout: Duration, authorization: Option<HeaderValue>) -> Self {
// TODO - support custom executor
let inner = Client::builder(hyper_util::rt::TokioExecutor::new()).build(connector);
Self {
inner,
timeout,
authorization,
}
}

@cijothomas Is this expected to not be working? I know we spoke about this the other day e.g. #1437 , but I suppose I assumed that the code already in main was already good. Should we change the way we are creating HTTP exporters in the meantime in the integration tests?

@lalitb
Copy link
Member

lalitb commented Dec 18, 2024

Will it make sense to document/support ONLY below scenarios with dedicated-thread periodic reader:

  • OTLP/gRPC - Using tonic client
  • OTLP/HTTP - Using reqwest-blocking client

It seems that hyper client and reqwest client will only work with runtime-based periodic reader.

@scottgerring
Copy link
Contributor Author

scottgerring commented Dec 18, 2024

Will it make sense to document/support ONLY below scenarios with dedicated-thread periodic reader:

* OTLP/gRPC - Using `tonic` client

* OTLP/HTTP - Using `reqwest-blocking` client

It seems that hyper client and reqwest client will only work with runtime-based periodic reader.

That would certainly make this ticket easier :) More generally, what's the thinking with the various clients - save our users a dependency?

@cijothomas
Copy link
Member

For 1.0, our default is the dedicated thread and default gRPC client is Tonic, and default HTTP Client is reqwest::blocking, both of which are proven to be working with integration tests.

Reqwest/Hyper may have been designed to only work within tokio::runtimes, and we can just ship 1.0 without its support. (if anyone truly wants it, then the experimental-tokio-rt feature can be enabled to get it).

Post 1.0, we can see how to address supporting tonic/reqwest. I have some ideas on how to generally support other runtimes like tokio (and hence networking libs that depend on it like hyper), without exposing public APIs. Will post it into a new issue for discussion, but for 0.28 and 1.0 release, lets stick with @lalitb's suggestion here

@cijothomas
Copy link
Member

The mystery deepens. I rolled my own futures-timeout that does not depend on tokio (yes, I know):

@ThomsonTan also explored this to enforce timeouts within the dedicated thread. The general conclusion was that
its best to let the individual exporters handle/enforce timeouts. , rather than OTel trying to do the job of an executor!

We may revisit this in the future.

@scottgerring
Copy link
Contributor Author

@cijothomas @lalitb perfect. I'll close this issue as a no-op for the moment!

@cijothomas
Copy link
Member

Could you check and see if this can be added to the OTLP docs somehow, so users know what to expect. This can be done after we merge the Batchprocessor changes as well.

@scottgerring scottgerring reopened this Dec 18, 2024
@scottgerring
Copy link
Contributor Author

Sure - that's a good idea !

@scottgerring
Copy link
Contributor Author

What if we also used the feature to remove the threaded PeriodicReader when the user selects an async HTTP client? I haven't checked if it has flow on effects, but this would save others from going down this rabbit hole.

@cijothomas
Copy link
Member

Users can do it themselves by opting in to the feature flags. We should not do it ourselves, as that'd need using non-stable features.

I don't really think such issues will be common once Otel picks a default. Previously we didn't have a default, forcing user to pick one. Once we have a default, that we validate will work with one http and gRPC library, then users have less reason to change anything. If they need to change something, the doc can give advice about what combinations are invalid.

Defaults would be:
PeriodicReader, BatchProcessor that spawn own thread + HTTP + ReqwestBlockingClient.
Users can switch to gRPC+Tonic.

Anything else - requires opt-in to the experimental ff for custom-runtimes. Does this sound reasonable for 1.0 timeline?

@scottgerring
Copy link
Contributor Author

scottgerring commented Dec 19, 2024

I think opinionated defaults are good, and supporting either gRPC/tonic as well as sync/async environments covers everything. Sounds good for 1.0!

Thinking ahead, do we need 3 different async HTTP clients in this library at all - what value is it adding? As cargo/rustc supports different-versions-of-the-same-package-in-the-tree via name mangling, it's not such a drama if we carry our own dep of a single library we like, and let the "outside world" do what it wants. E.g. - if we choose to use reqwest, I reckon that's merely an impl detail of opentelemetry-rust that the outside world shouldn't know or care about.

I suppose you could argue that if the stars line up and the outside and inside world can both use the same version of the same library we shave some bytes at runtime, but that feels like premature optimization.

The upside of cutting down to a single library would be decreased maintenance burden.

@cijothomas @lalitb what do you think? I'm probably missing some important context here :D

@cijothomas
Copy link
Member

Thinking ahead, do we need 3 different async HTTP clients in this library at all

I think just one would have been sufficient (reqwest-blocking, which works well in default mode of own background thread). But in asynchronous runtimes, we need non-blocking one, so either reqwest or hyper is needed.

I am equally unsure of the need to support many clients! (Also ability to bring-your-own-http-client capability.. maybe there are use cases, I am happy to explore and write down the needs)

@scottgerring
Copy link
Contributor Author

scottgerring commented Dec 20, 2024

Thinking ahead, do we need 3 different async HTTP clients in this library at all

I think just one would have been sufficient (reqwest-blocking, which works well in default mode of own background thread). But in asynchronous runtimes, we need non-blocking one, so either reqwest or hyper is needed.

I am equally unsure of the need to support many clients! (Also ability to bring-your-own-http-client capability.. maybe there are use cases, I am happy to explore and write down the needs)

I suspect the use cases should focus on the customizations the user might need to provide - e.g. "I need {customHeaderX} for my use case, because of weird proxy stuff". Or, I need a callback hook to provide a custom header, because its a bit more complicated than that". Then the problem becomes - which library makes it easier for us to provide the flexibility we might need in the future? I'm struggling to imagine a situation where a user of the library should need to BYO HTTP client; I think that would smell a bit like a break down of abstraction on our part. But - perhaps again here there is something that's come up in the other otel SDKs that shines light here?
It would be cool to decrease the amount of code / feature switches in there down to the minimum sync/async.

@cijothomas - I feel like this is something that could use an RFC, and would be happy to start writing one if that fits?

@cijothomas
Copy link
Member

I'm struggling to imagine a situation where a user of the library should need to BYO HTTP client; I think that would smell a bit like a break down of abstraction on our part. But - perhaps again here there is something that's come up in the other otel SDKs that shines light here?

My own experience is mostly .NET, where HttpClient is part of standard library itself, so this was not even a discussion point! (gRPC/protobuf was challenging, but ultimately Otel .NET decided to handcode a protobuf serializer, and got rid of gRPC/protobuf dependencies completely)

@lalitb you mentioned Otel C++ provides some ability to bring-own-http-client?

@cijothomas - I feel like this is something that could use an RFC, and would be happy to start writing one if that fits?

Totally! One scenario I can think of is the security patches - where companies want to provide own http/gRPC clients, since Rust std library has none. (To be clear - even today, we don't support bring-you-own-client capability. we just allow user to pick between 3 implementations maintained by this repo itself.)

For a 1.0 priority, it is sufficient that we support tonic+reqwest-blocking , so we just need to make sure users have enough flexibility to control these libraries like adding extra headers or authentication etc. Post 1.0 is a good time to revisit if we need more flexibility to bring-own-http-client.

(A similar thing for async runtimes and bring-your-own-runtime ability. Good post 1.0 candidate)

@lalitb
Copy link
Member

lalitb commented Dec 20, 2024

@lalitb you mentioned Otel C++ provides some ability to bring-own-http-client?

Thats correct. C++ lacks a universally accepted standard HTTP client—libcurl is commonly used in POSIX environments, while Windows and macOS have their own native clients. Then there could be iOS and Android bindings on top of Otel C++ SDK which have their own http libraries - we can't impose the use of a specific HTTP client. Instead, need to provide flexible and pluggable setup to accommodate different implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants