-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification on exporter timeout config #2346
Comments
I also wonder how these OTLP timeouts relate to similar-sounding timeouts defined for batch processors: "Maximum allowed time (in milliseconds) to export data." |
Triage comments: we assume that the answer is B. Do you know of any SDKs that are struggling with this definition? We've marked this as 'ready' so that someone can create a clarification PR for it. |
The Go SIG has interpreted this as (A) the the time out applied for the entire export process, including any retries. |
Same with java. |
I think it actually has to be A: timeout for the entire export process. As discussed in #4138, there is no standard definition of the OTLP retry exponential backoff algorithm. If the timeout limited individual export requests, then there would be no standard way to ensure that an exporter's export resolves within the limit of its associated BatchSpanProcessor, BatchLogRecordProcessor, or PeriodicMetricReader. A counter argument might be that if the total time for all export requests is just 10s (by default), then that doesn't leave a lot of room for individual attempts to fail and retry. I think it does leave room and we just expect OTLP receivers to return retryable status codes quickly enough to allow for retries with 10s limit for all requests. If the 10s is not enough, the user can extend the timeout to a higher value (provided they also extend the BatchSpanProcessor, BatchLogRecordProcessor, PeriodicMetricReader limits after 30s). However, as noted in the issue, the OTLP SDK exporter spec also underspecifies connection timeouts. In my experience, I've seen HTTP client libraries with default connect timeouts of 10s and unset. In both of these cases, a OTLP exporter can spend the entire 10s export budget waiting for the first attempt to connect. We need a way to configure OTLP exporters with a specific connection timeout so that there is some ability to retry when connection issues occur. Users need a way to say, set connect timeout to 5s and overall export timeout to 15s. This will allow the OTLP export to retry up to 3 times when connect timeouts occur, and potentially connect to different instances each time. In my experience, this greatly increases the chance of export success. |
What are you trying to achieve?
Need clarification on how timeout should be used for the export process.
Additional context.
Spec says that timeout is the max time the OTLP exporter will wait for each batch export but does this mean:
A. timeout for the entire export process (including retry requests)
B. timeout for each individual request
C. timeout for a certain phase in the request (socket, req, res).
The text was updated successfully, but these errors were encountered: