Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception while exporting Span batch. #3808

Open
kuza55 opened this issue Mar 22, 2024 · 2 comments
Open

Exception while exporting Span batch. #3808

kuza55 opened this issue Mar 22, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@kuza55
Copy link

kuza55 commented Mar 22, 2024

Describe your environment
Ubuntu 22.04 in Docker
Python 3.11
opentelemetry-api/sdk 1.23.0

Steps to reproduce
I had been running into issues where spans created inside a forked process using multiprocessing.Process were being dropped. Specifically, I was using the wrapt_timeout_decorator library to add timeouts around some sync python code.

After debugging it, I realized that adding some code like:

def exit_gracefully(signum=None, frame=None):
    trace.get_tracer_provider().force_flush()

...

@timeout(1.8, use_signals=False)
def slow_func(self):
  signal.signal(signal.SIGTERM, exit_gracefully)
  try:
    do_stuff()
  finally:
    exit_gracefully()

Got the spans inside do_stuff to be exported successfully most of the time (Though not all of the time).

What is the expected behavior?
No exceptions

What is the actual behavior?
However, after making this change I am running into some exceptions like this sporadically:

Traceback (most recent call last):
  File "/app/lib/python3.11/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/app/lib/python3.11/site-packages/urllib3/connectionpool.py", line 462, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 1390, in getresponse
    response.begin()
  File "/usr/lib/python3.11/http/client.py", line 325, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 286, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/socket.py", line 706, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/app/lib/python3.11/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/urllib3/connectionpool.py", line 469, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/app/lib/python3.11/site-packages/urllib3/connectionpool.py", line 358, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='0.0.0.0', port=4318): Read timed out. (read timeout=10)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/lib/python3.11/site-packages/opentelemetry/sdk/trace/export/__init__.py", line 367, in _export_batch
    self.span_exporter.export(self.spans_list[:idx])  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 145, in export
    resp = self._export(serialized_data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 114, in _export
    return self._session.post(
           ^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/opentelemetry/instrumentation/requests/__init__.py", line 150, in instrumented_send
    return wrapped_send(self, request, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/requests/adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='0.0.0.0', port=4318): Read timed out. (read timeout=10)
@kuza55 kuza55 added the bug Something isn't working label Mar 22, 2024
@brad-getpassport
Copy link

+1 as we have seen this sporadically with our system but we are just using auto-injection of traces... originally we thought it was due to the SDK/distro mismatch between code and our system but seems to be happening again...

Any light or insights on how to debug would be helpful

@LQss11
Copy link

LQss11 commented Jun 4, 2024

+1 as we have seen this sporadically with our system but we are just using auto-injection of traces... originally we thought it was due to the SDK/distro mismatch between code and our system but seems to be happening again...

Any light or insights on how to debug would be helpful

@brad-getpassport got similar issue, maybe this helps.

# From this
otlp_exporter = OTLPSpanExporter(endpoint="http://jaeger:4317")
# To this
otlp_exporter = OTLPSpanExporter(endpoint="http://jaeger:4318/v1/traces")

Fixed it by changing endpoint to http://jaeger:4318/v1/traces instead of http://jaeger:4317

You can check this which helped me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants