Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTLP Exporter blocking indefinitely #1192

Closed
jpkrohling opened this issue Jun 25, 2020 · 8 comments
Closed

OTLP Exporter blocking indefinitely #1192

jpkrohling opened this issue Jun 25, 2020 · 8 comments
Labels
bug Something isn't working

Comments

@jpkrohling
Copy link
Member

jpkrohling commented Jun 25, 2020

Describe the bug
When using the collector with a OTLP exporter, the exporter seems to be blocked indefinitely:

^C2020-06-25T16:18:29.765+0200	INFO	service/service.go:277	Received signal from OS	{"signal": "interrupt"}
2020-06-25T16:18:29.765+0200	INFO	service/service.go:448	Starting shutdown...
2020-06-25T16:18:29.766+0200	INFO	service/service.go:377	Stopping receivers...
2020-06-25T16:18:29.767+0200	INFO	service/service.go:383	Stopping processors...
2020-06-25T16:18:29.768+0200	INFO	builder/pipelines_builder.go:70	Pipeline is shutting down...	{"pipeline_name": "traces/custom-2", "pipeline_datatype": "traces"}
2020-06-25T16:18:29.768+0200	INFO	builder/pipelines_builder.go:76	Pipeline is shutdown.	{"pipeline_name": "traces/custom-2", "pipeline_datatype": "traces"}
2020-06-25T16:18:29.768+0200	INFO	builder/pipelines_builder.go:70	Pipeline is shutting down...	{"pipeline_name": "traces/custom-1", "pipeline_datatype": "traces"}
2020-06-25T16:18:29.768+0200	INFO	builder/pipelines_builder.go:76	Pipeline is shutdown.	{"pipeline_name": "traces/custom-1", "pipeline_datatype": "traces"}
2020-06-25T16:18:29.768+0200	INFO	service/service.go:389	Stopping exporters...

And it stays like that until I manually force kill the process (kill -9 PID).

Steps to reproduce

  1. make run in the OpenTelemetry Collector side
  2. SPAN_STORAGE_TYPE=memory go run ./cmd/collector/main.go --collector.grpc-server.host-port :14251 --log-level=debug on the Jaeger side
  3. send a span to the OpenTelemetry collector
  4. stop the OpenTelemetry Collector

What did you expect to see?
The process would finish at most after a couple of seconds.

What did you see instead?
You can check out anytime you like, but you can never leave.

What version did you use?
Version: master as of now (4eca960a4eb02104694324cf161ad9ec944c44c9).

What config did you use?
Config:

receivers:
  otlp:
  jaeger:
    protocols:
      grpc:
      thrift_compact:

processors:
  batch:

exporters:
  otlp:
    endpoint: "localhost:55268"
  jaeger:
    endpoint: "localhost:14251"
    insecure: true
  logging:

service:
  pipelines:
    traces/custom-1:
      receivers: [jaeger]
      processors: [batch]
      exporters: [otlp]
    traces/custom-2:
      receivers: [otlp]
      processors: [batch]
      exporters: [jaeger, logging]

Environment
OS: Fedora 32 with the latest updates
Compiler(if manually compiled): go1.14.3

Additional context
n/a

@jpkrohling jpkrohling added the bug Something isn't working label Jun 25, 2020
@jpkrohling jpkrohling changed the title Exporter blocked indefinitely OTLP Exporter blocked indefinitely Jun 25, 2020
@jpkrohling jpkrohling changed the title OTLP Exporter blocked indefinitely OTLP Exporter blocking indefinitely Jun 25, 2020
@jpkrohling
Copy link
Member Author

While this specific instance of the problem might be solved by #1201, the collector needs a way to forcefully shutdown a misbehaving exporter.

@jmacd
Copy link
Contributor

jmacd commented Jul 10, 2020

I wonder what we expect to happen when an OTLP request fails. IMO the correct thing to do would be to buffer the data and aggregate it with new data while waiting for a successful export. This would require new processor support, I guess.

@jpkrohling
Copy link
Member Author

What are the current failure conditions? Is it only about networking? Can a request failed due to bad data? We don't want data to live in the pipeline forever because they can't move forward for some reason :-)

@jpkrohling
Copy link
Member Author

jpkrohling commented Jul 16, 2020

I believe this has to be fixed for the next beta, or at the latest for GA.

cc @tigrannajaryan

@jrcamp
Copy link
Contributor

jrcamp commented Jul 20, 2020

@bogdandrutu to resolve blocked shutdown (and maybe for startup as well) should we set https://golang.org/pkg/context/#WithTimeout on the context before calling Startup/Shutdown? Should this value be configurable globally or per-component? What should the default be?

@bogdandrutu
Copy link
Member

@jrcamp @jmacd for normal requests I added timeout/retry and queueing support for all exporters in #1386

@jrcamp for start and stop I would recommend that to be a config on the service and use WithTimeout on the context passed to start and shutdown.

@bogdandrutu
Copy link
Member

I would say that this issue should be closed and open a separate issue for start/stop which I think was not the main concern in this initial issue.

@jrcamp
Copy link
Contributor

jrcamp commented Jul 20, 2020

New issue opened as a feature request for timeouts. ^

@jrcamp jrcamp closed this as completed Jul 20, 2020
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this issue Nov 11, 2021
* Rename SamplingDecision enum values

As prescribed in
open-telemetry/opentelemetry-specification#938
and open-telemetry/opentelemetry-specification#956.

* Include in Changelog

Co-authored-by: Tyler Yahn <[email protected]>
swiatekm pushed a commit to swiatekm/opentelemetry-collector that referenced this issue Oct 9, 2024
…try#1192)

Bumps [kyverno/action-install-chainsaw](https://github.com/kyverno/action-install-chainsaw) from 0.2.0 to 0.2.1.
- [Release notes](https://github.com/kyverno/action-install-chainsaw/releases)
- [Commits](kyverno/action-install-chainsaw@v0.2.0...v0.2.1)

---
updated-dependencies:
- dependency-name: kyverno/action-install-chainsaw
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants