Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failover does not respect priority-levels #36587

Closed
ioannispl41 opened this issue Nov 28, 2024 · 2 comments · Fixed by #36605
Closed

Failover does not respect priority-levels #36587

ioannispl41 opened this issue Nov 28, 2024 · 2 comments · Fixed by #36605
Assignees
Labels
bug Something isn't working connector/failover

Comments

@ioannispl41
Copy link

ioannispl41 commented Nov 28, 2024

Component(s)

connector/failover

What happened?

Hello,

I came across a weird behavior …

It seems am unable to redirect traffic back to priority level one.

Here’s my current configuration:

config:
  connectors:
    failover:
      priority_levels:
        - [traces/first]
        - [traces/second]
      retry_interval: 1m
      retry_gap: 10s
      max_retries: 0 # Unlimited retries
  exporters:
    # Data sources: traces, metrics, logs
    otlp/1:
      endpoint: endpointA
      retry_on_failure:
        enabled: false
      sending_queue:
        enabled: false
      tls:
        insecure: true
    otlp/2:
      endpoint: endpointB
      retry_on_failure:
        enabled: false
      sending_queue:
        enabled: false
      tls:
        insecure: true
  service:
    pipelines:
      traces:
        receivers: [otlp]
        exporters: [failover]
      traces/first:
        receivers: [failover]
        exporters: [otlp/1]
      traces/second:
        receivers: [failover]
        exporters: [otlp/2]

When changed the max-retries value to 1000, it worked as expected: when otlp/1 stopped, traffic flowed through otlp2, and when otlp1 was restarted, it went through otlp1 again.

I believe that the issue is related to this line: not allowing connector to switch to a higher priority-level when max-retries==0.

Collector version

v0.110.0

Environment information

No response

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@ioannispl41 ioannispl41 added bug Something isn't working needs triage New item requiring triage labels Nov 28, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@akats7
Copy link
Contributor

akats7 commented Nov 29, 2024

Hi @ioannispl41,

Thanks for bringing this up, I was able to recreate the issue and am pushing up a PR with a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working connector/failover
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants