Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite ingestion retry when batches are too large and using GuaranteedSend #14350

Closed
simitt opened this issue Oct 31, 2019 · 10 comments · Fixed by #29368
Closed

Infinite ingestion retry when batches are too large and using GuaranteedSend #14350

simitt opened this issue Oct 31, 2019 · 10 comments · Fixed by #29368
Assignees
Labels
bug good first issue Indicates a good issue for first-time contributors libbeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team v8.1.0

Comments

@simitt
Copy link
Contributor

simitt commented Oct 31, 2019

Elasticsearch returns status code 413 when a bulk request exceeds the size limit. A user can either increase the http.max_content_length in ES or decrease the bulk_max_size in the Beat to overcome such failures.
However, when this error happens and the beat is using a GuaranteedSend publisher method the current implementation can lead to an infinite retry, sending the same request to ES.
This might result in not being able to ingest any more events.

It might be worth exploring to use a special handling for the batch when the request size exceeds a limit, e.g. split it in half.

@ph ph added the libbeat label Oct 31, 2019
@ph
Copy link
Contributor

ph commented Oct 31, 2019

Related issue #3688

@ph
Copy link
Contributor

ph commented Oct 31, 2019

Prior experience with that from LS logstash-plugins/logstash-output-elasticsearch#497

@ph
Copy link
Contributor

ph commented Oct 31, 2019

Linked to #6749

@urso urso assigned faec Nov 1, 2019
@simitt simitt added the bug label Nov 19, 2019
@faec
Copy link
Contributor

faec commented Oct 6, 2021

This issue probably still exists, but seems rare, is fixable with proper configuration, and was never allocated time in a release cycle -- unassigning so it can be re-triaged.

@faec faec added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Oct 6, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@faec faec removed their assignment Oct 6, 2021
@jlind23 jlind23 added 8.1-candidate good first issue Indicates a good issue for first-time contributors labels Nov 30, 2021
@jlind23
Copy link
Collaborator

jlind23 commented Dec 17, 2021

Ping @mukeshelastic @nimarezainia as you were both interested by this issue. It will be fixed to 8.1 thanks To @rdner

@mukeshelastic
Copy link

Thanks @jlind23

@dikshachauhan-qasource
Copy link

Hi @simitt

Could you please help us on this Ticket validation with below points:

  • How can we create bulk requests for elasticsearch?
  • Can it be covered under manual testing.

Thanks
QAS

@simitt
Copy link
Contributor Author

simitt commented Feb 15, 2022

@rdner given that you implemented the fix, can you please provide guidance for the testers.
It's been a long time since I created the issue, and don't believe this is reproducible anymore with the latest apm-server version, as it is not using libbeat output to ES anymore.

@rdner
Copy link
Member

rdner commented Feb 15, 2022

@dikshachauhan-qasource I described the testing process in my PR #29368

Let me know if it's missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug good first issue Indicates a good issue for first-time contributors libbeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team v8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants