Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 400: Message size too large #636

Open
ThomasHappyHotel opened this issue Dec 2, 2024 · 9 comments
Open

HTTP 400: Message size too large #636

ThomasHappyHotel opened this issue Dec 2, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@ThomasHappyHotel
Copy link

Since November 26th, our agent has been shutting down with the following error:

ERROR logdna_agent::_main: bad request, check configuration: 400 Bad Request

In another issue I found that I can expand the agent logs with RUST_LOG=info,mz_http::client=debug. Now we get:

DEBUG mz_http::client: failed request: 400 Bad Request {"status":"not ok","errors":["local:10::Broker: Message size too large"]}

How can I get the agent to run stably again?

P.S.: Can I make the agent restart automatically so that I do not lose logs? I have Restart=on-failure in the service file, but it does not seem to work (at least for this error).

@jakedipity
Copy link
Contributor

@ThomasHappyHotel Can you try reducing the max request size through LOGDNA_INGEST_BUFFER_SIZE: https://github.com/logdna/logdna-agent-v2/tree/master?tab=readme-ov-file#options?

The default is 2MiB and the value is represented as bytes e.g. 2 * 1024 * 1024. Try reducing this to just 1 MiB or even further.

The error indicates an unintentional issue with our ingestion service so this should at least resolve the errors until we identify and fix the underlying issue.

@jakedipity jakedipity added the bug Something isn't working label Dec 5, 2024
@ThomasHappyHotel
Copy link
Author

@jakedipity Thank you for the feedback and the workaround. I just change the setting as recommended to 1 MiB and restarted the agent.

@ThomasHappyHotel
Copy link
Author

@jakedipity I reduced the buffer size in steps up to 131072, but the error still occurs. The workaround does not seem to work.

@jakedipity
Copy link
Contributor

@ThomasHappyHotel The agent has built in logic to loosely control the maximum size of a batch, but this doesn't work if all or most of the size is in a single line. Normally in a Kubernetes environment large lines are split up into partial lines which the agent consumes.

Can you share some information about your system and which version of the agent you are running? Do you also know if there's potentially any large lines in any of the files the agent is watching?

@jakedipity
Copy link
Contributor

Additionally, can you share the agent's initial output, specifically the config output section.

@ThomasHappyHotel
Copy link
Author

@jakedipity It was hard to diagnose because the exit happened mostly in the middle of the night and old logfiles were already rotated. Luckily yesterday it happened also while I was working.
Your assumption that there is one big line in a logfile is true: five lines of one action are about 5 MiB big (I assume that most of the 5MiB are in one line).

Is it possible to increase the ingestion buffer size to a bigger value, i.e. 6 MiB, so that the agent will be able to ingest and not exit? Would there be any performance issues?

@jakedipity
Copy link
Contributor

@ThomasHappyHotel Unfortunately there currently isn't a way to ingest such large chunks of data. Our ingestion service has some technical limits that limit each batch of lines to around 1 - 2 MiB. I'll open an internal ticket to see if we can better accommodate larger batch sizes in the future.

In the meantime the best option is trying to exclude the particularly long line with exclusion rules.

If it's coming from a particular file that doesn't require monitoring you can use file level exclusion rules: LOGDNA_EXCLUSION_RULES.

If the file includes logs you want then you can also use a line level exclusion rule: LOGDNA_LINE_EXCLUSION_REGEX. Off the top of my head you could do something like (?:^.{1048577}) to exclude anything with more than 1024 * 1024 characters. See here for more information about using regex with our agent.

@happyhotel42
Copy link

@jakedipity FYI: I just tried the suggested LOGDNA_LINE_EXCLUSION_REGEX: However, the logdna agent then refuses to start with the error message: Compiled regex exceeds size limit of 10485760 bytes

I found that the Maximum length for that workaround is about 10k characters.

@jakedipity
Copy link
Contributor

@ThomasHappyHotel Yeah it's a very crude generic work around. It sounds like you got it to stick with a slightly smaller length check.

It sounds like we need a feature to reject lines larger than a configurable length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants