Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add max bytes in line limit #19552

Merged
merged 6 commits into from
Jul 21, 2020
Merged

Add max bytes in line limit #19552

merged 6 commits into from
Jul 21, 2020

Conversation

aleksmaus
Copy link
Member

@aleksmaus aleksmaus commented Jul 1, 2020

What does this PR do?

Handles OOM on long lines gracefully, writes the error into log.

Why is it important?

Handle bad data gracefully without OOM.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • I just recently started looking at the beats code so might not have enough knowledge of any possible side-effects of this change. Please review with that in mind.

How to test this PR locally

Refer to #19500 for test case.
The filebeat doesn't get killed due to OOM with this change and the error is logged (see the screenshot attached).

Related issues

Screenshots

Screen Shot 2020-07-01 at 9 16 06 AM

Logs

2020-07-01T13:15:17.630Z	INFO	[crawler]	beater/crawler.go:108	Loading and starting Inputs completed. Enabled inputs: 1
2020-07-01T13:15:17.633Z	INFO	log/harvester.go:297	Harvester started for file: /test/readme.log
2020-07-01T13:15:17.635Z	ERROR	log/harvester.go:330	Read line error: exceeded max bytes in line limit; File: /test/readme.log

@aleksmaus aleksmaus requested review from andresrc and kvch July 1, 2020 13:34
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 1, 2020
@botelastic
Copy link

botelastic bot commented Jul 1, 2020

This pull request doesn't have a Team:<team> label.

@aleksmaus aleksmaus added the Team:Services (Deprecated) Label for the former Integrations-Services team label Jul 1, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 1, 2020
@aleksmaus
Copy link
Member Author

aleksmaus commented Jul 1, 2020

hmm, seems like a problem with CI build

[2020-07-01T13:40:28.732Z] + mage package

[2020-07-01T13:40:28.732Z] /var/lib/jenkins/workspace/Beats_packaging_PR-19552/src/github.com/elastic/beats/auditbeat@tmp/durable-fb3d78f4/script.sh: 1: /var/lib/jenkins/workspace/Beats_packaging_PR-19552/src/github.com/elastic/beats/auditbeat@tmp/durable-fb3d78f4/script.sh: mage: not found

script returned exit code 127

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 1, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #19552 updated]

  • Start Time: 2020-07-20T12:10:57.393+0000

  • Duration: 90 min 27 sec

Test stats 🧪

Test Results
Failed 0
Passed 15721
Skipped 1890
Total 17611

Steps errors

Expand to view the steps failures

  • Name: Report to Codecov

    • Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done

    • Duration: 1 min 27 sec

    • Start Time: 2020-07-20T12:54:11.019+0000

    • log

  • Name: Mage build test

    • Description: mage build test

    • Duration: 7 min 13 sec

    • Start Time: 2020-07-20T12:44:17.854+0000

    • log

  • Name: Report to Codecov

    • Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done

    • Duration: 2 min 22 sec

    • Start Time: 2020-07-20T12:45:48.055+0000

    • log

  • Name: Report to Codecov

    • Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done

    • Duration: 2 min 22 sec

    • Start Time: 2020-07-20T12:49:47.502+0000

    • log

  • Name: Report to Codecov

    • Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done

    • Duration: 2 min 22 sec

    • Start Time: 2020-07-20T12:48:03.100+0000

    • log

  • Name: Report to Codecov

    • Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done

    • Duration: 2 min 22 sec

    • Start Time: 2020-07-20T13:19:53.442+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-07-20T13:40:55.293Z] + FILE=winlogbeat/build/coverage/full.cov
[2020-07-20T13:40:55.293Z] + [ -f winlogbeat/build/coverage/full.cov ]
[2020-07-20T13:40:55.293Z] + FILE=journalbeat/build/coverage/full.cov
[2020-07-20T13:40:55.293Z] + [ -f journalbeat/build/coverage/full.cov ]
[2020-07-20T13:40:55.846Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats
[2020-07-20T13:40:56.162Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-07-20T13:40:56.174Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Lint
[2020-07-20T13:40:56.265Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Elastic-Agent-x-pack
[2020-07-20T13:40:56.354Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Winlogbeat-oss
[2020-07-20T13:40:56.442Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Journalbeat-oss
[2020-07-20T13:40:56.532Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Dockerlogbeat
[2020-07-20T13:40:56.613Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Auditbeat-crosscompile
[2020-07-20T13:40:56.693Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Functionbeat-x-pack
[2020-07-20T13:40:56.844Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Generators-Metricbeat-Linux
[2020-07-20T13:40:56.949Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Packetbeat-Linux
[2020-07-20T13:40:57.070Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Elastic-Agent-Mac-OS-X
[2020-07-20T13:40:57.176Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Elastic-Agent-x-pack-Windows
[2020-07-20T13:40:57.291Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-OSS-Unit-tests
[2020-07-20T13:40:57.405Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Libbeat-oss
[2020-07-20T13:40:57.515Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Heartbeat-oss
[2020-07-20T13:40:57.606Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Auditbeat-x-pack
[2020-07-20T13:40:57.704Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Auditbeat-oss-Mac-OS-X
[2020-07-20T13:40:57.816Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Filebeat-x-pack-Mac-OS-X
[2020-07-20T13:40:57.909Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Winlogbeat-Windows-x-pack
[2020-07-20T13:40:57.990Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Auditbeat-x-pack-Mac-OS-X
[2020-07-20T13:40:58.086Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Auditbeat-x-pack-Windows
[2020-07-20T13:40:58.173Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Auditbeat-oss-Linux
[2020-07-20T13:40:58.262Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Libbeat-x-pack
[2020-07-20T13:40:58.351Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Auditbeat-oss-Windows
[2020-07-20T13:40:58.433Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Filebeat-Mac-OS-X
[2020-07-20T13:40:58.545Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-crosscompile
[2020-07-20T13:40:58.645Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Filebeat-x-pack-Windows
[2020-07-20T13:40:58.743Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Filebeat-Windows
[2020-07-20T13:40:58.844Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Winlogbeat-Windows
[2020-07-20T13:40:58.935Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack-Windows
[2020-07-20T13:40:59.035Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack-Mac-OS-X
[2020-07-20T13:40:59.139Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-Mac-OS-X
[2020-07-20T13:40:59.228Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-Windows
[2020-07-20T13:40:59.332Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Functionbeat-Mac-OS-X-x-pack
[2020-07-20T13:40:59.424Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Heartbeat-Mac-OS-X
[2020-07-20T13:40:59.509Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Generators-Beat-Linux
[2020-07-20T13:40:59.602Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Packetbeat-Mac-OS-X
[2020-07-20T13:40:59.708Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Filebeat-x-pack
[2020-07-20T13:40:59.807Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Filebeat-oss
[2020-07-20T13:40:59.904Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Functionbeat-Windows
[2020-07-20T13:41:00.011Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Heartbeat-Windows
[2020-07-20T13:41:00.104Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-OSS-Python-Integration-tests
[2020-07-20T13:41:00.215Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-OSS-Go-Integration-tests
[2020-07-20T13:41:00.322Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Packetbeat-Windows
[2020-07-20T13:41:00.412Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Generators-Metricbeat-Mac-OS-X
[2020-07-20T13:41:00.505Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Generators-Beat-Mac-OS-X
[2020-07-20T13:41:00.607Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack
[2020-07-20T13:41:01.146Z] + cat
[2020-07-20T13:41:01.146Z] + /usr/local/bin/runbld ./runbld-script
[2020-07-20T13:41:01.146Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-07-20T13:41:07.740Z] runbld>>> runbld started
[2020-07-20T13:41:07.740Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-07-20T13:41:09.651Z] runbld>>> The following profiles matched the job 'Beats/beats/PR-19552' in order of occurrence in the config (last value wins).
[2020-07-20T13:41:10.591Z] runbld>>> Debug logging enabled.
[2020-07-20T13:41:10.591Z] runbld>>> Storing result
[2020-07-20T13:41:10.852Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-07-20T13:41:10.852Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200720134110-FABA21D5
[2020-07-20T13:41:10.852Z] runbld>>> Adding system facts.
[2020-07-20T13:41:11.867Z] runbld>>> Adding vcs info for the latest commit:  73e861c5982716a5163be7f75aa216cf40a59167
[2020-07-20T13:41:11.867Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-07-20T13:41:11.867Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-07-20T13:41:11.867Z] Processing JUnit reports with runbld...
[2020-07-20T13:41:11.867Z] + echo 'Processing JUnit reports with runbld...'
[2020-07-20T13:41:12.438Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-07-20T13:41:12.438Z] runbld>>> DURATION: 30ms
[2020-07-20T13:41:12.438Z] runbld>>> STDOUT: 40 bytes
[2020-07-20T13:41:12.438Z] runbld>>> STDERR: 49 bytes
[2020-07-20T13:41:12.438Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-07-20T13:41:12.438Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats
[2020-07-20T13:41:13.379Z] runbld>>> Storing build metadata: 
[2020-07-20T13:41:13.379Z] runbld>>> Adding test report.
[2020-07-20T13:41:13.379Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats
[2020-07-20T13:41:14.320Z] runbld>>> Found 134 test output files
[2020-07-20T13:41:15.704Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-istio.xml
[2020-07-20T13:41:15.704Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-openmetrics.xml
[2020-07-20T13:41:15.704Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-tomcat.xml
[2020-07-20T13:41:15.704Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-activemq.xml
[2020-07-20T13:41:15.704Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-iis.xml
[2020-07-20T13:41:15.965Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-OSS-Go-Integration-tests/metricbeat/build/TEST-go-integration-graphite.xml
[2020-07-20T13:41:15.965Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-19552/src/github.com/elastic/beats/Metricbeat-OSS-Go-Integration-tests/metricbeat/build/TEST-go-integration-windows.xml
[2020-07-20T13:41:17.346Z] runbld>>> Test output logs contained: Errors: 0 Failures: 0 Tests: 17461 Skipped: 1629
[2020-07-20T13:41:17.346Z] runbld>>> Storing result
[2020-07-20T13:41:17.346Z] runbld>>> FAILURES: 0
[2020-07-20T13:41:17.346Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-07-20T13:41:17.346Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200720134110-FABA21D5
[2020-07-20T13:41:17.606Z] runbld>>> Email notification disabled by environment variable.
[2020-07-20T13:41:17.606Z] runbld>>> Slack notification disabled by environment variable.
[2020-07-20T13:41:23.331Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-19552
[2020-07-20T13:41:23.498Z] [INFO] getVaultSecret: Getting secrets
[2020-07-20T13:41:23.583Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-07-20T13:41:24.322Z] + chmod 755 generate-build-data.sh
[2020-07-20T13:41:24.322Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-19552/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-19552/runs/4 FAILURE 5426670
[2020-07-20T13:41:24.323Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-19552/runs/4/steps/?limit=10000 -o steps-info.json
[2020-07-20T13:41:26.184Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-19552/runs/4/tests/?status=FAILED -o tests-errors.json
[2020-07-20T13:41:26.735Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-19552/runs/4/log/ -o pipeline-log.txt

@aleksmaus
Copy link
Member Author

The today's comment on the original issue was a suggestion to skip the long lines instead of returning the error. Will update PR later today.

@aleksmaus
Copy link
Member Author

Updated PR to skip the long lines, logging the warning and continuing processing.

New log screenshot, running on the 100M suggested in the test case with couple of lines appended.
Shows skipping the long line. Added more unit test coverage for skipping the long lines.

Using the same MaxSize from the config that is used for the limiter at the end of the readers pipeline, multiplied by 4 since the checks and skipping happen before decoding the data.

Screen Shot 2020-07-02 at 11 57 32 AM

Copy link

@benbuzbee benbuzbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Thanks so much for the quick and best fix. Looking forward to deploying this.


// If newLine is not found and exceeded max bytes limit
if idx == -1 && r.maxBytes != 0 && r.inBuffer.Len() > r.maxBytes {
return ErrExceededMaxBytesLimit

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this abort reading the file forever? If other words, if max_bytes is 1024 and line 50 is 3000 bytes, will line 51+ never be read?

If so, would be it possible to, in lieu of returning an error, continue reading until the new-line but do not add what we read to r.inBuffer, effectively just ignoring the line. So in my example 50 would be skipped, but 51 could be read

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you commented on the first commit. This code was changed by the second commit in this PR.

// for the worst case scenario where incoming UTF32 charchers are decoded to the single byte UTF-8 characters.
// This limit serves primarily to avoid memory bload or potential OOM with expectedly long lines in the file.
// The further size limiting is performed by LimitReader at the end of the readers pipeline as needed.
encReaderMaxBytes := h.config.MaxBytes * 4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may want to add a comment in docs for max_bytes that it can use up to 4x the number worth of memory for buffering

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure can do that. Which docs? Before this change it was taking all available memory :-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html

As a user I would want that information here

The maximum number of bytes that a single log message can have. All bytes after max_bytes are discarded and not sent., but up to 4x max_bytes will be buffered in-memory .This setting is especially useful for multiline log messages, which can get large. The default is 10MB (10485760).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second thought I'm not sure that this part of internal implementation should be documented.
Before this change the code was allocating as much as it was available and cutting the final message to the max byte limit before the output from the pipeline.
This change adds a protection from unlimited allocation.

I'll let the regular beats contributors that are going to review this PR to decide if this needs to be documented.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I will make a quick pitch and leave it to you and reviewers to decide: without that kind of information, it is hard to plan. For example, how can I determine how much memory I will be safe giving filebeat? I may think that if I tell it to read only 1 file at a time, and tell it to use max_bytes of 25MB, that I am very safe giving it 50MB of memory, only to be sad when it OOMs because it actually used 100MB of memory for 1 line that I thought was limited to 25.

Personally I find it already very hard to figure out how much memory it needs because i have to do match like + max_events * num_harvesters * max_size - and im scared of more hidden values (like * 4)

@aleksmaus
Copy link
Member Author

Can reduce the memory footprint by not caching the line data before decoding and decoding every read. Need to measure how it affects the performance.
This would require a bit more changes in the exiting code.
The current change is less invasive.
Will leave up to the reviewers feedback.

Copy link
Contributor

@kvch kvch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you taken a look at the functions of streambuf? Is AppendWithCapLimits not usable here? (https://github.com/elastic/beats/blob/master/libbeat/common/streambuf/streambuf.go#L176)

@aleksmaus
Copy link
Member Author

aleksmaus commented Jul 6, 2020

@kvch if I'm reading the implementation correctly

func (b *Buffer) doAppend(data []byte, retainable bool, newCap int) error {

it caps the the number of byte in append operation, while potentially doing huge (10MB default) allocations for backing array:

b.data = make([]byte, len(data), newCap)

and not really limiting the buffer size growth

b.data = append(b.data, data...)

It looks like the purpose of AppendWithCapLimits implementation was to reduce the number of reallocations for appends when you know the final possible buffer size.

Please explain, if I misunderstood the way you are proposing to use this.

@@ -28,12 +28,15 @@ import (
"github.com/elastic/beats/v7/libbeat/logp"
)

const Unlimited = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not export this variable if you are not using it outside the readfile package.

Copy link
Member Author

@aleksmaus aleksmaus Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant potentially could be used for the public NewLineReader config param.
Let me know if still want to hide it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I am not a huge fan of exposing something we would potentially use. In general, it is easier to export something unexported, then unexport something exported.

But I do not block this PR because of a philosophical question about a small constant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will update to not export.

@@ -31,10 +35,12 @@ import (
)

// Sample texts are from http://www.columbia.edu/~kermit/utf8.html
var tests = []struct {
type LineTestSpec struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not export this struct as you are not using it from outside the package. Also, rather name it listTestCase. The name you have given it is not idiomatic to this repo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

}

func TestMaxBytesLimit(t *testing.T) {
const (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please extract the setup part of the test out into a separate function? The way the test is structured now is hard to read and understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted.

Copy link
Contributor

@kvch kvch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code itself look good, I have a few minor comments. Please refactor the test you have provided.

@kvch
Copy link
Contributor

kvch commented Jul 9, 2020

@aleksmaus You are right. I was just wondering if we could use it.

@aleksmaus
Copy link
Member Author

Addressed all of the comments. Let me know if anything else needs to be changed.

@kvch
Copy link
Contributor

kvch commented Jul 16, 2020

jenkins run tests

@kvch
Copy link
Contributor

kvch commented Jul 16, 2020

@aleksmaus Sorry for the long wait. The code in general looks good to me. However, when a line is truncated Filebeat flags that message by adding "truncated" to the list in log.flags. This current implementation hides this information from the user. Could you please address this problem?

@kvch
Copy link
Contributor

kvch commented Jul 20, 2020

@aleksmaus Indeed. Could you please rebase the PR and add a changelog entry to CHANGELOG.next.asciidoc?

@kvch kvch added the needs_backport PR is waiting to be backported to other branches. label Jul 20, 2020
@ppf2
Copy link
Member

ppf2 commented Jul 20, 2020

we completely drop the line that is too long (logging it)

Have we also considered truncating the line before logging? We have seen ~1Gb of junk causing filebeat to OOM. It might be helpful to truncate first before logging. Perhaps only log the first 1024 characters or something for the dropped line?

@aleksmaus
Copy link
Member Author

we completely drop the line that is too long (logging it)

Have we also considered truncating the line before logging? We have seen ~1Gb of junk causing filebeat to OOM. It might be helpful to truncate first before logging. Perhaps only log the first 1024 characters or something for the dropped line?

the line drop happens before decoding. would have to try to decode for that. the number of characters will be known only after decoding. also the junk will not be necessarily decodable.
thoughts?

@aleksmaus
Copy link
Member Author

we completely drop the line that is too long (logging it)

Have we also considered truncating the line before logging? We have seen ~1Gb of junk causing filebeat to OOM. It might be helpful to truncate first before logging. Perhaps only log the first 1024 characters or something for the dropped line?

Ahh, not logging the line content itself. Just the fact that the line is dropped and the length in bytes.

@kvch kvch merged commit 0e049f0 into elastic:master Jul 21, 2020
@kvch
Copy link
Contributor

kvch commented Jul 21, 2020

@aleksmaus Thank you for the contribution!

@kvch kvch added v7.10.0 and removed needs_backport PR is waiting to be backported to other branches. labels Jul 21, 2020
kvch pushed a commit to kvch/beats that referenced this pull request Jul 21, 2020
v1v added a commit to v1v/beats that referenced this pull request Jul 21, 2020
…-basis-when

* upstream/master:
  [CI] run everything for branches/tags (elastic#20057)
  Limit the number of bytes read by LineReader in Filebeat (elastic#19552)
  Prefer testify.Error/NoError over NotNil/Nil (elastic#20044)
@kvch kvch added the v6.8.11 label Jul 21, 2020
kvch added a commit that referenced this pull request Jul 21, 2020
kvch pushed a commit to kvch/beats that referenced this pull request Jul 22, 2020
kvch added a commit that referenced this pull request Jul 23, 2020
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
@zube zube bot removed the [zube]: Done label Oct 19, 2020
benero added a commit to benero/beats that referenced this pull request Jul 20, 2021
jacky15 pushed a commit to TencentBlueKing/beats that referenced this pull request Jul 21, 2021
* bugfix: Add max bytes in line limit elastic#19552
chery pick from elastic#19552

* feature: add file reuse harvester

* bugfix: harvester cleanup & state update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Services (Deprecated) Label for the former Integrations-Services team v6.8.11 v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants