Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libbeat/reader/syslog: relax timestamp parsing to allow leading zero #31254

Merged
merged 2 commits into from
Apr 15, 2022

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Apr 12, 2022

What does this PR do?

This change relaxes the RFC3164 timestamp grammar to allow dates with a leading
zero to be parsed as valid syslog timestamps, bringing the parser's behaviour into
line with the parser in filebeat/input/syslog.

Why is it important?

There are non-standard syslog generating implementations that we were previously unable to consume. With this change we also gain the possibility of reducing the number of syslog parser implementations that we have.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • Confirm that the equivalent test in filebeat/input/syslog matches (there is one difference in that the hostname in that package's test does not conform to the RFC).
  • This is arguably a bug fix even though it is a break from the strict definition in the RFC. Opinions on this are welcome.

How to test this PR locally

go test github.com/elastic/beats/v7/libbeat/reader/syslog

Related issues

Use cases

Screenshots

Logs

@efd6 efd6 requested a review from a team as a code owner April 12, 2022 03:57
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 12, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Apr 12, 2022
@mergify mergify bot assigned efd6 Apr 12, 2022
@efd6 efd6 requested review from taylor-swanson and a team April 12, 2022 04:01
@elasticmachine
Copy link
Collaborator

elasticmachine commented Apr 12, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-04-15T02:27:52.451+0000

  • Duration: 70 min 11 sec

Test stats 🧪

Test Results
Failed 0
Passed 22301
Skipped 1935
Total 24236

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@mergify
Copy link
Contributor

mergify bot commented Apr 12, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b syslogrelaxdate upstream/syslogrelaxdate
git merge upstream/main
git push upstream syslogrelaxdate

@cmacknz cmacknz removed the request for review from a team April 12, 2022 15:14
This change relaxes the RFC3164 timestamp grammar to allow dates with a leading
zero to be parsed as valid syslog timestamps, bringing the parser's behaviour into
line with the parser in filebeat/input/syslog.
Copy link
Contributor

@taylor-swanson taylor-swanson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@taylor-swanson
Copy link
Contributor

This is arguably a bug fix even though it is a break from the strict definition in the RFC

As much as I'd like to strictly adhere to RFC, it doesn't help us trying to ingest from as many sources as possible. The more I look at it, the more I realize it is not our job to be enforcers of RFC, there are better tools out there if a user really wants to do that. I think for us, leaning towards best effort rather than strict RFC adherence is the way to go. Another example would be my latest change that makes the priority field optional, which is a blatant violation of RFC (yet many syslog providers do just that).

I have been looking at making the parser a lot more tolerant and being able to recover from certain errors, but it's proving to be more difficult and time consuming than I'd like. @efd6 I know you recently did some work with the cef decoder and recovery. Feel free to look at it if you've got time. The problem I'm running into is getting the state machine to transition to the "next state" (easier said than done). Further discussion on that topic should probably be done on the main issue, #31246

@taylor-swanson
Copy link
Contributor

Should this be backported to 8.2.0?

@efd6
Copy link
Contributor Author

efd6 commented Apr 14, 2022

Happy to take a look @taylor-swanson — I'm not surprised you are having difficulties; starting a robust machine with throwing out your invariants is hard/foolish (coming from someone who has spent time doing that foolishness).

This should be back ported if it's considered a bug. @andrewkroh?

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we documented the time parsing behavior of the syslog processor/parser and deviations from the RFC. Please check the docs and update them accordingly.

As for backports, I think this should go into 8.2.

@efd6 efd6 added the backport-v8.2.0 Automated backport with mergify label Apr 15, 2022
@efd6
Copy link
Contributor Author

efd6 commented Apr 15, 2022

@andrewkroh Please take a look at the doc wording. Also, since the issue that this fixes has been around for two years, should I back port to 7.17, and 8.1(too late for 8.1)?

@andrewkroh
Copy link
Member

The syslog processor being changed is new and only exists in main/8.2. So users that want this fix will need to upgrade to take advantage of the new processor.

@efd6 efd6 merged commit e1a7f6d into elastic:main Apr 15, 2022
@efd6 efd6 deleted the syslogrelaxdate branch April 15, 2022 08:13
mergify bot pushed a commit that referenced this pull request Apr 15, 2022
…31254)

This change relaxes the RFC3164 timestamp grammar to allow dates with a leading
zero to be parsed as valid syslog timestamps, bringing the parser's behaviour into
line with the parser in filebeat/input/syslog.

(cherry picked from commit e1a7f6d)
efd6 added a commit that referenced this pull request Apr 15, 2022
…31254) (#31315)

This change relaxes the RFC3164 timestamp grammar to allow dates with a leading
zero to be parsed as valid syslog timestamps, bringing the parser's behaviour into
line with the parser in filebeat/input/syslog.

(cherry picked from commit e1a7f6d)

Co-authored-by: Dan Kortschak <[email protected]>
v1v added a commit to v1v/beats that referenced this pull request Apr 18, 2022
…er-tar-gz

* upstream/main: (139 commits)
  [Automation] Update elastic stack version to 8.3.0-c655cda8 for testing (elastic#31322)
  Define a queue metrics reporter interface  (elastic#31289)
  [Oracle Module] Change tablespace metricset collection period (elastic#31259)
  libbeat/reader/syslog: relax timestamp parsing to allow leading zero (elastic#31254)
  [Automation] Update elastic stack version to 8.3.0-55ba6f37 for testing (elastic#31311)
  [libbeat] Remove unused fields and functions in the memory queue (elastic#31302)
  [libbeat] Cleaning up some unneeded helper types (elastic#31290)
  Readme for kibana module (elastic#31276)
  [Automation] Update elastic stack version to 8.3.0-4be61f32 for testing (elastic#31296)
  x-pack/winlogbeat/module/routing/ingest: fix typo for channel name (elastic#31291)
  Small pipeline cleanup removing some unused data fields (elastic#31288)
  removing info log (elastic#30971)
  Simplify TLS config deserialization (elastic#31168)
  Detect new files under known paths in filestream input (elastic#31268)
  Add support for port mapping in docker hints (elastic#31243)
  Update qa-labels.yml (elastic#31260)
  libbeat: log debug for `proxy_url` and fixed docs (elastic#31130)
  [heartbeat][docs] Add note about ensuring correct index settings for uptime (elastic#31146)
  [Automation] Update elastic stack version to 8.3.0-2c8f9574 for testing (elastic#31256)
  [Filebeat] fix m365_defender pipeline bug (elastic#31227)
  ...
kush-elastic pushed a commit to kush-elastic/beats that referenced this pull request May 2, 2022
…lastic#31254)

This change relaxes the RFC3164 timestamp grammar to allow dates with a leading
zero to be parsed as valid syslog timestamps, bringing the parser's behaviour into
line with the parser in filebeat/input/syslog.
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
…31254)

This change relaxes the RFC3164 timestamp grammar to allow dates with a leading
zero to be parsed as valid syslog timestamps, bringing the parser's behaviour into
line with the parser in filebeat/input/syslog.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.3-candidate backport-v8.2.0 Automated backport with mergify enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants