Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filestream: validate input id on startup #41731

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AndersonQ
Copy link
Member

@AndersonQ AndersonQ commented Nov 21, 2024

During startup filebeat now validates the filestream inputs and fails to start if there are inputs without ID or with duplicated IDs. Duplicated IDs might cause data duplication therefore now it's mandatory to have unique and non-empty ID for each filestream input.

Proposed commit message

filestream: validate input id on startup

During startup filebeat now validates the filestream inputs and fails to start if there are inputs without ID or with duplicated IDs. Duplicated IDs might cause data duplication therefore now it's mandatory to have unique and non-empty ID for each filestream input. 

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Impact: Users who have not configured unique IDs for their filestream inputs will need to update their configurations to include unique IDs for each input. Failure to do so will prevent Filebeat from starting.

Previously: Filebeat would only log an error if filestream would find inputs had missing or duplicated IDs, potentially leading to data duplication.

Now: Filebeat will fail to start if any filestream input lacks an ID or has a duplicated ID.

How to test this PR locally

  • build filebeat
cd filebeat
go build .
  • use the following as filebeat.yml
filebeat.inputs:
  - type: filestream
    enabled: true
    paths:
      - /tmp/noID-1/*.log
  - type: filestream
    enabled: true
    paths:
      - /tmp/noID-2/*.log
  - type: filestream
    id: duplicated-id
    enabled: true
    paths:
      - /tmp/duplicated-id-1/*.log
  - type: filestream
    id: duplicated-id
    enabled: true
    paths:
      - /tmp/duplicated-id-2/*.log

output.discard.enabled: true
logging:
  level: info
  metrics:
    enabled: false
  • run filebeat:
./filebeat -e 2>&1 | grep message

verify filebeat:

  • will exit with error
  • will produce a log with all the invalid input configuration:
{"@timestamp":"2024-11-21T15:38:05.974+0100","log.level":"error","message":"filestream inputs with invalid IDs","log.logger":"filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.ValidateInputIDs","file.name":"filestream/config.go","file.line":187},"service.name":"filebeat","inputs":[{"enabled":true,"paths":["/tmp/noID-1/*.log"],"type":"filestream"},{"enabled":true,"paths":["/tmp/noID-2/*.log"],"type":"filestream"},{"enabled":true,"id":"duplicated-id","paths":["/tmp/duplicated-id-1/*.log"],"type":"filestream"},{"enabled":true,"id":"duplicated-id","paths":["/tmp/duplicated-id-2/*.log"],"type":"filestream"}],"ecs.version":"1.6.0"}
  • will produce a log pointing out the issues:
{"@timestamp":"2024-11-21T15:38:05.974+0100","log.level":"error","message":"invalid filestream configuration: filestream inputs validation error: input without ID, found filestream inputs with duplicated IDs: duplicated-id","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Run","file.name":"beater/filebeat.go","file.line":297},"service.name":"filebeat","ecs.version":"1.6.0"}

Related issues

Logs

{"@timestamp":"2024-11-21T15:38:05.974+0100","log.level":"info","message":"filebeat start running.","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch","file.name":"instance/beat.go","file.line":770},"service.name":"filebeat","ecs.version":"1.6.0"}
{"@timestamp":"2024-11-21T15:38:05.974+0100","log.level":"error","message":"filestream inputs with invalid IDs","log.logger":"filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.ValidateInputIDs","file.name":"filestream/config.go","file.line":187},"service.name":"filebeat","inputs":[{"enabled":true,"paths":["/tmp/noID-1/*.log"],"type":"filestream"},{"enabled":true,"paths":["/tmp/noID-2/*.log"],"type":"filestream"},{"enabled":true,"id":"duplicated-id","paths":["/tmp/duplicated-id-1/*.log"],"type":"filestream"},{"enabled":true,"id":"duplicated-id","paths":["/tmp/duplicated-id-2/*.log"],"type":"filestream"}],"ecs.version":"1.6.0"}
{"@timestamp":"2024-11-21T15:38:05.974+0100","log.level":"error","message":"invalid filestream configuration: filestream inputs validation error: input without ID, found filestream inputs with duplicated IDs: duplicated-id","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Run","file.name":"beater/filebeat.go","file.line":297},"service.name":"filebeat","ecs.version":"1.6.0"}
{"@timestamp":"2024-11-21T15:38:05.974+0100","log.level":"info","message":"filebeat stopped.","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch","file.name":"instance/beat.go","file.line":779},"service.name":"filebeat","ecs.version":"1.6.0"}
{"@timestamp":"2024-11-21T15:38:05.974+0100","log.level":"error","message":"Exiting: filestream inputs validation error: input without ID, found filestream inputs with duplicated IDs: duplicated-id","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cmd/instance.handleError","file.name":"instance/beat.go","file.line":1594},"service.name":"filebeat","ecs.version":"1.6.0"}

@AndersonQ AndersonQ added breaking change Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Nov 21, 2024
@AndersonQ AndersonQ self-assigned this Nov 21, 2024
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Nov 21, 2024
Copy link
Contributor

mergify bot commented Nov 21, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @AndersonQ? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Nov 21, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Nov 21, 2024
During startup filebeat now validates the filestream inputs and fails to start if there are inputs without ID or with duplicated IDs
@AndersonQ AndersonQ force-pushed the 40540-filestream-require-unique-ids branch from 247727b to b6ecc4e Compare November 21, 2024 14:45
@AndersonQ AndersonQ marked this pull request as ready for review November 21, 2024 14:56
@AndersonQ AndersonQ requested a review from a team as a code owner November 21, 2024 14:56
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@AndersonQ AndersonQ added backport-skip Skip notification from the automated backport with mergify and removed backport-8.x Automated backport to the 8.x branch with mergify labels Nov 21, 2024
@pierrehilbert pierrehilbert requested review from belimawr and rdner and removed request for khushijain21 November 21, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify breaking change Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filebeat should fail to start when multiple filestream inputs have the same input ID
2 participants