Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat 6.0.0-rc1: logfiles are fully parsed on every start #5442

Closed
friesoft opened this issue Oct 25, 2017 · 1 comment
Closed

Filebeat 6.0.0-rc1: logfiles are fully parsed on every start #5442

friesoft opened this issue Oct 25, 2017 · 1 comment

Comments

@friesoft
Copy link

  • Version: 6.0.0-rc1, 5.6.3 (latest GA)
  • Operating System: Red Hat 7
  • Steps to Reproduce:
    Restarting Filebeat re-reads and parses the same log lines (json) every time. Can be seen easily with the console output of Filebeat. On first run it correctly outputs the parsed logline, on second, thirds, .... run it outputs it again.

Sample JSON to reproduce the issue:
{ "@timestamp": "2017-10-17T10:03:14.301Z", "request": "/" }

Sample Filebeat Config 5.0.1 (working) and 5.6.3 (not working):

filebeat.prospectors:
- input_type: log
  paths:
    - serverlogs/apache.json
  json.keys_under_root: true
  json.add_error_key: true
  json.overwrite_keys: true
  fields_under_root: true

output.console:
  pretty: true

Sample Filebeat Config 6.0.0-rc1 (not working):

filebeat.prospectors:
- prospector_type: log
  paths:
    - serverlogs/apache.json
  json.keys_under_root: true
  json.add_error_key: true
  json.overwrite_keys: true
  fields_under_root: true

output.console:
  pretty: true

Running with a 5.0.1 installation (download, untar, add serverlogs/apache.json file with sample provided above, add filebeat.json.yml with sample provided above, run):

[friedreb@pc64901 filebeat-5.0.1-linux-x86_64]$ ./filebeat -c filebeat.json.yml
{
  "@timestamp": "2017-10-17T10:03:14.301Z",
  "beat": {
    "hostname": "pc64901",
    "name": "pc64901",
    "version": "5.0.1"
  },
  "input_type": "log",
  "offset": 61,
  "request": "/",
  "source": "serverlogs/apache.json",
  "type": "log"
}

Exiting filebeat (Ctrl+C) and starting it again doesn't produce any output on 5.0.1 as it isn't parsing anything as there are no new loglines (the one line provided as a sample has been parsed already during last run).

Running with a 5.6.3 or 6.0.0-rc1 installation it produces the same loglines on every start (download, untar, add serverlogs/apache.json file with sample provided above, add filebeat.json.yml with sample provided above, run):

[friedreb@pc64901 filebeat-6.0.0-rc1-linux-x86_64]$ ./filebeat -c filebeat.json.yml
{
  "@timestamp": "2017-10-25T07:54:57.673Z",
  "@metadata": {
  "beat": "filebeat",
  "type": "doc",
  "version": "6.0.0-rc1"
  },
  "@timestamp": "2017-10-17T10:03:14.301Z",
  "beat": {
    "name": "pc64901",
    "hostname": "pc64901",
    "version": "6.0.0-rc1"
  },
  "source": "/products/filebeat-6.0.0-rc1-linux-x86_64/serverlogs/apache.json",
  "offset": 61,
  "request": "/"
}

Every time I exit filebeat and start it again the same lines are parsed and output (of course every time with a new parse timestamp (the duplicate timestamps is issue #5440)

In the discuss entry on elastic.co we (@kvch) agreed that this is a bug.
https://discuss.elastic.co/t/6-0-0-rc1-json-overwrite-keys-not-working-with-timestamp/105189

@friesoft friesoft changed the title Filebeat 6.0.0-rc1: logfile are fully parsed on every start Filebeat 6.0.0-rc1: logfiles are fully parsed on every start Oct 25, 2017
@tsg
Copy link
Contributor

tsg commented Oct 25, 2017

The issue seems to happen when relative paths are used in the paths configuration. A workaround is to use absolute paths. I'm working on a fix and will open a PR shortly.

tsg added a commit to tsg/beats that referenced this issue Nov 3, 2017
If a relative path was used in a prospector definition, it could happen
that when loading the state from the registry, the path didn't match the
pattern, which made it ignore the saved state. Fixes elastic#5442.

Additionally, this PR cleans up some code related to the `recursive_glob`
setting. The docs used to claim that the feature is disabled by default, but
that wasn't the case and it was impossible to disable it. This change leaves
it enabled by default, but makes it possible to disable it.
@urso urso closed this as completed in #5443 Nov 6, 2017
urso pushed a commit that referenced this issue Nov 6, 2017
* Fix relative paths in prospectors definitions

If a relative path was used in a prospector definition, it could happen
that when loading the state from the registry, the path didn't match the
pattern, which made it ignore the saved state. Fixes #5442.

Additionally, this PR cleans up some code related to the `recursive_glob`
setting. The docs used to claim that the feature is disabled by default, but
that wasn't the case and it was impossible to disable it. This change leaves
it enabled by default, but makes it possible to disable it.

* addressed comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants