You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some situations when a file is truncated Filebeat will detect the truncation, close the reader but it will not update the registry.
This happens in two situations:
If Filebeat is stopped, the file is truncated then when Filebeat starts up the offset is not reset to zero and it keeps increasing with the new data read. This causes Filebeat to always detect the file as truncated on every restart and to re-ingest all data.
If the file is truncated while Filebeat is running and Filebeat is stopped after the reader is closed but before a new harvester/prospector are started.
Look at the offset in the registry and TTL. The offset should match the file size and the TTL should be 1800000000000.
Wait until the whole file is ingested
Stop Filebeat
Truncate the file
Add some data, let's say 5 lines. It should result in a smaller file than the original
Restart Filebeat
Assert:
The new content is ingested
The offset in the registry is larger than the file size.
Logs
Logs from the second run where we can see the file truncation being detected
{"log.level":"info","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/v2/compat.(*runner).Start.func1","file.name":"compat/compat.go","file.line":121},"message":"Input 'filestream' starting","service.name":"filebeat","id":"filestream-input-id","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"metric_registry","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/monitoring/inputmon.NewInputRegistry","file.name":"inputmon/input.go","file.line":63},"message":"registering","service.name":"filebeat","input_type":"filestream","id":"filestream-input-id","key":"filestream-input-id","uuid":"4175231d-a5c3-451b-b054-6b316ef1fba5","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).Run","file.name":"filestream/prospector.go","file.line":133},"message":"Starting prospector","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":120},"message":"Start next scan","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":224},"message":"File scan complete","service.name":"filebeat","total":1,"written":0,"truncated":0,"renamed":0,"removed":0,"created":1,"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":179},"message":"A new file /tmp/test.log has been found","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"create","source_name":"native::65140-34","os_id":"65140-34","new_path":"/tmp/test.log","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*defaultHarvesterGroup).Start","file.name":"input-logfile/harvester.go","file.line":139},"message":"Starting harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).openFile","file.name":"filestream/input.go","file.line":275},"message":"File was truncated. Reading file from offset 0. Path=/tmp/test.log","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","path":"/tmp/test.log","state-id":"native::65140-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).open","file.name":"filestream/input.go","file.line":185},"message":"newLogFileReader with config.MaxBytes:10485760","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","path":"/tmp/test.log","state-id":"native::65140-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*logFile).Read","file.name":"filestream/filestream.go","file.line":131},"message":"End of file reached: /tmp/test.log; Backoff now.","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","path":"/tmp/test.log","state-id":"native::65140-34","ecs.version":"1.6.0"}
File size before truncation and adding new data
ls -li /tmp/test.log
65140 -rw-r--r-- 1 tiago tiago 1997 Feb 20 16:58 /tmp/test.log
Final file size
ls -li /tmp/test.log
65140 -rw-r--r-- 1 tiago tiago 331 Feb 20 16:58 /tmp/test.log
The registry is correctly updated to the actual file size/offset read once file truncation is detected.
Steps to reproduce case 2
This case can use the same configuration as case 1.
Create a log file with some content, keep adding content regularly, let's say every second.
Start Filebeat. You can set prospector.scanner.check_interval: 30s to have more time between truncating the file and Filebeat picking it up again
Truncate the file. truncate -s 0 /tmp/test.log on Linux
Wait for the following log messages
{"log.level":"debug","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*logFile).handleEOF","file.name":"filestream/filestream.go","file.line":259},"message":"File was truncated as offset (773) > size (34): /tmp/test.log","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).readFromSource","file.name":"filestream/input.go","file.line":334},"message":"File was truncated, nothing to read. Path='/tmp/test.log'","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.startHarvester.func1","file.name":"input-logfile/harvester.go","file.line":247},"message":"Stopped harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).Run.func1","file.name":"filestream/input.go","file.line":151},"message":"Closing reader of filestream","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}
Stop Filebeat
Assert that the registry was not updated with an offset of 0
Make sure some data has been added to the file since the truncation
Restart Filebeat
Assert:
Filebeat detects the truncation
The new content is read and sent to the output
The offset in the registry is bigger than the file size.
Adding more data to the file will make the offset in the registry increase.
Logs
After restarting Filebeat it detects the truncation
{"log.level":"debug","@timestamp":"2024-02-20T19:43:06.309+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":179},"message":"A new file /tmp/test.log has been found","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"create","source_name":"native::69162-34","os_id":"69162-34","new_path":"/tmp/test.log","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T19:43:06.309+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*defaultHarvesterGroup).Start","file.name":"input-logfile/harvester.go","file.line":139},"message":"Starting harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-20T19:43:06.309+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).openFile","file.name":"filestream/input.go","file.line":275},"message":"File was truncated. Reading file from offset 0. Path=/tmp/test.log","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}
ls -li shows the moment the file was truncated, you can see the size going down and then starting to grow again
69162 -rw-r--r-- 1 tiago tiago 637 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 671 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 705 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 739 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 773 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 807 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 34 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 68 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 102 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 136 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 170 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 204 Feb 20 19:40 /tmp/test.log
The last few entries in the registry do not have an offset of 0 nor it ever decreases. The registry is read using tail -F data/registry/filebeat/log.json|jq -r '[.k, .v.cursor.offset, .v.ttl] | @tsv'
belimawr
changed the title
Filestream does not update registry offset when file is truncated while Filebeat is offline
Filestream does not update registry offset when file is truncated
Feb 20, 2024
belimawr
changed the title
Filestream does not update registry offset when file is truncated
Filestream sometimes does not update registry offset when file is truncated
Feb 20, 2024
Another workaround is to temporarily switch to deprecated (but otherwise fine) log input because it does not suffer from this problem as far as I can tell.
Description
In some situations when a file is truncated Filebeat will detect the truncation, close the reader but it will not update the registry.
This happens in two situations:
main
Steps to reproduce case 1
Create a log file with some content, let's say 100 lines
Start Filebeat with the following configuration
filebeat.yml
Look at the offset in the registry and TTL. The offset should match the file size and the TTL should be
1800000000000
.Wait until the whole file is ingested
Stop Filebeat
Truncate the file
Add some data, let's say 5 lines. It should result in a smaller file than the original
Restart Filebeat
Assert:
Logs
Logs from the second run where we can see the file truncation being detected
File size before truncation and adding new data
Final file size
Last 3 entries from the registry
Expected behaviour
The registry is correctly updated to the actual file size/offset read once file truncation is detected.
Steps to reproduce case 2
This case can use the same configuration as case 1.
prospector.scanner.check_interval: 30s
to have more time between truncating the file and Filebeat picking it up againtruncate -s 0 /tmp/test.log
on Linux0
Logs
After restarting Filebeat it detects the truncation
ls -li
shows the moment the file was truncated, you can see the size going down and then starting to grow againThe last few entries in the registry do not have an offset of
0
nor it ever decreases. The registry is read usingtail -F data/registry/filebeat/log.json|jq -r '[.k, .v.cursor.offset, .v.ttl] | @tsv'
Expected behaviour
The registry is correctly updated to the actual file size/offset read once file truncation is detected.
The text was updated successfully, but these errors were encountered: