Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: backup broken file chunk #4025

Merged
merged 6 commits into from
Feb 16, 2023
Merged

Conversation

daipom
Copy link
Contributor

@daipom daipom commented Jan 26, 2023

Which issue(s) this PR fixes:
Partial fix for #3970

What this PR does / why we need it:
Backup feature was implemented in #1952, but it didn't support handling broken file chunks found in resuming buffer.
This extends the backup feature to support it.

When a file is corrupted due to power failure or other reasons, backing up the file allows us to guess which data was corrupted.

Example of a forwarder:

<system>
  root_dir /test/fluentd/forwarder
</system>

<source>
  @type sample
  tag test
</source>

<match test.**>
  @id test_id
  @type forward
  <buffer tag,time,file>
    @type file
    path /test/fluentd/forwarder/buffer
    timekey 24h
    flush_mode interval
    flush_interval 10s
    overflow_action drop_oldest_chunk
  </buffer>
  <server>
    host localhost
    port 24224
  </server>
</match>

We can reproduce file corruption due to power failure as follows.
(Especially in Windows, a phenomenon in which the data being edited is turned to a sequence of zero is confirmed.)

  • Stop the fluentd.
  • Fill in the second half of the chunk data with zeros.
  • Break the meta-data completely.
$ truncate -s 80 buffer.b5f32232e76a4d1bdfdbeed36c384b03b.log
$ truncate -s 160 buffer.b5f32232e76a4d1bdfdbeed36c384b03b.log
$ head -c 89 /dev/zero > buffer.b5f32232e76a4d1bdfdbeed36c384b03b.log.meta

Then, restart the fluentd, and it backups the chunk.
The logs of fluentd are as follows.

[info]: #0 fluent/log.rb:330:info: starting fluentd worker pid=891085 ppid=891065 worker=0
[debug]: #0 [test_id] restoring buffer file: path = /test/fluentd/forwarder/buffer/buffer.b5f32232e76a4d1bdfdbeed36c384b03b.log
[error]: #0 [test_id] found broken chunk file during resume. path="/test/fluentd/forwarder/buffer/buffer.b5f32232e76a4d1bdfdbeed36c384b03b.log" mode=:staged err_msg="staged meta file is broken. no implicit conversion of Symbol into Integer"
[warn]: #0 [test_id] bad chunk is moved to /test/fluentd/forwarder/backup/worker0/test_id/5f32232e76a4d1bdfdbeed36c384b03b.log

We can check the backup file as follows.

require "fleunt/event"
chunk_data = File.read("buffer.b5f32232e76a4d1bdfdbeed36c384b03b.log")
Fluent::MessagePackEventStream.new(chunk_data, nil, 0).each {|time, record| p time, record}

2023-01-26 12:18:12.017325082 +0900
{"message"=>"sample"}
2023-01-26 12:18:13.020290925 +0900
{"message"=>"sample"}
2023-01-26 12:18:14.022211281 +0900
{"message"=>"sampl\u0000"}
0
nil
0
nil
...

Docs Changes:

Release Note:

Same as the title.

Backup feature was implemented in fluent#1952, but it didn't support
handling broken file chunks found in resuming buffer.

This extends the backup feature to support it.

Signed-off-by: Daijiro Fukuda <[email protected]>
@ashie ashie self-requested a review January 26, 2023 03:50
@daipom
Copy link
Contributor Author

daipom commented Jan 26, 2023

I'm fixing TestFluentPluginConfigFormatter...

@daipom
Copy link
Contributor Author

daipom commented Jan 26, 2023

Hmm, disable_chunk_backup option is still in buffer section, but I moved it into the buffer plugin from output plugin.

I thought this was OK since this doesn't change the config shape, but it seems this is not good.
This will make the output plugins directly using this option unworkable.
I will add the getter for the compatibility.

`disable_chunk_backup` option is still in `buffer` section, but
it is moved into `Buffer` from `Output`.

Signed-off-by: Daijiro Fukuda <[email protected]>
lib/fluent/plugin/buffer.rb Outdated Show resolved Hide resolved
Signed-off-by: Daijiro Fukuda <[email protected]>
test/plugin/test_buf_file.rb Outdated Show resolved Hide resolved
Signed-off-by: Daijiro Fukuda <[email protected]>
@ashie ashie merged commit 8bb38b5 into fluent:master Feb 16, 2023
@daipom daipom deleted the backup-broken-filechunk branch February 16, 2023 08:41
@daipom
Copy link
Contributor Author

daipom commented Feb 16, 2023

Thanks for your review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants