Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_tail: use the latest id when finding offset from the db #2960

Merged
merged 1 commit into from
Jan 26, 2021

Conversation

lee-byeoksan
Copy link
Contributor

DB may have two or more records with the same inode values. The problem is that when restarting, if a resumed file has same inode of a DB record whose file is already deleted, in_tail uses the offset for deleted file. This leads to unexpected behavior.

We can ensure that when DB contains two records with the same inode value, the one with the smaller id can be ignored because file system cannot have two files with the same inode if we can ignores hard links. Thus, it is safe to query a record with the largest id.

Below is a part of DB from our server. I removed some sensitive information.

$ sqlite3 tail.db 'select * from in_tail_files where inode=6443043122'
9816|XXX_access_log.2.20200713|83636851|6443043122|1594479601|1
10582|XXX_access_log.3.20210121|3520016914|6443043122|1611068401|1

When I tried to upgrade (v1.3.6 -> v1.6.9) and restart fluent-bit, fluent-bit starts to process a too large amount of logs. Note that, at that time, the file corresponding to id 10582 was not rotated and resumed. For the old versions, it is not a problem because fluent-bit also checks the name of the file and that of the record.

If we do not check the name, I think we should check the id to prevent the above situation.

Signed-off-by: Lee Byeoksan [email protected]


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change
  • [N/A] Debug log output from testing the change
  • [N/A] Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • [N/A] Documentation required for this feature

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

DB may have two or more records with the same inode values. The
problem is that when restarting, if a resumed file has same inode of a
DB record whose file is already deleted, in_tail uses the offset for
deleted file. This leads to unexpected behavior.

We can ensure that when DB contains two records with the same inode
value, the one with the smaller id can be ignored because file system
cannot have two files with the same inode if we can ignores hard
links. Thus, it is safe to query a record with the largest id.

Signed-off-by: Lee Byeoksan <[email protected]>
@edsiper edsiper merged commit af4b25c into fluent:master Jan 26, 2021
@edsiper
Copy link
Member

edsiper commented Jan 26, 2021

thanks for your contribution

@edsiper
Copy link
Member

edsiper commented Jan 26, 2021

(I will backport this for v1.6.11 release)

edsiper pushed a commit that referenced this pull request Jan 26, 2021
DB may have two or more records with the same inode values. The
problem is that when restarting, if a resumed file has same inode of a
DB record whose file is already deleted, in_tail uses the offset for
deleted file. This leads to unexpected behavior.

We can ensure that when DB contains two records with the same inode
value, the one with the smaller id can be ignored because file system
cannot have two files with the same inode if we can ignores hard
links. Thus, it is safe to query a record with the largest id.

Signed-off-by: Lee Byeoksan <[email protected]>
@lee-byeoksan
Copy link
Contributor Author

@edsiper any update for v1.6.11?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants