Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop following symlinks. #1767

Merged
merged 5 commits into from
Jun 1, 2016
Merged

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented May 30, 2016

Previously symlinks were followed. This had the consequence if a symlink and the file itself existed, the file was read twice. Now symlinks are not followed anymore.

This closes #1686

Previously symlinks were followed. This had the consequence if a symlink and the file itself existed, the file was read twice. Now symlinks are not followed anymore.

This closes elastic#1686
@@ -79,11 +79,15 @@ func (p *ProspectorLog) getFiles() map[string]os.FileInfo {
}

// Stat the file, following any symlinks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is out of date, right? os.Lstat doesn't follow symlinks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will remove it.


if os.name == "nt":
import win32file
win32file.CreateSymbolicLink(symlink_file, testfile, 1)
Copy link
Member

@andrewkroh andrewkroh May 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link points to a file so the last parameter should be 0. https://msdn.microsoft.com/en-us/library/windows/desktop/aa363866(v=vs.85).aspx

Hopefully that will fix the issue with 7z on AppVeyor.

@andrewkroh andrewkroh merged commit 41f904b into elastic:master Jun 1, 2016
@djschny
Copy link

djschny commented Jun 13, 2016

Can we add the option for a fail fast option so that if the path provided in the config is a symlink, that it fails to startup so that way a person knows right away of the issue?

@djschny
Copy link

djschny commented Jun 13, 2016

Alternatively why not default to it being disabled but allowing it to be enabled if one chooses.

@ruflin ruflin deleted the fb-alias-handling branch June 14, 2016 07:09
@ruflin
Copy link
Member Author

ruflin commented Jun 14, 2016

@djschny The plan was to introduce a config option as soon as we have a request for it as we were not sure if this is something even needed. See #1686 (comment) Is following symlink something you need?

About the "fail fast". This is a little bit more complex as the symlink could also appear when Filebeat is already running and there is not difference in "first scan" of file and all the follow up scans. I prefer here the config option as I think it is more predictable and is closer to the principle we have so far. If one file is not readable, all the other files are still read.

@djschny
Copy link

djschny commented Jun 14, 2016

Cool, thanks @ruflin, I missed that comment linked issue. Sounds good.

In regards to "fail fast" I don't think people would care much about the worry about if it changed "out from under" Filebeat, but if you don't have startup check code in place, I can understand how this would be a heavier implementation than what may be viewed externally. Thanks for fielding my comment.

@tsg tsg mentioned this pull request Aug 16, 2016
ruflin added a commit to ruflin/beats that referenced this pull request Sep 8, 2016
This implementation of symlinks takes a symlink and opens the original file. The only difference is that as Source the symlink path is reported. The advantage of this implementation is that everything related to file handling is identical with non symlink files as the original file is harvested. All close_* options work as expected. State information is stored for the original file, not the symlink itself. In case a symlink and original file are harvested, only one will be harvested as they share the same inode information.

* Add test to verify that symlinks are disabled by default
* Add test for symlink handling
* Improve error handling in harvester. Return proper error messages instead of logging it. Prevents too many log messages.
* Remove IsRegular file call as not needed anymore and leads to additional Stat calls
* Add documentation

See elastic#2277 and elastic#1767
tsg pushed a commit that referenced this pull request Sep 8, 2016
This implementation of symlinks takes a symlink and opens the original file. The only difference is that as Source the symlink path is reported. The advantage of this implementation is that everything related to file handling is identical with non symlink files as the original file is harvested. All close_* options work as expected. State information is stored for the original file, not the symlink itself. In case a symlink and original file are harvested, only one will be harvested as they share the same inode information.

* Add test to verify that symlinks are disabled by default
* Add test for symlink handling
* Improve error handling in harvester. Return proper error messages instead of logging it. Prevents too many log messages.
* Remove IsRegular file call as not needed anymore and leads to additional Stat calls
* Add documentation

See #2277 and #1767
ruflin added a commit to ruflin/beats that referenced this pull request Sep 9, 2016
This implementation of symlinks takes a symlink and opens the original file. The only difference is that as Source the symlink path is reported. The advantage of this implementation is that everything related to file handling is identical with non symlink files as the original file is harvested. All close_* options work as expected. State information is stored for the original file, not the symlink itself. In case a symlink and original file are harvested, only one will be harvested as they share the same inode information.

* Add test to verify that symlinks are disabled by default
* Add test for symlink handling
* Improve error handling in harvester. Return proper error messages instead of logging it. Prevents too many log messages.
* Remove IsRegular file call as not needed anymore and leads to additional Stat calls
* Add documentation

See elastic#2277 and elastic#1767
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve symlink handling in Filebeat
3 participants