Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Use last modified files when log limit is hit. #2748

Closed
jannylund opened this issue Dec 6, 2018 · 7 comments
Closed

Feature request: Use last modified files when log limit is hit. #2748

jannylund opened this issue Dec 6, 2018 · 7 comments

Comments

@jannylund
Copy link

Describe what happened:

  • I configured DD-Agent on an Atlassian Bamboo server which runs a few hundred builds per day. The log structure for build output is similar to: /bamboo_home/xml-data/builds/*/download-data/build_logs/*.log
  • I get a warning similar to 2018-12-06 19:48:26 UTC | WARN | (file_provider.go:98 in FilesToTail) | Reached the limit on the maximum number of files in use: 100
  • Confirming the issue, I got 50.000+ files in different subfolders. They need to be stored for archival reasons. Only a few hundred (or few thousands) are modified each day.

Describe what you expected:

  • I expected DD-Agent would be looking at the last modified files instead of opening every feel each time.

Suggestion:
Modify the method FilesToTail to order files by last modified instead of alphabetical.

@tmichelet
Copy link
Contributor

Hi @jannylund, thanks for reporting this. We are having a look and will reach back in the next few days

@jannylund
Copy link
Author

@tmichelet is my understanding correct that the files are temporarily opened, tailed and then closed? Or will they be permanently kept open? I'm thinking I could make a shell script to symlink them to a log folder as a quick workaround.

@achntrl
Copy link
Member

achntrl commented Dec 7, 2018

@jannylund they are permanently kept open (only closed on logrotate or when stopping the tailer)

@jannylund
Copy link
Author

I made a workaround in the meanwhile.

  1. Create a new folder that DD is watching.
  2. Make a shellscript that uses find to fetch the latest logfiles into a list.
  3. Symlink them into the new folder (and remove ones that are older).
  4. Schedule from crontab every minute.

This causes the DD Agent to open up tails on the symlink added event and close them on delete event.

@tmichelet
Copy link
Contributor

@jannylund this workaround should do the job indeed! On our end, we're queuing some work to properly tackle this issue. Thanks again for reporting

@achntrl
Copy link
Member

achntrl commented Dec 11, 2018

Hi @jannylund, do you have the name format of your files ? And do you have control on the name of those files ?

Getting the ordered list of files by last modified date for 50k+ files is quite resource intensive and might seriously increase the agent footprint on its host. We're investigating a more gentle way of doing it based on the filenames. Sharing a tree of your folder would be helpful

Thank you!

@achntrl
Copy link
Member

achntrl commented Jan 23, 2019

We just implemented this feature in #2796, it will be released in 6.10 end of February!

@achntrl achntrl closed this as completed Jan 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants