-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #7288: Policy does not get deleted when changing relays #2591
Fixes #7288: Policy does not get deleted when changing relays #2591
Conversation
def cleanPromiseFolderOfDeletedNode : Box[Seq[BetterFile]] = { | ||
import BetterFile._ | ||
def getNodeFolders(file: BetterFile): List[BetterFile] = { | ||
file.collectChildren(d => d.isDirectory && d.children.exists(_.name == "rules")).flatMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am really scared by the cost of this search with 10 000 nodes; doing it at each policy generation will slow it down even further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Myabe it could be a batch that runs, every ... day ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(mean an independant system that runs outside policy generation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cost should be really small. It's just collecting a biles names/properties. But OK to add a dedicated timer on that.
It's not logical at all to make it run once in a while if the goal is to limit the timespan where non relevant files are present (if it's just space-consumption control, why not).
And if can be put out of the generation itself, asynchronuously triggered so that it does not delay it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(there is no reason file operation like that should be slow when nothing is deleted, and if they are, we need to understand why, because we have a problem)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not use much resource, apart lots of fd in an older version of better-files but fixed by updating it to latest version
Commit modified |
7eeb37e
to
115eb8a
Compare
Commit modified |
115eb8a
to
5cc44f3
Compare
Nil | ||
} | ||
nodeFolder :: childrens | ||
}.toList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will result in "Too many open files" when system is large
5cc44f3
to
054b9c0
Compare
Commit modified |
val restExtractor = restExtractorService | ||
|
||
def process0(version: ApiVersion, path: ApiPath, req: Req, params: DefaultParams, authzToken: AuthzToken): LiftResponse = { | ||
apiv11service.reloadTechniques(req, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be reloadTechnique
val restExtractor = restExtractorService | ||
|
||
def process0(version: ApiVersion, path: ApiPath, req: Req, params: DefaultParams, authzToken: AuthzToken): LiftResponse = { | ||
apiv11service.reloadTechniques(req, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiv11service.reloadTechniques(req, params) | |
apiv11service.cleanPromises(req, params) |
Commit modified |
054b9c0
to
b6b4819
Compare
Commit modified |
b6b4819
to
9cbc03f
Compare
def cleanPromiseFolderOfDeletedNode(currentNodes : Map[NodeId,NodeInfo]) : Box[Seq[File]] = { | ||
import File._ | ||
def getNodeFolders(file: File): Iterator[File] = { | ||
file.children.filter{d => d.name !="rules" && d.isDirectory && d.children.exists{b => b.name == "rules"}}.flatMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
java.nio.file.FileSystemException: /var/rudder/share/546f908c-ef75-492d-bc30-5577fa44099a: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:427)
at java.nio.file.Files.newDirectoryStream(Files.java:457)
at java.nio.file.Files.list(Files.java:3451)
at better.files.File.list(File.scala:764)
at better.files.File.children(File.scala:766)
at com.normation.rudder.batch.CleanPromisesFolder.$anonfun$cleanPromiseFolderOfDeletedNode$1(CleanPromisesFolder.scala:99)
at com.normation.rudder.batch.CleanPromisesFolder.$anonfun$cleanPromiseFolderOfDeletedNode$1$adapted(CleanPromisesFolder.scala:99)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on my test platform I have 4551 folders in /var/rudder/share.
It means file.children has 4551 open files at least, plus for each of them, we look for the children, so again 4551 files opened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should keep them open thought.
Well, that ticket need perhaps more work to minimize that number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have the error when there are folder to delete (if none are to be deleted, all is fine)
[2019-11-15 09:36:21] ERROR scheduledJob - Error when deleting unreferenced promise directory <- /var/rudder/share/be28c259-8e34-4447-a6fe-d661d9af975f: Too many open files
[2019-11-15 09:36:21] ERROR scheduledJob - Exception was:
java.nio.file.FileSystemException: /var/rudder/share/be28c259-8e34-4447-a6fe-d661d9af975f: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:427)
at java.nio.file.Files.newDirectoryStream(Files.java:457)
at java.nio.file.Files.list(Files.java:3451)
at better.files.File.list(File.scala:764)
at better.files.File.children(File.scala:766)
at com.normation.rudder.batch.CleanPromisesFolder.$anonfun$cleanPromiseFolderOfDeletedNode$1(CleanPromisesFolder.scala:100)
at com.normation.rudder.batch.CleanPromisesFolder.$anonfun$cleanPromiseFolderOfDeletedNode$1$adapted(CleanPromisesFolder.scala:100)
Commit modified |
c4ddea4
to
39d9523
Compare
Commit modified |
39d9523
to
5742956
Compare
Commit modified |
9 similar comments
Commit modified |
Commit modified |
Commit modified |
Commit modified |
Commit modified |
Commit modified |
Commit modified |
Commit modified |
Commit modified |
@@ -217,6 +219,7 @@ object RestTestSetUp { | |||
, fakeItemArchiveManager | |||
, fakePersonIndentService | |||
, fakeRepo | |||
, new CleanPoliciesFolder(???, 24.hours) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that may not be what you want to keep :)
Commit modified |
3 similar comments
Commit modified |
Commit modified |
Commit modified |
logger.debug(s"***** starting batch that purge unreferenced policies directory, every ${updateInterval.toString()} *****") | ||
scheduler.scheduleWithFixedDelay(1.hour, updateInterval) { | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to complete the scheduler logic. Also, can you put the code that is also used by API elsewhere, in a "CleanPoliciesFolderService", and only keep the batch logic here ?
# This allows you to schedule the purge of unused promised folders (Deleted Nodes, or Nodes that has moved to another relay/policy server) | ||
# | ||
# | ||
# Defaults: check runs every 24 hours. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
garbage collection runs ...
|
||
# | ||
# Promise folder cleaning | ||
# This allows you to schedule the purge of unused promised folders (Deleted Nodes, or Nodes that has moved to another relay/policy server) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows you to schedule s/the// purge of unused s/promised/policies folders (s/D/d/eleted s/N/n/odes,
or s/N/n/odes that s/has// moved to another relay or policy server)
# | ||
|
||
# Interval in hours | ||
rudder.batch.clean.promises.interval=24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/promises/policies/
@@ -334,6 +334,16 @@ object RudderConfig extends Loggable { | |||
} | |||
} | |||
|
|||
val RUDDER_BATCH_CLEAN_PROMISES_INTERVAL = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/PROMISES/POLICIES/
This one will go in 5.0.16 |
Commit modified |
Fixes #7288: Policy does not get deleted when changing relays
…relays Fixes #7288: Policy does not get deleted when changing relays
…anging relays Fixes #7288: Policy does not get deleted when changing relays
…when changing relays Fixes #7288: Policy does not get deleted when changing relays
…eleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
…t get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
…does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
…Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
…#7288: Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
… Fixes #7288: Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
… fixup! Fixes #7288: Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
… fixup! fixup! Fixes #7288: Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
… fixup! fixup! fixup! Fixes #7288: Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
… fixup! fixup! fixup! fixup! Fixes #7288: Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
0fab3f1
to
3f97eff
Compare
… fixup! fixup! fixup! fixup! fixup! Fixes #7288: Policy does not get deleted when changing relays Fixes #7288: Policy does not get deleted when changing relays
Commit modified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems ok to me. @ncharles do you want to test it a last time on the load server?
this still fails with {"action":"cleanPolicies","result":"error","errorDetails":"An error while cleaning policies folder through rest API <- Error when deleting unreferenced policies directory <- Too many files open (currently -1, but number may have dropped), maybe you can raise open file descriptor limit (ulimit -n)"} |
This PR is now several years old. I'm closing it, there's not need to keep it open with no changes for so long time. |
https://issues.rudder.io/issues/7288
We need to update better-files to 3.8.0 because of pathikrit/better-files#279
This was causing fd leak when deleting files