-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cmdmod misleading error message #49490
Fix cmdmod misleading error message #49490
Conversation
Very interesting. Do you have a GH issue filed on the memory leak that we can associate this PR with? |
I didn't create a GH issue just yet, because I assumed I'm going to look into that, but I can create one and if you want me to, I will do it tomorrow (take snapshots of memory and everything). |
I did try a vanilla salt on NILinuxRT (using both zmq and the tcp transport with the tcp_keepalive options we use) and I cannot reproduce this. However, as we are having our own beacons and modules and grains, and a windows master, we are seeing and reproducing the leak very easy. I believe we should still fix the message because is not accurate the way it is. |
salt/modules/cmdmod.py
Outdated
cmd if output_loglevel is not None else 'REDACTED', | ||
new_kwargs | ||
new_kwargs, | ||
exc.strerror |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would personally replace this with exc
. This will force .format()
to stringify it, which will in addition to the error message give us the error code. In addition, it will insulate this except
block against potential future issues, since not all exception classes have a strerror
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@terminalmage - okay, I will do that tomorrow morning. I had the same thing in mind, but exc will print: 'OSError: [Errno 12] Cannot allocate memory' as the 'reason', and I saw in official python docs that both OSError and IOError do have the 'strerror' attribute (but it can be None obviously for certain exceptions). Basically both OSError and IOError are constructed from an int and a string, the int being the errno, and the string the explanations for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rares-pop I think it will just show [Errno 12] Cannot allocate memory
.
But my main concern here is that someone later adds some other exception class to that except
and then causes an AttributeError
.
I'm not saying you did anything terribly wrong, just trying to make it a bit more future-proof.
@rares-pop Yes, I'm all for fixing the leak of course. :) Let's just make sure we get that issue filed so that we can get more eyes on the problem. Thanks! |
@cachedout - as I was mentioning in a previous comment today, I cannot reproduce the memory leak with pure salt (without our custom beacons and grains, modules, and running against a Windows salt-master). Let me try some more to try to reproduce it, maybe the problem is in our own code (not salt related) |
Signed-off-by: Rares POP <[email protected]>
It looks like the lint check failed to run last time through, we need to get that sorted before we can merge. |
Is it related to this change? Or is a system fault. I did run pylint --rcfile=.pylintrc salt/modules/cmdmod.py and the things I saw were not coming from the area I touched. |
Some weirdness happened last night with some connections to GitHub it looks like. I'm going to update the branch so this runs on the most recent code. Some of the test builds are missing as well, so some hooks must have gotten dropped somewhere. |
Lint is happy :) |
The cmdmod.py _run function is leaking memory (I don't know exactly why, or where), but here is a reproducing scenario:
Update: this doesn't reproduce on Ubuntu with master/minion on the same machine, the minion registered to 'localhost'. Maybe it's a configuration thing?
cycle these serveral times:
The memory on salt-minion increases from cycle to cycle until occupies more than 200mb (on a 512MB target) and at a subsequent clone of the process when running cmd.run, it raises an OSError: [Errno 12] Cannot allocate memory.
That investigation will be done on a separate thread, but here we should uncover the root cause of the problem, and not the 'command not found' which makes no sense in this case.
Previous Behavior
Previously, if there was no 'filename' on the exception, it assumed it was a 'command not found' exception, but that's not true. I.e. I'm getting “OSError: [Errno 12] Cannot allocate memory” which has no filename attribute
New Behavior
Add the OSError/IOError strerror message, which for the above is 'Cannot allocate memory', in order not to mislead users about the root cause of what happens.
Tests written?
No
Commits signed with GPG?
Yes