-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CA-388587: Fix filtering the xapi-clusterd
db
and log any errors
#67
Conversation
Signed-off-by: Bernhard Kaindl <[email protected]>
Python3 coverage comment from https://github.com/marketplace/actions/pytest-coverage-comment
|
bad4e07
to
f9eb22c
Compare
7a082a9
to
f9eb22c
Compare
cfd0cdb
to
f9eb22c
Compare
xapi-clusterd
db
and log any errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has refactoring of the code ,descriptive documentation and additional test case .Overall code looks sufficient and clean.
Signed-off-by: Bernhard Kaindl <[email protected]>
The 'token' is an API token that is used to authenticate all API requests to the xapi-clusterd daemon. - Apply the fix originally developed and tested by Edwin. - Update the tests: - Ensure that the fix works and keeps working across future changes. Co-authored-by: Bernhard Kaindl <[email protected]> Signed-off-by: Edwin Török <[email protected]>
… instead Handle the situations where configs doesn't have "pems" keys: Yangtze does not have REQ-403 and the REQ-403 API changed later to pass the PEMs in files, so the clusterd DB only has their path names: Ignore the pems keys if not there and sanitize the rest of the config. - Applies the fix originally developed and tested by Edwin. - Update the tests: - Ensure that the fix works and keeps working across future changes. Co-authored-by: Bernhard Kaindl <[email protected]> Signed-off-by: Edwin Török <[email protected]>
Signed-off-by: Bernhard Kaindl <[email protected]>
3ba1e16
to
f9eb22c
Compare
@lindig, @ashwin9390 : I have only done clean-ups, streamlined the test fixtures and their use. Maybe review (and approve) again? |
try: | ||
with open(XAPI_CLUSTERD, 'r') as f: | ||
clusterd_data = json.load(f) | ||
|
||
except Exception as e: # pylint: disable=broad-exception-caught |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why split the try/with here? AFAICT the end result is the same: we log the exception and return the empty string. And the traceback should tell us exactly which line raised the error, but maybe I'm missing something here, is there a reason to keep try/with split like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, the reason was rather long-winded with at least two aspects:
The original reason for the try-split split was:
I had set .pylintrc
max-try-statements=11
and this was exceeded by the added lines.
I've set this limit with an intention:
- If it is exceeded, I want developers to think:
- Should any error in all of this big try-statement really result in the same error-handling?
The max-size of the try-statements was the lever that I found for this, to make the code at least not regress further.
So I want to give devs (including myself) a heads-up, when their try-statements grow beyond a certain size, aka "out of leaps and bounds".
So I do not want to raise that limit, and I also like to avoid t just add a comment to disable the warning:
Firstly, it gives code a bad look when disabling warnings.
Secondly, I think that pylint
implements this limit for a good reason (or I found):
max-try-statements=11
is what works for the status-report tool, and adding the additional code inside would have made this the largest try-statement in the status-report script.
Because these try-statements are combined with broad-except, (except Exception) they catch everything in the try-statement.
These large try-statements with broad-except have been a major pain point for me for finding Python3 bugs - the actual cause of Python3 bugs.
Errors in them have not been straight-forward to find because nothing tells you that there was even an error:
Usually they are even combined with output_data = ""
, and an empty file (in the happy case) is the only indication that something is wrong.
And this is also what was the case in this function, the broad Exception handler
threw away the KeyError
that the expected pems
was no longer in the JSON data.
So when testing the xapi-clusterdb
, the only error indication was that the output file was empty.
So what I'd like to go for is that when such an Exception happens, I want a more targeted handling of the exception.
But it's just preference that I'd like to have a more specific Exception handling when the sources are completely different causes.
And the size was my only handle, I had. I set it so that it would trigger when the existing try-statements grow even larger.
Here is where I wanted to aim for:
try:
with open(XAPI_CLUSTERD, 'r') as f:
clusterd_data = json.load(f)
except Exception as e: # pylint: disable=broad-exception-caught
return log_internal_error(e)
Any Exception raised here means either invalid JSON - so something that should be an alert that something is bad.
This Exception should actually go into a separate bugball file like "system-errors.out" or so.
In the other case any error its definitely a bug in bugtool itself that must be fixed.
Since the data has been red, just some fields are not there, it's a different kind of error for me.
that should automatically be directed to a developer (since that looks like an unexpected change in the JSON).
OTOH, a non-existing clusterd-db file is a normal case in the non-clustered case.
For the usual case where the file does not exist, we can just check if it is there.
But if the file is missing in the clusterd-db case, it would a problem that would prevent clustering from working.
So I did this with a look towards enabling more diagnostics later,
make it possible to direct the exception causes towards different channels.
As it has 3 approvals now, I'd merge and move forward. Smaller tweaks are of course always possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the real reason is to avoid broad try-excepts, can we do that here? There are only a limited number of exceptions that 'open' can raise, and KeyError is certainly not one we should catch, and I understand the desire to not mask these errors.
I think the reason for catching broad errors in status-report may be a good one though: if one of the status report gathering functions fail you don't want to give up on the entire script, there may still be valuable information you can gather from other files.
Perhaps a better approach (that could be addressed in a future PR) would be:
- have some kind of top-level try/catch handler whenever we invoke a function to collect something for capability "X", and if that fails we log the exception somewhere (general xen-bugtool.out or the file that was supposed to contain our output, etc.), but keep going and capture other things. Then the broad try/except would only wrap a very small number of lines (i.e. 1 or 2, just invoke the gathering function for that capability)
- you've already added a check on whether the file exists which avoids logging errors unnecessarily when they are not errors
- status-report could keep a count of how many of these broad try/except it has caught and logged, and exit non-zero at the end, which should allow XenRT to detect this and fail (we should whether it'd still store the already created bugtool in that case, as that is still very useful in debugging what went wrong)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good proposals! Yes, they should be implemented!
I created #72 to implement them and added my answers!
I only added you as an additional assignee to show that we should do this, if it adds an entry into your GitHub notifications that you like to avoid, of course just remove yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It all looks reasonable but I'm not a good person to review this.
CA-358870: Backport fixes from #67 to filter the Xapi `clusterd/db` for XS8/Yangtze
Fix filtering the XAPI Clustder DB as found by CA-388587 (found using testing by @ashwin9390)
Commit 1 and 1 setup unit testing:
The Fix 1 and Fix 2 commits reuse original patches by @edwintorok for this issue.
Fix 3 applies the lesson learnt:
The tests that are created and fixed as described in the module docstring of the test case module.