-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix/warnings_sort_QualityEngine.report #11
Conversation
@@ -68,15 +67,14 @@ def dtypes(self, dtypes: dict): | |||
|
|||
def store_warning(self, warning: QualityWarning): | |||
"Adds a new warning to the internal 'warnings' storage." | |||
self._warnings.add(warning) | |||
self._warnings.append(warning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should maintain uniqueness of warnings, otherwise running the same failing test multiple times will generate duplicated warnings which are meaningless. this is solved for the .report method with the set but not for the .warnings property. wdyt of the strategy below?
def store_warning(self, warning: QualityWarning):
"Adds a new warning to the internal 'warnings' storage."
if warning not in self._warnings:
self._warnings.append(warning)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems right if this holds true, when we do:
warning not in self._warnings
we are running the warning __eq__
method against each content of _warnings, right? Doing a set of objects also resorts to __eq__
to filter uniques right?
The user might raise the same warning, in the same test but with different parameters. We would not filter in that case using your proposal, right?
Besides that I wonder if self.warnings should be an accessible property and not private. I think we probably should only give access to report and get_warnings methods. ATM get_warnings is not filtering unique warnings but should right on the beginning, just like report. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing warnings as an accessible property would mean switching 3 occurrences of self.warnings to self._warnings and removing the warnings property definition in core engine.
Across the different tutorials the sample warning demo should be changed to engine.get_warnings()[x] too.
Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm not 100% sure of what is the python implementation but from local testing it seems that defining __eq__ is enough to test the presence in arrays (like we are doing in the
warning not in self._warnings
). From our implementation, we are comparing the {category, test, description, priority} attributes on the __eq__ , so if two warnings of the same test have different parameters, the {category, test} are the same but the at least the {description} should be different (we often add some success/failure metrics to the warning description). - Given the necessary sort (by priority) and the optional filtering, we can remove the
warnings
as a property and keep only theget_warnings
method
…ydataai/ydata-quality into fix/report_method_qualityengine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like nothing odd remains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! 👍 🚀
Core changes
Minor changes