-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unused core dependencies #7534
Conversation
1208f12
to
b2523e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to provide feedback on multiple aspects:
(first, TLDR version:)
- I believe that the
@synchronized
decorator isn't an optimal abstraction since it conceals lock handling, which, ideally, should be explicitly managed. - There is a newly introduced bug within the
WatchDog.run
method due to the current refactoring. - This bug doesn't surface since the
WatchDog.run
method isn't invoked. - The
WatchDog
class, originally designed when Tribler was based on threads and not asyncio, is now essentially obsolete except for one function:WatchDog.get_threads_info
. - The
WatchDog.get_threads_info
function is flawed as it doesn't account for potential thread context switches during its execution.
Therefore, I recommend the complete removal of the tribler.core.utilities.instrumentation
module. We could rework WatchDog.get_threads_info
as a standalone function after addressing its concurrency issue, while the rest of the tribler.core.utilities.instrumentation
module can be discarded.
Let's delve into specifics:
The @synchronized
decorator may not be necessary in our context.
Its purpose is to abstract away locks. However, such abstraction can inadvertently result in deadlocks, given that locks are non-composable and the order of their usage should be explicit. If we're employing the @synchronized
decorator on various functions, we need to ensure they don't call each other to avoid deadlock scenarios. This might be challenging when the lock usage is obscured by the decorator.
In our codebase, the @synchronized decorator
is used only on three functions within the same class:
class WatchDog(Thread):
def __init__(self):
...
@synchronized
def _reset_state(self):
...
@synchronized
def register_event(self, event, name, timeout=10):
...
@synchronized
def unregister_event(self, name):
...
def run(self):
events_to_unregister = []
while not self.should_stop:
sleep(0.2)
with self._synchronized_lock:
...
Note that the run method uses self._synchronized_lock
, which was injected by the previous version of the @synchronized
decorator. The new decorator version doesn't inject this lock, causing the run
method to encounter an uninitialized attribute. However, since the WatchDog
thread is not currently initiated in Tribler, this error in the run
method doesn't trigger an exception.
We could rewrite the class to function without the @synchronized
decorator. Here's an example that seems clearer:
class WatchDog(Thread):
def __init__(self):
...
self._lock = Lock()
def _reset_state(self):
with self._lock:
...
def register_event(self, event, name, timeout=10):
with self._lock:
...
def unregister_event(self, name):
with self._lock:
...
def run(self):
events_to_unregister = []
while not self.should_stop:
sleep(0.2)
with self._lock:
...
However, we're not using this class currently, aside from the single WatchDog.get_threads_info
method. Thus, we could discard the WatchDog
class entirely, barring this one function.
WatchDog.get_threads_info
fails to account for the possibility of a thread context switch during its execution. This can be remedied by temporarily adjusting sys.setswitchinterval
during the function's execution. However, caution is advised since we're already modifying sys.setswitchinterval
elsewhere in the codebase.
b2523e9
to
25d11fc
Compare
@kozlovsky, thank you for your review. I've taken your comments on board and made several adjustments. Specifically, I've extracted the |
25d11fc
to
1bd8ce9
Compare
header = f"{filename}:{frame.f_lineno} {frame.f_code.co_name}" | ||
local = '' | ||
for key, value in list(frame.f_locals.items()): | ||
value = shorten(repr(value), width=value_width) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking the repr
of an unknown value can trigger an exception; I think we should add a try/except block here. Something like:
try:
value = repr(value)
except Exception as e:
value = f'<exception when taking repr: {e.__class__.__name__}: {e}>'
value = shorten(value, width=value_width)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide an example of a dangerous call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is expected for the __repr__
method of any object to return a string and don't raise any exception. With this, a __repr__
method is just a usual method and can raise an exception due to a bug in its code.
class AClassWithABadRepr1:
def __repr__(self):
1/0
In Tribler, we had a case of incorrect implementation of the alert.__str__()
method that raises UnicodeDecodeError
(#5740). In this case, the alert.__repr__()
implementation does not raise an error, but it is possible to discover a similar problem later with an incorrect __repr__
implementation of some class.
As another example, historically, some repr
methods of container classes could not handle situations when the container itself was added as its item (or as an item of another item, etc.) and raised RecursionError
. Now Python standard structures like lists and dicts handle this case correctly, but it is possible that some user-defined classes still have this problem.
Some blog posts describe the problems with the repr of arbitrary objects.
Python issue tracker has an issue for a case like ours: traceback.FrameSummary does not handle exceptions from repr()
. They decided not to catch these exceptions because Python follows the "fail fast" approach:
Bugs should be reported, not masked. I don't think that FrameSummary should be an exception to this.
But they agree that the case is real, and the exception is possible. We don't follow the "fail fast" approach in Tribler, so we don't want Tribler to crash because a user opened a debug window.
Also, unittest.util
has a safe_repr
function, and probably for a reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure about your statement that 'we don't follow the "fail fast" approach in Tribler'. In the previous dev meeting, the majority of the developers voted to maintain the current approach, which is a form of "fail fast".
However, I do agree with your argument and appreciate the clear examples you've provided.
I have now implemented your suggestion.
with _switch_interval_lock: | ||
previous_value = sys.getswitchinterval() | ||
try: | ||
sys.setswitchinterval(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Usually the call like this is placed before the try
block:
change_something()
try:
...
finally:
change_something_back()
In the example above, if change_something()
behaves atomically and raises an exception, that means that "something" hasn't changed, and restoring it is unnecessary.
In our case, sys.setswitchinterval(value)
also behaves atomically. It can throw an exception if the value has an incorrect type, but in that case, the internal variable wasn't changed, and restoring anything is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make the change, although this rule seems odd to me. The pattern:
try:
change_something()
finally:
change_something_back()
is just as effective as using atomic operations. Furthermore, it is more generic as it doesn't necessitate knowing which operations are atomic and which are not, for both the reader and the writer of the code.
2c7d773
to
b89d275
Compare
This PR cleans up the
requirements-core.txt
by removing unused dependencies (pyasn1
,service-identity
)Also, it contains refactoring of the
instrumentation.py
to make it possible to remove thedecorator
(library) dependency.