Remove unused core dependencies #7534

drew2a · 2023-07-10T13:01:19Z

This PR cleans up the requirements-core.txt by removing unused dependencies (pyasn1, service-identity)

Also, it contains refactoring of the instrumentation.py to make it possible to remove the decorator (library) dependency.

kozlovsky

I'd like to provide feedback on multiple aspects:

(first, TLDR version:)

I believe that the @synchronized decorator isn't an optimal abstraction since it conceals lock handling, which, ideally, should be explicitly managed.
There is a newly introduced bug within the WatchDog.run method due to the current refactoring.
This bug doesn't surface since the WatchDog.run method isn't invoked.
The WatchDog class, originally designed when Tribler was based on threads and not asyncio, is now essentially obsolete except for one function: WatchDog.get_threads_info.
The WatchDog.get_threads_info function is flawed as it doesn't account for potential thread context switches during its execution.

Therefore, I recommend the complete removal of the tribler.core.utilities.instrumentation module. We could rework WatchDog.get_threads_info as a standalone function after addressing its concurrency issue, while the rest of the tribler.core.utilities.instrumentation module can be discarded.

Let's delve into specifics:

The @synchronized decorator may not be necessary in our context.

Its purpose is to abstract away locks. However, such abstraction can inadvertently result in deadlocks, given that locks are non-composable and the order of their usage should be explicit. If we're employing the @synchronized decorator on various functions, we need to ensure they don't call each other to avoid deadlock scenarios. This might be challenging when the lock usage is obscured by the decorator.

In our codebase, the @synchronized decorator is used only on three functions within the same class:

class WatchDog(Thread):
    def __init__(self):
        ...

    @synchronized
    def _reset_state(self):
        ...
        
    @synchronized
    def register_event(self, event, name, timeout=10):
        ...

    @synchronized
    def unregister_event(self, name):
        ...

    def run(self):
        events_to_unregister = []
        while not self.should_stop:
            sleep(0.2)
            with self._synchronized_lock:
                ...

Note that the run method uses self._synchronized_lock, which was injected by the previous version of the @synchronized decorator. The new decorator version doesn't inject this lock, causing the run method to encounter an uninitialized attribute. However, since the WatchDog thread is not currently initiated in Tribler, this error in the run method doesn't trigger an exception.

We could rewrite the class to function without the @synchronized decorator. Here's an example that seems clearer:

class WatchDog(Thread):
    def __init__(self):
        ...
        self._lock = Lock()

    def _reset_state(self):
        with self._lock:
            ...
        
    def register_event(self, event, name, timeout=10):
        with self._lock:
            ...

    def unregister_event(self, name):
        with self._lock:
            ...

    def run(self):
        events_to_unregister = []
        while not self.should_stop:
            sleep(0.2)
            with self._lock:
                ...

However, we're not using this class currently, aside from the single WatchDog.get_threads_info method. Thus, we could discard the WatchDog class entirely, barring this one function.

WatchDog.get_threads_info fails to account for the possibility of a thread context switch during its execution. This can be remedied by temporarily adjusting sys.setswitchinterval during the function's execution. However, caution is advised since we're already modifying sys.setswitchinterval elsewhere in the codebase.

src/tribler/core/utilities/tests/test_instrumentation.py

drew2a · 2023-07-11T12:37:40Z

I've rewritten thread frames formatting:

drew2a · 2023-07-11T13:08:32Z

@kozlovsky, thank you for your review. I've taken your comments on board and made several adjustments. Specifically, I've extracted the get_threads_info method from the WatchDog class, completely removed the instrumentation.py file, refactored get_threads_info, and wrote tests to ensure its proper function.