Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load checkpoints in background #7024

Merged

Conversation

kozlovsky
Copy link
Contributor

@kozlovsky kozlovsky commented Aug 31, 2022

This PR fixes #7032

@kozlovsky kozlovsky force-pushed the fix/load_checkpoints_in_background branch from ccad263 to 8437d84 Compare September 6, 2022 07:54
@kozlovsky kozlovsky marked this pull request as ready for review September 6, 2022 08:29
@kozlovsky kozlovsky requested review from a team and drew2a and removed request for a team September 6, 2022 08:29
def notify_shutdown_state(self, state):
self.notifier[notifications.tribler_shutdown_state](state)

async def shutdown(self, timeout=30):
self.cancel_pending_task("start")
Copy link
Contributor

@drew2a drew2a Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string constant "start" used at least twice in the same context, so it would be nice if it will be extracted to a const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this, but currently, Tribler does not have a tradition of defining constants for task names. Constants are not used in any invocation of register_task / cancel_pending_task. It looks strange to set it just for the "start" task but not for, say, the "download_states_lc" task and then for other tasks, and it is unclear where to stop. It looks like this change is slightly out of the scope of the current PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the Tribler does not have a tradition of defining constants for task names.

You introduces the new task In this PR, called "start" and for this change (for this task) we can apply best practice and extract the string to a constant.

It looks strange to set it just for the "start" task but not for, say, the "download_states_lc" task and then for other tasks

For me it doesn't look strange.

A refactoring of the whole source base to apply "the new tradition of defining constants for task names" is indeed out of scope of the current PR.

But the actual changes are within the scope of the current PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a class-level constant for a task name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

Comment on lines 342 to 344
'total': self.download_manager.checkpoints_count,
'loaded': self.download_manager.checkpoints_loaded,
'all_loaded': self.download_manager.all_checkpoints_are_loaded,
Copy link
Contributor

@drew2a drew2a Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to see all these string constants extracted as a consts to decrease a duplication and decrease a chance of misprinting while editing etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that static API is less error-prone, but on the other side, I think it is important to use a consistent approach to all data elements. It is strange to have constants just for some fields.

Like, if we have a dedicated constant for downloads['checkpoints']['total'] key name, why do we not have a similar constant for download["channel_download"] or download["speed_down"]? If we want to change our approach for working with keys of JSON elements, we need to apply it consistently to all keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that static API is less error-prone, but on the other side, I think it is important to use a consistent approach to all data elements. It is strange to have constants just for some fields.

Extracting fields to the const is extremely easy task (thank to PyCharm), and this action noticeably improves the codebase.
I can do it for you if you want :)

Copy link
Contributor Author

@kozlovsky kozlovsky Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I correctly understand what you want, I'm afraid I have to disagree with this change.

This is how I see what you are suggesting:

  • For JSON keys, string constants should be used instead of plain string values in Swagger definitions and the endpoint's code.
  • The GUI code should also use the same string constants for JSON keys, or extracting the constants has little sense.
  • To be usable on the GUI side, string constants for JSON keys should be moved into a separate module (like part of the common package we had before) and imported both from Core and GUI.
  • This new module will contain hundreds of JSON keys from multiple endpoints. The downloads_endpoint alone has about 50 different JSON keys used on different levels of the response structure. In the original JSON structure, the nested keys are located together, and in the new module, they will be mixed with keys for other endpoints in a flat list.
  • The resulting code will have an additional level of indirection, which make all JSON-related code harder to read.
  • An intermediate state after introducing the partial changes will look even uglier when some JSON keys are listed in a separate thesaurus and other keys are hard coded.

In the end, we will get a half-backed attempt to make JSON code slightly more statically typed with code much harder to read than now and only slightly less error-prone (if at all, as using JSON keys from a vast list of constants listed in a separate module introduces new types of possible coding errors).

If we want to make code better statically typed, we should switch from raw JSON to something like Pydantic and use FastAPI to automatically generate Swagger definitions from Pydantic structures. It is much better than replacing hardcoded keys with constants in all JSON structures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I'm asking for:

diff --git a/src/tribler/core/components/libtorrent/restapi/downloads_endpoint.py b/src/tribler/core/components/libtorrent/restapi/downloads_endpoint.py
index c1170cd..acdb705 100644
--- a/src/tribler/core/components/libtorrent/restapi/downloads_endpoint.py
+++ b/src/tribler/core/components/libtorrent/restapi/downloads_endpoint.py
@@ -37,6 +37,12 @@ from tribler.core.utilities.simpledefs import (
 from tribler.core.utilities.unicode import ensure_unicode, hexlify
 from tribler.core.utilities.utilities import froze_it
 
+ALL_LOADED = 'all_loaded'
+
+LOADED = 'loaded'
+
+TOTAL = 'total'
+
 
 def _safe_extended_peer_info(ext_peer_info):
     """
@@ -221,9 +227,9 @@ class DownloadsEndpoint(RESTEndpoint):
                         'time_added': Integer
                     }),
                     'checkpoints': schema(Checkpoints={
-                        'total': Integer,
-                        'loaded': Integer,
-                        'all_loaded': Boolean,
+                        TOTAL: Integer,
+                        LOADED: Integer,
+                        ALL_LOADED: Boolean,
                     })
                 }),
             }
@@ -243,9 +249,9 @@ class DownloadsEndpoint(RESTEndpoint):
         get_files = params.get('get_files', '0') == '1'
 
         checkpoints = {
-            "total": self.download_manager.checkpoints_count,
-            "loaded": self.download_manager.checkpoints_loaded,
-            "all_loaded": self.download_manager.all_checkpoints_are_loaded,
+            TOTAL: self.download_manager.checkpoints_count,
+            LOADED: self.download_manager.checkpoints_loaded,
+            ALL_LOADED: self.download_manager.all_checkpoints_are_loaded,
         }
 
         if not self.download_manager.all_checkpoints_are_loaded:

Copy link
Contributor Author

@kozlovsky kozlovsky Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, and I'm questioning the benefit of it. If applying this approach consistently, we will have around 50 similar constants, 50 more lines of code, and then we need to use these constants on the GUI side to make them really useful. And to use them on the GUI side, we need to extract them to a separate module.

The usual "rule of three" of refactoring assumes that "two instances of similar code do not require refactoring, but when similar code is used three times, it should be extracted". If we only replace the constant in two places, the cost/benefit ratio is usually not favorable for refactoring.

I prefer later switch to Pydantic and FastAPI and avoid explicit Swagger definitions altogether.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to use them on the GUI side, we need to extract them to a separate module.

No, we can use constants from CORE in the GUI, as we already doing now. But I don't encourage you to do that.

The usual "rule of three" of refactoring assumes that "two instances of similar code do not require refactoring, but when similar code is used three times, it should be extracted".

I disagree with this approach: to use the rule of three everywhere. It is useful when you do refactoring of a complex code and it is hard to say whether it is beneficial to extract some common logic or not (will you add unnecessary complexity or not?).

But this case is different. We talking about extracting the string that was used twice in the same file and in the same PR. Copy-paste is a straightforward and easy-to-apply approach, but it may lead to problems in the future.

As a reviewer, I should check whether you wrote it correctly or made a mistake. I believe that you copy the value and past a value, so in this case, the chance of doing a mistake is minimal (but not zero).

But in the future, during a hotfix or refactoring it may happen that in one place the value will be changed and in the other not. However, this risk is easily eliminated. If you do only one action, press alt+ctrl+c and extract the string to a constant, the case that I described above will be impossible.

So, if I choose from two options:

  1. To write a code with the probability of mistyping now or in the future
  2. To write a code without the probability of mistyping now or in the future
    then I would choose the second option.

That is why I suggest doing this in the current PR.

I prefer later switch to Pydantic and FastAPI and avoid explicit Swagger definitions altogether.

Agree, it would be nice. But it also may not happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the constants that you requested. I still think it makes the code slightly worse, but it is not an important enough issue to start bikeshedding about it :)

src/tribler/gui/tests/test_core_manager.py Show resolved Hide resolved
@kozlovsky kozlovsky force-pushed the fix/load_checkpoints_in_background branch from 5b627dd to f85e7ea Compare September 8, 2022 06:28
@kozlovsky kozlovsky force-pushed the fix/load_checkpoints_in_background branch from f85e7ea to d10f724 Compare September 8, 2022 07:17
@kozlovsky kozlovsky requested a review from drew2a September 8, 2022 08:09
Copy link
Contributor

@drew2a drew2a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing all the comments 🙏 .

@kozlovsky kozlovsky merged commit 10c7154 into Tribler:release/7.12 Sep 8, 2022
@kozlovsky kozlovsky deleted the fix/load_checkpoints_in_background branch September 8, 2022 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants