-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: cache downloaded wheel information #2276
Conversation
def add_wheel_to_update_log(wheel, for_py_version, app_data): | ||
embed_update_log = app_data.embed_update_log(wheel.distribution, for_py_version) | ||
logging.debug("adding %s information to %s", wheel.name, embed_update_log.file) | ||
u_log = UpdateLog.from_dict(embed_update_log.read()) | ||
if any(version.filename == wheel.name for version in u_log.versions): | ||
logging.warning("%s already present in %s", wheel.name, embed_update_log.file) | ||
return | ||
version = NewVersion(wheel.name, datetime.now(), release_date_for_wheel_path(wheel.path), "download") | ||
u_log.versions.append(version) # always write at the end for proper updates | ||
embed_update_log.write(u_log.to_dict()) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been added to periodic_update.py
as it is the only place dealing with the embed_update_log
(which is not used only for updates anymore).
@@ -250,16 +265,20 @@ def _run_do_update(app_data, distribution, embed_filename, for_py_version, perio | |||
now = datetime.now() | |||
if periodic: | |||
source = "periodic" | |||
# mark everything not updated manually as source "periodic" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the modifications in _run_do_update
are what lead me to bump the JSON embed format version.
# we need to sort versions for the next update to work properly | ||
u_log.versions.sort(reverse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log needs to be sorted. It was sorted by design before. That's not the case anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why is no longer the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would remove the need to sort the list.
I don't mind sorting the log; why would you consider the other solution less risky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why is no longer the case?
because other sources logs are always appended rather than inserted.
I don't mind sorting the log; why would you consider the other solution less risky?
I'm feeling a bit hesitant to rely on the tuple ordering given:
def as_version_tuple(version):
result = []
for part in version.split(".")[0:3]:
try:
result.append(int(part))
except ValueError:
break
if not result:
raise ValueError(version)
return tuple(result)
I feel that the inner try/catch is hiding something I did not think about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not think of pre-release versions... In this specific sample, it does not matter much but it's not sorted correctly.
{
"completed": "2022-01-01T21:48:50.557509Z",
"periodic": false,
"started": "2022-01-01T21:48:40.230835Z",
"versions": [
{
"filename": "pip-21.3.1-py3-none-any.whl",
"found_date": "2022-01-01T14:22:37.717326Z",
"release_date": "2021-10-22T15:57:09.000000Z",
"source": "manual"
},
{
"filename": "pip-20.3-py2.py3-none-any.whl",
"found_date": "2022-01-01T21:46:25.286240Z",
"release_date": "2020-11-30T12:47:49.000000Z",
"source": "manual"
},
{
"filename": "pip-20.0-py2.py3-none-any.whl",
"found_date": "2022-01-01T21:48:23.634459Z",
"release_date": "2020-01-21T11:42:56.000000Z",
"source": "manual"
},
{
"filename": "pip-20.3b1-py2.py3-none-any.whl",
"found_date": "2022-01-01T21:44:30.798536Z",
"release_date": "2020-10-31T19:25:12.000000Z",
"source": "manual"
}
]
}
Further-more, a manual/periodic update should never lead to defaulting to pre-release version. This promise is broken (at least for manual updates).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given an added test with a pre-release version downloaded as "pinned" version indeed broke manual updates, I went with the alternate implementation of _run_do_update which first splits the UpdateLog in 2 depending if it's coming from a previous update or another source.
Edit: this is used by the latest commit There's another way to do the update which might be a less bit risky for periodic updates:
This would remove the need to sort the list. |
Edit: this is no longer the case with the latest push. |
Writing things down makes me realize that there are still some things left to do so converting to draft. PS: while thinking about the time penalty I mentioned, I thought about caching PyPI requests on disk as well and use the |
Add downloaded wheel information in the relevant JSON embed file to prevent additional downloads of the same wheel. This requires to bump the JSON embed file version.
I rewrote the |
assert read_dict.call_count == 1 | ||
assert write.call_count == 1 | ||
|
||
|
||
def test_download_manual_stop_at_first_usable(tmp_path, mocker, freezer): | ||
def test_download_periodic_stop_at_first_usable2(tmp_path, mocker, freezer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the 2 suffix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's already a test_download_periodic_stop_at_first_usable
test and I didn't see how to do this as a parametrize test easily so adding a suffix 2
on the second one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of magic numbers, I generally add to the test name what differentiates them 🤔 for future reference.
Signed-off-by: Bernát Gábor <[email protected]>
Add downloaded wheel information in the relevant JSON embed file to prevent additional downloads of the same wheel.
This requires to bump the JSON embed file version once again as I was lacking some tests cases in #2273.
I will add some inline comments for some choices that I made.
Fixes #2268
Thanks for contributing, make sure you address all the checklists (for details on how see
development documentation)!
tox -e fix_lint
)docs/changelog
folder