-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite Cronjob build triggering logic in python; read from influxdb #275
Conversation
if not version: | ||
return None | ||
|
||
url = f"{repo_url}/releases/download/{crate}-{version['vers']}/{crate}-{version['vers']}-{target_arch}.tar.gz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using a python library for it would be more readable https://pypi.org/project/PyGithub/
We could use the github api to get all artifacts and cache it.
For the rate-limit problem, I think we can provide a read-only GitHub token via secrets, to the github api.
While secrets.GITHUB_TOKEN
would work, it
- has a rate limit of 1000 requests per hour
- has too many permissions within the repo
If we create a read-only token using a robot account (which I've setup), we will have:
- 5000 requests per hour
- has read-only permission and cannot access any private info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alsuren if you think this is too much work for this PR, we could leave this to a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main goals in this PR are to stop using the old stats server and to delete 400 lines of bash, so I am going to leave it like this.
I also quite like the symmetry of using exactly the same url that the quickinstall client uses. The old code even used the same http client (curl
) but I decided that the connection pooling from requests was too much of a win.
We could use the github api to get all artifacts and cache it.
I'm a bit wary of using a "do a lot of work and rely on caches to save us" approach. I've so far been taking the approach of "do as little work as possible and still make forward progress". I realise that there is a cutover though. As we check more crates every hour, we will want to have some kind of cached pre-filter to reduce the number of http requests we need to make. We would need to be wary of stale caches as well: if the cache says the package is there then we can trust that, but if the cache says it's not there, we need to re-check to make sure we didn't build it in the last hour.
In another unrelated note: I think it would be really interesting to gather stats around our build processes as well as our client requests. This would allow us to make better decisions when we make tweaks to the scheduling algorithm. I'm quite annoyed that the new influxdb cloud org doesn't have good notebook/dashboards. I really don't want to have to plug it into grafana just to share some graphs.
Thank you, it LGTM, let's merge this in and switch to influx! i will update the cargo-binstall to use the influx API afterwards and cuts a new release and we also need to update cargo-quickinstall as well. |
We also need to get #165 merged, since that's what's currently powering the influx stuff (note that the vercel stats server is already forwarding stats to the new server, so as long as we get this done in the next month, it should't make much difference). |
For output, we could use grouped output which is more readable so we can have a @contextmanager
def group_output(title):
print("::group::{title}")
yield
print("::endgroup::")
with group_output("Collecting popular crates"):
print(...) |
This points the "supported-targets" link in the readme back to the `supported-targets` file, which it had originally linked to and which its text suggests it is still intended to link to. After cargo-bins#275, the readme link to the supported targets list pointed to the `trigger-package-build.py` script. This appears to have been based on idea of removing the `supported-targets` file. As noted in https://github.com/cargo-bins/cargo-quickinstall/pull/275/files#r1747798959, the file was not removed. But the link target was not restored. As long as the file does exist, it is easier for users reading the readme to consult it rather than examine a script. In addition, the link was fully broken since cargo-bins#300 when the script was rewritten and named `trigger_package_build.py` (with underscores).
This points the "supported-targets" link in the readme back to the `supported-targets` file, which it had originally linked to and which its text suggests it is still intended to link to. After #275, the readme link to the supported targets list pointed to the `trigger-package-build.py` script. This appears to have been based on idea of removing the `supported-targets` file. As noted in https://github.com/cargo-bins/cargo-quickinstall/pull/275/files#r1747798959, the file was not removed. But the link target was not restored. As long as the file does exist, it is easier for users reading the readme to consult it rather than examine a script. In addition, the link was fully broken since #300 when the script was rewritten and named `trigger_package_build.py` (with underscores).
make
rulesp1YOOpyPWwrcJ5wqeBqpsIVPGMiKDHg_PUkUxFdwl9lW6W5jHLdxL67Fsz4STm1yDd8pwH42YHuX7ZxjRkOu4w==
for now. I will think about revoking it later.