-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get binary R packages from packagemanager.rstudio.com #1104
Conversation
packagemanager.rstudio.com is a CRAN mirror provided by rstudio, with *binary packages* prebuilt for many Linux Distributions! https://www.rstudio.com/blog/announcing-public-package-manager/ has more excellent detail. It cuts down install times for R packages by almost 90% in some cases! Like MRAN (which we use now), they also provide a daily snapshot of CRAN at that date (https://docs.rstudio.com/rspm/news/#rstudio-package-manager-2021090). The URL for CRAN for a particular date can be fetched via an API call. We call that API, and if there is no snapshot for that date (anything before Oct 2017), we fall back on to MRAN. Adds a test to test this fallback. One possible issue about changing existing binder repos to use binary builds rather than source builds is that the binary builds sometimes require you have an apt package installed, and will fail if it is not. We had to install the zmq library apt package for example - source installs compile zmq from source, which is where the speedup comes from. But unlike python wheels or conda packages, these binary builds are not self-contained - they are linked to apt packages from the specific distros. So some repos that worked before might fail now. We can choose a more recent cut-off date to prevent this from happening.
We were doing this from an old MRAN snapshot. I moved the pin a little ahead, so IRKernel can also be installed from CRAN instead of from GitHub. R <= 4.0 gets the old version, and anything newer gets a more recent version of devtools. This gives us fast installs for IRkernel with binary packages. Also add a R 4.0 and R 4.1 test
Is it guaranteed that CRAN will have a daily snapshot for each day after |
@manics I had initially read https://docs.rstudio.com/rspm/news/#rstudio-package-manager-2021090 and came to the conclusion they'll have a snapshot for all days, but on re-reading it I'm not sure. I'll set the MRAN / packagemanager.rstudio.com cutoff be based on dates, and retry to slightly older snapshots if it can't find one for that date. |
Looking at https://packagemanager.rstudio.com/client/#/repos/1/overview it appears as if they've gone and snapshotted all of the days (with two exceptions in October 2017)? |
@RaoOfPhysics ah, glad you found the holes in October!!! Will help me test and make sure we cover those cases. |
Here you go, @yuvipanda. |
- Install a different version of RStudio for R < 4.1, as latest RStudio doesn't seem to support those. And newer RStudio isn't supported on these older R versions. - Cleanup how Shiny is installed - install it with the same apt invocation as rstudio (saves time), and install shiny-proxy from PyPI instead or GitHub. The release on PyPI is the same as our previous GitHub pin. - Remove outdated comment about different behavior for R 3.6 - I think now we get all our R versions from the same apt repo. Plus, the conditional was adding more scripts than just adding extra apt package repos
- MRAN doesn't seem to have R 4.1 specific snapshots, so let's default to RSPM for anything 4.1+. - Otherwise, snapshot dates in 2022 will result in using rspm
Ok, I've changed the logic for when rspm is used as default to either be a snapshot request date in 2022+ or asking for R4.1. I think with these two, we shouldn't break any old repos that were dependent on source builds. |
I haven’t thought too carefully about it, but I think the logic makes sense. Why R4.1+ and not R4.0+ though? |
@RaoOfPhysics The R4.1 decision is because R4.0 is the latest I see in MRAN (https://mran.microsoft.com/timemachine). I'm mostly trying to make sure we break as little existing repositories as possible... |
Unfortunately it looks like R 4.1 is being installed even when we ask for R 4.0. Trying to figure out why. |
Otherwise latest version was being installed, giving us R 4.1 even when we ask for 4.0
Looking at |
And add another R test for R4.0 + rspm
Based on output of apt-cache madison r-base-core
I've fixed the tests as well now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, once those are answered this looks good to merge!
Was accidentally included along with the 3.6.3-1biocnic upgrade for 3.6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳
@manics awesome! THANK YOU SO MUCH! |
- Explains jupyterhub#1104 - Advertises that we get RStudio 'for free' when R is installed
packagemanager.rstudio.com is a CRAN mirror provided
by rstudio, with binary packages prebuilt for many Linux
Distributions! https://www.rstudio.com/blog/announcing-public-package-manager/
has more excellent detail. It cuts down install times for R packages
by almost 90% in some cases!
Like MRAN (which we use now), they also provide a daily snapshot
of CRAN at that date
(https://docs.rstudio.com/rspm/news/#rstudio-package-manager-2021090).
The URL for CRAN for a particular date can be fetched via an API
call. We call that API, and we retry for earlier dates if we can't find one
for that date. However, note that rspm seems to do serverside magic
to give us packages from the earlier date anyway, so we don't need to
do the MRAN backoff behavior yet.
One possible issue about changing existing binder repos to use binary
builds rather than source builds is that the binary builds sometimes
require you have an apt package installed, and will fail if it is
not. We had to install the zmq library apt package for example -
source installs compile zmq from source, which is where the speedup
comes from. But unlike python wheels or conda packages, these binary
builds are not self-contained - they are linked to apt packages from
the specific distros. So some repos that worked before might fail now.
Due to this, we default to RSPM only if one of the following conditions are true:
seem to support R 4.1?
We also bring in newer versions of RStudio based on what R version they support,
and a matching jupyter-rsession-proxy. Fixes #1041
A bug where asking for R 4.0 gave us R 4.1 is also fixed, and we add a separate test for
that as well. Fixes #1077
TODO:
before looking into MRAN
R version