Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible next steps for r-docker #101

Open
iaindillingham opened this issue Jun 22, 2022 · 1 comment · May be fixed by #167
Open

Possible next steps for r-docker #101

iaindillingham opened this issue Jun 22, 2022 · 1 comment · May be fixed by #167
Assignees

Comments

@iaindillingham
Copy link
Member

Here are some possible next steps for r-docker, following a meeting with @bloodearnest, @milanwiedemann, and @rebkwok. They're arranged from least to most effort, over the short-to-medium term.

Continue to update the existing image. Address:

Fix the image.

Create a new - and reversion the existing - image.

  • The existing image could become r:legacy; the new image could become r:latest
  • Automatically open PRs for all studies that use the image, s/r:latest/r:legacy/

Handle dependencies better.

  • We could create a volume for dependencies
  • We could create a private, limited archive network
  • We could create tooling to help users bundle dependencies with their studies (e.g. opensafely add-dependency ...)
@remlapmot
Copy link
Contributor

(sorry for long post)

Below is what I think would be the most stable and easiest approach for you to maintain long term (because it's what rocker/r-ver, Microsoft, and RStudio Cloud do - the key for stable maintenance for you is just to install packages from CRAN from the same day in an image).

  • As you say have one the Dockfiles above at legacy
  • Then copying rocker/r-ver have a tag at each R version number (i.e., 4.0.2, 4.0.5, 4.1.3) which builds on the corresponding rocker/r-ver:tag
  • For stability for you
    • each tagged image only installs packages from CRAN from a single date
    • this is also the approach of Microsoft (checkpoint) package and what RStudio do in RStudio Cloud
    • The CRAN dates rocker/r-ver uses are listed in the table here (I think the rocker/r-ver images have these CRAN dates using an RSPM snapshot set already - just need to check whether that's for source or binary packages - if source obtain the URL for the binaries from here)
    • and the backup here is to use an MRAN snapshot if RSPM ever goes down
  • Then as additional packages are added they are added into each image at whatever version they were at at each specific CRAN date (no reliance on renv, pak, packrat which unreliable really)
    • and the same tags are maintained e.g. 4.0.2, 4.0.5, 4.1.3
    • or you could add an increasing number in 4th position, e.g. 4.0.2.2
  • If a user requests a package at a particular version you can provide them with another image which uses any CRAN date when that package was at that version on CRAN (by changing the CRAN date), which you could tag as say the R version number with the date, e.g. 4.0.2.2020-10-01

A subtlety here is that if you followed this approach you wouldn't want users on latest because with rocker/r-ver the latest tag corresponds to the latest version of R and CRAN packages from the day it's run

  • this is currently 4.2.0 but this will change to 4.2.1 tomorrow as the latest version of R is released
  • Instead you would want users on one of the versioned numbers, e.g. probably previous release (4.0.0) or last patch release of previous sequence (4.1.3)
  • If a user insisted upon the latest version of R you could give them a new tag based on a date at say 4.2.0.2022-06-21

Also, I have made a second version of the reverse engineered Dockerfile which minimises the number of layers (apparently better Dockerfile practice).

  • original version is in reverse-engineer branch (too many layers really) Dockerfile here

  • minimised layers version reverse-engineer-min-layers branch Dockerfile here

I will try and keep both upto date if additional packages are added.

@remlapmot remlapmot self-assigned this Oct 29, 2024
@remlapmot remlapmot linked a pull request Nov 22, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants