Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 2 of the r image #167

Draft
wants to merge 73 commits into
base: main
Choose a base branch
from

Conversation

remlapmot
Copy link
Contributor

@remlapmot remlapmot commented Nov 22, 2024

Closes #51
Because pandoc is updated in v2 image.

Closes #74
The kableExtra package is included but we have chosen not to include phantomjs, which we think has gone unmaintained for a while.

Closes #75
As the v2 image installs binary R packages for Linux from the Posit Public Package Manager (formerly known as RStudio Package Manager).

Closes #88
Because survcomp is a Bioconductor package and we have currently chosen not to include Bioconductor packages.

Closes #101
As this is the next step for r-docker.

Closes #160
Adds sjPlot to v2 image.

@remlapmot remlapmot force-pushed the update-to-v1-and-v2-with-dd4d-in-both branch 2 times, most recently from f7cd181 to b5b65dc Compare November 28, 2024 13:49
@remlapmot remlapmot changed the title Create version 1 and 2 in the build process and include dagitty and Will's dd4d R packages Version 2 of the r image Dec 4, 2024
@remlapmot remlapmot force-pushed the update-to-v1-and-v2-with-dd4d-in-both branch from 945f35a to 2f0068d Compare December 4, 2024 09:06
remlapmot and others added 30 commits December 12, 2024 15:39
Figure out the actual dependencies needed, rather than the -dev or -bin
versions. This removes 130 uneeded packages, and saves 230mb of the
image.

The only testing done here is that the package imports cleanly, which
definitly does fail if you remove the dependency entirely. It's possible
that we have removed some optional dependencies that are not loaded on
package import, but are are needed for certain. However, it is easy to
add these back again if needed.
We only need a single cache for this, with subdirectories for each type.
More readable and simplifies cache management.
AFAICT, this has no effect, for two reasons.

 - ENV vars are only valid in the build stage that defined them, and we
   defined them in the `base-r` stage. In the `r` stage it will run with
   the default values, which should be empty anyway.

 - You cannot save space by deleting things with docker. The objects
   will still be in the previous layer.  The only way to reduce space by
   deleting file by removing them within the same RUN command that
   created them, so they are never persisted to a layer.

In addition, because we use a persistant build cache, we don't want to
clear it at all! That would wipe the persistant cache out each build.

I manually checked that this made no different to the size, and it did
not, and I poked around to check:

```
> library("pak")
> cache_summary()
$cachepath
[1] "/root/.cache/R/pkgcache/pkg"

$files
[1] 0

$size
[1] 0

> cache_list()
> A data frame: 0 × 6
> ℹ 6 variables: fullpath <chr>, path <chr>, package <chr>, url <chr>,
>   etag <chr>, sha256 <chr>
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants