Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Artifactory #602

Closed
brianrepko opened this issue Dec 23, 2020 · 15 comments
Closed

Support for Artifactory #602

brianrepko opened this issue Dec 23, 2020 · 15 comments

Comments

@brianrepko
Copy link

There is a similar issue logged for packrat as well - rstudio/packrat#583

Artifactory does not store archived versions in a CRAN-like way - see

The last issue from Artifactory looks like they have changed this in their code but I'm not clear on what version this fix is included in.

I thought about re-implementing this PR from remotes - r-lib/remotes#441 - thinking that will also fix renv but now I don't believe that that is true (since renv only Suggests remotes)

Am I correct in reading the code that install.R eventually calls r_cmd_install on packages?
And the code that works out the correct location of the package is in retreive.R?

Not sure if it makes sense to change renv_retrieve_repos_archive to also try with
repo <- file.path(repo, "src/contrib/Archive", record$Package, record$Version)

or if renv_retrieve_repos_impl should do the test - this is only broken for Archive'd versions...

But am I correct in assuming that a fix to remotes won't do anything for renv?

@brianrepko
Copy link
Author

Also not sure if it makes sense to able to update the Repository in renv.lock to be flagged as a "versioned-archives" Repository - then we could build the url properly based on the Repository settings

@kevinushey
Copy link
Collaborator

But am I correct in assuming that a fix to remotes won't do anything for renv?

That's correct; renv is independent of remotes. I think the solution for renv would be to try alternate locations here:

renv/R/retrieve.R

Lines 553 to 565 in d10f636

renv_retrieve_repos_archive <- function(record) {
name <- sprintf("%s_%s.tar.gz", record$Package, record$Version)
for (repo in getOption("repos")) {
repo <- file.path(repo, "src/contrib/Archive", record$Package)
status <- catch(renv_retrieve_repos_impl(record, "source", name, repo))
if (identical(status, TRUE))
return(TRUE)
}
return(FALSE)
}

But we'd need to make this "cheap"; e.g. renv should have a way of quietly detecting whether a file is available at a particular URI. I can try exploring this later.

@kevinushey
Copy link
Collaborator

Is there a way to tell if a CRAN repository is an Artifactory repository (e.g. checking the headers or something from a web request)?

@brianrepko
Copy link
Author

brianrepko commented Dec 24, 2020

I just hit both our base URL (the URL that is in the repos option) as well as src/contrib under that and there are three headers that tell you if it is an Artifactory repository.

X-Artifactory-Id
X-Artifactory-Node-Id
Server

Here are the values for ours (sort of)

X-Artifactory-Id: 601b67c93bea52ad53d8fbf7a2d69fed56e4dbe9
X-Artifactory-Node-Id: hostname
Server: Artifactory/6.20.1

The Node-Id header may be only used if setup as a clustered application.
But X-Artifactory-Id should always be there for an Artifactory CRAN repo.
And Server header can be parsed for a version should this only be needed for versions under a given version.

That makes this super cheap - you can then know that a Repository has CRAN structure or Artifactory structure
and based on that just build the repo variable above appropriately.

@kevinushey
Copy link
Collaborator

Awesome!

For now, I've implemented something that should work as well -- renv now tries to figure out the structure used for archived packages in the repository, and selects + saves the appropriate one. Would you be able to give the development version of renv a try and let me know if it works for you?

@brianrepko
Copy link
Author

I will but not until I return home after the first of the year

@brianrepko
Copy link
Author

@kevinushey I'm having trouble getting this tested - some of my initial tests had some issues so I wanted to start completely clean - however I'm on a system that is R 3.6.1 and cannot use usethis 2.0.0 because of the change from git2r to gert (libgit2).

Here is what I'm doing

mkdir -p $ROOT/renvtest/lib/R/3.6
mkdir -p $ROOT/renvtest/cache

edit ~/.Renviron
# set R_USER_LIB to $ROOT/renvtest/lib/R/3.6
# set RENV_PATHS_CACHE to $ROOT/renvtest/cache

edit ~/.Rprofile
# set CRAN mirror and WORK_CRAN (Artifactory)

R
# verify
.libPaths() 
getOption("repos")

# setup library
install.packages('remotes')
remotes::install_github('rstudio/renv')
remotes::install_version(package='usethis',version='1.6.3',upgrade=T)

# create project
usethis::create_project("<$ROOT>/renvtest/project", rstudio=T, open=T)

# init renv
renv::init(bare=T)
* Project '$ROOT/renvtest/project' loaded. [renv 0.12.3-72]
* renv activated -- please restart the R session.
> q('no')

cd project
R
# Bootstrapping renv 0.12.3-72 -----------------------------------------------
* Downloading renv 0.12.3-72 from GitHub ... FAILED
Error in bootstrap(version, libpath) : failed to download renv 0.12.3-72

So I can't seem to init a clean project with a clean library.
I have to use bare=T because otherwise renv tries to upgrade/install usethis 2.0.0

@brianrepko
Copy link
Author

My earlier test (yesterday) was with 0.12.3-71 - not sure if I should try to install that instead? renv::upgrade always gives me an rdb corrupt file issue

@brianrepko
Copy link
Author

aha - looking at bootstrap.R I needed to set a GITHUB_PAT...

@brianrepko
Copy link
Author

Nuts - I need the RENV_PATHS_ROOT changed - not just cache... redoing this whole thing

@brianrepko
Copy link
Author

so all worked great on this - I'll upload the "script" that I used to test it tomorrow but no problems at all.
I'm not totally sure if / where you store the originating URL - my only concern on that is when the package version is current, the URL is different than when it is no longer current (src/contrib vs src/contrib/Archive) but I literally don't see where this is kept.

@brianrepko
Copy link
Author

here is my "memory" of what I did - again it all looks good to me

mkdir -p $ROOT/renvtest/lib/R/3.6
mkdir -p $ROOT/renvtest/renv

edit ~/.Renviron
# set R_USER_LIB to $ROOT/renvtest/lib/R/3.6
# set RENV_PATHS_ROOT to $ROOT/renvtest/renv
# set GITHUB_PAT

edit ~/.Rprofile
# set CRAN mirror and WORK_CRAN (Artifactory)

cd $ROOT/renvtest
R
  # verify
  .libPaths() 
  getOption("repos")

  # setup library
  install.packages('remotes')
  remotes::install_github('rstudio/renv')
  remotes::install_version(package='usethis',version='1.6.3',upgrade=T)

  # create project
  usethis::create_project("<root>/renvtest/project", rstudio=T, open=T)

  # init renv
  renv::init(bare=T)
  * Project '$ROOT/renvtest/project' loaded. [renv 0.12.3-72]
  * renv activated -- please restart the R session.
  > q('no')

# reload project
cd $ROOT/renvtest/project
R
  # Bootstrapping renv 0.12.3-72 -----------------------------------------------
  * Downloading renv 0.12.3-72 from GitHub ... Done!
  * Installing renv 0.12.3-72 ... Done!
  * Successfully installed and loaded renv 0.12.3-72.
  * Project '$ROOT/renvtest/project' loaded. [renv 0.12.3-72]

  # add the artifactory repo
  local({
    r <- getOption("repos")
    r["WORK_CRAN"] <- "https://example.com/artifactory/CRAN"
    options(repos = r)
  })
  # PackageA - 1.0.0 is old, 1.1.0 is current
  # PackageB - 1.1.0 is old, 1.2.0 is current, depends on PackageA
  renv::install("[email protected]")
  #Retrieving 'https://example.com/artifactory/CRAN/src/contrib/Archive/PackageA/1.0.0/PackageA_1.0.0.tar.gz' ...
  #... dependencies ...
  renv::install("[email protected]")
  #Retrieving 'https://example.com/artifactory/CRAN/src/contrib/Archive/PackageB/1.1.0/PackageB_1.0.0.tar.gz' ...
  #... dependencies ...

  # create a project-dependencies.R file with
  # library(PackageA)
  # library(PackageB)

  # capture and work with A/B via renv
  renv::status()
  renv::snapshot()
  renv::update("PackageA")
  # 1.0.0 --> 1.1.0?
  # Retrieving 'https://example.com/artifactory/CRAN/src/contrib/PackageA/1.1.0/PackageA_1.1.0.tar.gz' ...
  renv::update("PackageB")
  # 1.1.0 --> 1.2.0?
  # Retrieving 'https://example.com/artifactory/CRAN/src/contrib/PackageB/1.2.0/PackageB_1.2.0.tar.gz' ...
  renv::snapshot()
  q('no')

@brianrepko
Copy link
Author

I think we can close this issue - thank you @kevinushey

@kevinushey
Copy link
Collaborator

Awesome. Thanks!

For reference, here's where the "magic" happens:

renv/R/retrieve.R

Lines 576 to 621 in 05028b9

renv_retrieve_repos_archive_path <- function(repo, record) {
# allow users to provide a custom archive path for a record,
# in case they're using a repository that happens to archive
# packages with a different format than regular CRAN network
# https://github.com/rstudio/renv/issues/602
override <- getOption("renv.retrieve.repos.archive.path")
if (is.function(override)) {
result <- override(repo, record)
if (!is.null(result))
return(result)
}
# if we already know the format of the repository, use that
if (exists(repo, envir = `_renv_repos_archive`)) {
formatter <- get(repo, envir = `_renv_repos_archive`)
root <- formatter(repo, record)
return(root)
}
# otherwise, try determining the archive paths with a couple
# custom locations, and cache the version that works for the
# associated repository
formatters <- list(
function(repo, record) {
with(record, file.path(repo, "src/contrib/Archive", Package))
},
function(repo, record) {
with(record, file.path(repo, "src/contrib/Archive", Package, Version))
}
)
name <- renv_retrieve_repos_archive_name(record, "source")
for (formatter in formatters) {
root <- formatter(repo, record)
url <- file.path(root, name)
if (renv_download_available(url)) {
assign(repo, formatter, envir = `_renv_repos_archive`)
return(root)
}
}
}

Basically, we try both archive paths; if one of those succeeds, then we "save" that as the preferred path to use when attempting to install packages from the archives.

@brianrepko
Copy link
Author

@kevinushey thanks - I found it by looking through the git diff as well

One thing to be aware of with Artifactory. In it you can define virtual repositories that are the combination of remote repositories (basically a true CRAN mirror - but with the ability to accept-list / deny-list various packages) and a local repository (for non-public packages). The remote repo will have the structure of the repository that it is mirroring - the local repo will have the structure defined by Artifactory.

I don't think that this will be a problem as renv always looks at the CRAN repo first (that being mirrored) and thus should only use the Artifactory / local repo when the package is not found / not public (this is my case). There may be users that have to use both the remote and local parts of Artifactory but that is probably quite rare.

I don't think the code needs changing - just wanted to make sure you understood what might be there in the Artifactory world

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants