Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R in rocker/r-ver does not respect the container timezone setting. #151

Closed
eitsupi opened this issue Apr 22, 2021 · 15 comments
Closed

R in rocker/r-ver does not respect the container timezone setting. #151

eitsupi opened this issue Apr 22, 2021 · 15 comments

Comments

@eitsupi
Copy link
Member

eitsupi commented Apr 22, 2021

Images build from this repository ignore the OS environment variable TZ.
That behavior is differently from other rocker images.

For example, since I am in Japan, I sometimes set Japan time.

$ docker run --rm -it -e TZ=Asia/Tokyo rocker/r-base Rscript -e "Sys.time()"
[1] "2021-04-22 19:11:18 JST"
$ docker run --rm -it -e TZ=Asia/Tokyo rocker/r-ver:3.6.3 Rscript -e "Sys.time()"
[1] "2021-04-22 19:12:26 JST"
$ docker run --rm -it -e TZ=Asia/Tokyo rocker/r-ver:4.0.5 Rscript -e "Sys.time()"
[1] "2021-04-22 10:13:25 UTC"

This is due to the following line added by 6e988b3

echo "TZ=${TZ}" >> ${R_HOME}/etc/Renviron

Why is it set like this?
Is it related to this issue? rocker-org/rocker-versioned#89

I don't know the need to write TZ to Renviron file, but how about writing TZ to Renviron when the container is started by setting ENTRYPOINT?

@eitsupi eitsupi changed the title rocker/r-ver does not respect the container timezone setting. R in rocker/r-ver does not respect the container timezone setting. Apr 22, 2021
@cboettig
Copy link
Member

cboettig commented Apr 22, 2021

@eitsupi Yes, it's related to rocker-org/rocker-versioned#89 plus the fact that RStudio doesn't inherit environmental variables from system, only from Renviron. To solve rocker-org/rocker-versioned#89 we need TZ in the system Renviron when RStudio starts.

RStudio's choice not to respect system environmental variables is a constant nuisance for rocker users and a source of innumerable issues in these threads, but it's still not clear to me what the most consistent strategy would be. In your second two examples above, it you switched from r-ver to the RStudio interface and ran the same script, can you predict what you would get for TZ in each case? Which one is what you would expect? (current images will give the same TZ either way, the older 3.6.3 will give you a different TZ.

Yes, we could restore the old behavior by insisting that env vars needed by RStudio are only run when RStudio is launched. The natural way to do this I think would be in one of the setup scripts run by init, i.e. as part of user_config.sh in the install_rstudio.sh configuration.

However, I'm not convinced that it's preferable for RStudio-based containers to have different behavior when a user is running in RStudio interface vs running Rscript. If a user builds and tests a script in rstudio using, say, rocker/rstudio, and then runs the same script using a docker run --rm -ti rocker/rstudio Rscript ... in production use, they would get different results under such a strategy. Is that really desirable?

For consistency across rocker and Rscript, I think users are better off getting used to patterns which pass all env vars needed to R by using an appropriate .Renviron file (e.g. via a -v link at runtime) rather than a -e link. Does that make sense? Happy to be talked out of this. At very least we ought to document this more clearly (where?) since it comes up all the time.

@eddelbuettel
Copy link
Member

it comes up all the time

😿

@eitsupi
Copy link
Member Author

eitsupi commented Apr 23, 2021

@cboettig Thank you for answering in detail. I understood the need to write TZ to Renviron.

However, it remains a question whether it is good to have this behavior on rocker/r-ver without RStudio Server installed.
For example, it seems more natural for me to do this in install_rstudio.sh instead of in install_R.sh.
On the other hand, I can understand the concern that doing so will cause confusion among users since there will be a difference in behavior between images with and without RStudio installed.

My suggestion is to remove this process from install_R.sh, and in rocker/rstudio, write TZ to Renviron with a script set to ENTRYPOINT.
Here is an example of entrypoint.sh and a Dockerfile that sets ENTRYPOINT in rocker/rstudio.

#!/bin/bash
set -e

if grep -q "TZ=" ${R_HOME}/etc/Renviron; then
    sed -i -e /^TZ=/d ${R_HOME}/etc/Renviron
    echo "TZ=${TZ}" >> ${R_HOME}/etc/Renviron
else
    echo "TZ=${TZ}" >> ${R_HOME}/etc/Renviron
fi

exec "$@"
FROM rocker/rstudio:4.0.5

COPY scripts/entrypoint.sh /rocker_scripts/entrypoint.sh
RUN chmod +x /rocker_scripts/entrypoint.sh

ENTRYPOINT ["/rocker_scripts/entrypoint.sh"]
CMD ["/init"]

This image (which I have tagged rstudio-test) rewrites Renviron before execute CMD.
So as you can see below, I can use the TZ I have set up both inside and outside of RStudio.

$ docker run --rm -it -e TZ=Asia/Tokyo rstudio-test Rscript -e "Sys.time()"
[1] "2021-04-23 20:17:52 JST"
$ docker run --rm -it -e TZ=Asia/Tokyo -p 8787:8787 rstudio-test

image

What do you think of this approach?

@eitsupi
Copy link
Member Author

eitsupi commented Apr 23, 2021

By the way, the reason why I raised this issue this time was because I was asked how to set timezone in a docker container by the Japanese R community yesterday.
Since I usually use RStudio in rocker/tidyverse and put .Renviron file in the root directory of the project, I hadn't noticed that the TZ was fixed even outside RStudio...

@cboettig
Copy link
Member

Thanks @eitsupi , I really appreciate your example and the chance to discuss this issue in detail. It's a common friction point and working through this with you helps me think about it more clearly myself as well.

Having a custom .Renviron file that sets your personal R environment at the project-specific or user-specific level, like you already do, definitely seems like an excellent approach. That technique is unlikely to give any surprises regardless of your choice of container and regardless of whether you run with Rscript or RStudio.

That's just not true of the other approaches:

  • I'm reluctant to start using ENTRYPOINT because to date we have not set ENTRYPOINT, which means that downstream users are free to make use of setting entrypoint dynamically at runtime without over-riding some default Rocker configuration. Doing this creates another way in which results in rocker become more dependent on docker-specific configurations.
  • Yes, we could just move the Renviron setting to the RStudio stage rather than the r-ver stage, but that means we have inconsistent behavior within the versioned2 stack depending on whether or not you installed rstudio, even if you are not actually using it, which I'd rather avoid.
  • The entrypoint solution is clever, but users might be surprised to find that setting -e TZ works on the command line to pass an env var to an RStudio instance, while not working for any other env vars. It might be tempting (but probably difficult?) to try and generalize this solution for all command line env vars (though arguably certain ones, like PASSWORD, you do not want visible to the RStudio environment in a multi-user system anyway). Meanwhile, I'm reluctant to create a class of "known env vars" that get "special treatment" of being able to pass with -e instead of using .Renviron.

When a user asks how to set TZ, I really think "use Renviron" is the most natural answer. It is the standard "R" way of setting environmental variables -- a user doesn't need to know that they are even in docker or to have any control over the docker runtime arguments to set it. The Renviron conventions give greater control to which user or project does or does not inherit the env var. It provides better persistence and portability by declaring these details in config file instead of dynamically at run time, and avoids the issues discussed above.

@eitsupi
Copy link
Member Author

eitsupi commented Apr 24, 2021

@cboettig Thank you for explaining your thoughts to me.

When a user asks how to set TZ, I really think "use Renviron" is the most natural answer. It is the standard "R" way of setting environmental variables -- a user doesn't need to know that they are even in docker or to have any control over the docker runtime arguments to set it. The Renviron conventions give greater control to which user or project does or does not inherit the env var. It provides better persistence and portability by declaring these details in config file instead of dynamically at run time, and avoids the issues discussed above.

Given the project's emphasis on reproducibility, I thought that the fixed TZ might be a desirable behavior. (Regardless of the fact that sessions on RStudio Server ignores the OS environment variables.)

Since I also use Docker for other programming languages (e.g. Python), I found the method of using .env files to set OS environment variables and referring to them from within the container to be generic and convenient.

$ docker run --rm -it --env-file .env python

On the other hand, I had a hard time running a script that is supposed to read OS environment variables without using Docker (when passing it to another person who is not using Docker).
As you said, unlike .env files, Renviron files are always loaded into R, so there is no need to worry about that.

However, users who use R outside of RStudio may expect the OS environment variables to be read.
https://github.com/rocker-org/rocker-versioned describes the differences between r-base and rocker/r-ver. In the same way, how about stating in README.md here that "rocker/r-ver R >= 4.0.0 does not read the OS timezone setting"?

Also, it would be easier for users to find their way to the configuration on their own if there is a description of Renviron files.
It would be best for users to run ?Startup on R, but a recent article on RStudio Support may be easy to understand too.
https://support.rstudio.com/hc/en-us/articles/360047157094-Managing-R-with-Rprofile-Renviron-Rprofile-site-Renviron-site-rsession-conf-and-repos-conf

It might be tempting (but probably difficult?) to try and generalize this solution for all command line env vars (though arguably certain ones, like PASSWORD, you do not want visible to the RStudio environment in a multi-user system anyway)

It would be easy to write all the environment variables with the env command as shown below, but even if PASSWORD will be removed, I don't think it's a good idea to make it the default behavior because it will have a big impact on existing users.

$ env >> ${R_HOME}/etc/Renviron

It may be welcome to use a user-set environment variable (e.g. $SINGLE_USER_MODE?) to allow the above process to be controlled by userconf.sh.
Of course, it goes against the idea that we should use Renviron, but there seems to be user needs.

For example, Docker reads the proxy configuration written in ~/.docker/config.json of the host OS by default and automatically creates environment variables such as http_proxy in the container, so users don't usually need to configure the proxy settings of the container every time.
Only RStudio Server ignores the environment variables of the container, so I often prepare a Renviron with only ALL_PROXY set to connect to the Internet from RStudio.
R outside the container uses the proxy settings of the OS to begin with (even on Windows!). So users usually don't need to be aware of the proxy settings.

@Pit-Storm
Copy link
Contributor

Pit-Storm commented May 6, 2021

I bounced upon this topic because I use the images in germany.

As far as I understood the discussion a flexible solution for two sides is needed:

  1. Keep flexibility to set TZ during runtime or builttime with or without .Renviron file
  2. A consise and reproducible configuration for all users that don't want to get in touch with that.

In my opinion the easiest way to solve this problem would be to not hard code the value of the TZ variable into the Renviron file. It would be better to write TZ=${TZ} to the Renviron.site file. Like it is said on rdrr.io this would be possible.

Because the Env-Var is set int he Dockerfile the expression TZ=${TZ} in the Renviron.site file will always expand to the default value. And if you have the need to set it, even in Dockerfile or during runtime (e.g. in an k8s deployment) one is able to set it through CLI or yml file.

Hope that this can help :-)

Edited: Due to the fact the variable expansion is not possible in Renviron file, we have to use ${R_HOME}/etc/Renviron.site file.

@cboettig
Copy link
Member

cboettig commented May 6, 2021

@Pit-Storm Wow, that sounds great. Does this actually work?

If I create a local .Renviron with:

TZ=${TZ}

and I do:

docker run --rm -ti -v $(pwd)/.Renviron:/home/rstudio/.Renviron -p 8787:8787 -e PASSWORD=testing -e TZ=Asia/Tokyo rocker/rstudio

I still see Etc/UTC from inside RStudio, not Asia/Tokyo. Am I missing something?

@eitsupi
Copy link
Member Author

eitsupi commented May 7, 2021

I tried it too, but it doesn't seem to work in RStudio sessions because the user (rstudio by default) doesn't exist when docker run rocker/rstudio and the environment variable TZ is only set to the root user.

In other words

  1. Put TZ=${TZ} on ${R_HOME}/etc/Renviron.site.
  2. Set the OS environment variable TZ for root user to rstudio user when RStudio Server is running. (maybe in the userconf.sh?)

I think these two changes will make it work, but for the second one, I did not know how to set the OS environment variable for rstudio user.

@Pit-Storm
Copy link
Contributor

Pit-Storm commented May 7, 2021

At first: The variable expansion has to work, because if you look at ${R_HOME}/etc/Renviron itself there are variables used on the right hand side of the equal sign.

I think there must be another reason why this is not working.

As I Use the Dockerfile:

# Dockerfile
FROM rocker/rstudio:4.0.3
RUN echo "TZ=\${TZ}" >> ${R_HOME}/etc/Renviron.site

and do docker build --tag rstudiotz:4.0.3 . and after that docker run --rm -it -e TZ=Europe/Berlin -p 8787:8787 rstudiotz:4.0.3. The Rstudio Terminal will give me for Sys.getenv()["TZ"] the string "Etc\UTC".

Even if you start the container with -e ROOT=true the env var is not expanded through Rstudio.
This holds also true when you exec into the container (than as root user) and start an R session. This session is started as root, so there should not bee a problem. When you print out the ENV-Vars in the bash as root user inside the container you can see that TZ=Europe/Berlin. This have to tell us, that something other goes wrong here. But it is really hard to debug, because we don't get an error message.

For me it looks like that Rstudio and even R itself is not picking up the Renviron.site and for a strange reason is not expanding the TZ=${TZ} when you write it in ${R_HOME}/etc/Renviron. (BTW: To write something to that file is not recommended by several blogposts of RStudio.)

Additionally R and RStudio is not processing Renviron.site when you write TZ=Europe/Berlin hardcoded to the file. It's just skipping it....

Could it have something todo with the install mechanism? Or do you modify some mechanism for the R startup in other scripts?

I am not really shure how to debug that 🙈

@Pit-Storm
Copy link
Contributor

So, i have some more investigations and hints that we have to check.

There must be something wrong with the startup process because the rsession of RStudio is not picking up ANY env var in any file. Even the one you set to /etc/R/Renviron.site (here is not inherited to the Rsession of RStudio.

I really don't get it, but I am not as deep inside the scripts in rocker_scripts. And the RStudio server documentation and blogposts suggests that those file should be read by RStudio (Like it is said here).

I am really not getting it...

@cboettig
Copy link
Member

cboettig commented May 7, 2021

Thanks folks!

I think we want to edit:

https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/install_R.sh#L144

to use single quotes and echo to $R_HOME/etc/Renviron.site and not $R_HOME/etc/Renviron. I wasn't previously aware about the distinction between $R_HOME/etc/Renviron.site and $R_HOME/etc/Renviron. The latter is created by R install, the former does not exist by default (though R_HOME/etc/Rprofile.site does). The RStudio docs @Pit-Storm links have this clear if cryptic warning:

The Renviron file located at R_HOME/etc is unique and different from Renviron.site
though I honestly wish I had a more operational understanding of what "unique and different" means (i.e. does this mean something other than "is loaded first"? Does it prevent env vars set in that file from being overwritten by Renviron files loaded later? Or does it just mean that R cannot start without having the environ variables that are pre-defined there available?

Still not entirely convinced that RStudio will evaluate system env vars when parsing the Renviron file, even though R obviously supports that since it uses that notation in the R_HOME/etc/Renviron file it creates (e.g. R_PAPERSIZE=${R_PAPERSIZE-'letter'}. But it does sound like rocker-versioned2 scripts all need to switch over to using Renviron.site, and probably could be tidied up a bit (e.g. using defaults and perhaps avoiding duplicate additions if an install script is run multiple times).

@Pit-Storm
Copy link
Contributor

I am preparing a Pull request for Renviron.site switch here.

And after that another one for the $TZ fix.

@cboettig I couldn't find a contributing.md file in the repo here. Can you give some hints please? :-)

@eddelbuettel
Copy link
Member

eddelbuettel commented May 14, 2021

The standard reference for that is, I think, our wiki entry in the main repo: https://github.com/rocker-org/rocker/wiki/How-to-contribute -- our repos are sadly a little scattered across the org making things a little harder to find at times.

@eitsupi
Copy link
Member Author

eitsupi commented Jun 24, 2021

#186 has solved this problem in and out of RStudio. Thanks for the great work @Pit-Storm !

$ docker run --rm -it -e TZ=Asia/Tokyo rocker/rstudio:4.1.0 Rscript -e "Sys.time()"
[1] "2021-06-24 19:38:48 JST"

$ docker run --rm -it -e TZ=Asia/Tokyo -e PASSWORD=foobar -p 8787:8787 rocker/rstudio:4.1.0

in RStudio

image

Note that if we run /init without setting the PASSWORD environment variable, the script will exit before copying the environment variables to ${R_HOME}/etc/Renviron.site, so we cannot use the environment variable in RStudio.

elif [ "$PASSWORD" == "rstudio" ]
then
printf "\n\n"
tput bold
printf "\e[31mERROR\e[39m: You must set a unique PASSWORD (not 'rstudio') first! e.g. run with:\n"
printf "docker run -e PASSWORD=\e[92m<YOUR_PASS>\e[39m -p 8787:8787 rocker/rstudio\n"
tput sgr0
printf "\n\n"
exit 1
fi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants