-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run rstudio directly from repo2docker #533
Comments
If I run |
Related to this, is there a general pattern that could be documented for running arbitrary applications under a similar model, eg @betatim's |
I think the general pattern is that you need to know three bits of information:
This means that you can use it to start openrefine directly, but you will have to know which port needs forwarding and what the openrefine command is. |
Has there been any discussion of putting the port and command into the generated Dockerfile instead of overriding values during |
If the port is aliased, by a proxy path, you presumably don't need to know the port number? Some services might allow the port on which they are run to be specified, in which case it would make sense to have a standard recipe / convention for handling that, eg a standardised way/convention for passing a port number in via a docker environment variable via Something that struck me in context of the OpenRefine service, which was started on a specified port that was dynamically allocated, was that it would be useful to have the port number available via introspection eg within a notebook kernel, eg by writing it to a config file in a standard location. By the by, one other thing I noticed running |
Adding aliases is easy now with https://jupyter-server-proxy.readthedocs.io/ |
I think we are talking about two different things here. This issue is specifically about directly launching an executable (like RStudio or OpenRefine) without going via the notebook server and its proxy. I think it is worth creating separate issues for discussing these topics and things related to each.
Can you explain a bit what you mean? My understanding of how docker works is that even if you put a |
Does Binderhub always start with a CMD that launches jupyterhub? i.e. could MyBinder be used just to run RStudio? More generally, is there a set-up for MyBinder where I could autostart a service in addition to a notebook server, such as a postgres database? |
(@betatim re: separate issues: yes, apols, I was conflating user lumped ideas of starting arbitrary services together. eg ones that a user might start from a notebook menu and run via a proxy, ones that might autostart, etc) |
Currently the creator of a binder/repo can't control the command that BinderHub will run. You can hook yourself into the startup process though via https://repo2docker.readthedocs.io/en/latest/config_files.html#start-run-code-before-the-user-sessions-starts aka start up a DB. (I don't know what will happen if you don't add |
BinderHub requires a notebook server to be running as CMD, since it does token authentication through that. One option is to use a simple binary that does proxying, supervision & token authentication only, which might be useful to explore. The alternative I've been exploring is, of course, jupyter-server-proxy, which can now spawn and supervise additional processes, including postgres. The ability to autostart services and have arbitrary readiness checks doesn't exist yet, but could be added easily! |
This is in response to @betatim's question #533 (comment). Sorry to be so verbose, but since Whole Tale will be using repo2docker outside of Binder, I feel like I need to provide more context. I'm happy to move this to another issue if appropriate, since it's tangential to the main issue topic.
My question was more motivated by the idea of having the generated Dockerfile/image accurately reflect what was passed to repo2docker. For example, something like:
Would produce a Dockerfile with
This same information could be used regardless of whether The basic WT use case is as follows: A researcher comes to WT to create a Tale. They select from a set of supported interactive environments -- e.g., Jupyter or Rstudio. We run a vanilla base environment for them to start their work. They upload/create any necessary data/code/documentation, etc and custom configuration via repo2docker config files. (During development, they can rebuild the running environment to apply any config changes.) Once completed, they can publish the Tale to an external repository (e.g., DataONE, Dataverse, etc). Later, another user discovers and runs the Tale either from the WT system or via the external repository, potentially downloading a zip archive to run locally. At this point a given Tale will be based on a single environment -- Rstudio or Jupyter. Note also that we're currently considering publishing the generated repo2docker Dockerfile as an additional artifact (similar to what's been discussed for the Odum CoRe2 project) so that a user doesn't need to run repo2docker to read it. From the archival perspective, the Dockerfile may also have value in the long run regardless of whether Docker is around or the image can actually be built. For now, WT will use repo2docker to build (but not run) the image. We will necessarily need to store information about the port and default command for each environment (e.g., Jupyter, Rstudio). Looking at the typical Rstudio case where the user is not a Jupyter user, using repo2docker as described above the generated Dockerfile and image would have the wrong port (EXPOSE) and default command (CMD). We would need to include the Rstudio-specific information as part of the published Tale and would be less likely to include the generated Dockerfile, since it would potentially cause confusion. Using the current repo2docker implementation, I'd probably include a simple generated script or readme instructing the user how they could regenerate and run locally using repo2docker with the full command (#533 (comment)) However, by enabling overriding the EXPOSE and CMD during build as well as at runtime, the Dockerfile (and resulting image, if inspected) would reflect the intent of the user -- for anyone running it to use Rstudio, not Jupyter. From my understanding, Binder achieves this through the "launch in" badges/links which specify the environment a user should access. |
@craig-willis Thank you for the well thought out comment. I've generated two specific issues from it: #545 and #546. Let's continue these discussions there. Currently, the Dockerfile generated by repo2docker can't be built by itself - see #202 for discussions there. It is still useful as documentation, though. |
Thanks, @yuvipanda -- I'll comment on the new tickets. I wasn't aware of #202 and that's very good to know. @jonc1438 -- you may also be interested in parts of this discussion. |
How are you doing this? I am running a similar command and I get what appears to be a success message however, going to
Results in... ---> Running in a448e223cb32
Removing intermediate container a448e223cb32
---> b75ea650e2ed
Step 57/61 : ENV PYTHONUNBUFFERED=1
---> Running in cc6fdfb1ea0f
Removing intermediate container cc6fdfb1ea0f
---> 68006a8975b1
Step 58/61 : COPY /python3-login /usr/local/bin/python3-login
---> 74e01a8e934f
Step 59/61 : COPY /repo2docker-entrypoint /usr/local/bin/repo2docker-entrypoint
---> 4f85a207d078
Step 60/61 : ENTRYPOINT ["/usr/local/bin/repo2docker-entrypoint"]
---> Running in b740cb1ff035
Removing intermediate container b740cb1ff035
---> 51855a58f677
Step 61/61 : CMD ["jupyter", "notebook", "--ip", "0.0.0.0"]
---> Running in 72db9f88b2c2
Removing intermediate container 72db9f88b2c2
---> 2fc93bf925c3
{"aux": {"ID": "sha256:2fc93bf925c377b0562d9e5560ff1f52f19643da3d676d13ad63521e22a36279"}}Successfully built 2fc93bf925c3
Successfully tagged r2d-2fhome-2fnoah-2fprojects-2fh-2epylori-2dsiegel-5fet-5fal-5f20221654723301:latest
CONTAINER FINISHED RUNNING. |
If you have an install.R file, we install RStudio into the image. By default, we use nbressionproxy to make this available from JupyterHub / MyBinder.org, but if you are just building images with repo2docker you don't have to use Jupyter at all - you can just launch directly into rstudio.
We should document this.
/cc @craig-willis
The text was updated successfully, but these errors were encountered: