-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block insecure ways of pulling from private repositories by default #85
Comments
I would strongly encourage against this. I have set up a bot github account with read only access and I use that token, and I distribute links to my students through private channels (the course information system at my university). I cannot have my course materials publicly available for IP reasons, so I need to be able to pull from a private repo using just a link. I do not see any way in which you can get around this without distributing something just as confidential to students in order to allow for cloning the repos. I think that actively disallowing this completely normal and expected feature of github based workflows is both insulting to the users and, frankly, pointless. I think that recommending people not share access tokens is fine (and not documenting the method to do so except to warn people), but if I choose to share an access token, understanding the consequences and having taken precautions I feel comfortable with, then I would be quite disappointed if someone else told me I wasn't allowed to. I would strongly encourage you to delete this as an issue. |
First, I apologize if this was insulting, @albertmichaelj. I can see how the framing of this issue would feel like that, and I'll shortly reword it to feel less like that. There are primarily two different ways to safely support pulling private repos:
Both (1) and (2a) have the following advantages:
(2b) has disadvantages that almost mirror the advantages of (1) and (2a):
Software should be secure by default, and if you want to do something insecure, you have to explicitly opt into it. Accessing private repos is an extremely common use case, so we should figure out how to explicitly support that well (#53). This would mean we should document the various ways you can do it, and the ways you should not. Otherwise folks will find and document insecure ways of doing this - for example, see discussion in #53. So not documenting it only gives us security by obscurity. As software developers, we should make it easy to do the secure thing, hard to do things that are insecure but can not be accomplished in any other way, and practically impossible to do things that are insecure but easily doable in other ways. I'd that method (1) is much easier to do than creating a secure bot account, and is what most folks should do. It also gives you per-repo control. I'm interested in learning if options (1) or (2a) do not work for your use case. If you can expand on what your use case is, we can probably find another way of supporting that that doesn't involve spreading credentials in URLs. If not, we can figure out how to allow this with a big security warning and requiring explicit opt-in. I hope this makes a little more sense on the rationale behind this issue, and I again apologize if it felt insulting. |
I appreciate your detailed response, and I respect the desire to provide a secure piece of software. I also agree that encouraging people to use the personal access token in the url is poor practice. However, as best I can tell, your suggestions make the pretty strong assumption that people will only be using nbgitpuller for JupyterHub. I use it for local installations as well. I could definitely give all of my students netrc files, but I'm not sure this is actually safer (though your point about the url not being logged is a good one) since the students still have the credentials. However, I work with fairly low tech comfort level students, and I've done a lot of work to make the process as painless as possible for them. Trying to get them to correctly configure something through a hidden file would be a major pain point for my class. As far as JupyterHub goes, I'm sure there are better ways to access private git repos. However, I still see no reason to take away an option that works perfect for my use case (if someone snoops on my bot accounts read only access token, it's not a big deal, I just have to make a good faith effort to keep the files secure, but the students can always do arbitrarily bad things with the files once they have them). I think that there should be a more secure and easier to use option as well, but I do not think that just forbidding including a private access token (like I could with any |
Ah, very interesting. I had not considered the non-JupyterHub use case at all - thank you for that perspective. I agree that trying to get non-tech-savvy folks to set up a netrc or ssh key on their own isn't an acceptable solution. I also agree that including a token in the URL is probably the most realistic option there. Another way to approach this would be:
This makes it impossible to do the insecure thing when alternatives are available (on a JupyterHub), and possible when alternatives aren't realistically available. For (3), can you tell me how your users typically start a Jupyter Notebook process? I'd also generally love to hear a little more about your use case. Thank you for helping make nbgitpuller better and more useful! <3 |
What I've done is create an environment.yaml file that the students import through the anaconda app that they get from installing the standard anaconda distribution. The students then can select the conda environment in a drop down menu on the anaconda app and just click launch on Jupyter Lab. So, the users do not have the ability to specify any kind of command line switch for the notebook server. I have experimented with having a jupyter notebook that they have to run to finish installation that writes things like environmental variables in the conda environment that will be set when the environment is activated (though the conda activate.d directories), so there are ways that I could use this to set up some kind of key. However, running the installation notebook and maintaining it has been somewhat problematic. That is why I've been looking forward to pull request #42 being merged so that I can specify where the class files end up so they're not in the home directory. However, philosophically I don't think the problem is with nbgitpuller. Github allows for the cloning of repos using personal access tokens. nbgitpuller is just a wrapper for git that handles things like local file changes and recloning a little better (and it provides a url that can be clicked). I don't think that it's really nbgitpuller's job to limit git's capabilities. I understand the argument against using it, but it is something that GitHub allows, and it has its use cases. I would argue they shouldn't even be disallowed for JupyterHub. Instead, I think that there should be an easy alternative to the personal access token that is well documented, and then people will naturally choose that path. People aren't going to find an obscure way to use personal access tokens if they can do something much easier that is spelled out in the documentation. I don't see why this has to be explicitly forbidden when their may be unforeseen reasonable use cases for not disallowing built in GitHub functionality. |
EDIT: I figured out how to use SSH address in the nbgitpuller generated link, and seems to be working now. The documentation for JupyterLab does not seem accurate, as adding %3Fautodecode to a JupyterLab link does not prevent the workspace prompt.
I've seen you mention this a few times in a few issue threads around nbgitpuller and tljh, but I can't seem to figure out how to do it. I have a private repo in bitbucket, and I've generated an SSH key on my TLJH instance, and added the public key to bitbucket, but nbgitpuller always defaults to https. Can I force TLJH to use ssh globally? Or am I barking up the wrong tree with SSH, and this is better accomplished with OAuth (which I'm also unfamiliar with). |
I imagine following workflow that will not include Access Token in nbgitpuller URL:
I would also suggest another security improvement. |
See comment #85 (comment) for a lot more detail
Since we don't have good support for pulling from private repositories, folks often put their own personal access tokens in the nbgitpuller URL.
This is extremely dangerous, and the same as sharing your password. We should detect and block this, but only after making sure we have some easy way for folks to pull from private repositories.
The text was updated successfully, but these errors were encountered: