-
Notifications
You must be signed in to change notification settings - Fork 1k
Support for private repositories e.g. Github Enterprise #174
Comments
At my company, we solve this exact problem using go's vanity imports. We have something like https://github.com/dominikh/go-vanity which serves up the appropriate protocol depending on various factors and we use the url of the vanity server instead of our ghe url in our import paths. |
This is part of our spec (https://docs.google.com/document/d/1qnmjwfMmvSCDaY4jxPmLAccaaUI5FfySNE90gB0pTKQ/edit#). Search for "alt location". Does that work, at least basically for your needs? With that said it doesn't exist atm. |
First - solving this with vanity imports via an external system, as @zevdg describes, is far easier. (I didn't know about go-vanity - that's awesome.) That said, vanity imports still have some of the same issues as what's described here.
We didn't talk about it much directly within the committee as it wasn't an immediate necessity, but I've thought about this quite a lot. It's in the top three nasty problems I've grappled with over the last year. To be clear - these problems are solvable, but they add a ton of new, potentially obnoxious failure modes, so I've been very cautious about adding support for them in gps. I've also been thinking about them for way too long, and have gotten somewhat lost in the forest - help with weighing tradeoffs is very welcome :) This is kind of a slapdash writeup, but hopefully it covers the basic bases. The essential problem is portability: we have to be able to deduce the source/repo for all import paths we encounter. If that's only possible by relying on custom rules, then anyone trying to work with your project without those custom rules won't be able to resolve those import paths. The really nasty part, though, is that we're unable to provide users in that situation with useful error messages, because it's impossible to tell the difference between "invalid import path" and "invalid import path because you're missing some custom deduction rules." How could we know such custom rules exist? If you're working on a private, internal company system, then this sort of portability isn't really a concern. The problem with a feature like this is that it wouldn't just be used for that - how many people would write their own little custom rule for, say, gitlab (for which we don't currently have a built-in rule)? We could mitigate this by having the custom rules live in the manifest, alongside constraints/ignores/requires/overrides. composer does this, generally to good effect. But, if you're in a company with a bunch of different projects, then each would need to define the same rules in their manifests. (Pattern definition would have to be a root-manifest-only thing, for basically the same reasons as composer gives). If you have to make any changes in how those rules work, you have to make them in all your projects, and it could potentially make it impossible to use older versions of those projects, because your new rules don't work with the old imports. But all this works well enough in composer, right? Sure. Things are strictly more difficult for Go, though, because this problem of deducing projects/repositories from import paths is one that composer simply doesn't have. Composer is just dealing with the name of a dep and a location from which to retrieve it and its metadata. Because Plus, composer constrains names to have two parts, but we have no such guarantee - Also, as one final kick in the pants, because we're ultimately talking about a system to decide what names (import paths) mean, it would have to be incorporated into the input hashing system we use to determine when you need to regenerate your lock. (That's the Again, I do think there are reasonable solutions here. But I have yet to figure out a good, balanced compromise that ratchets down control to avoid rendering the entire system unsafe; I've been enmeshed in this problem for a long time, and have lost a lot of the perspective necessary to weigh tradeoffs. Feedback is super welcome :) |
Oh! I forgot to mention... All of this ugliness is my own single strongest motivation for creating a registry (#175): we can impose meta-rules on a subset of import paths. For example, imagine this pattern: import "p/gopherhoard.com/user/project"
This solves so many problems.
|
Thanks for the background! Glad I'm not the only person thinking about this. The problem that vanity imports don't solve for us is having a centralized cache of 3rd party libraries. We use artifactory to do this from nearly every other language, but since go uses the import path to uniquely identify both the package and its source url, the existing go tool simply cannot handle changing the source url without also changing the package's identifier. We could re-write import paths to point to artifactory, but we'd have to also do that inside of every 3rd party package we use as well so that we don't end up referring to the same package with by different names. The alternate urls described in dep ensure -h looks like exactly what we need. I was assuming that those alternate urls would be stored in your manifest.json individually for every dependency. Something like {
"dependencies": {
"github.com/golang/protobuf": {
"branch": "master",
"source": "artifactory.corp-intranet.com/github.com/golang/protobuf"
},
"github.com/pkg/errors": {
"branch": "master",
"source": "artifactory.corp-intranet.com/github.com/pkg/errors"
},
"ghe.corp-intranet.com/myteam/lib": {
"branch": "master"
}
}
} So while that leaves us with a lot of seemingly redundant text in the manifest file, it keeps any complex deduction rules out of it. Anyone would be able to pull down this project's deps from the right place assuming they had access to the intranet. That alone seems "sufficient" to solve the problem, but it requires a lot of discipline for devs to add the correct alternate location to each and every 3rd party dependency. So on some level, I do want a "custom rule" of some sort, but more generally, I'm imagining some kind of onManifestEdit hook there I can run some arbitrary script AFTER dep updates the manifest but BEFORE dep uses manifest to resolve dependencies and update the lockfile. This hook would have the opportunity to also add more updates to the manifest and if it did, dep would re-read and use the new info in generating the lock file. Conceptually, this would be similar to setting up your editor to run In particular, the hook I'd personally want would be along the lines of (in psedo code)
So that on a properly configured machine, I imagine there would be other legit use cases for having an automated edit to the manifest here. I haven't thought too deeply about the implication of allowing a hook like this, but since the manifest is supposed to be editable by humans, never mind other tools, this doesn't seem too dangerous at first glance. Also, we may have gone very far off topic from the original issue. Maye this should be separated into a separate issue or 2? |
+1 |
Yep yep!
Oh for sure - but I'll go two steps stronger. Step 1: it's not just that it requires discipline for devs. It's that you effectively leave to individual developers what may, to meet the workflow/legal/whatever reqs of your organization, need to be questions of policy. Step 2: when you explicitly record the source values in This creates a potentially nasty situation: let's say your company gets acquired, and the artifactory base URL has to change from (The
Yes, I've been imagining something like this since some folks described a similar need in Masterminds/glide#372. A script would probably work well; my plan was that it would take either a project root or a URL (so, If we went that route, it would be transparent to the manifest, and the onus would be on the implementor of the script to ensure that the mirror is properly reflecting the data from upstream. Failure to do that would compromise the portability of the lock outside of the alternate sourcing universe imposed by the script...which is maybe fine in some cases, but has its risks :)
Sure - my thinking on these topics is all pretty jumbled up together, so if you can see a way to tease out more specific issues, please do! |
Just bumping this to see if you've had any more thoughts on separating out some issues, @zevdg 😄 |
I'm glad I'm not the only one who have thought about these things :)
I had this thought maybe 8 months ago but went away from it for reasons I can't quite remember. I think I imagined it would be inconvenient to use a different import url. But looking at it now I see more benefits than trade-offs.
One minor thing that this doesn't solve as far as I can tell is when you want to vendor for the first time with
I can understand wanting to prevent users from misusing a feature like this, but if this would be the only way to get dependencies from gitlab, then it would be better than |
Ok, so my main concerns around rewrite rules for alternate sources seem to be much less pressing due to the paradigm shift that seems to be happening around #213 . If the manifests will not be edited by dep, then we can just make our own manifest generation tooling that writes to the manifest and not have to worry about conflicting with or hooking into dep. If implemented, #221 could help ensure this tooling is installed correctly. The UX could be improved by making some sort of source rewrite rules a first class citizen in dep, but it's a less pressing issue now and it seems like a contentious feature that could be added in a later version of dep. I'll open a ticket for reference and discussion around this issue. The other main issue is authentication. Even though github has the right go-import meta tags, they are not accessible for any repository (public or private) in my company's GHE necessitating our vanity import server. If I could configure dep with a github access token or with a git-like credential manager, that wouldn't be necessary. To be clear though, AFAIK, this isn't just an issue for GHE, but also for private repositories on regular old github.com We could continue to use this issue for that, but it's now pretty cluttered with other stuff so I'll open a separate issue for that as well. |
Giving dep v0.1.0 a try with an existing work project. Since the project is hosted on GitHub Enterprise, I knew it wasn't going to work perfectly given this issue and #286. When running
In this case myrepo is a reusable package I wrote. There's nothing tricky about it. This is the remote for that repository:
There is a 1:1 mapping here -- it's just Our previous tool was However, I was able to work around that issue by manually dumping files in |
Just noticed I wasn't completely honest in my last comment. There is something a bit tricky.
->
That could explain it. Of course there's no way of really knowing where the |
If I change my import paths to:
I saw the following message with
But then the next time I ran Not a big fan of having to put |
Looking at private mirroring of our dependencies currently. So we have the ability to fetch a dependency from a different location via This seems mostly sufficient, just mirror the repositories you want to host internally and I'm assuming everything will work out fine. I have question with this workflow though regarding transitive dependencies. Let's say we want to add
Let's say we know we are going to pull this dependency in and we mirror it internally. So what should happen if we do
Any thoughts on what this workflow ought to look like? |
Good questions. Though I'm starting to wonder if we could split up this issue? Personally I'd just like a way to support GitHub Enterprise for dependencies written within the organization. Transitively mirroring third-party deps is absolutely an important feature, but not one I need where I'm at. I just don't want one to slow down the implementation of the other... and it becomes a bit difficult to follow the conversation when the issue covers to many different topics. |
@nathany I'm happy to move/split however maintainers see fit. The question seemed to fit under "private repository" usage patterns to me. As an update though on actually trying this out:
Appears I would need to setup a page to serve the correct metadata? Modifying [[constraint]]
branch = "master"
name = "golang.org/x/sys"
+ source = "http://git.internal.org/external/golang.org/x/sys.git"
As I suspected though transitive dependencies continue to be pulled from their original location which is not what we (or presumably other organizations) would want when trying to vendor all third party dependencies. @sdboyer I am inclined to agree that perhaps this part of the issue ought to be split but I'm not particularly sure how you want it organized (or rather what even the right question ought to be). |
@aajtodd Won't an [[override]]
branch = "master"
name = "golang.org/x/sys"
source = "http://git.internal.org/external/golang.org/x/sys.git" This should ensure that |
@wadelee1986 not relevant to this issue - please make a new issue. @justinfx an explanation for why adding |
@morganhein but isn't |
Not sure about git:// but SSH to GitHub is pretty secure. A comparison here: https://gist.github.com/grawity/4392747 |
If using github.myenterprise.com, it's not entirely clear to me how Git URL rewriting would satisfy dep, go get, or other tools in the Go ecosystem. Has anyone got this to work, and have an example to share? https://blog.devzero.com/2014/08/29/useful-git-commands-url-rewriting/ |
@nathany The problem is most likely that your github enterprise installation doesn't expose the url In this case It would be great if Github Enterprise could just expose the go metadata without protecting it behind an auth page since there are no secrets in the metadata. |
@nathany As for other tools, Glide works correctly in this situation. You just point your dependency at any git repository and it will do the right thing, no matter what package path you use in your source code. @mikkeloscar or if dep would just do the ssh dance like Glide does. |
It would be nice if dep did the ssh dance 💃 🕺 @dsvensson mentioned, or if GitHub implemented what @mikkeloscar suggests. I'd rather not move work projects from gvt to glide to dep -- and I don't imagine my co-workers would be super happy about switching tools multiple times either. Plus I'd like to get more experience using dep in "real world" scenarios. |
this is really just about #860 - if we were to trust the source declaration fully, it would obviate the need for the HTTP |
@nathany @mikkeloscar git URL rewriting should work fine with go get and dep to clone private repos. I've used the below for the past couple years with no issues. It seems like you're actually running into http://grokbase.com/t/gg/golang-nuts/13c5jx3g79/go-nuts-problem-with-go-get-with-github-enterprise-repo perhaps you can further rewrite the URL to append the
This works because |
It works for github.com because it's hardcoded in |
Ha, I just noticed this myself as you posted. Here's the link in the tools repo fwiw. Further discussion here https://groups.google.com/forum/m/#!msg/golang-nuts/AURCoVLjNyc/2Uw7A_-LRfQJ seems to imply that suffixing your imports with Pretty sure I use rewriting with GitLab instances which Id expect to have the same problem. Beginning to doubt that now though. |
to clarify - while the |
Thanks for clarifying, I misunderstood your previous comment and did get that impression. Having now also read 830.
This seems like the way to do it, the "ping" seems to create more problems while just letting the vcs error on a bad url seems manageable. |
I've tried many workarounds but non of them worked.
Git clone and go get work as expected.
|
I believe I have a solution for private repos with ssh access, see PR #1717 |
#1759 - this is a quick sketch of a local registry implementation that doesn't require the editing of any import paths. Let me know if this is useful. |
This is the workaround in our team:
If one project is likely to be imported, clone it with the $ git clone [email protected]:chehsunliu/example.git $GOPATH/src/github.mycompany.com/chehsunliu/example.git Otherwise, it would be quite painful to rename every import statement in the project. |
A little sidenote for clarifcation / addition: meaning of .git and where it is mandatory
So the important one here is, you do not need .git in the package name, you need it in the source. Since @chehsunliu is already importing using .git, this constrain to "remap" name to source is no longer needed. This said, you could also import the package without .git and do the constraint above .. but this would need a constrain per internal project to remap you non global git url rewrite You can fix a lot more with the git url rewrites .. even paths and path remapping if the anchor of your repos e.g. is /scm/whatever/project you could remap that https://mycompany.com/whatever/project . from the import will be sourced from git urls |
With a combination of some of the above answers, I ended using docker to get dependencies and build. Hopefully this workaround will help someone:
|
I entirely gave up on gitconfig:
Where
Where it then uses your ssh On top of that, it is just less hassle, no |
I didn't see this addressed in any of the docs, so I'll raise the question
here.
Have you considered support for private repositories?
Many of the existing vendor tools does not handle private repositories very
well or at all.
godep
andgovender
can sort of handle it because they getthe dependencies from your $GOPATH. But existing tools that fetches directly
from the remote does not work with private repos in my experience (If someone
knows any tools that does work, please let me know :) ).
It would be great if
dep
could break this trend and work well with privaterepositories from the start.
I'm probably not covering all use cases, but these are the use cases I would
like to see solved.
From my point of view this could be as simple as introducing an environment
variable which maps your private repo URL to a certain protocol. E.g.:
Which means whenever
dep
encounters an import path which starts withmy.ghe.com
it will automatically choosegit+ssh
when fetching the repo. Andsince I already have configured an ssh key on my machine it should know how to
authenticate.
Similarly when it encounters an import path with
my.ghe2.com
itwill choose
git+https
and ask me for username+password or just work if Ihave configured what token to use in
.gitconfig
.I'm sure someone could think of a better name for the environment variable and
a better format, but I hope you get my point.
The text was updated successfully, but these errors were encountered: