Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate docker reference changes #1778

Merged

Conversation

dmcgowan
Copy link
Collaborator

@dmcgowan dmcgowan commented Jun 9, 2016

Allows having other parsers which are capable of unambiguously keeping hostname and name separated in a Reference type.

This is intended to address comments made for #1777 and lay the foundation for upstreaming changes from docker/docker. Leaving this as WIP so some of those changes can be included.

@codecov-io
Copy link

codecov-io commented Jun 9, 2016

Current coverage is 51.41% (diff: 72.36%)

Merging #1778 into master will decrease coverage by 9.82%

@@             master      #1778   diff @@
==========================================
  Files           129        129           
  Lines         11416      11531    +115   
  Methods           0          0           
  Messages          0          0           
  Branches          0          0           
==========================================
- Hits           6991       5929   -1062   
- Misses         3534       4851   +1317   
+ Partials        891        751    -140   

Powered by Codecov. Last update 534b155...429c75f

@stevvooe
Copy link
Collaborator

stevvooe commented Jun 9, 2016

@dmcgowan This is looking okay. Are there are few demonstrative test cases? For example, does the case from #1777 now pass?

@dmcgowan
Copy link
Collaborator Author

dmcgowan commented Jun 9, 2016

@stevvooe adding the downstream functionality now. The test case from #1777 must always fail unless we change the grammar. However by adding the downstream parsers it should be supported and able to be parsed in an unambiguous way.

@dmcgowan
Copy link
Collaborator Author

dmcgowan commented Jun 9, 2016

Added commit with functionality needed to replace github.com/docker/docker/reference. Another pass on the interface and naming should get us close to what we want.

Also considering adding an interface which has methods for Hostname() and Path() (or whatever the correct name for "Path" is). This would allow some simplification of methods to split out the values and make the functionality less reliant on type asserting an unexported type.

// Normalizer removes ambiguity in values to ensure
// parsing is done correctly and without partial valus.
type Normalizer interface {
Canonicalize(string) (string, error)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of the term canonical here and CanonicalName is confusing with the idea of the Canonical reference. Not sure what would be a more appropriate word.

Also would it make more sense for this interface just to take in a string and do the Named parsing. Originally I wanted to leave it as a package function so it could possible access unexported values without trying the docker implementation to the package anymore. However that is not currently the case and the implementation is already accessing unexported types.

@dmcgowan dmcgowan changed the title [WIP] Split apart repository reference into hostname and name Split apart repository reference into hostname and name Jun 13, 2016
@dmcgowan
Copy link
Collaborator Author

Ping @stevvooe @aaronlehmann for feedback

@dmcgowan dmcgowan changed the title Split apart repository reference into hostname and name Integrate docker reference changes Jun 14, 2016
@dmcgowan
Copy link
Collaborator Author

Discussed more with Stephen and came to a few decisions...

  • Get rid of the normalizer interface, only one implementation of normalize
  • Normalize takes any name and produces the longest form ie redis -> docker.io/library/redis
  • Introduce a FamiliarName function which always produces a shortened version for UI, called familiar because this is intended to be the name that is familiar to users, not always shortened.
  • Rename Hostname to Authority as the first component of the Name may be a DNS name, it does not necessarily represent a host or the hostname of a registry. Rather the first component of a name is intended to represent the authority for that name for trust, authentication, and registry host resolution.
  • Add support for fully resolving string references which may be an ID identifier. When just an identifier is found with no name, a digested type with no name will be returned. Also have a version which could take in a shortened identifier and a digest.Set and returned a reference.
  • Refer to RemoteName as Path. A Name is then always made up of an Authority and a Path.
  • Add all existing helper functions used from docker/docker and docker/engine-api to guarantee that the reference package does not get forked.

I am going to start implementing these changes, we can have a discussion of their individual merit while I get these changes implemented. Ping @aaronlehmann @RichardScothern

@aaronlehmann
Copy link
Contributor

I like most of these changes.

Rename Hostname to Authority as the first component of the Name may be a DNS name, it does not necessarily represent a host or the hostname of a registry. Rather the first component of a name is intended to represent the authority for that name for trust, authentication, and registry host resolution.

I'm not sure I like the name Authority, but I'm not sure what would be a better choice.

Add support for fully resolving string references which may be an ID identifier. When just an identifier is found with no name, a digested type with no name will be returned

How will the grammar distinguish between IDs and names that look like IDs?

@dmcgowan
Copy link
Collaborator Author

How will the grammar distinguish between IDs and names that look like IDs?

I am not sure we can have a top level grammar to represent the rules for ID parsing as well as normalization. I would prefer to keep the grammar as is along with the existing Parse function and just add functions which can parsing IDs and normalization.

@stevvooe
Copy link
Collaborator

@aaronlehmann Authority works here because it is non-committal to the actual role and differentiates the term from namespace, which is too overloaded.

How will the grammar distinguish between IDs and names that look like IDs?

The role of the grammar to be to identify best effort structure. If a name may be an ID, its really up to the caller to run that throw the ID filter before treating it as a name. The main general drawback to this approach is that a secure process requires knowing the entire ID set.

@dmcgowan
Copy link
Collaborator Author

I would also consider Domain as an alternative to Authority, has a similar meaning but slightly more specific to its use.

@stevvooe
Copy link
Collaborator

Domain is fine, but it still has that DNS flavor...

@dmcgowan dmcgowan force-pushed the reference-with-split-hostname branch 2 times, most recently from b79674e to 3a2a544 Compare June 15, 2016 23:00
@dmcgowan
Copy link
Collaborator Author

PTAL, this now has additions to the grammar so I want everyone looking at it. Function naming may still still does not feel right (such as NormalizedName is similar to ParseNamed). Also I added helpers from engine-api but may get rid of the SplitName helper, it really does not add anything useful nor save many lines of code. Plus I only see the original version used twice in engine-api.


// SplitName is a helper function which parses the
// input as a name and returns name plus tag or digest
func SplitName(s string) (string, string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't used internally, and I don't think we want to encourage outside callers to conflate digests and tags.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I got it from here https://github.com/docker/engine-api/blob/master/types/reference/image_reference.go#L10 but even its use looks trivial. It would be better to be more explicit about parsing into a Named. It is possible for engine-api to use the normalized versions now as well.

@dmcgowan dmcgowan force-pushed the reference-with-split-hostname branch from 3a2a544 to dfe593b Compare June 15, 2016 23:34
@dmcgowan
Copy link
Collaborator Author

Changed WithDefaultTag to EnsureTagged and dropped last commit with changes from engine-api. I agree those changes are too specific to the api and we don't want them widely used.

@aaronlehmann
Copy link
Contributor

LGTM

1 similar comment
@stevvooe
Copy link
Collaborator

LGTM

@aaronlehmann
Copy link
Contributor

I suggest we hold off on merging this until we go through the exercise of porting Docker Engine to use it, to make sure we got the details right.

@dmcgowan
Copy link
Collaborator Author

@aaronlehmann that is fair, I will link a branch of Docker with the integration soon, no rush on merging this

@dmcgowan
Copy link
Collaborator Author

Made a PR in my fork for reviewing the changes dmcgowan/docker#27.

Noticeable changes:

  • "convenience" methods adding to RepositoryInfo, just call corresponding reference function with the Named value
  • "WithDefaultTag" has different behavior than EnsureTagged. EnsureTagged always adds a tag if not tagged while WithDefaultTag would not add the tag is the type was canonical. I prefer the more explicit change to only call EnsureTagged on a named only reference, but if that is not the consensus we can update the behavior.
  • ParseAnyReference causes slightly different behavior than the function it was replacing since it does explicitly return a Named type. While the return type is always Named or Digested I am not sure that logic translates well, perhaps another helper function would be better here.

@dmcgowan
Copy link
Collaborator Author

Added NormalizedNamed interface to allow referencing without ambiguity. Note that all underlying Named types implement the interface since even non-normalized types may be familiarized. Was thinking of also adding an IsNormalized function which just needs to ensure the domain is not empty. The caller still has the potential to call ParseNamed and type assert to NormalizedNamed but I don't see that as a being a problematic pattern since ParseNamed should only be called on fully normalized values. The docker update will only use ParseNormalizedNamed.

@aaronlehmann
Copy link
Contributor

I like the addition of the NormalizedNamed intrerface. I think it's the right direction because without it, there's no way to distinguish a normalized reference by type. For example in the first attempt at porting the Docker Engine codebase to this unified version of the reference library, I believe push/pull functions expected normalized references, but since the types were reference.Named, it would be easy to accidentally pass another type of named reference.

Allows having other parsers which are capable of unambiguously keeping domain and path separated in a Reference type.

Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan)
Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan)
Add normalization functions and Docker specific domain splitting to
reference package.

Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan)
Add interface for for a normalized name and corresponding parser for that type.
New normalized versions of all interfaces are not added since all type information is preserved on calls to Familiar.

Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan)
@dmcgowan dmcgowan force-pushed the reference-with-split-hostname branch from 6741755 to 429c75f Compare January 6, 2017 00:59
@dmcgowan
Copy link
Collaborator Author

dmcgowan commented Jan 9, 2017

Rebased

Added 2 helper functions to make easy to integrate into Docker

@aaronlehmann
Copy link
Contributor

Whoah, nice

@stevvooe
Copy link
Collaborator

stevvooe commented Jan 9, 2017

LGTM

legacyDefaultDomain = "index.docker.io"
defaultDomain = "docker.io"
defaultRepoPrefix = "library/"
defaultTag = "latest"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on exporting these? If we don't, downstream code will likely need to redefine them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to, ideally they are only used through functions and hidden from the callers. The places the default tag are used today should rather be using EnsureTagged. Other use is for logging, but the "default" tag is irrelevant to the logs, the useful information is that a reference was altered along with before/after.

Only other types used are here https://github.com/docker/docker/blob/master/registry/config.go#L258. That function will likely get replaced to use the parsers in this package to validate.

@aaronlehmann
Copy link
Contributor

LGTM

I think it would be a nice simplification to merge Named and NamedRepository. Named doesn't ever appear to be backed by anything that doesn't implement NamedRepository. Formalizing the fact that Named exposes Domain and Path methods would simplify the code a bit, avoiding some type assertions that use these methods if available and otherwise do parsing.

However, this would change the Named interface, and I think it's better not to do that yet. This could be a followup for after the docker and distribution reference packages are merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants