Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image-layout: add an initial image layout spec #94

Merged
merged 1 commit into from
Jun 9, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions image-layout.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
## Open Container Initiative Image Layout Specification

The OCI Image Layout is a slash separated layout of OCI content-addressable blobs and [location-addressable](https://en.wikipedia.org/wiki/Content-addressable_storage#Content-addressed_vs._location-addressed) references (refs).
This layout MAY be used in a variety of different transport mechanisms: archive formats (e.g. tar, zip), shared filesystem environments (e.g. nfs), or networked file fetching (e.g. http, ftp, rsync).
Given an image layout a tool can convert a given ref into a runnable OCI Image Format by finding an appopriate manifest from the manifest list, unpacking the filesystem serializations in the correct order, and then converting the image configuration into an OCI Runtime config.json.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few requirements we may want to lay out:

  1. No symbolic linking. All linking is through file contents.
  2. File object names should restricted to a simple character set, such that are supported across a wide variety of protocols.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is symlinking bad?

And I think the set {[a-z], [0-9], ':', '/'} is OK for most all things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added character limit

The image layout has two top level directories:

- "blobs" contains content-addressable blobs. A blob has no schema and should be considered opaque.
- "refs" contains descriptors pointing to an image manifest list

It also contains a file that is used to identify the layout version:

- "oci-layout" MUST contain a JSON object with a version field `{"imageLayoutVersion": "1.0.0"}` and MAY include additional fields.

This is an example image layout:

```
$ cd example.com/app/
$ find .
.
./blobs
./blobs/sha256-afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
./blobs/sha256-5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270
./blobs/sha256-e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f
Copy link
Member

@vbatts vbatts May 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would break this down further. In the event of a lot of objects, this would be a mess.
./blobs/sha256/af/ff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
or even just
./blobs/sha256/afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Tue, May 31, 2016 at 02:26:40PM -0700, Vincent Batts wrote:

I would break this down further. In the even of a lot of objects, this would be a mess.
./blobs/sha256/af/ff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
or even just
./blobs/sha256/afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51

How many layers do you expect to put in a single lump? I think this
is premature optimization that should be punted to later versions ;).
And a performant file format containing lots of blobs is probably
going to have to abandon tar at some point for a binary format like
1, although then you lose the transport-agnosticism of the current
format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For others: the reason things like git and diskv (used in rkt) do this is because POSIX filesystems like ext4 just suck at having 32,000+ dentries. And some CLI tools like ls can get grumpy too. This generally isn't applicable to things like HTTP, s3, ceph, etc.

To fix this we can transform sha256-ff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51 to sha256/af/ff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d5

I am OK with this change. @stevvooe @wking ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Tue, May 31, 2016 at 02:43:32PM -0700, Brandon Philips wrote:

To fix this we can transform
sha256-ff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
to
sha256/af/ff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d5

I am OK with this change. @stevvooe @wking ?

I'm not intimately familiar with tar, but tar(5) does not talk about
indexes or other super-record structures. So I don't think this
change provides a useful speedup for tarred layouts. This means out
of our available transport mechanisms, we're optimizing for only one
(local filesystem access), and then for a subset of filesystems.
Until someone says “oh man, image $NAME really drags without
filename-based blob sharding”, I think we should leave the blob
directory as it stands 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the performance on ext4 I am OK punting on this for image layout version 2. cc @vbatts ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's that clean, but we can punt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this comment. ext4 is okay but many file systems do still struggle without the prefixed hash directories. Partitioning can help with remote file systems, as well.

There are also cases where you want to partition based on algorithm, which easier if you can just pick up the whole directory, rather than match a file prefix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Tue, Jun 07, 2016 at 09:21:52PM -0700, Stephen Day wrote:

ext4 is okay but many file systems do still struggle without the
prefixed hash directories…

Under blob counts that you'd still consider stuffing into a single
tarball? I'd still like to see someone post a link to a real-life
tarball that uses this layout to motivate any optimizations like this,
since the tar-layer approach makes me think it unlikely that you have
more than a few dozen blobs in a given layout.

There are also cases where you want to partition based on algorithm…

Can you point to docs on those cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe What operations on a remote filesystem? Do you have benchmarks?

100% agree this feels like a premature optimization @wking.

./oci-layout
./refs
./refs/v1.0
./refs/v1.1
```

Blobs are named by their contents:

```
$ shasum -a 256 ./blobs/sha256-afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51 ./blobs/sha256-afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
```

Object names in the refs and blobs MUST NOT include characters outside of the set of "A" to "Z", "a" to "z", the hyphen `-`, the dot `.`, and the underscore `_`.
Hash algorithm identifiers containing the colon `:` will be converted to the hyphen `-`.
For example `sha256:5b` will map to the layout `blobs/sha256-5b`.
The blobs directory MAY contain blobs which are not referenced by any of the refs.
The blobs directory MAY be missing referenced blobs, in which case the missing blobs SHOULD be fulfilled by an external blob store.

Each object in the refs subdirectory MUST be of type `application/vnd.oci.descriptor.v1+json`.
In general the `mediatype` of this descriptor object will be either `application/vnd.oci.image.manifest.list.v1+json` or `application/vnd.oci.image.manifest.v1+json` although future versions of the spec MAY use a different mediatype.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd drop the “MAY” → “may”, since the spec is about imposing constraints on implementations and consumers/producers. We don't want a chicken-and-egg constraint on the spec itself.


This illustrates the expected contents of a given ref and the manifest list it points to.

```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well.. shell ... hrm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #92

$ cat ./refs/v1.0
{"size": 4096, "digest": "sha256:afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51", "mediatype": "application/vnd.oci.image.manifest.list.v1+json"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this doc have a struc and schema somewhere too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #92

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So TODO plug the descriptor stuffs here.

```
```
$ cat ./blobs/sha256-afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.list.v1+json",
"manifests": [
{
...
```