-
Notifications
You must be signed in to change notification settings - Fork 53
Working group for Reference Spec #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f193d75
to
6810ce6
Compare
We discussed this during the OCI meeting at KubeCon today and decided it makes more sense to add this to image-spec instead of a separate repo. I'm closing this and we can coordinate with PR's to image-spec. |
Reopening to continue the discussion from the mailing list: https://groups.google.com/a/opencontainers.org/g/dev/c/PpzsQMOnda4 |
Thanks for reopening this @sudo-bmitch. I think clarifying meaning of reference in distribution and OCI layout is valuable since parsing that regex depending on context and language choice would make it easier if there was a clear definition in image spec (on which distribution indirectly depends on anyway). e.g. write up (but doesn't include the regex) - https://oras.land/docs/concepts/reference |
cc @lumjjb |
There's not a lot in the minutes explaining the decision to include it in image-spec, can anyone remember the reasoning? Personally, I think defining references separately makes sense. The image spec as written is very cleanly encapsulated around what an image is and not where an image is. It also gives more flexibility for the future, if we need to refer to things other than images. Putting it in the image spec would also tie up the lifecycles, and I'm not sure that's a good thing. |
I believe @imjasonh weighed in on that. The impression I got was that creating a new spec is more effort than extending an existing one, and they felt that image spec was the closest fit. Given that I originally opened this to create a new spec myself, I tend to agree with you that it doesn't feel like a great fit.
We've muddied the definition, particularly with defining the usage for artifacts, and the image layout. The spec also uses "image" for a lot of things that aren't necessarily an image, like the image index. |
Linking my thoughts on what the spec could include here: https://github.com/Jamstah/reference-spec/blob/main/spec.md |
I think such a codification of existing practice would make sense. (I’m a tiny bit worried about a separate spec working group finding opportunities to invent new features with little previous precedent.)
(As the maintainer of the linked Skopeo tool), at best as a warning. Strings are notoriously ambiguous. IMHO every string field / syntax should strictly define whether it is an “image reference” in the sense of the rest of that text, or a transport E.g. consider Accepting both syntaxes in the same field forces the consumer to resort to heuristics to disambiguate. And while it would certainly be more useful for every software to use the same heuristics, it seems to me to be very hard to design and standardize heuristics that will stand the test of time, as other registry features, and new c/image transports (or transports in other image codebases?), are added. |
Signed-off-by: Brandon Mitchell <[email protected]>
6810ce6
to
12aabc7
Compare
Kubernetes contributor here. We would benefit significantly from having this spec. We're current dealing with a proposal to improve our validation support for image references (kubernetes/kubernetes#130834) and it's quite risky to offer guarantees on our API surface without an actual spec. |
Note that this PR is currently held in draft state waiting on a sufficient number of volunteers to be a proposed owners and stakeholders in the working group. |
Do volunteers need any particular credential? I'm happy to be involved. |
Mostly a willingness to roll up their sleeves and do the work. Submit a review with your name/project to add if you have the bandwidth. See the referrers WG for an example of what it took to push through a large change. The auth WG hit a wall when it came time to write the spec changes, and I still need to spend a lot of time doing the heavy lifting over there. I think we had more people interested in seeing the result than we had people with the time to lead the effort. So I'm holding off opening up new working groups where I'd end up being the only author. |
The successfulness of that WG is also not universally held (I'd say there's at least as much cautionary tale there as blueprint, if not more). |
Signed-off-by: Brandon Mitchell <[email protected]>
* **additional volunteers needed** | ||
|
||
## Stakeholders | ||
|
||
OCI Projects, non-OCI projects, or organizations sponsoring the working group and participating in the implementation and use case validation of the work done by the group. | ||
|
||
* distribution | ||
* regclient | ||
* **additional volunteers needed** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containerd, nerdctl, Moby, BuildKit can be added here?
cc @dmcgowan @tonistiigi @thaJeztah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that a question to your co-maintainers or a request for me to add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both, but kinda latter one
Signed-off-by: Brandon Mitchell <[email protected]>
The goal here is essentially to take the Grammar from https://pkg.go.dev/github.com/distribution/[email protected]#pkg-overview and write slightly more words around it, especially describing the (Perhaps to be even more clear, if this working group ends up designing something new, I think we've missed the mark.) In other words, I'm not strictly opposed to this working group, but I don't think "extension" should be explicitly part of the initial charter. I also think this "spec" probably belongs in either the existing image-spec or distribution-spec. It's got a little bit of overlap between them, but that's the nature of those two specs already, and IMO this probably falls more on the distribution-spec than the image (as these "references" are describing objects as referenced via that API). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a proper "review" so it's clear I'm -1 on WG this as-written (but +1 on the high-level intent of having spec-language for "references"); I've included some more explicit review comments here, but they are not totally inclusive (because I don't want to repeat myself too heavily).
|
||
References are a string that is used by runtimes and other OCI registry clients to retrieve a container image. | ||
They are currently a convention, used by many clients, adopted from Docker and implemented in distribution/distribution. | ||
This WG seeks to define the syntax and parsing of a reference as an OCI spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"define" is a loaded word for me here, given the pre-existing nature of the relevant format -- would prefer a word like "document"
Also, "in the OCI" instead of "as an OCI spec" (because as I've noted prior, I don't think this should be new a spec, but declaring that it will be a new spec right off the bat in the WG proposal is definitely premature).
This WG seeks to define the syntax and parsing of a reference as an OCI spec. | |
This WG seeks to document the syntax and parsing of a reference within the OCI. |
or
This WG seeks to define the syntax and parsing of a reference as an OCI spec. | |
This WG seeks to spec the syntax and parsing of a reference within the OCI. |
* Specify the syntax for references to an [OCI Image Layout](https://github.com/opencontainers/image-spec/blob/7b36cea86235157d78528944cb94c3323ee0905c/image-layout.md) manifest, including support for a tag, digest, or a full registry/repository name. | ||
* Define how the syntax can be extended to support other use cases, including: | ||
* Alternate layer formats | ||
* Immutable tags | ||
* Selecting an artifact that refers to another manifest (e.g. signature or SBOM) | ||
* Querying content from a pull through cache serving multiple registries (see [distribution-spec PR #66](https://github.com/opencontainers/distribution-spec/pull/66)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all what I'd describe as "Referrers 2.0" (the cautionary tales I referenced earlier) - stretch goals that look more like designing a new thing than documenting an existing thing, and I'm pretty strongly opposed to including these in the first pass of explicit spec language for "references" in the OCI. Let's start smaller and just have an explicit spec for the thing that currently exists and has seen wide adoption for ~10+ years, then we can talk about expansion possibilities with less fog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree here. There are many places that could refer to a spec documenting the existing state, that would be, for backward compatibility reasons with handling existing inputs, ~unable to adopt new many plausible new syntaxes (there are only so many ASCII characters, and for readability, some are more attractive than others, so conflicts are likely).
The list above is also unclear about the target user — some seem to be relevant to running containers, some to building them, some for, I guess, managing a registry, some for… I don’t know, guessing, purely internal uses within a build system/registry?
Philosophically, there are many data transfer / RPC mechanism that can be used to add such parameters to operations where the parameters make sense; and clear separation between data/metadata is essential. I strongly think that adding a field is almost always preferable to adding extra syntax within an existing string field.
The one exception is if the string field is so extremely widespread that adding a field everywhere we have the existing string is unfeasible. [E.g. we have ended up with the concept of a “slash-separated [0-terminated] file path” although that is, in many ways, flawed.] But, such a widespread use almost always implies that the specifics of the format are hard-coded in many places, and various extensions / other string formats assume that the format will not change.
I can imagine an effort to add features / drive a cross-vendor consensus for a single specific use case. E.g., maybe there is some feature that would be really valuable to have in K8s pod specs, and equivalent container image references in other “run a container from this image” orchestration systems. That is a reasonably bounded problem where clear benefits can be articulated.
I think add ing features that could benefit various sort-of-related but rather different use cases like (running / publishing / downloading / SBOMs / graph queries), where most of the proposed features would be irrelevant to any single user but many interoperability / backward compatibility concerns would be a concern for many such users, seems unlikely to succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start smaller and just have an explicit spec for the thing that currently exists and has seen wide adoption for ~10+ years, then we can talk about expansion possibilities with less fog.
If the thing specified cannot be extended, there will be no "then". This isn't defining the possible extensions, just making it possible for extensions to be created. The result is likely limiting the character set for various fields so that characters are available to mark future extensions.
I strongly think that adding a field is almost always preferable to adding extra syntax within an existing string field.
I see a reference more as a URI than as a pointer to a tag on a registry. The tag pointer was already extended to allow for pointers to a digest. With referrers and people mounting OCI volumes, I can see a request for a reference to a referrer to an image. This is going to keep happening, e.g. with a pull through proxy that can proxy multiple registries, we will lose the ability to directly pull content from the proxy using the current reference syntax.
* Immutable tags | ||
* Selecting an artifact that refers to another manifest (e.g. signature or SBOM) | ||
* Querying content from a pull through cache serving multiple registries (see [distribution-spec PR #66](https://github.com/opencontainers/distribution-spec/pull/66)) | ||
* Provide backwards compatibility by using the existing reference convention from [distribution/reference](https://github.com/distribution/reference) when feasible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This increases my confusion -- we're trying to talk about "backwards compatibility" but we're explicitly trying to design a whole new thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're defining something that can be an in-place upgrade for tooling that currently uses the distribution/reference strings. v1.0 may end up being the current state of distribution/reference, but v* should avoid changes that break users who depend on the current string syntax. New features should be additive in a way that can be distinguished from current use cases, without ambiguity.
|
||
## Out of Scope | ||
|
||
* This WG will be limited to the creation of a new spec, and is not expected to impact other OCI specs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted above, I strongly disagree with this conclusion, but also believe it's premature to decide that this should be a completely net-new spec.
The goal here is essentially to take the Grammar from https://pkg.go.dev/github.com/distribution/[email protected]#pkg-overview and write slightly more words around it, especially describing the docker.io, docker.io/library/, and :latest defaults, right?
This isn't very big, and I don't think we should encourage it to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the size of the work should determine whether it's a new spec, but rather whether it fits within the definition of an existing spec. Looking at image-spec:
The OCI Image Format project creates and maintains the software shipping container image format spec (OCI Image Format).
I don't think we fit there. And for distribution-spec:
The OCI Distribution Spec project defines an API protocol to facilitate and standardize the distribution of content.
We only fit there if we are deciding in advance to only allow references to content on a registry, and not to any other way of storing images, e.g. the OCI Image Layout.
I agree with this. It seems to me that a reasonable deliverable for the WG (if this really needs a WG) would be a PR against distribution-spec. |
The community has gone back and forth on this a few times. It was originally proposed as a new spec, then there was a decision that it belonged as part of image-spec since it wasn't a registry API change, then it was back to making it a separate spec because image-spec should focus on the image json schemas and layer formats, and now it's over to distribution-spec where I'd suggest it prematurely limits it so that a reference to anything outside of a registry is permanently out of scope.
The goal with that section of the scope is to avoid defining a string that cannot be extended. There's a lot of external knowledge we can leverage for this, like the syntax of a URI, but that depends on ensuring character sets for various fields are limited from the start so that characters are available for future extensions. This doesn't mean those extensions need to be added as part of the working group, just that we acknowledge that groups will be doing this and to avoid rushing into a quick decision that locks us into a syntax that can never by extended without breaking changes. |
I'd like to propose a new working group to create a
reference-spec
that would define a reference in OCI. At present, this is just a convention, originated by Docker, wherealpine
gets mapped todocker.io/library/alpine:latest
(wheredocker.io
isn't even the real registry name). There is currently an implementation for this in distribution/distribution that many have used. I believe it would be useful for OCI to formalize this, and even include a Go implementation.Other use cases I can think of (or have seen):
This PR will remain in draft state until there are a sufficient number of owners and stakeholders that volunteer to work on the project.
Signed-off-by: Brandon Mitchell [email protected]