Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introducing oci artifact registry impl #2989

Merged
merged 17 commits into from
Oct 11, 2024

Conversation

jonathanj-square
Copy link
Contributor

@jonathanj-square jonathanj-square commented Oct 4, 2024

fixes #2954

Introduces the OCI implementation of the module registry. The controller still uses the database module registry; so to exercise the change the ftl release commands have been introduced to demonstrate pushing module artifacts to the registry and discovering them (and metadata). The CLI implementations will change with the subsequent controller integration work.

- adding docker registry image
- adding docker registry image
- integration with CLI to demonstrate release publishing is WIP
@ftl-robot ftl-robot mentioned this pull request Oct 4, 2024
@stuartwdouglas stuartwdouglas added the run-all A PR with this label will run the full set of CI jobs in the PR rather than in the merge queue label Oct 4, 2024
docker-compose.yml Outdated Show resolved Hide resolved
@jonathanj-square jonathanj-square marked this pull request as ready for review October 9, 2024 20:04
@jonathanj-square jonathanj-square requested review from wesbillman and removed request for a team October 9, 2024 20:04
@jonathanj-square jonathanj-square changed the title feat: WIP introducing oci artifact registry impl feat: introducing oci artifact registry impl Oct 9, 2024
}

func (s *ContainerService) Upload(ctx context.Context, artefact Artefact) (sha256.SHA256, error) {
hash := sha256.Sum(artefact.Content)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't Artefact already have a sha256 sum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right you are, using that instead

}
registry.PlainHTTP = true
result := make([]ArtefactRepository, 0)
err = registry.Repositories(ctx, "", func(repos []string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this iterating over all repositories? That won't be scalable will it? eg. in Cash's cloud there will be thousands if not hundreds of thousands of repositories.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this operation is actually part of the OCI standard either. The underlying implementation queries "%s://%s/v2/_catalog" which is not part of the OCI distribution spec.

Either way there should be a single configured repository where we upload these artifacts, and we should not rely on the discovery APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an issue to track fully spec'ing out the approach.

return fmt.Sprintf("%s/%s", host, createModuleRepositoryPathFromDigest(digest))
}

// getDigestFromModuleRepositoryPath extracts the digest from the module repository path; e.g. /ftl/modules/<digest>:latest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be /ftl/modules/<modul>/<digest> no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this path actually defined BTW, I don't see it anywhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above, but basically this should be a config option and not per module so we can re-use artifacts.

ModuleArtifactPrefix = "ftl/modules/"
)

type ContainerConfig struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI the typical pattern for config is to tag them with Kong tags, and embed them in the CLI so they can be automatically passed through from the CLI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

}

type releaseListCmd struct {
Registry string `help:"Registry host:port" default:"127.0.0.1:5001"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all seem to be shared, just put them in releaseCmd and inject that into the Run() method as a parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat, updated

}
registry.PlainHTTP = true
result := make([]ArtefactRepository, 0)
err = registry.Repositories(ctx, "", func(repos []string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this operation is actually part of the OCI standard either. The underlying implementation queries "%s://%s/v2/_catalog" which is not part of the OCI distribution spec.

Either way there should be a single configured repository where we upload these artifacts, and we should not rely on the discovery APIs.


func (s *ContainerService) Upload(ctx context.Context, artefact Artefact) (sha256.SHA256, error) {
hash := sha256.Sum(artefact.Content)
ref := fmt.Sprintf("ftl/modules/%s", hash)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a config option. Basically the general location format should probably be something like <host>/<configured-repo>:ftl-artifact-<digest>.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design doc should probably be updated to cover exactly how these artifacts are stored in the registry in terms of the naming scheme.

Copy link
Contributor Author

@jonathanj-square jonathanj-square Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, I need to be more explicit about how release and module compilation artifacts (binaries, jars, schema DSLs) get mapped to OCI format concepts and their layout:

Created a tracking item here: #3077

for _, d := range digests {
set[d] = true
}
modules, err := s.DiscoverModuleArtefacts(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be explicitly checking the registry for each digest, not listing the full registry then iterating over the list. Ideally Oras should be sending a HEAD request to /v2/<name>/blobs/<digest> for each artifact to check.

Even then I am not sure if this is the right approach, we can still store in the database which artifact we know about, as for JVM applications even the HEAD approach will likely be slower than we would want.

We could still potentially use the HEAD approach if we don't know about the artifact, as it would allow us to re-use jars from other clusters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call and thanks for catching it at this stage. Some rethinking is due here, this ticket will track that.
#3076

return fmt.Sprintf("%s/%s", host, createModuleRepositoryPathFromDigest(digest))
}

// getDigestFromModuleRepositoryPath extracts the digest from the module repository path; e.g. /ftl/modules/<digest>:latest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above, but basically this should be a config option and not per module so we can re-use artifacts.

MaxIdleDBConnections int `help:"Maximum number of idle database connections." default:"20" env:"FTL_MAX_IDLE_DB_CONNECTIONS"`
}

func (d *releasePublishCmd) Run() error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to do 'releases' to a container registry we will need to push some kind of manifest with the contents of the release, as releases are multiple files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, the transition to the OCI registries involves multiple steps, the first is to simply dump the payloads there instead of the database. from there it will be refined to take advantage of the container formats capabilities

if err != nil {
return fmt.Errorf("failed to create container service: %w", err)
}
modules, err := svc.DiscoverModuleArtefacts(context.Background())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would basically translate to the OCI list operation, which is often really slow and not scalable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, captured in #3076

@jonathanj-square jonathanj-square merged commit b4606de into main Oct 11, 2024
95 checks passed
@jonathanj-square jonathanj-square deleted the jonathanj/oci-service-impl branch October 11, 2024 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-all A PR with this label will run the full set of CI jobs in the PR rather than in the merge queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce an OCI based module artifact registry service implementation
3 participants