Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting channel metadata into Channel and Manifest files #109

Merged
merged 2 commits into from
Jan 25, 2023

Conversation

spyrkob
Copy link
Contributor

@spyrkob spyrkob commented Oct 12, 2022

Splitting the Channel definition into Channel and Manifest files. Adding blockList and advanced resolution strategies.

Channel contains static information - repositories, vendor, etc while Manifest contains streams definition

Example Channel file:

schemaVersion: "2.0.0"
name: My Channel
description: |-
  This is my channel
  with my stuff
vendor:
  name: My Vendor
  support: community
repositories:
- id: test
  url: test-repository
manifest:
  gav: org.test:test-manifest

Example Manifest file:

schemaVersion: "1.0.0"
name: My Manifest
description: This is My Manifest v1.0.0
streams:
  - groupId: org.wildfly
    artifactId: wildfly-ee-galleon-pack
    version: 26.0.0.Final

@spyrkob spyrkob changed the title Manifest [WiP] Splitting channel metadata into Channel and Manifest files Oct 12, 2022
@spyrkob
Copy link
Contributor Author

spyrkob commented Oct 12, 2022

@jmesnil, @wolfc - the PR is not complete yet, but is this the split of channel definition you had in mind?

The changes from #104 would be added to the Channel file.

@jmesnil
Copy link
Member

jmesnil commented Oct 12, 2022

I'm not sure.

When we specified the channel, we decided that the maven repositories were not a part of its model: https://github.com/wildfly-extras/wildfly-channel/blob/main/doc/spec.adoc#channel-model.

When the provisioning tool is a Maven-based, the maven repositories naturally comes from the pom.xml settings.

When the provisioning tool is not based on Maven, such as Prospero, we would need another way to specify them. You have e.g. https://github.com/wildfly-extras/prospero/blob/main/prospero-cli/src/test/resources/prospero-known-combinations.yaml that would combine the channel GAV with the maven repositories that hosts the artifacts streams.

@wolfc
Copy link
Contributor

wolfc commented Oct 12, 2022

Channel = 1+ repositories, 1 channel definition, 1 channel definition location (2022-04-04)

I highlighted a couple of times the implementation was using wrong definitions. Now we're at the point we need to clean this up and correctly state that 'channel definition' = 'manifest'. So far the implementation has been using Channel for 'channel definition'.

As for how provisioning works in a Maven environment this is inherently tricky. Having any form of latest strategy active means you can not blindly combine the repositories. Each repository is tied to a certain channel definition / manifest which in essence then forms the channel. You can combine channels, but not repositories. (This is why I said configuration is skewed at the moment.)

A Maven project can only use 1 channel with its repositories inherently defined. To be able to use multiple channels it needs to use different means (aka the same means as prospero).

@jmesnil
Copy link
Member

jmesnil commented Oct 12, 2022

So far the implementation has been using Channel for 'channel definition'.

The library is providing the Channel API to represent the model as defined in https://github.com/wildfly-extras/wildfly-channel/blob/main/doc/spec.adoc#channel-model.

Having any form of latest strategy active means you can not blindly combine the repositories.

Definitely! That's why "stable" channels should use fixed versions and curate them instead of blinding relying on untrusted / unaligned versions coming from different Maven repositories.
In Maven land, the provisioning of the server is realized by the combination of a Channel GAV and maven repositories (directly configured in the pom.xml). In that context, I don't see why we would need to express this configuration with yet another YAML file.

For Prospero, we can definitely do it but that's a tool-specific configuration as I see it.

@wolfc
Copy link
Contributor

wolfc commented Oct 12, 2022

So far the implementation has been using Channel for 'channel definition'.

The library is providing the Channel API to represent the model as defined in https://github.com/wildfly-extras/wildfly-channel/blob/main/doc/spec.adoc#channel-model.

Right, and what it describes is a way to describe the contents of a channel. The actual contents live in the backing repositories. Without the repositories there is only the manifest.

Having any form of latest strategy active means you can not blindly combine the repositories.

Definitely! That's why "stable" channels should use fixed versions and curate them instead of blinding relying on untrusted / unaligned versions coming from different Maven repositories. In Maven land, the provisioning of the server is realized by the combination of a Channel GAV and maven repositories (directly configured in the pom.xml). In that context, I don't see why we would need to express this configuration with yet another YAML file.

For Prospero, we can definitely do it but that's a tool-specific configuration as I see it.

Irrespective of the tool being used (Prospero, Maven plugins or whatever) the provisioning result must be the same. Yes, you can use untrusted Maven repositories as a backing content repository for a channel, but that is only in Maven land. Our channel repositories are fully trusted and aligned to individual product streams.

To be able to provide the exact same result the input must be the same and thus any tool doing provisioning would need proper channel configuration. As I said, you can use a single channel in a Maven project but combining multiple and then mixing the repositories leads to trouble.

What would you call the combination of repositories + manifest(s)? And then what would you call a collection of those combinations?


public VersionResolverFactory(RepositorySystem system,
RepositorySystemSession session,
List<RemoteRepository> repositories) {
RepositorySystemSession session) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change the signature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of repositories needs to be passed to the VersionResolverFactory when the Channel is initialised (via VersionResolverFactory#create). So I think there is no need for a shared list of repositories in VersionResolverFactory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but create is also called from the Channel#init (https://github.com/wildfly-extras/wildfly-channel/blob/main/core/src/main/java/org/wildfly/channel/Channel.java#L179)

This call would have to disregard the repository list from the Factory's constructor and pass a different list for each channel, right?

btw. is the ChannelMavenArtifactRepositoryManager #resolver used anywhere in the wildfly-maven-plugin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but create is also called from the Channel#init (https://github.com/wildfly-extras/wildfly-channel/blob/main/core/src/main/java/org/wildfly/channel/Channel.java#L179)

gotcha. I missed that usage.

btw. is the ChannelMavenArtifactRepositoryManager #resolver used anywhere in the wildfly-maven-plugin?

mmh, that seems to be old code that can be removed. thanks for spotting it

* Streams of components that are provided by this channel.
*/
private Set<Stream> streams;
private Manifest manifest;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please add Javadoc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll update the docs as well, just wanted to settle on the definition first

import static com.fasterxml.jackson.annotation.JsonInclude.Include.NON_EMPTY;
import static com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL;

public class Manifest implements AutoCloseable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to ChannelManifest?

public static final String CLASSIFIER="manifest";
public static final String EXTENSION="yaml";

private final String schemaVersion;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add javadoc please?


import static com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL;

public class ManifestRef {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename it to ManifestCoordinate and create a superclass common to https://github.com/wildfly-extras/wildfly-channel/blob/main/maven-resolver/src/main/java/org/wildfly/channel/maven/ChannelCoordinate.java

We want to be able to locate a channel or a manifest using either an URL or a GA(V)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added ChannelMetadataCoordinate as superclass to both ChannelCoordinate and ChannelManifestCoordinate

private String url;

@JsonCreator
public Repository(@JsonProperty(value = "id") String id, @JsonProperty(value = "url") String url) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the id required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I guess we could auto-generate the id if it's not provided. But I think it might possibly cause some maven cache issue if the ids collide

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Maven world repository id is used to track the state in the local cache. So if you allow resolution to go through a Maven cache the ids must match.

@jfdenise
Copy link
Collaborator

When I look at the Channel file I can see

repositories:
- id: test

That means that the channel library should only use the "test" maven repository to resolve any artifact that the manifest bound to this channel has a stream for.

@wolfc @spyrkob am I correct with this assumption?

Based on this, we can have the wildfly-maven-plugin to expose a list of "restricted maven repositories" that would, in this case, contain a single entry: "test". Obviously the settings.xml or pom.xml (the enabled current profiles(s)), would have to contain a repository with the "test" id. This list is global to all configured channels. All configured channels would receive the same list. Each channel would only use a subset of the repositories according to its configuration. Each channel having the knowledge of the ids it trusts. 2 channel files referencing the id "mrrc" identify the same repository. We can't imagine "mrrc" having a meaning for channel1 and another for channel2.

When the plugin runs, it conveys the global list of "restricted repositories" to all the configured channels. The channel, when a "restricted list" is provided, only uses the ids that it knows (if any) repository from the list of maven repositories instances that the plugin has injected in the channel (could contain central, nexus, mrrc, repos, custom, test, ...). If the expected ids are not found, then it aborts.
If no "restricted lists" is provided, then the channel uses all repositories, without any selection nor check.

@spyrkob
Copy link
Contributor Author

spyrkob commented Oct 18, 2022

@jfdenise I think that makes sense. What would be the expectation if channel defines two repositories, but only one of them is in the restricted list? Aborted run?

@jfdenise
Copy link
Collaborator

@spyrkob , I would say that, when a restricted list is provided, all repositories known by the channel must be present. Aborting if one is missing.

@spyrkob
Copy link
Contributor Author

spyrkob commented Oct 20, 2022

added the manifest information to the spec

@spyrkob spyrkob changed the title [WiP] Splitting channel metadata into Channel and Manifest files Splitting channel metadata into Channel and Manifest files Oct 20, 2022
@spyrkob spyrkob force-pushed the manifest branch 3 times, most recently from 9048f3d to 9590bf7 Compare October 20, 2022 11:16
@jmesnil
Copy link
Member

jmesnil commented Oct 21, 2022

repositories:
- id: test
  url: https://example.com/test-repository

There are scenarii when the users will configure a Maven proxy to pull Maven dependencies. The lib must respect their settings.
In that sense, the id field is a unique identifier and the url would be a "default" URL that can be overridden by the users in the tooling.

For example, with the wildfly-maven-plugin, it should be possible for the user to configure a <repository> with a test id to use instead of the default repositories from the channel definition.

Given that we can not necessarily assert the sources of the components provisioned by a Channel, it becomes more important to verify their integrity (as tracked by #112).
It then does not matter if the component was pulled from Maven Central or a Proxy as long as its checksum correspond to the expected content from the channel.

@spyrkob spyrkob mentioned this pull request Oct 21, 2022
@spyrkob
Copy link
Contributor Author

spyrkob commented Oct 21, 2022

@jmesnil how about adding following constructor to the VersionResolverFactory:

public VersionResolverFactory(RepositorySystem system,
                                  RepositorySystemSession session,
                                  Function<Repository, RemoteRepository> repositoryFactory)

User can then provide their own mapping from channel's repository to maven repository

@spyrkob
Copy link
Contributor Author

spyrkob commented Nov 2, 2022

@jmesnil should I add the constructor above? Will it solve the use case you outlined?

@spyrkob
Copy link
Contributor Author

spyrkob commented Nov 7, 2022

@jmesnil do you know when this change might be included?

@spyrkob
Copy link
Contributor Author

spyrkob commented Nov 9, 2022

Should the requires relation be expressed between Channels or ChannelManifests?

The use case I'm thinking about is, for example, Wildfly feature packs where the wildfly-galleon-pack depends on wildfly-ee-galleon-pack.

Currently that's represented by two separate manifests - wildfly-galleon-pack-channel.yaml and wildfly-ee-galleon-pack-channel.yaml where the former requires the latter.
After the Channel/Manifest split should those be treated as two separate channels with the same repositories? Or should the Channel allow to combine multiple Manifests into one (either via Maniefest requires or listing multiple manifests in Channel)?

@jmesnil
Copy link
Member

jmesnil commented Nov 14, 2022

@spyrkob I've created a dev branch. Could you please rebase this PR against it please?

@spyrkob spyrkob changed the base branch from main to dev November 14, 2022 13:41
@spyrkob
Copy link
Contributor Author

spyrkob commented Nov 14, 2022

@jmesnil done


public String getClassifier() {
return classifier;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary semicolon.

if (gav != null) {
final String[] parsedGav = gav.split(":");
if (parsedGav.length < 2 || parsedGav.length > 3) {
throw new IllegalArgumentException("Illegal GAV experession: " + gav);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling expression

}

void init(MavenVersionsResolver factory) {
resolver = resolver;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be resolver = factory;

@@ -18,10 +18,15 @@

import java.io.Closeable;
import java.io.File;
import java.net.MalformedURLException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused import

@spyrkob spyrkob force-pushed the manifest branch 2 times, most recently from 7ac35f9 to 76babb5 Compare January 12, 2023 12:58
@jfdenise
Copy link
Collaborator

@spyrkob it seems that the fix done in #131 should be applied to the new schemas.

@spyrkob spyrkob force-pushed the manifest branch 3 times, most recently from 94eb18b to c1e18e6 Compare January 16, 2023 11:36
@spyrkob
Copy link
Contributor Author

spyrkob commented Jan 16, 2023

@jfdenise updated

@spyrkob spyrkob force-pushed the manifest branch 3 times, most recently from 018821c to a3003f8 Compare January 20, 2023 11:01
@jfdenise jfdenise self-requested a review January 20, 2023 15:13
@spyrkob spyrkob changed the base branch from dev to main January 20, 2023 15:50
@jfdenise
Copy link
Collaborator

@spyrkob , I integrated these changes with wildfly-maven-plugin, and all is fine. I am approving this PR.

… spec changes - channel/manifest split,

blocklist and resolve strategies
@jfdenise jfdenise merged commit 4751d1a into wildfly-extras:main Jan 25, 2023
@jmesnil jmesnil added this to the 1.0.0.Beta4 milestone Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Splitting channel metadata into Channel and Manifest files
7 participants