Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Legacy test cluster infrastructure configuration cache compatible #101903

Merged
merged 22 commits into from
Nov 28, 2023

Conversation

breskeby
Copy link
Contributor

@breskeby breskeby commented Nov 8, 2023

  • Made some minor optimisations on the way
  • Do not allow node specific module or plugin configuration in test cluster setup

@breskeby breskeby self-assigned this Nov 8, 2023
@breskeby breskeby added >enhancement :Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team labels Nov 8, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-delivery (Team:Delivery)

@breskeby breskeby changed the title Move TestCluster plugin and module setup from Node to Cluster Rework LegacyTestCluster infrastructure to be configuration cache compatible Nov 8, 2023
@breskeby breskeby changed the title Rework LegacyTestCluster infrastructure to be configuration cache compatible Rework LegacyTestCluster to be configuration cache compatible Nov 8, 2023
@breskeby breskeby marked this pull request as draft November 8, 2023 10:50
@@ -85,7 +85,7 @@ public void apply(Project project) {
NamedDomainObjectContainer<ElasticsearchCluster> clusters = (NamedDomainObjectContainer<ElasticsearchCluster>) project
.getExtensions()
.getByName(TestClustersPlugin.EXTENSION_NAME);
clusters.all(c -> {
clusters.configureEach(c -> {
Copy link
Contributor Author

@breskeby breskeby Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We accidentally materialised all TestCluster here; fixing since we've been here

@@ -302,84 +287,34 @@ private void setDistributionType(ElasticsearchDistribution distribution, TestDis
}
}

// package protected so only TestClustersAware can access
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we never have node specific plugin or module requirements I moved this one level up to ElasticsearchCluster. That reduces the API surface and makes this handling more efficient as we only do this whole configuration setup once per cluster.

this.plugins.add(plugin.map(RegularFile::getAsFile));
}

@Override
public void plugin(String pluginProjectPath) {
plugin(maybeCreatePluginOrModuleDependency(pluginProjectPath, "zip"));
throw new IllegalStateException("Not Supported API");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never used. We should not start now while migrating away from this Build logic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes more sense to be an UnsupportedOperationException.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@breskeby breskeby changed the title Rework LegacyTestCluster to be configuration cache compatible Make Legacy test cluster infrastructure configuration cache compatible Nov 16, 2023
@@ -55,7 +55,7 @@ public String toString() {
private final Property<Boolean> failIfUnavailable;
private final Property<Boolean> preferArchive;
private final ConfigurableFileCollection extracted;
private Action<ElasticsearchDistribution> distributionFinalizer;
private transient Action<ElasticsearchDistribution> distributionFinalizer;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used when configuring the distribution. Once this is configured (and cached by the configuration cache) this is no longer required.

project.getRootProject().getPluginManager().apply(TestClustersHookPlugin.class);
configureArtifactTransforms(project);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this only once per project here and not multiple times for each cluster node

@@ -74,7 +74,7 @@ subprojects {
// Compatibility testing for JDBC driver started with version 7.9.0
BuildParams.bwcVersions.allIndexCompatible.findAll({ it.onOrAfter(Version.fromString("7.9.0")) && it != VersionProperties.elasticsearchVersion }).each { bwcVersion ->
def baseName = "v${bwcVersion}"
def cluster = testClusters.maybeCreate(baseName)
def cluster = testClusters.register(baseName)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

creating cluster lazy while been here

@@ -43,8 +41,6 @@ class LegacyYamlRestTestPluginFuncTest extends AbstractRestResourcesFuncTest {

def "yamlRestTest executes and copies api and tests to correct source set"() {
given:
// RestIntegTestTask not cc compatible due to
configurationCacheCompatible = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run tests with configuration cache enabled.

getClusters().add(cluster);
}

default void useCluster(Provider<ElasticsearchCluster> cluster) {
useCluster(cluster.get());
}

default void beforeStart() {}
default void beforeStart() {
getTasksService().get().register(this);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do that here as we cannot apply custom logic within task graph calculation as this is skipped when configuration cache is enabled

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't the result of the task graph listener get cached? That makes it seem like the state of build services isn't cached.

int claim = claimsInventory.getOrDefault(cluster, 0) + 1;
claimsInventory.put(cluster, claim);
if (claim > 1) {
cluster.setShared(true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We keep that information in the serialised cluster as the BuildServiceParameters cannot be keeping state across builds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I'm correct in my previous comment that build service state isn't cached. Any mutations they make have to be to cached data structures, which presumably means something that's an input to a task? Or a ValueSource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a Task itself is basically cached. So (nested) beans or fields of a task need to be cacheable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Therefore keep the information in the cluster

@breskeby breskeby marked this pull request as ready for review November 16, 2023 22:10
@@ -113,7 +113,7 @@ public void apply(Project project) {
configureArtifactTransforms(project);

// Create configuration for aggregating historical feature metadata
Configuration featureMetadataConfig = project.getConfigurations().create(FEATURES_METADATA_CONFIGURATION, c -> {
FileCollection featureMetadataConfig = project.getConfigurations().create(FEATURES_METADATA_CONFIGURATION, c -> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

referenced from SystemPropertyCommandLineArgumentProvider during execution time. Therefore needs to be configuration cache compatible.

@breskeby breskeby added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Nov 21, 2023
@breskeby breskeby requested a review from a team November 21, 2023 19:37
@@ -27,22 +26,27 @@ public interface TestClustersAware extends Task {
@ServiceReference(REGISTRY_SERVICE_NAME)
Property<TestClustersRegistry> getRegistery();

@ServiceReference(TEST_CLUSTER_TASKS_SERVICE)
Property<TestClustersPlugin.TaskEventsService> getTasksService();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be Provider not Property, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -88,6 +88,7 @@ public WorkResult delete(Object... objects) {

@Override
public void beforeStart() {
getTasksService().get().register(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead call super() here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean that logic should move into the TestClusterAware?

getClusters().add(cluster);
}

default void useCluster(Provider<ElasticsearchCluster> cluster) {
useCluster(cluster.get());
}

default void beforeStart() {}
default void beforeStart() {
getTasksService().get().register(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't the result of the task graph listener get cached? That makes it seem like the state of build services isn't cached.

.getSharedServices()
.registerIfAbsent(
THROTTLE_SERVICE_NAME,
TestClustersThrottle.class,
spec -> spec.getMaxParallelUsages().set(Math.max(1, project.getGradle().getStartParameter().getMaxWorkerCount() / 2))
);

// register cluster hooks
project.getTasks().withType(TestClustersAware.class).configureEach(task -> { task.usesService(testClustersThrottleProvider); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? This should be accomplished by us overriding getSharedServices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is to make our getSharedService override work with configuration cache enabled. Otherwise that service won’t be accessible when cc is enabled. We probably want to adjust the count though now within that override

int claim = claimsInventory.getOrDefault(cluster, 0) + 1;
claimsInventory.put(cluster, claim);
if (claim > 1) {
cluster.setShared(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I'm correct in my previous comment that build service state isn't cached. Any mutations they make have to be to cached data structures, which presumably means something that's an input to a task? Or a ValueSource?

// Add an extra licenses directory to the combined notices
tasks.named('generateNotice').configure {
dependsOn "extractNativeLicenses"
inputs.dir("${project.buildDir}/extractedNativeLicenses/platform/licenses")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just use extractNativeLicenses.destinationDir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made that more gradle idiomatic reusing extractNativeLicenses.destinationDir

@breskeby breskeby merged commit 4a9b4bc into elastic:main Nov 28, 2023
14 checks passed
@breskeby breskeby deleted the rework-cc-test-cluster-infra branch November 28, 2023 11:15
timgrein pushed a commit to timgrein/elasticsearch that referenced this pull request Nov 30, 2023
elastic#101903)

* Move TestCluster plugin and module setup from Node to Cluster
* Do not materialize unused TestCluster
* Fix lazy test cluster evaluation
* Register artifact transforms ones per project
* Make task caching for TestClusterAware tasks CC compatible
* Move stateful logic out of taskgraph.whenready
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure >non-issue Team:Delivery Meta label for Delivery team v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants