Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] RemoteIndexAuditTrailStartingTests failed authn [test_user] [cluster:monitor/health] #32685

Closed
albertzaharovits opened this issue Aug 7, 2018 · 4 comments
Assignees
Labels
:Security/Audit X-Pack Audit logging >test-failure Triaged test failures from CI v6.4.1

Comments

@albertzaharovits
Copy link
Contributor

Build failure: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.4+matrix-java-periodic/ES_BUILD_JAVA=java10,ES_RUNTIME_JAVA=java8,nodes=virtual&&linux/21/console

Does not reproduce:

REPRODUCE WITH: ./gradlew :x-pack:plugin:security:test \
  -Dtests.seed=4B1F8200799F1127 \
  -Dtests.class=org.elasticsearch.xpack.security.audit.index.RemoteIndexAuditTrailStartingTests \
  -Dtests.method="testThatRemoteAuditInstancesAreStarted" \
  -Dtests.security.manager=true \
  -Dtests.locale=hu-HU \
  -Dtests.timezone=Africa/Khartoum

Stacktrace for the root failure:

ElasticsearchSecurityException[unable to authenticate user [test_user] for action [cluster:monitor/health]]
	at __randomizedtesting.SeedInfo.seed([4B1F8200799F1127:64B1E0821634A508]:0)
	at org.elasticsearch.xpack.core.security.support.Exceptions.authenticationError(Exceptions.java:18)
	at org.elasticsearch.xpack.core.security.authc.DefaultAuthenticationFailureHandler.createAuthenticationError(DefaultAuthenticationFailureHandler.java:129)
	at org.elasticsearch.xpack.core.security.authc.DefaultAuthenticationFailureHandler.failedAuthentication(DefaultAuthenticationFailureHandler.java:63)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$AuditableTransportRequest.authenticationFailed(AuthenticationService.java:520)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.consumeUser(AuthenticationService.java:358)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$consumeToken$14(AuthenticationService.java:296)
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60)
	at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:96)
	at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:76)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.consumeToken(AuthenticationService.java:300)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$extractToken$9(AuthenticationService.java:234)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.extractToken(AuthenticationService.java:244)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$0(AuthenticationService.java:178)
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60)
	at org.elasticsearch.xpack.security.authc.TokenService.getAndValidateToken(TokenService.java:284)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$2(AuthenticationService.java:174)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$lookForExistingAuthentication$4(AuthenticationService.java:205)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lookForExistingAuthentication(AuthenticationService.java:216)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.authenticateAsync(AuthenticationService.java:170)
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.access$000(AuthenticationService.java:131)
	at org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:101)
	at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.applyInternal(SecurityActionFilter.java:160)
	at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:113)
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81)
	at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:87)
	at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:76)
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:407)
	at org.elasticsearch.client.FilterClient.doExecute(FilterClient.java:67)
	at org.elasticsearch.client.support.AbstractClient$1.doExecute(AbstractClient.java:1788)
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:407)
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:396)
	at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.execute(AbstractClient.java:708)
	at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:46)
	at org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:53)
	at org.elasticsearch.xpack.security.audit.index.RemoteIndexAuditTrailStartingTests.startRemoteCluster(RemoteIndexAuditTrailStartingTests.java:122)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:941)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.lang.Thread.run(Thread.java:748)

I reckon there is a race when starting the second cluster, and the user the client is using is not available somehow.

    @Before
    public void startRemoteCluster() throws IOException, InterruptedException {
        final List<String> addresses = new ArrayList<>();
        // get addresses for current cluster
        final NodesInfoResponse response = client().admin().cluster().prepareNodesInfo().execute().actionGet();
        final String clusterName = response.getClusterName().value();
        for (final NodeInfo nodeInfo : response.getNodes()) {
            final TransportAddress address = nodeInfo.getTransport().address().publishAddress();
            addresses.add(address.address().getHostString() + ":" + address.address().getPort());
        }

        // create another cluster
        final String cluster2Name = clusterName(Scope.TEST.name(), randomLong());

        // Setup a second test cluster with a single node, security enabled, and SSL
        final int numNodes = 1;
        final SecuritySettingsSource cluster2SettingsSource =
                new SecuritySettingsSource(numNodes, sslEnabled, createTempDir(), Scope.TEST) {
            @Override
            public Settings nodeSettings(int nodeOrdinal) {
                final Settings.Builder builder = Settings.builder()
                        .put(super.nodeSettings(nodeOrdinal))
                        // Disable native ML autodetect_process as the c++ controller won't be available
//                        .put(MachineLearningField.AUTODETECT_PROCESS.getKey(), false)
                        .put("xpack.security.audit.enabled", true)
                        .put("xpack.security.audit.outputs", randomFrom("index", "index,logfile"))
                        .putList("xpack.security.audit.index.client.hosts", addresses.toArray(new String[addresses.size()]))
                        .put("xpack.security.audit.index.client.cluster.name", clusterName)
                        .put("xpack.security.audit.index.client.xpack.security.user",
                             TEST_USER_NAME + ":" + SecuritySettingsSourceField.TEST_PASSWORD)
                        .put("xpack.security.audit.index.settings.index.number_of_shards", 1)
                        .put("xpack.security.audit.index.settings.index.number_of_replicas", 0);

                addClientSSLSettings(builder, "xpack.security.audit.index.client.");
                builder.put("xpack.security.audit.index.client.xpack.security.transport.ssl.enabled", sslEnabled);
                return builder.build();
            }
        };
        remoteCluster = new InternalTestCluster(randomLong(), createTempDir(), false, true, numNodes, numNodes,
                cluster2Name, cluster2SettingsSource, 0, SECOND_CLUSTER_NODE_PREFIX, getMockPlugins(), getClientWrapper());
        remoteCluster.beforeTest(random(), 0.0);
>>>>>>  assertNoTimeout(remoteCluster.client().admin().cluster().prepareHealth().setWaitForGreenStatus().get());
    }
@albertzaharovits albertzaharovits added >test-failure Triaged test failures from CI :Security/Audit X-Pack Audit logging v6.4.1 labels Aug 7, 2018
@albertzaharovits albertzaharovits self-assigned this Aug 7, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

@droberts195
Copy link
Contributor

The same problem occurred in
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.4+multijob-unix-compatibility/os=opensuse/24/console

The repro command is this:

./gradlew :x-pack:plugin:security:test \
  -Dtests.seed=E1BB81267C3E504 \
  -Dtests.class=org.elasticsearch.xpack.security.audit.index.RemoteIndexAuditTrailStartingTests \
  -Dtests.method="testThatRemoteAuditInstancesAreStarted" \
  -Dtests.security.manager=true \
  -Dtests.locale=sr-Latn-BA \
  -Dtests.timezone=Europe/Kirov

Once again, it doesn't reproduce locally.

This time the action being performed when the authentication failure occurs is different, but the stack trace is very similar to the original description on this issue.

    > Throwable #1: ElasticsearchSecurityException[unable to authenticate user [test_user] for action [indices:admin/delete]]

albertzaharovits added a commit that referenced this issue Feb 5, 2019
Authn is enabled only if `license_type` is non `basic`, but `basic` is
what the `LicenseService` generates implicitly. This commit explicitly sets
license type to `trial`, which allows for authn, in the `SecuritySettingsSource`
which is the settings configuration parameter for `InternalTestCluster`s.

The real problem, that had created tests failures like #31028 and #32685, is
that the check `licenseState.isAuthAllowed()` can change sporadically. If it were
to return `true` or `false` during the whole test there would be no problem.
The problem manifests when it turns from `true` to `false` right before `Realms.asList()`.
There are other license checks before this one (request filter, token service, etc)
that would not cause a problem if they would suddenly see the check as `false`.
But switching to `false` before `Realms.asList()` makes it appear that no installed
realms could have handled the authn token which is an authentication error, as can
be seen in the failing tests.

Closes #31028 #32685
albertzaharovits added a commit to albertzaharovits/elasticsearch that referenced this issue Feb 5, 2019
Authn is enabled only if `license_type` is non `basic`, but `basic` is
what the `LicenseService` generates implicitly. This commit explicitly sets
license type to `trial`, which allows for authn, in the `SecuritySettingsSource`
which is the settings configuration parameter for `InternalTestCluster`s.

The real problem, that had created tests failures like elastic#31028 and elastic#32685, is
that the check `licenseState.isAuthAllowed()` can change sporadically. If it were
to return `true` or `false` during the whole test there would be no problem.
The problem manifests when it turns from `true` to `false` right before `Realms.asList()`.
There are other license checks before this one (request filter, token service, etc)
that would not cause a problem if they would suddenly see the check as `false`.
But switching to `false` before `Realms.asList()` makes it appear that no installed
realms could have handled the authn token which is an authentication error, as can
be seen in the failing tests.

Closes elastic#31028 elastic#32685
albertzaharovits added a commit to albertzaharovits/elasticsearch that referenced this issue Feb 5, 2019
Authn is enabled only if `license_type` is non `basic`, but `basic` is
what the `LicenseService` generates implicitly. This commit explicitly sets
license type to `trial`, which allows for authn, in the `SecuritySettingsSource`
which is the settings configuration parameter for `InternalTestCluster`s.

The real problem, that had created tests failures like elastic#31028 and elastic#32685, is
that the check `licenseState.isAuthAllowed()` can change sporadically. If it were
to return `true` or `false` during the whole test there would be no problem.
The problem manifests when it turns from `true` to `false` right before `Realms.asList()`.
There are other license checks before this one (request filter, token service, etc)
that would not cause a problem if they would suddenly see the check as `false`.
But switching to `false` before `Realms.asList()` makes it appear that no installed
realms could have handled the authn token which is an authentication error, as can
be seen in the failing tests.

Closes elastic#31028 elastic#32685
albertzaharovits added a commit that referenced this issue Feb 5, 2019
Authn is enabled only if `license_type` is non `basic`, but `basic` is
what the `LicenseService` generates implicitly. This commit explicitly sets
license type to `trial`, which allows for authn, in the `SecuritySettingsSource`
which is the settings configuration parameter for `InternalTestCluster`s.

The real problem, that had created tests failures like #31028 and #32685, is
that the check `licenseState.isAuthAllowed()` can change sporadically. If it were
to return `true` or `false` during the whole test there would be no problem.
The problem manifests when it turns from `true` to `false` right before `Realms.asList()`.
There are other license checks before this one (request filter, token service, etc)
that would not cause a problem if they would suddenly see the check as `false`.
But switching to `false` before `Realms.asList()` makes it appear that no installed
realms could have handled the authn token which is an authentication error, as can
be seen in the failing tests.

Closes #31028 #32685
albertzaharovits added a commit that referenced this issue Feb 5, 2019
Authn is enabled only if `license_type` is non `basic`, but `basic` is
what the `LicenseService` generates implicitly. This commit explicitly sets
license type to `trial`, which allows for authn, in the `SecuritySettingsSource`
which is the settings configuration parameter for `InternalTestCluster`s.

The real problem, that had created tests failures like #31028 and #32685, is
that the check `licenseState.isAuthAllowed()` can change sporadically. If it were
to return `true` or `false` during the whole test there would be no problem.
The problem manifests when it turns from `true` to `false` right before `Realms.asList()`.
There are other license checks before this one (request filter, token service, etc)
that would not cause a problem if they would suddenly see the check as `false`.
But switching to `false` before `Realms.asList()` makes it appear that no installed
realms could have handled the authn token which is an authentication error, as can
be seen in the failing tests.

Closes #31028 #32685
@albertzaharovits
Copy link
Contributor Author

Resolved in #38397

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Security/Audit X-Pack Audit logging >test-failure Triaged test failures from CI v6.4.1
Projects
None yet
Development

No branches or pull requests

4 participants