Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected log messages during alerting stress testing #54508

Closed
pmuellr opened this issue Jan 10, 2020 · 4 comments
Closed

unexpected log messages during alerting stress testing #54508

pmuellr opened this issue Jan 10, 2020 · 4 comments
Labels
Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.6.0

Comments

@pmuellr
Copy link
Member

pmuellr commented Jan 10, 2020

Kibana version: master, a few days before 7.6 feature freeze

Elasticsearch version: snapshot from yarn es snapshot, from Kibana version ^^^

Describe the bug:

During stress testing of alerting, when 100 alert deletions are happening, a few odd messages appeared in the Kibana and ES console outputs.

Kibana

Steps to reproduce:

  1. run the alerting stress test from the gist whole-lotta-alerts.sh
  2. After the alerts have been running a while, delete them all with the command invocation indicated in the script.

Expected behavior:

Nothing unusual in the ES or Kibana logs.

Provide logs and/or server output (if relevant):

The following message was repeated ~50 times in the ES console:

info [o.e.x.s.a.AuthenticationService] [pmuellr.muellerware.org] \
   Authentication using apikey failed \
   - api key has been invalidated

The following message occurred one time every time I deleted all 100 alerts:

[error][task_manager] Failed to mark Task alerting:example.always-firing \
   "a4606f31-33d7-11ea-9271-cf13bcf871f1" as running: \
   Task has been claimed by another Kibana service

The following message was repeated ~25 times in the Kibana console - note that much of the message was a JSON encoded string, which I've decoded here:

missing authentication credentials for REST request
[error][task_manager] Task actions:.server-log "96445751-33e5-11ea-9271-cf13bcf871f1" failed: [security_exception] missing authentication credentials for REST request [/_security/user/_has_privileges], with { header={ WWW-Authenticate={ 0="Bearer realm=\"security\"" & 1="ApiKey" & 2="Basic realm=\"security\" charset=\"UTF-8\"" } } } ::  { 
    "path": "/_security/user/_has_privileges",
    "query": {},
    "body": {
        "applications": [
            {
                "application": "kibana-.kibana",
                "resources": [
                    "space:default"
                ],
                "privileges": [
                    "version:8.0.0",
                    "login:",
                    "saved_object:8.0.0:action/get"
                ]
            }
        ]
    },
    "statusCode": 401,
    "response": {
        "error": {
            "root_cause": [
                {
                    "type": "security_exception",
                    "reason": "missing authentication credentials for REST request [/_security/user/_has_privileges]",
                    "header": {
                        "WWW-Authenticate": [
                            "Bearer realm=\"security\"",
                            "ApiKey",
                            "Basic realm=\"security\" charset=\"UTF-8\""
                        ]
                    }
                }
            ],
            "type": "security_exception",
            "reason": "missing authentication credentials for REST request [/_security/user/_has_privileges]",
            "header": {
                "WWW-Authenticate": [
                    "Bearer realm=\"security\"",
                    "ApiKey",
                    "Basic realm=\"security\" charset=\"UTF-8\""
                ]
            }
        },
        "status": 401
    },
    "wwwAuthenticateDirective": "Bearer realm=\"security\", ApiKey, Basic realm=\"security\" charset=\"UTF-8\""
}
@pmuellr pmuellr added Feature:Task Manager v7.6.0 Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Jan 10, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@pmuellr
Copy link
Member Author

pmuellr commented Jan 13, 2020

The API key and missing auth creds messages are likely the same as reported here: #54125 - no need to do more work on it in this issue.

I've not seen "Task has been claimed by another Kibana service" since doing other testing, but worthing looking into, let's focus on that.

I will note that I often stop and start Kibana during stress testing - it just picks up from where it left off, no usual problems, but perhaps the message was caused by the restart. Why does it think another Kibana service claimed the test?

@gmmorris
Copy link
Contributor

It's worth noting that the Task has been claimed by another Kibana service message appears whenever there's a version conflict- we're just assuming that's what happened, but it might be a version conflict due to something else 🤔

@pmuellr
Copy link
Member Author

pmuellr commented Jan 15, 2020

ah, I didn't realize it was an optimistic locking version thing, but tracing it back, looks like it comes from here:

public async markTaskAsRunning(): Promise<boolean> {
performance.mark('markTaskAsRunning_start');
const VERSION_CONFLICT_STATUS = 409;
const now = new Date();

Since this could have been from a Kibana restart, makes sense that the task could have been left in a funky state when Kibana shutdown, then when it started back up, hit the 409.

That's not great - you'd like to think TM could deal with a restart cleanly, but given it's complexity, feels understandable.

I'm going to close this, but will keep an eye out for more of these now I know what that it is.

@pmuellr pmuellr closed this as completed Jan 15, 2020
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.6.0
Projects
None yet
Development

No branches or pull requests

4 participants