-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alerting] Add a tie breaker field to alerts #62002
Comments
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
@elastic/kibana-platform Is there any plans to have an alternative to sorting by |
FWIW, I created a |
Not that I'm aware of. @rudolf? |
I wasn't aware of this duplicating behaviour, but I think we should treat it as a bug in Core. @FrankHassanabad suggestion for using @FrankHassanabad What is the impact on users, could showing a duplicate mislead a user to take the wrong actions or is this mostly an annoyance? What's the priority for getting this fixed? It doesn't sound like a lot of work, but our roadmap is quite full.
|
I read to quickly and didn't see that data could be either missing or duplicated, we can probably sometimes live with duplicates but the missing data wouldn't be acceptable. |
There's no immediate requirement to implement this in Core/Saved Objects, we could only implement this in task manager for now since Saved Objects only uses |
@rudolf, I generate a UUID client side at the moment for the a tie breaker field before pushing the record into the index within Kibana/NodeJS
For large lists since they can be above 10k we had to switch from SO to another data index so I could use search after for our SIEM large list support. Also for a few other Elastic Search abilities such as delete by query, etc... If SO is adding support for above 10k and a few other features and everyone feels comfortable we can add large volumes of data into it I would be 👍 for moving back to it. I started with it, but quickly ran into issues with the 10k support and missing things such as delete by query where I had to move to a data index. |
"Reviewed by Frank Hassanabad on 7/29/2020, still valid as of this date" |
Just wanted to double back on this ticket. We're adding auto refresh to our all rules table and the inconsistencies in sorting without a |
Is there any chance you could use an existing alerting SO field like It sounds like the only other option is for us to add a new field to the alert SO, which would presumably be a UUID, as the tie breaker. I'm worried about the cardinality explosion here, as the internal index on that new field will literally be as large as the # of alerts, so reusing an existing field(s) would help out there (I think!). We could potentially make this field optional, and probably not available for editing in the UI, so that alertsClient users could specify a value if they want, but the default would be that we don't use the field. Also wondering if this will bite us in the end, for things like the alerts list UI. No matter how we end up sorting the alerts in that list (eg, status, name, etc), seems like we may also be in the same situation - with an unstable sort, so missing/dup entries while paging - but perhaps that's a bit more survivable for customers than Frank's use case. |
@FrankHassanabad , any thoughts on my comment ^^^ |
Thanks, I did a sampling of data from the prepackaged rules install and the 2020-11-05T19:46:41.552Z
2020-11-05T19:46:41.406Z
2020-11-05T19:46:41.753Z
2020-11-05T19:46:41.716Z
2020-11-05T19:46:42.533Z It's not guaranteed as you can see some of those are very close to being non-unique and we might end up with some that are maybe not unique but I think at the very least for a workaround we can use them to help keep things more stable or as close to stable as we can. |
@FrankHassanabad thanks for checking that! I think we do still need a way to pass multiple fields to sort by though right? So for our rules example, when the user selects to sort by |
Yeah, we need to allow comma separated values to sort by within the |
Adding another field is the only way to guarantee no duplicates (ES team might add a virtual tie breaker field at some point in the future elastic/elasticsearch#56828). Although every field adds overhead, I don't think the "cost" of this is prohibitively high, instead of N it will now be 2N. We might have to add a tie breaker field to all saved objects for #77961 |
Ah, adding a tie breaker to SO's themselves seems like the way to do this, as this is a generic SO issue IIRC, not specific to alerts. I'd certainly hate to add one to alerts in vX only to have it added to SO's in v>X, since then we'd have two tie breakers (and more wasted index space). |
With the changes in elastic/elasticsearch#68833, a tiebreaker no longer needs to be provided to Elasticsearch (as of 7.12) as long as you are searching using a point-in-time... ES now automatically applies an internally-created tiebreaker for you. In #89915, we introduced the ability to use I think this means this issue can be closed, but I'll let @FrankHassanabad confirm. |
Not entirely clear to me if the tiebreaker needs to be consistent over time, or just over the "initial load" of the list being generated. If it needs to be consistent over time, then the per-PIT tiebreaker won't help. We also recently made the alert params a For a case like this, you'd like the process of alert creation to call into the alert type to explicitly set such a field to some random value. But we don't really have a way of doing that now. We've talked about "hooks", where an alert type could be involved in processes like this, which seems like would be a good way to handle it. Eg, there would be a "pre-save" hook that an alert type could implement which would be passed the alert data just before it's created/updated, with a chance to modify it. For now though, setting such a field in the alert type UI or via the alerts client is probably good enough. |
@FrankHassanabad @yctercero Is a tie breaker field still needed given the changes to the core's |
I don't believe so. We're working on moving over to PIT, though don't have details on timeline. |
Closing as don't need. |
Describe the feature:
Provide a
tie_breaker_id
by copying the_id
field so we can have stable sort/export in batches.Describe a specific use case for the feature:
For Saved objects and for SIEM rules we ran into issues with sorting and paging where we ended up with duplicates and in some cases missing data. We solved this by loading all records into memory at once and then exporting. For our UI tables we allow the user to view 300 records at a time to minimize the chances of duplicates or misses when doing paging but it is possible when they have above 300.
We have replicated this issue by paging through rules 10 at a time and looking at each rule and seeing we have found duplicates. We have replicated this issue also using CURL and eliminating any chances it was a UI issue.
There are docs such as this one which highlight the problem and a solution (In the important section)
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-search-after
If the alerting framework provide a copy of _id as a tiebreaker field called
as suggested there for us to sort on, export on, etc... that would be a small mapping change utilizing copy_to:
https://www.elastic.co/guide/en/elasticsearch/reference/current/copy-to.html
The text was updated successfully, but these errors were encountered: