-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce performance impact of deleting legacy url aliases #141136
Comments
Pinging @elastic/kibana-core (Team:Core) |
Pinging @elastic/kibana-security (Team:Security) |
We should also consider making |
Because the majority of saved object types are of Some use cases that might attempt to do large bulk deletes where legacy url aliases might cause performance problems: |
I had an idea I wanted to suggest for these saved objects. They don't contain very many fields (see example below). What if instead of creating a saved object for every one of them we had a single saved object containing a list of all the sourceId, targetId, etc from each.
This would reduce the number of SO docs Kibana has to read and write during migration. There's obviously some risk to this idea. It's putting more eggs in one basket. If this one doc were accidentally deleted or corrupted it would be bad. Might even be necessary to keep a few documents as old versions as updates are made. But how large might this one doc get? And how much memory in Kibana would it potentially consume while cached? Maybe the new file service could chunk it if it was too large. Example doc;
|
@elastic/kibana-security Would the suggestion in #141136 (comment) be feasible? |
@rudolf Considering we've seen real instances of many |
Some of the history escapes me at the moment, but one of the reasons we create an alias-document-per-object is to prevent multiple aliases from referencing the same object. Document IDs are the only uniqueness guarantee that we have in ES today, and so leveraging this ID for uniqueness prevents another class of unresolvable conflict errors.
Based on what I've been able to reconstruct, I don't think it would be infeasible, but a change like this would require careful planning, design, and consideration before moving forward with implementation. |
Agree with @legrego that this would be a risky change to make. If we put all the aliases in one document we could enforce the uniqueness constraint in Kibana itself. But I would be concerned about the potential size of this document. We could probably easily test this by seeing how big the JSON of a object with 2m legacy url aliases inside an array would get. Another option could be disable legacy url aliases for some SO types. I don't know the details, but I could imagine that we maybe have URL's to a case but not to a case-user-action. This would not solve the problem completely but it can reduce the impact somewhat. |
Yes to @legrego and @rudolf - Definitely something to approach carefully and spend some time to test design feasibility. One other idea to mitigate having one large alias doc would be to create one alias doc per type and/or per space. We could also create a secondary lookup system if this still proves to be too large in some deployments (have a max doc size, multiple alias docs, and a lookup doc), but I think that may be more complicated than justified by the benefits. Not sure what our most dense deployments look like IRL. |
Unfortunately, we have URLs that point to user actions and comments. Each comment and user action has a "Copy reference link" button (see screenshot) with which the user can get a URL pointing to this particular user action or comment (the UI will scroll to this position). The format of the URL is |
With the addition of a bulk delete API, we run the risk of performance degradation when objects being deleted have many legacy URL aliases.
The API that handles legacy URL alias cleanup after a write operation uses an
updateByQuery
with a simple script to delete one alias at a time.For the API to handle many objects (a pseudo-bulk operation), the script would need to be adapted to handle different namespaces and delete behavior for each saved object that no longer exists.
At the moment, there are two bulk operations that include cleaning up legacy URL aliases:
updateObjectSpaces
bulkDelete
Both use a
pMap
to limit the number of concurrent operations to try and mitigate blocking the event loop.To avoid potentially serious performance-related issues, the saved objects repository needs a performant API to delete legacy URL aliases in bulk.
Originally posted by @TinaHeiligers in #139680 (comment)
The text was updated successfully, but these errors were encountered: