-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saved object id migration issues #59598
Comments
Pinging @elastic/kibana-platform (Team:Platform) |
Pinging @elastic/kibana-app-arch (Team:AppArch) |
When considering the technical implementations for sharing saved-objects in multiple spaces, we explored every option that we could think of that didn't require us to rewrite IDs. Unfortunately, all of these alternative implementations had issues which made them untenable.
I'm not aware of any other plans which would require IDs to be rewritten. However, using saved-object references properly is critical to key functionality in Kibana. It seems like a fairly simple concept to teach developers. Perhaps we should investigate ways to make their use more apparent with the improvements which are being made in the Kibana Platform? |
If we require expressions to be something like Consider a Lens Embeddable that is "by value". It takes as input an expression string. We developers anyway to be able to embed Lens visualizations, so they can do something like:
And the visualization that powers that expression would be rendered inline. That expression gets run through the interpreter. Esaggs function uses the id to get the index pattern saved object via If we instead did
So maybe the lens embeddable would pass it in here, but this lens embeddable is "by value" - it's hard coded by a developer in their app. It isn't backed by a saved object. What happens when we let the user edit the expression and they are typing in a value for Right now all information needed to render is encapsulated in the expression string itself. If we replace ids with reference names, I don't see how we can do that anymore. When I was originally talking about these registry items needing to declare their references, I was thinking simply an array of What purpose does Hope that helps explain my concerns a bit more. |
The main problem references tries to solve is: how can we identify and extract the saved object id's from a saved object. This enables us to:
The current references array allows a saved object to decouple internal references from the actual id. By using the I agree that the developer experience of this scheme isn't ideal when you're trying to share a referenced id across saved objects because it requires also passing along all the references. If we ignore (3) and decide that ID's will never change, we could embed ID's in strings like we do for expressions as long as the embedded ID's were also added to the references array. It does mean that any object that stores an expression needs to be able to parse that expression to extract the id and type. |
Having a consistent way to look at saved object references gives us a generic way to:
I feel like we've mostly been focused on child references here, but the ability to see parent references (i.e., "what saved objects are using this saved object") is another important benefit, and is a stepping stone to the referential integrity that @rudolf mentioned. We even surface this in the saved objects management UI today:
This is an interesting use case. We have saved objects which are referenced / in-use, but we have no way of tracking because their consumers are ephemeral in a certain sense. So we end up giving users a false sense of security when managing saved objects: They could get the impression that a saved object isn't being used anywhere, and decide to delete it. Then this hard-coded embeddable will fail because it has a "hidden" dependency that we couldn't account for. This almost breaks an assumption saved objects can only be referenced by other saved objects. Can you give an example of when we would want to hard-code a visualization like this in code? I'm sure there are valid use cases here, but it would help me to have a concrete example or two to think through. |
There are two cases: Embeddables backed by a saved objects, and embeddables not backed by a saved object. The use case I list above is the latter.
If someone did want to embed a visualization backed off a saved object, they need to ensure it exists or create it. There is actually a use case for this. SIEM would like to use a dashboard in their overview page and wants it to be via a saved object so users can click "edit dashboard", and customize their overview page: #52680. In order to support that we'd need to implement something like this #33496 But I think the non saved object backed embeddable is still an example that shows why this would be complex. @timroes did mention to me this morning an idea where the expressions that exist at run time could still have these id references, as above, if we implemented |
It seems to me that finding a way to make all objects use references would allow us to solve the overall problem here. One assumption regarding expressions that I'd like to challenge: is using explicit ids in an expression something users actually want? I think if I knew nothing about how Kibana works, I'd just find this confusing. What could be useful is there was some way to get a list of available options in the editor's autocomplete, and once selected, that chosen object would be represented by a human-readable name, rather than a opaque UUID. In other words, a named reference. IMO, this is a better experience than what we have today. In addition it allows us to easily find all expressions that use a particular index pattern or saved search. It seems the main driver for using explicit, global ids in expressions has more to do with the "portability" feature rather than the ease-of-use of writing expressions. For one, I suspect that ease-of-use should take priority over portability. Second, I think we could come up with a clever solution around the copy/paste mechanics to smooth over the portability issue. For example, when a user copies an expression, we could append the references data to the string that is copied to the clipboard. Then when this string is pasted, we could read out the references, and strip it from the UI. Of course this wouldn't solve portability with older versions of Kibana, but I'm not sure how important that use case is. |
@joshdover - Can you explain a bit more how we could allow the user to use a human readable name? I don't see how this would work since names are not unique, ids are. Users only hand edit the expression as a last resort. We have UI controls which make going from UX -> expression string easy so I am not concerned about the id being less user friendly than a name. At least it's obvious what it should be, I wouldn't know what to put if it was Like, this is the flow for Canvas embeddables A UX component to select which saved object to add: How would this work with references? Would we have to ask the user to enter in a unique reference name? Would we just arbitrarily create and use another guid for the ref name just to make the saved object guid modifiable? We support editing an object via import, if ids match something already existing in saved objects. Do we not want to officially support that because we might change those ids between the time the object was exported and then imported back in? We have ids stored in URLs, not reference names, and we want to support the BWC of them. We need this "alias" feature to support that - if the alias feature supports ids in urls, why not ids coming from other storage mechanisms?
What if I copied the expression, pasted it into a google doc, what would it look like? Could I copy what is in the google doc and paste it back into the expression editor? Would this show up in Kibana editor:
but the copied string, pasted into google docs would actually be:
But even then, they couldn't paste that back into the editor reliably because that has the old id in the map, which might have changed, and now the reference is broken. Supporting this references name indirection greatly complicates a lot of code/implementations and DX. All to make sure we can migrate ids, which we can't actually do without breaking all our URLs, so we have to implement alias's, so, why require this names indirection at all and just rely on id alias's? |
I was thinking we could generate a default name based on the type. It just has to be unique within the SO, not globally unique. So things like These names would be the reference name, which would map to the real object.
I agree it would break in this case if the ID changed, unless we have aliases. We talked a bit offline and it seems like the alias approach eliminates the problems this issue brings up. To be more specific, with the alias support in the SO client itself (via the proposed |
Aliases were intentionally only scoped to identifiers in the URL. When using an alias, there's a potential that both an alias and a saved-object ID match. When there are multiple saved-objects which match we have a decision we need to make: choose one or throw an error. Throwing an error is safest, but we don't want to throw these errors from within operations like exporting a saved-object with references. This will be frustrating for end users, and trying to surface these errors in an easy to understand manner is difficult. We were hoping to isolate the situations where this occurs, hence only using them in URLs. |
I wanted to leave some more context on the expression part of that discussion that I talked with Stacey about. Can we use "names" instead of "ids"? I think we're just building a secondary id at that point, since it anyway needs to be unique. So yes while I agree it might be nicer having something more human readable in the expression, we're running in all the same issues then with Also there was a suggestion around, that we could keep for index-pattern the index pattern string instead, which would be a valid suggestion for index-pattern, but doesn't help us with the general issue, e.g. for embedability functions. The current suggestion that @flash1293 @ppisljar and I so far have worked out, would be having something like a reference function directly inside the expressions plugin, that could be used as follows (I am using index patterns here in the example, but of course the same problem transfers to any id):
Now the
This can be used by anyone needing to store expressions to extract references from it. This works fine as long as ids are unchangeable. As soon as we must assume that
This function would now return the references and a new expression, that's ready to save and would look like in the above example like:
I really don't like that approach, since it means we're storing a completely different expression in the saved object, than the user actually sees (and is able to edit) during runtime, and I fear that might will cause at some point confusions in the best case, or have people breaking their saved object during import/export. My personal preferences given the issues listed here, and the discussions we had, is also to keep |
And talked another round with Joe, and I am not so sure if we can actually keep |
If aliases are only created when there's an id regeneration, why would both the alias and the saved object match? The only reason I can think of is that we have a unique id "collision" which causes a new saved object to be created with the same ID as a previously aliased object. If that's the only case, the following proposal might give us BWC ID's while preventing that an object is ever created with an id matching that of an alias:
|
Dang, good call on the "copy references to space" use case Joe and Tim. I can't think of a way around that, so I guess we do have to support this reference replacement functionality in any persisted implementations that may contain references (at least expressions, embeddables) after all. |
In the original proposal, end-users can create aliases themselves, so there are any number of ways this could happen. However, the system itself can end up causing a conflict in the following situation, which @jportner recently discovered when implementing the first phase of sharing saved-objects in multiple spaces:
If I understand correctly, you're proposing not allowing end-users to specify aliases themselves with the intent that this would prevent multiple saved-objects from ever matching. If my understanding is correct, we'd still end up with a conflict with the previous situation. When working through the various technical approaches to allow saved-objects to be shared in multiple spaces, we spent an incredibly long time trying to find a perfect solution which didn't have any trade-offs or any conflicts like the above. Unfortunately, we were unable to find one which didn't have more issues than the one we decided upon. The design we decided upon attempts to minimize the situations where these conflicts do occur, and ensure that end-users are able to rectify the conflicts themselves. |
current plan with expressions is to have very limited access to any of our services, including index pattern, meaning that we will have just one function that can read index patterns this means that extracting references at save time (and restoring at load time) should be possible, as we know exactly what to look for: walk the AST tree looking for any indexPattern function and read the id parameter. Every saved object storing expression would need to call a utility function to extract references from expression and then push them and their references array. |
How would that work with third party developers who add an expression fn that is using a reference? |
@stacey-gammon every function can provide relevant issue: #61769 |
Also, reference extraction and injection is supported on expressions already and |
Pinging @elastic/kibana-presentation (Team:Presentation) |
Given @stacey-gammon's comment in #59598 (comment) it sounds like adopting references is a necessary pain we have to live with but it will solve the problems in this issue. I'm closing this as I don't see anything actionable in this issue, but feel free to re-open. |
The Situation
In order to support shared saved objects in multiple spaces, the @elastic/kibana-security team is planning to migrate saved object ids. This feature is written up here: #27004.
We were already aware of potential issues with needing to migrate registry items that get serialized inside saved objects. This issue is written up here: #56166
If saved object ids are going to be migrated, we need to migrate some of our registry items to support that feature. Specifically:
Embeddables is not an issue because saved object ids are currently a primary concept and dashboard uses the correct mechanism of storing references in the dashboard saved object structure, not the actual saved object id. (e.g. in code, dashboard input has a
savedObjectId
field but insidesaved_dashboard_references
there is anextractReferences
andinjectReferences
that gets used inbetween pulling the saved object out and converting it into a saved object.The problems
We've identified several issues we are going to run into, or already have:
Persistable registry items have no formal concept of migration, nor do they have a formal concept of saved object id references.
We need this if we want this to work with "export references" and "copy to space".
For example, currently I'm almost positive Canvas workpad exports/copy to space will be missing references because there is no formal extraction of saved object ids into saved object references prior to saving the saved object. Lens also has this problem now. @timroes attempting to fix it is how this came to our attention when he realized the larger problem with the saved object id migration.
This in and of itself raises a slew of questions. Before we assumed we could do these migrations on the fly, in the code, but that is no longer the case. If the reference output changes, we need to persist those changes in saved objects to preserve export/copy to space functionality. There are performance implications.
Assuming we work through this feature, there are even more questions/issues for how will this work when the saved object ids are also being migrated.
Our migration system currently runs migrations by type, then by version.
This means that index patterns could be migrated from 7.0 to 8.0 before lens visualizations even start the migration for 7.0, or vice versa. This could be a problem if ids can change, as well as references. For example:
index-pattern: { id: "1234", .... }
=>index-pattern: { id: "5678", .... }
(The id has the potential to change to account for collisions, so let's assume a collision in this case forced the id to change)
lens: { id: "..", references: [{ type: index-pattern: id: "1234"}]}
=>lens: { id: "..", references: [{ type: index-pattern: id: "4567"}]}
esaggs index-pattern="1234"
=>esaggs index-pattern="1234", references: { type: index-pattern, id: "1234"}
Problems:
We can solve the first problem by having our migration system run version -> all types, instead of type -> all versions. This doesn't solve the second problem.
Saved object id migrations creates a lot of complications
One question I think we should ask ourselves is whether we can avoid this altogether, or continue to support old ids.
If in this specific situation, we can't, we can create a one time solution because we don't officially support third party developers. We have knowledge of all the registry implementations that need to change: esaggs, maps embeddable, we'll search for any more. When we do eventually officially support third party developers, we can hopefully then make the commitment to never change ids.
Can we now, or starting in 8.0, decide saved object ids cannot be mutated?
If we don't, I believe that requiring every developer to know to use a reference model instead of persisting ids, will be very difficult to enforce.
Summary
There is a lot to discuss here. Main topics boil down to:
Solving Discuss: How to migrate a saved object that contains state from another plugin #56166 is important, and it needs to have a first class concept of saved object references.
How can we solve the spaces feature without breaking visualize, lens and canvas workpad saved objects?
cc @peterschretlen @kobelb @crob611 @joshdover @rudolf @lukeelmers
The text was updated successfully, but these errors were encountered: