-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RAC] Alert document updates should not spread results of a fields API query #113003
Comments
On the security solution side we've implemented "best-effort" merging of This strategy tries to get the best of both worlds, so we can retain any runtime/const keyword fields that only show up in |
cc @FrankHassanabad implemented it and may have more thoughts. |
Sorry just noticed this was already answered in another comment. That we're using runtime fields inside alerting somehow. I change my question then to: When you read from |
@timroes my guess is that we want to also read fields not present in the source, like runtime fields and such. |
No, this is not desired. That's the bug we've identified with this approach here in this ticket. |
I've just brought this up with the Elasticsearch "Fix It" bi-weekly meeting and had a discussion about options. I think there are a few things here:
I think we should focus on looking into (i) or (ii) and avoid (iii) if possible. @marshallmain are there reasons I'm not aware of that make (iii) more preferable (the full document update)? |
Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui) |
I'd like to first understand all/any updates we are currently making to the alert document, so we can decide which of these options is feasible for us. We can change how this works later if the updates become more complicated, but for now, if we can simply use a painless script to have this happen on the ES node, I think it would have a lot of potential advantages. |
This sounds fine to me.
For the security solution where we're creating a new document and copying over the fields from a source index that we don't control, yes, this is desired. We want to ensure that runtime/const_keyword/aliased fields from the source index can be queried in the alerts as well. But it seems that the desired behavior is different here. |
If I understand correctly, what happens there is that we get the runtime fields from the source indices (i.e. In our scenario, we fetch the |
Looks like using the |
@weltenwort I think I agree on both points here unless anyone of us knows of any reasons to do it differently. |
It's safer for data that exists already in the source document, since ES doesn't do any transformations in the source itself. This next bit is a bit stream of consciousness, so forgive me if it feels disconnected I wonder if we need to use the Right now we get all fields using the fields API and transform the result into something we can consume. If we query the I understand that we want to read possible runtime fields, but even then do we expect to need them in the executor or is that more for the UI queries (e.g to perform searches, to show the values on the table, etc.) Do we expect the possible future runtime fields to be arbitrary or would we know which fields exist? If that's the case would it make sense to query the source and use the fields API for those specific fields? Is there a way to only a certain type of field using the fields API? (e.g. |
I'm just digging into this so please bear with me... It looks like the field spread is also problematic for the creation of the new documents as well, correct? I'm assuming we really only need to update these fields: kibana/x-pack/plugins/rule_registry/server/utils/create_lifecycle_executor.ts Lines 261 to 271 in 5c73c0c
Update: I guess we have to get the |
@simianhacker A bit of a history regarding this ticket. @afgomez wanted to store the rule params that triggered the alert and he explored storing it as an object but that caused lots of headaches when querying the field using the ES fields API. In the end he used an alternative for the problem he was trying to solve. He stored the params as a So I am wondering do we actually need to fix this issue, as we used an alternative solution in the end? @jasonrhodes |
@simianhacker for new documents there won't be predecessor documents whose fields will be flattened incorrectly due to the usage of the kibana/x-pack/plugins/rule_registry/server/utils/create_lifecycle_executor.ts Lines 219 to 220 in 5c73c0c
|
Summary
Right now to update an alert document, we first read the document from the index using the
fields
API, make the changes we need, and write everything back.The
fields
API doesn't necessarily return what was indexed via the_source
.Consider this simplified example:
As you can see the
fields
object has no resemblance with the original object in the_source
. This is how ES works and it's expected. The potential problem arises when we write back thefields
as_source
in the document. What happens is that the document is updated with what's in the fields, and not with what the original "_source" looked like.Another problem is that if the mapping contains any runtime fields, field aliases, etc. that are not in the _source, they will be returned by the
fields
API and later saved in_source
. This is not something we explicitly intend to do, as far as I'm aware.Plan for fixing this bug
We can't do this the way we are currently doing it where we spread the results of the Fields API-driven query back into the document. This will produce unintended results in the documents for aliased fields, runtime fields, and certain field types like object which come back from the fields API in a different format than they are intended to be indexed.
We should read from _source for the unmapped fields that we want to read, as well as any object field. "include_unmapped" will likely go away at some point, it sounds like.
We have three options to solve the update issue, listed here from (imo) best to worst:
I think we should focus on looking into (i) or (ii) and avoid (iii) if possible.
AC:
Note: The security solution appears to have logic in place (I'm not sure exactly where it's used at the moment) that attempts to merge the Fields API response with the _source response and then write the Fields version back to _source on purpose. I don't see an advantage to that for our use case right now but we should make sure we understand the purpose of doing that before making a final decision on which approach to use here.
The text was updated successfully, but these errors were encountered: