[RAC] Alert document updates should not spread results of a fields API query #113003

afgomez · 2021-09-23T16:23:56Z

Summary

Right now to update an alert document, we first read the document from the index using the fields API, make the changes we need, and write everything back.

The fields API doesn't necessarily return what was indexed via the _source.

Consider this simplified example:

# Create the index
PUT /test
{
  "mappings": {
    "properties": {    
      "foo": { "type": "object", "enabled": "false" }
    }
  }
}

# Save a document
POST /test/_doc/1
{
  "foo": {
    "nodeType": "host",
    "criteria": [{ "comparator": "GT", "threshold": 0.1 }, { "comparator": "LT", "threshold": 0.5 }]
  }
}


# Retrieve the document using fields
GET /test/_search
{
  "_source": false,
  # Note the `include_unmapped` here
  "fields": [{ "field": "*", "include_unmapped": "true"}]
}


# ...
{
  ...,
  "hits" : {
    ...,
    "hits" : [
      {
        ...,
        "fields" : {
          "foo.criteria.threshold" : [
            0.1,
            0.5
          ],
          "foo.nodeType" : [
            "host"
          ],
          "foo.criteria.comparator" : [
            "GT",
            "LT"
          ]
        }
      }
    ]
  }
}

As you can see the fields object has no resemblance with the original object in the _source. This is how ES works and it's expected. The potential problem arises when we write back the fields as _source in the document. What happens is that the document is updated with what's in the fields, and not with what the original "_source" looked like.

POST /test/_doc/1
{
  "foo.criteria.threshold" : [
    0.1,
    0.5
  ],
  "foo.nodeType" : [
    "host"
  ],
  "foo.criteria.comparator" : [
    "GT",
    "LT"
  ]
}

Another problem is that if the mapping contains any runtime fields, field aliases, etc. that are not in the _source, they will be returned by the fields API and later saved in _source. This is not something we explicitly intend to do, as far as I'm aware.

Plan for fixing this bug

We can't do this the way we are currently doing it where we spread the results of the Fields API-driven query back into the document. This will produce unintended results in the documents for aliased fields, runtime fields, and certain field types like object which come back from the fields API in a different format than they are intended to be indexed.
We should read from _source for the unmapped fields that we want to read, as well as any object field. "include_unmapped" will likely go away at some point, it sounds like.
We have three options to solve the update issue, listed here from (imo) best to worst:
1. The Elasticsearch update API would allow us to let the GET + PUT logic be handled on the ES nodes. We would likely need to write the update script in Painless, though.
2. Create a new object where updates are applied, and only apply those updates back to the document. Don't attempt to spread the entire document back into the update object and do a full document update/replace.
3. Query from _source and from the fields API, then spread the _source result back onto the full document to do a full document update/replace.

I think we should focus on looking into (i) or (ii) and avoid (iii) if possible.

AC:

List all updates we're currently doing to alert documents (even if some of them only happen conditionally)
Remove "include_unmapped" and use _source for any object/unmapped fields we need to access
Look into options (i) and (ii) above and decide on what's best (with the possibility that we are missing context that makes (iii) desirable, in which case we'll need to read in from _source and the fields API.

Note: The security solution appears to have logic in place (I'm not sure exactly where it's used at the moment) that attempts to merge the Fields API response with the _source response and then write the Fields version back to _source on purpose. I don't see an advantage to that for our use case right now but we should make sure we understand the purpose of doing that before making a final decision on which approach to use here.

The text was updated successfully, but these errors were encountered:

marshallmain · 2021-09-23T17:12:22Z

On the security solution side we've implemented "best-effort" merging of fields into _source which may be useful here. See implementation and readme.

This strategy tries to get the best of both worlds, so we can retain any runtime/const keyword fields that only show up in fields but also retain the original structure of _source.

marshallmain · 2021-09-23T17:14:26Z

cc @FrankHassanabad implemented it and may have more thoughts.

timroes · 2021-09-27T15:30:56Z

~~I am curious: Why are we reading the document in general via fields API, when we want to use it as _source to index again? Wouldn't it make more sense reading the _source here directly?~~

Sorry just noticed this was already answered in another comment. That we're using runtime fields inside alerting somehow. I change my question then to: When you read from fields you'll get the runtime fields back and will now index them. Is that a desired behavior for those fields?

afgomez · 2021-09-27T15:31:40Z

@timroes my guess is that we want to also read fields not present in the source, like runtime fields and such.

jasonrhodes · 2021-09-27T15:37:08Z

When you read from fields you'll get the runtime fields back and will now index them. Is that a desired behavior for those fields?

No, this is not desired. That's the bug we've identified with this approach here in this ticket.

jasonrhodes · 2021-09-27T15:41:28Z

I've just brought this up with the Elasticsearch "Fix It" bi-weekly meeting and had a discussion about options. I think there are a few things here:

We can't do this the way we are currently doing it where we spread the results of the Fields API-driven query back into the document. This will produce unintended results in the documents for aliased fields, runtime fields, and certain field types like object which come back from the fields API in a different format than they are intended to be indexed.
We should read from _source for the unmapped fields that we want to read, as well as any object field. "include_unmapped" will likely go away at some point, it sounds like.
We have three options to solve the update issue, listed here from best to worst:
1. The Elasticsearch update API would allow us to let the GET + PUT logic be handled on the ES nodes. We would likely need to write the update script in Painless, though.
2. Create a new object where updates are applied, and only apply those updates back to the document. Don't attempt to spread the entire document back into the update object and do a full document update/replace.
3. Query from _source and from the fields API, then spread the _source result back onto the full document to do a full document update/replace.

I think we should focus on looking into (i) or (ii) and avoid (iii) if possible.

@marshallmain are there reasons I'm not aware of that make (iii) more preferable (the full document update)?

elasticmachine · 2021-09-27T15:55:11Z

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

jasonrhodes · 2021-09-27T15:59:30Z

I'd like to first understand all/any updates we are currently making to the alert document, so we can decide which of these options is feasible for us. We can change how this works later if the updates become more complicated, but for now, if we can simply use a painless script to have this happen on the ES node, I think it would have a lot of potential advantages.

marshallmain · 2021-09-27T16:15:45Z

The Elasticsearch update API would allow us to let the GET + PUT logic be handled on the ES nodes. We would likely need to write the update script in Painless, though.

This sounds fine to me.

When you read from fields you'll get the runtime fields back and will now index them. Is that a desired behavior for those fields?

For the security solution where we're creating a new document and copying over the fields from a source index that we don't control, yes, this is desired. We want to ensure that runtime/const_keyword/aliased fields from the source index can be queried in the alerts as well. But it seems that the desired behavior is different here.

afgomez · 2021-09-29T12:30:45Z

For the security solution where we're creating a new document and copying over the fields from a source index that we don't control, yes, this is desired.

If I understand correctly, what happens there is that we get the runtime fields from the source indices (i.e. foobar*) and store them as regular fields in the .alert* indices, correct?

In our scenario, we fetch the fields from the .alert* indices and store them back as regular fields in the same document. This might cause problems if we ever add a runtime field or other virtual fields to the .alert* indices (which is what we want to prevent).

weltenwort · 2021-10-04T14:49:16Z

Looks like using the _update API with partial updates would be safest and doesn't require painless. It's not apparent to me whether reading from _source might still be safer than the fields API in this case or not. Until we're actually making use of runtime fields it might be the choice least likely to mess up values while writing back previously read values. 🤔

jasonrhodes · 2021-10-06T02:18:53Z

@weltenwort I think I agree on both points here unless anyone of us knows of any reasons to do it differently.

afgomez · 2021-10-11T13:40:47Z

It's not apparent to me whether reading from _source might still be safer than the fields API in this case or not

It's safer for data that exists already in the source document, since ES doesn't do any transformations in the source itself.

This next bit is a bit stream of consciousness, so forgive me if it feels disconnected

I wonder if we need to use the fields API in the executor at all. Is there a reason to read from the fields API right now? Does it do something for us that reading from the source doesn't or should we apply YAGNI?

Right now we get all fields using the fields API and transform the result into something we can consume. If we query the _source it would already be in a consumable form.

I understand that we want to read possible runtime fields, but even then do we expect to need them in the executor or is that more for the UI queries (e.g to perform searches, to show the values on the table, etc.)

Do we expect the possible future runtime fields to be arbitrary or would we know which fields exist? If that's the case would it make sense to query the source and use the fields API for those specific fields? Is there a way to only a certain type of field using the fields API? (e.g. fields: [{ field: '*', type: 'runtime' }]).

simianhacker · 2021-11-02T19:08:27Z

I'm just digging into this so please bear with me...

It looks like the field spread is also problematic for the creation of the new documents as well, correct?

I'm assuming we really only need to update these fields:

kibana/x-pack/plugins/rule_registry/server/utils/create_lifecycle_executor.ts

Lines 261 to 271 in 5c73c0c

    
           [ALERT_DURATION]: (options.startedAt.getTime() - new Date(started).getTime()) * 1000, 
        
           [ALERT_INSTANCE_ID]: alertId, 
        
           [ALERT_START]: started, 
        
           [ALERT_UUID]: alertUuid, 
        
           [ALERT_STATUS]: isRecovered ? ALERT_STATUS_RECOVERED : ALERT_STATUS_ACTIVE, 
        
           [ALERT_WORKFLOW_STATUS]: alertData?.fields[ALERT_WORKFLOW_STATUS] ?? 'open', 
        
           [EVENT_KIND]: 'signal', 
        
           [EVENT_ACTION]: isNew ? 'open' : isActive ? 'active' : 'close', 
        
           [VERSION]: ruleDataClient.kibanaVersion, 
        
           ...(isRecovered ? { [ALERT_END]: commonRuleFields[TIMESTAMP] } : {}),

Update: I guess we have to get the _source of the alert too in order to keep them in sync. For example, if someone updates an the name of the alert, we would need to update that too since it changed.

mgiota · 2021-11-03T08:57:51Z

@simianhacker A bit of a history regarding this ticket. @afgomez wanted to store the rule params that triggered the alert and he explored storing it as an object but that caused lots of headaches when querying the field using the ES fields API.

In the end he used an alternative for the problem he was trying to solve. He stored the params as a keyword with index: false that contains the serialized params object. More information #113429.

So I am wondering do we actually need to fix this issue, as we used an alternative solution in the end? @jasonrhodes

weltenwort · 2021-11-03T12:02:36Z

@simianhacker for new documents there won't be predecessor documents whose fields will be flattened incorrectly due to the usage of the fields API. But in the case of updating existing documents the fields API will cause problems, so I'd agree that we should probably use _source instead:

kibana/x-pack/plugins/rule_registry/server/utils/create_lifecycle_executor.ts

Lines 219 to 220 in 5c73c0c

    
           _source: false, 
        
           fields: [{ field: '*', include_unmapped: true }],

afgomez added the Feature:RAC label obsolete label Sep 23, 2021

botelastic bot added the needs-team Issues missing a team label label Sep 23, 2021

afgomez added bug Fixes for quality problems that affect the customer experience and removed needs-team Issues missing a team label labels Sep 23, 2021

botelastic bot added the needs-team Issues missing a team label label Sep 23, 2021

mgiota mentioned this issue Sep 23, 2021

[Logs UI] Register log threshold rule as lifecycle rule #104341

Merged

jasonrhodes added Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Theme: rac label obsolete labels Sep 27, 2021

botelastic bot removed the needs-team Issues missing a team label label Sep 27, 2021

jasonrhodes added needs-team Issues missing a team label v7.16.0 labels Sep 27, 2021

botelastic bot removed the needs-team Issues missing a team label label Sep 27, 2021

jasonrhodes changed the title ~~[RAC] Potential problem when updating an alerting document~~ [RAC] Alert document updates should not spread results of a fields API query Sep 27, 2021

simianhacker self-assigned this Nov 2, 2021

simianhacker removed their assignment Nov 3, 2021

paulb-elastic assigned simianhacker Nov 8, 2021

simianhacker mentioned this issue Nov 10, 2021

[Rule Registry] Switch to _source for updating documents instead of Fields API #118245

Merged

1 task

simianhacker closed this as completed in #118245 Dec 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RAC] Alert document updates should not spread results of a fields API query #113003

[RAC] Alert document updates should not spread results of a fields API query #113003

afgomez commented Sep 23, 2021 •

edited by jasonrhodes

Loading

marshallmain commented Sep 23, 2021

marshallmain commented Sep 23, 2021

timroes commented Sep 27, 2021 •

edited

Loading

afgomez commented Sep 27, 2021

jasonrhodes commented Sep 27, 2021

jasonrhodes commented Sep 27, 2021 •

edited

Loading

elasticmachine commented Sep 27, 2021

jasonrhodes commented Sep 27, 2021

marshallmain commented Sep 27, 2021

afgomez commented Sep 29, 2021

weltenwort commented Oct 4, 2021 •

edited

Loading

jasonrhodes commented Oct 6, 2021

afgomez commented Oct 11, 2021

simianhacker commented Nov 2, 2021 •

edited

Loading

mgiota commented Nov 3, 2021 •

edited

Loading

weltenwort commented Nov 3, 2021

[RAC] Alert document updates should not spread results of a fields API query #113003

[RAC] Alert document updates should not spread results of a fields API query #113003

Comments

afgomez commented Sep 23, 2021 • edited by jasonrhodes Loading

Summary

Plan for fixing this bug

AC:

marshallmain commented Sep 23, 2021

marshallmain commented Sep 23, 2021

timroes commented Sep 27, 2021 • edited Loading

afgomez commented Sep 27, 2021

jasonrhodes commented Sep 27, 2021

jasonrhodes commented Sep 27, 2021 • edited Loading

elasticmachine commented Sep 27, 2021

jasonrhodes commented Sep 27, 2021

marshallmain commented Sep 27, 2021

afgomez commented Sep 29, 2021

weltenwort commented Oct 4, 2021 • edited Loading

jasonrhodes commented Oct 6, 2021

afgomez commented Oct 11, 2021

simianhacker commented Nov 2, 2021 • edited Loading

mgiota commented Nov 3, 2021 • edited Loading

weltenwort commented Nov 3, 2021

afgomez commented Sep 23, 2021 •

edited by jasonrhodes

Loading

timroes commented Sep 27, 2021 •

edited

Loading

jasonrhodes commented Sep 27, 2021 •

edited

Loading

weltenwort commented Oct 4, 2021 •

edited

Loading

simianhacker commented Nov 2, 2021 •

edited

Loading

mgiota commented Nov 3, 2021 •

edited

Loading