Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files plugin - swap out "get by ID" calls to make it compatible with index aliases #153334

Closed
hop-dev opened this issue Mar 20, 2023 · 8 comments · Fixed by #153342
Closed

Files plugin - swap out "get by ID" calls to make it compatible with index aliases #153334

hop-dev opened this issue Mar 20, 2023 · 8 comments · Fixed by #153342
Assignees
Labels
Team:Fleet Team label for Observability Data Collection Fleet team Team:SharedUX Team label for AppEx-SharedUX (formerly Global Experience)

Comments

@hop-dev
Copy link
Contributor

hop-dev commented Mar 20, 2023

Over on Fleet we have used the file plugin for upload and download of agent diagnostic files, we use ILM to manage the file data and metadata indices, and therefore have to use an index alias infront of the backing indices. The problem we have encountered is that after a rollover, the index alias points to multiple indices, which means get doc by ID calls fail with:

Root causes:
		illegal_argument_exception: alias [.fleet-files-endpoint] has more than one index associated with it [.fleet-files-endpoint-000001, .fleet-files-endpoint-000002], can't execute a single index op

This is because get by ID is a single index op.

We want to understand how big a change and how much risk there would be in moving to using _search to find by ID instead of the get by ID API. This is currently blocking 8.7 so we need to know if there is a realistic way forward.

Here is the fleet bug we raised #153322

@hop-dev hop-dev added Team:Fleet Team label for Observability Data Collection Fleet team Team:SharedUX Team label for AppEx-SharedUX (formerly Global Experience) labels Mar 20, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@elasticmachine
Copy link
Contributor

Pinging @elastic/appex-sharedux (Team:SharedUX)

@vadimkibana
Copy link
Contributor

vadimkibana commented Mar 21, 2023

From what I understand, this line of code is reading file chunks by get ID, but since you are not pointing the read stream to an index, but an alias, which points to multiple indices, it fails.

It seems like a quick and dirty fix could be to replace that line with _search by ID.

Retrieving using get by ID is more efficient. So, maybe a better solution would be to leave the get by ID logic by default, and add the ability to fetch by _search if somebody opts-in. For example, by specifying a flag to the ContentStream constructor:

new ContentStream({
  indexIsAlias: true, // or fetchBySearch: true
});

@hop-dev
Copy link
Contributor Author

hop-dev commented Mar 21, 2023

I think as _search will work for both index and alias, and the performance impact shouldn't be noticeable, I would lean towards having one code path instead of the flag to save some complexity. But of course I am happy to take your direction.

I am on call at the minute so need to tend to a couple of SDHs, I'm then happy to look at getting a PR together.

@vadimkibana
Copy link
Contributor

vadimkibana commented Mar 21, 2023

[..] the performance impact shouldn't be noticeable [..]

It might be noticeable. Found this answer on the Wiki:

image

From there:

  • get by ID is automatically routed to the correct shard, whereas search request is fanned out to all shards.
  • The _search query needs to be rewritten as a bool query and search mechanics need to be used, instead of just fetching a document by its ID.
  • Using get the latest document is returned, whereas, when using _search, it needs to go through the Lucene search index, and where the document will only be available after a "refresh", which is 1s by default. And I believe, Elasticsearch Serverless was aiming for 5s refresh interval. (If that is true, files downloaded with get will be immediately available, but if "refresh" interval is 5s and _search is used—the already uploaded files will only be possible to download after 5s).

@kpollich
Copy link
Member

I'm +1 for implementing this via an opt-in flag to use the _search API and leaving the default behavior as fetching via get. This seems like the least breaking way to handle this, and allows us to introduce the new behavior in only the code paths experiencing issues (Fleet + Security Solution UI file uploads).

Let's make sure we include a comment regarding the new option that explains a consumer may want to use the _search behavior if they're reading from an alias.

@paul-tavares
Copy link
Contributor

Hi all,
I'll start to look at implementing the suggested approach mentioned above by adding a new argument to the exposed service that will make the internals of the service use _search instead. I'm still not clear on the extent of the changes, but I'll dig in now

@paul-tavares paul-tavares self-assigned this Mar 21, 2023
@paul-tavares
Copy link
Contributor

Make the initial set of changes here:

#153342

It gets the Security Solution use case working again and I think it also addresses Fleet's use case - still testing. If someone can take a look at the changes thus far, it would be much appreciated

paul-tavares added a commit that referenced this issue Mar 21, 2023
…dle index alias and fix Endpoint/Fleet usage to set new option to true (#153342)

## Summary

- Adds `indexIsAlias` to `Files` plugin client. Used when provided
indexes are Aliases (changes how the documents are retrieved internally)
- Changes security solution (endpoint) file service to use `.search()`
instead of `.get()` when retrieving a file metadata via `id`
- Changed Security Solution call to `createEsFileClient()` (`Files`
plugin service) to set `indexIsAlias` to `true`
- Changed Fleet call to `createEsFileClient()` (`Files` plugin service)
to set `indexIsAlias` to `true`


Addresses the following Issues that were raised for 8.7:

- Fixes #153322 
- FIxes #153334
paul-tavares added a commit to paul-tavares/kibana that referenced this issue Mar 21, 2023
…dle index alias and fix Endpoint/Fleet usage to set new option to true (elastic#153342)

## Summary

- Adds `indexIsAlias` to `Files` plugin client. Used when provided
indexes are Aliases (changes how the documents are retrieved internally)
- Changes security solution (endpoint) file service to use `.search()`
instead of `.get()` when retrieving a file metadata via `id`
- Changed Security Solution call to `createEsFileClient()` (`Files`
plugin service) to set `indexIsAlias` to `true`
- Changed Fleet call to `createEsFileClient()` (`Files` plugin service)
to set `indexIsAlias` to `true`

Addresses the following Issues that were raised for 8.7:

- Fixes elastic#153322
- FIxes elastic#153334

(cherry picked from commit 50cc574)

# Conflicts:
#	src/plugins/files/server/blob_storage_service/adapters/es/content_stream/content_stream.ts
#	src/plugins/files/server/file_client/create_es_file_client.ts
paul-tavares added a commit that referenced this issue Mar 22, 2023
…to handle index alias and fix Endpoint/Fleet usage to set new option to true (#153342) (#153403)

# Backport

This will backport the following commits from `main` to `8.7`:
- [[Security Solution + Files + Fleet] Add option to Files client to
handle index alias and fix Endpoint/Fleet usage to set new option to
true (#153342)](#153342)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Paul
Tavares","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-03-21T20:40:39Z","message":"[Security
Solution + Files + Fleet] Add option to Files client to handle index
alias and fix Endpoint/Fleet usage to set new option to true
(#153342)\n\n## Summary\r\n\r\n- Adds `indexIsAlias` to `Files` plugin
client. Used when provided\r\nindexes are Aliases (changes how the
documents are retrieved internally)\r\n- Changes security solution
(endpoint) file service to use `.search()`\r\ninstead of `.get()` when
retrieving a file metadata via `id`\r\n- Changed Security Solution call
to `createEsFileClient()` (`Files`\r\nplugin service) to set
`indexIsAlias` to `true`\r\n- Changed Fleet call to
`createEsFileClient()` (`Files` plugin service)\r\nto set `indexIsAlias`
to `true`\r\n\r\n\r\nAddresses the following Issues that were raised for
8.7:\r\n\r\n- Fixes #153322 \r\n- FIxes
#153334","sha":"50cc574c63a3837ce255878ee483b9cbc75d8277","branchLabelMapping":{"^v8.8.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Fleet","Team:Defend
Workflows","v8.7.0","v8.8.0"],"number":153342,"url":"https://github.com/elastic/kibana/pull/153342","mergeCommit":{"message":"[Security
Solution + Files + Fleet] Add option to Files client to handle index
alias and fix Endpoint/Fleet usage to set new option to true
(#153342)\n\n## Summary\r\n\r\n- Adds `indexIsAlias` to `Files` plugin
client. Used when provided\r\nindexes are Aliases (changes how the
documents are retrieved internally)\r\n- Changes security solution
(endpoint) file service to use `.search()`\r\ninstead of `.get()` when
retrieving a file metadata via `id`\r\n- Changed Security Solution call
to `createEsFileClient()` (`Files`\r\nplugin service) to set
`indexIsAlias` to `true`\r\n- Changed Fleet call to
`createEsFileClient()` (`Files` plugin service)\r\nto set `indexIsAlias`
to `true`\r\n\r\n\r\nAddresses the following Issues that were raised for
8.7:\r\n\r\n- Fixes #153322 \r\n- FIxes
#153334","sha":"50cc574c63a3837ce255878ee483b9cbc75d8277"}},"sourceBranch":"main","suggestedTargetBranches":["8.7"],"targetPullRequestStates":[{"branch":"8.7","label":"v8.7.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.8.0","labelRegex":"^v8.8.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/153342","number":153342,"mergeCommit":{"message":"[Security
Solution + Files + Fleet] Add option to Files client to handle index
alias and fix Endpoint/Fleet usage to set new option to true
(#153342)\n\n## Summary\r\n\r\n- Adds `indexIsAlias` to `Files` plugin
client. Used when provided\r\nindexes are Aliases (changes how the
documents are retrieved internally)\r\n- Changes security solution
(endpoint) file service to use `.search()`\r\ninstead of `.get()` when
retrieving a file metadata via `id`\r\n- Changed Security Solution call
to `createEsFileClient()` (`Files`\r\nplugin service) to set
`indexIsAlias` to `true`\r\n- Changed Fleet call to
`createEsFileClient()` (`Files` plugin service)\r\nto set `indexIsAlias`
to `true`\r\n\r\n\r\nAddresses the following Issues that were raised for
8.7:\r\n\r\n- Fixes #153322 \r\n- FIxes
#153334","sha":"50cc574c63a3837ce255878ee483b9cbc75d8277"}}]}]
BACKPORT-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Fleet Team label for Observability Data Collection Fleet team Team:SharedUX Team label for AppEx-SharedUX (formerly Global Experience)
Projects
None yet
5 participants