-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mango queries should not use other index when use_index is specified #4511
Comments
The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To avoid this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`.
The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To prevent this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`.
I would like to work on this ticket. Committers, please assign it to me. |
The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To prevent this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`.
The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To prevent this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`. ``` ### 🐛 Bug Fixes * Prevent fallback on sub-optimal index for in-progress jobs query. ```
The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To prevent this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`.
Backport of #2644 The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To prevent this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`. ``` ### 🐛 Bug Fixes * Prevent fallback on sub-optimal index for in-progress jobs query. ```
The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To prevent this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`.
Backport of #2644 The in-progress jobs query for the export feature relies on the presence of a very specific index to be fast on cozies with a lot of `io.cozy.jobs` documents. To make sure the index presence detection goes well, we forced a sort on the `worker` attribute. However, since `cozy-stack` uses an index on `worker` and `state` only, CouchDB fallbacks to this index instead of detecting the index we want does not exist and runs the slow `$or` query in-memory. See apache/couchdb#4511 for more details on this issue. To prevent this fallback we will sort on a second attribute not present in the existing `cozy-stack` index: `message.slug`. ``` ### 🐛 Bug Fixes * Prevent fallback on sub-optimal index for in-progress jobs query. ```
Hi @paultranvan, I have studied the source code and it seems this behavior is intended (see #962). However, I see what you want to achieve here and why this behavior causes a problem. On the side of CouchDB, a possible solution would be to adjust the API so that the desired semantics could be requested. That is, by passing, say But I think the whole issue could be managed on the client side. Why not to store the list of known indexes there and use it to decide on each query if an index has been in use? This might be even faster as you would not have to look out for errors constantly but get the desired result by a single query. When needed, the list of known indexes could be initialized / synced from a CouchDB |
Hi @pgj, thank you for your answer. Yes it could be handled by the client, but it is quite cumbersome because you have to deal with this list state and update it when there are add/edit/remove on indexes. Typically, if an index is not in the list, you don't know if it's because it does not exist yet or simply because the list is not up-to-date, which could happen if there are several clients. That's why I prefer the idea of adjusting the API e.g. with a |
I like the idea of an optional behaviour to require strict index usage, and probably should have a toggle for the default behaviour in |
I proposed the API option because that can be controlled per query not per deployment. My impression is that users are mostly fine with the original behavior but there are use cases (like the one featured in this issue) where "more strictness" would be preferred. Nevertheless I can be fine with controlling the default from I fully accept the criticism about overloading |
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, automated overrides can still happen. Introduce the concept of "strict index selection" which lets the user to request the exclusive use of a specific index. When it is not possible, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the missing index, request its creation and try again later. The feature comes with a configuration toggle. By default, the feature is disabled to maintain backward compatibility, but the user may ask for this behavior via the new `use_index_strict` query parameter for a specific query. Note that, similarly, `use_index_strict` could be used to disable strict index selection temporarily when it is enabled on the server. Fixes apache#4511
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, automated overrides can still happen. Introduce the concept of "strict index selection" which lets the user to request the exclusive use of a specific index. When it is not possible, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the missing index, request its creation and try again later. The feature comes with a configuration toggle. By default, the feature is disabled to maintain backward compatibility, but the user may ask for this behavior via the new `use_index_strict` query parameter for a specific query. Note that, similarly, `use_index_strict` could be used to disable strict index selection temporarily when it is enabled on the server. Fixes apache#4511
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, automated overrides can still happen. Introduce the concept of "strict index selection" which lets the user to request the exclusive use of a specific index. When it is not possible, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the missing index, request its creation and try again later. The feature comes with a configuration toggle. By default, the feature is disabled to maintain backward compatibility, but the user may ask for this behavior via the new `use_index_strict` query parameter for a specific query. Note that, similarly, `use_index_strict` could be used to disable strict index selection temporarily when it is enabled on the server. Fixes apache#4511
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, fallbacks may still happen. Introduce the `allow_fallback` flag which can be used to tell if falling back to other indexes is acceptable when an index is explicitly specified by the user. When set to `false`, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the requested but missing index, optionally create it and try again. By default, fallbacks are allowed to maintain backwards compatibility. It is possible to set `allow_fallback` to `true` but currently it coincides with the default behavior hence becomes a no-op in practice. Fixes apache#4511
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, fallbacks may still happen. Introduce the `allow_fallback` flag which can be used to tell if falling back to other indexes is acceptable when an index is explicitly specified by the user. When set to `false`, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the requested but missing index, optionally create it and try again. By default, fallbacks are allowed to maintain backwards compatibility. It is possible to set `allow_fallback` to `true` but currently it coincides with the default behavior hence becomes a no-op in practice. Fixes apache#4511
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, fallbacks may still happen. Introduce the `allow_fallback` flag which can be used to tell if falling back to other indexes is acceptable when an index is explicitly specified by the user. When set to `false`, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the requested but missing index, optionally create it and try again. By default, fallbacks are allowed to maintain backwards compatibility. It is possible to set `allow_fallback` to `true` but currently it coincides with the default behavior hence becomes a no-op in practice. Fixes apache#4511
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, fallbacks may still happen. Introduce the `allow_fallback` flag which can be used to tell if falling back to other indexes is acceptable when an index is explicitly specified by the user. When set to `false`, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the requested but missing index, optionally create it and try again. By default, fallbacks are allowed to maintain backwards compatibility. It is possible to set `allow_fallback` to `true` but currently it coincides with the default behavior hence becomes a no-op in practice. Fixes apache#4511
Thanks a lot @pgj ! |
Description
When a mango query includes a
use_index
option and the specified index does not exist, CouchDB might fallback on an existing index, which can cause unexpected behavior.Steps to Reproduce
Consider the following mango query:
Now suppose the
_design/by-worker
index does not exist, but there is an existing one on the same fields:The
_design/by-worker-and-state
index will actually be used. However, this can be problematic if the database is quite huge: the$or
will be evaluated in-memory, which could result in timeouts as the whole database will be scanned.In our client logic, we first run the mango query and expect an error, if the index does not exist. Then, we create the index, and run the query again. This avoids to check if the index exists for every mango query, as it will be the case most of the time.
See the logic here.
In our example, if the index does not exist, we would create one with a partial filter:
This avoids to evaluate the
$or
at query time, and potential timeouts. But because of the existing index being used, we never have the opportunity to create the actual and efficient index.Expected Behaviour
We expect any mango query including a
use_index
to immediately fails if the index does not exist. This seems reasonable to force the use of a particular index without any fallback, as it might lead to sub-optimal queries.Your Environment
Additional Context
This issue can be convoluted by specifying a sort on fields that are not used on the existing index, but this is not really a solution for us.
The text was updated successfully, but these errors were encountered: