-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mango: generate a warning instead of an error when an user-specified index cannot be used #962
Conversation
src/mango/src/mango_cursor.erl
Outdated
maybe_add_warning(UserFun, #idx{type = IndexType}, UserAcc) -> | ||
case IndexType of | ||
maybe_add_warning(UserFun, #cursor{index = Index, opts = Opts}, UserAcc) -> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this whitespace intended? I don't think I've seen it right after a function declaration before. Might want to check styling guidelines. It's slightly hard to read.
src/mango/src/mango_cursor.erl
Outdated
{use_index, [DesignId]} -> | ||
case filter_indexes([Index], DesignId) of | ||
[] -> | ||
[fmt("_design/~s was not used because it does not contain a valid index for this query.", [ddoc_name(DesignId)])]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is too long
src/mango/src/mango_cursor.erl
Outdated
{use_index, [DesignId, ViewName]} -> | ||
case filter_indexes([Index], DesignId, ViewName) of | ||
[] -> | ||
[fmt("_design/~s, ~s was not used because it is not a valid index for this query.", [ddoc_name(DesignId), ViewName])]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super long line
|
||
ddoc_name(Name) -> | ||
Name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing endline
src/mango/src/mango_error.erl
Outdated
400, | ||
<<"no_usable_index">>, | ||
<<"There are no indexes defined in this database.">> | ||
}; | ||
info(mango_idx, {no_usable_index, no_index_matching_name}) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to remove this error as well?
end, | ||
GlobalIndexes = mango_cursor:remove_indexes_with_partial_filter_selector(ExistingIndexes), | ||
UserSpecifiedIndex = mango_cursor:maybe_filter_indexes_by_ddoc(ExistingIndexes, Opts), | ||
UsableIndexes0 = lists:usort(GlobalIndexes ++ UserSpecifiedIndex), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we sorting the indexes here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is to remove duplicates - an index might belong to GlobalIndexes and be specified by the user explicitly.
@@ -77,34 +73,78 @@ def test_uses_index_when_no_range_or_equals(self): | |||
resp_explain = self.db.find(selector, explain=True) | |||
self.assertEqual(resp_explain["index"]["type"], "json") | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitespace
I need to think about this some more. Falling back on _all_docs is one thing, but I'm a little concerned about switching between indexes. Some initial thoughts:
|
@tonysun83 Regarding your concerns about the fallback behaviour:
|
Wanted to clarify a bit: For scenario 1): I agree that fixing the difference between text and json indexes is outside the scope of the PR, but I was thinking about the specific backwards-compatibility situation where a user specifies For scenario 2): Similarly to scenario 1), the issue is whether or not the fallback leads to inadvertent changes for the user. In this case, he is modifying his indexes based on perceived notion of what is being used. The performance degradation for _all_docs is noticed and he complains about the issue. We notify him saying that an index was deleted, but he says the index wasn't being used. Now he says ok he wants to rebuild an index...and that may take a long time and negatively affect his cluster. I wonder if it's better to explicitly send back index names as part an failure error message. But could that be a security risk? |
@@ -46,7 +46,7 @@ def test_use_most_columns(self): | |||
"name.last": "Something or other", | |||
"age": {"$gt": 1} | |||
}, explain=True) | |||
self.assertNotEqual(resp["index"]["ddoc"], "_design/" + ddocid) | |||
self.assertNotEqual(resp["index"]["ddoc"], ddocid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks a random testcase fix not related to the PR?
raise AssertionError("did not reject bad use_index") | ||
|
||
resp = self.db.find(selector, use_index=[ddocid,name], return_raw=True) | ||
self.assertEqual(resp["warning"], "{0}, {1} was not used because it is not a valid index for this query.".format(ddocid, name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the test case above, you're checking that a value is returned
# should still return a correct result
for d in r["docs"]:
self.assertEqual(d["company"], "Pharmex")
We should be consistent?
self.assertEqual(resp_explain["index"]["ddoc"], ddocid_valid) | ||
|
||
resp = self.db.find(selector, sort=["foo", "bar"], use_index=ddocid_invalid, return_raw=True) | ||
self.assertEqual(resp["warning"], '{0} was not used because it does not contain a valid index for this query.'.format(ddocid_invalid)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, we should get a value back here right? It's going to fall back on the ddoc_valid and return a result, so just confirm a result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree that we should check the result in the cases above. In this test, however, I don't expect any matches (and can assert that).
src/mango/src/mango_cursor.erl
Outdated
case filter_indexes([Index], DesignId) of | ||
[] -> | ||
[fmt("_design/~s was not used because it does not contain a valid index for this query.", | ||
[ddoc_name(DesignId)])]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should make the warning message a little bit more explicit here and add "we used index " instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured the user can use _explain to see which index was used
src/mango/src/mango_cursor.erl
Outdated
case filter_indexes([Index], DesignId, ViewName) of | ||
[] -> | ||
[fmt("_design/~s, ~s was not used because it is not a valid index for this query.", | ||
[ddoc_name(DesignId), ViewName])]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as the above comment
src/mango/src/mango_cursor.erl
Outdated
end, | ||
|
||
maybe_add_warning_int(UseIndexInvalid ++ NoIndex, UserFun, UserAcc). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to append two messages and then take the head of that list later? I feel like it's either going to be a
- "we used a diff index" warning or
- "no indexes found, we used all docs" warning.
Instead of returning empty lists, perhaps you could just move the NoIndex value as the return value for the case clauses. That way no need to add and then take the head.
Ex:
{use_index, []} ->
NoIndex;
src/mango/src/mango_cursor.erl
Outdated
|
||
maybe_add_warning_int(Warnings, UserFun, UserAcc) -> | ||
% only include the first warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment about about the warning list
+1 |
Mango will always fall back to the primary index so this check / error message is redundant.
If a user specifies a value for use_index that is not valid for the selector - i.e. it does not meet the coverage requirements of the selector or sort fields - attempt to fall back to a valid index (or database scan) rather than returning a 400 error. When a fallback occurs, populate the "warning" field in the response (as we already do when a full database scan takes place) with details of the fallback. This change is partially as mitigation for #816, which may lead to some previously valid indexes being deemed invalid, and also to make use_index less brittle in general. If an index that is used explicitly by active queries is removed, Couch will now generate warnings and there may be a performance impact, but the client will still get correct results.
Overview
If a user specifies a value for use_index that is not valid for the current query - i.e. it does not meet the coverage requirements of the selector or sort fields - attempt to fall back to a valid index (or database scan) rather than returning a 400 error to the client.
When a fallback occurs, populate the "warning" field in the response (as we already do when a full database scan takes place) to feed back to the user that
use_index
was ignored.This change is partially as mitigation for #816, which may lead to some previously valid indexes being deemed invalid, and also to make
use_index
less brittle for production applications in general.Testing recommendations
Run the test suite. Run some mango queries with and without
use_index
. Observe that a warning is returned in the JSON and in Fauxton.Checklist