Change compound index to improve query performance #5568
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We've seen increase in mongoDB memory usage whenever we try to fetch executions with filters.
Query:
db.action_execution_d_b.find({"action.ref": "sre.fleet_execution", "start_timestamp": {"$gt" : NumberLong("1638316000000000"), "$lt": NumberLong("1644192000000000")},"status" : "succeeded"}, {"context" : 1,"parameters" : 1,"action" : 1,"start_timestamp" : 1,"status" : 1,"runner.runner_parameters" : 1,"_id" : 1,"end_timestamp" : 1}).sort({"start_timestamp" : -1,"action.ref" : 1})
With above query it's expected that the compound index start_timestamp_-1_action.ref_1_status_1 should be used, but the query plan shows that action.ref_1 index is being used.
old-query-plan.txt
Execution stats from the old query plan
totalDocsExamined is how many documents were examined and we want this number to be low. Even more importantly, we want to look at the ratio of TotalDocsExamined and nReturned. These numbers together helps determine “how much work is MongoDB doing to return me useful data?”. We can see here that our “hit ratio” is 472 / 5074 or ~9.3%. If our cluster is examining a high number of docs with respect to those that it is returning, we're likely to see a few things happen:
Why didn't mongoDB use compound index? From the query plan we can see that the query which utilises compound index was rejected because it was very slow and didn't return any docs by the time the winning query finished.
MongoDB recommends to follow ESR rule when creating a compound index (Ref https://www.mongodb.com/blog/post/performance-best-practices-indexing)
For compound indexes, this rule of thumb is helpful in deciding the order of fields in the index:
Given the above rule a much more efficient compound index will be {"action.ref": 1,"status": 1, "start_timestamp": -1}. We want the first field to have the high cardinality (prefer action.ref over status) and start_timestamp is mostly used to access range of data.
After creating the above index and making the same query, we see that new compound index is being used and document hit ratio is 100% now also the query execution time is ↓ by 50%.
new-query-plan.txt