Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change compound index to improve query performance #5568

Merged

Conversation

khushboobhatia01
Copy link
Contributor

We've seen increase in mongoDB memory usage whenever we try to fetch executions with filters.

  • Query:
    db.action_execution_d_b.find({"action.ref": "sre.fleet_execution", "start_timestamp": {"$gt" : NumberLong("1638316000000000"), "$lt": NumberLong("1644192000000000")},"status" : "succeeded"}, {"context" : 1,"parameters" : 1,"action" : 1,"start_timestamp" : 1,"status" : 1,"runner.runner_parameters" : 1,"_id" : 1,"end_timestamp" : 1}).sort({"start_timestamp" : -1,"action.ref" : 1})

  • With above query it's expected that the compound index start_timestamp_-1_action.ref_1_status_1 should be used, but the query plan shows that action.ref_1 index is being used.
    old-query-plan.txt

  • Execution stats from the old query plan

"executionSuccess" : true,
"nReturned" : 472,
"executionTimeMillis" : 51,
"totalKeysExamined" : 5074,
"totalDocsExamined" : 5074,

totalDocsExamined is how many documents were examined and we want this number to be low. Even more importantly, we want to look at the ratio of TotalDocsExamined and nReturned. These numbers together helps determine “how much work is MongoDB doing to return me useful data?”. We can see here that our “hit ratio” is 472 / 5074 or ~9.3%. If our cluster is examining a high number of docs with respect to those that it is returning, we're likely to see a few things happen:

  1. longer query times overall
  2. more utilisation of clusters CPU & memory resources
  3. choppier cache residency and eviction
  4. locked & blocking queries under load
  • Why didn't mongoDB use compound index? From the query plan we can see that the query which utilises compound index was rejected because it was very slow and didn't return any docs by the time the winning query finished.
    MongoDB recommends to follow ESR rule when creating a compound index (Ref https://www.mongodb.com/blog/post/performance-best-practices-indexing)
    For compound indexes, this rule of thumb is helpful in deciding the order of fields in the index:

    1. First, add those fields against which Equality queries are run.
    2. The next fields to be indexed should reflect the Sort order of the query.
    3. The last fields represent the Range of data to be accessed.
  • Given the above rule a much more efficient compound index will be {"action.ref": 1,"status": 1, "start_timestamp": -1}. We want the first field to have the high cardinality (prefer action.ref over status) and start_timestamp is mostly used to access range of data.

  • After creating the above index and making the same query, we see that new compound index is being used and document hit ratio is 100% now also the query execution time is ↓ by 50%.
    new-query-plan.txt

"executionSuccess" : true,
"nReturned" : 472,
"executionTimeMillis" : 25,
"totalKeysExamined" : 472,
"totalDocsExamined" : 472,

  • This issue might be prominent where no. of executions are less or execution documents are small because not a lot of data is to be fetched.

@pull-request-size pull-request-size bot added the size/XS PR that changes 0-9 lines. Quick fix/merge. label Feb 8, 2022
@cognifloyd
Copy link
Member

Can you add a changelog entry for this as well?

@pull-request-size pull-request-size bot added size/S PR that changes 10-29 lines. Very easy to review. and removed size/XS PR that changes 0-9 lines. Quick fix/merge. labels Feb 9, 2022
@cognifloyd
Copy link
Member

Will this need any kind of migration when upgrading between ST2 versions? (eg to recreate the index?)

@cognifloyd cognifloyd added this to the 3.7.0 milestone Feb 9, 2022
@khushboobhatia01
Copy link
Contributor Author

Will this need any kind of migration when upgrading between ST2 versions? (eg to recreate the index?)

No, that won't be needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S PR that changes 10-29 lines. Very easy to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants