Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dbnode] Fix AggregateQuery limits #3112

Merged
merged 22 commits into from
Jan 23, 2021
Merged

[dbnode] Fix AggregateQuery limits #3112

merged 22 commits into from
Jan 23, 2021

Conversation

arnikola
Copy link
Collaborator

No description provided.

Comment on lines 52 to 64
type AggregateResultsEntries []AggregateResultsEntry

// Size is the element size of the aggregated result entries.
func (e AggregateResultsEntries) Size() int {
// NB: add 1 to the entries length for each entry's field.
length := len(e)
for _, entry := range e {
length += len(entry.Terms)
}

return length
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll be reworked to not be inside a generated file (also refactor will help getting around having to do the full length calculation per incoming field)

batch = AggregateResultsEntries(b.opts.AggregateResultsEntryArrayPool().Get())
maxBatch = cap(batch)
iterClosed = false // tracking whether we need to free the iterator at the end.
exhaustive = false
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may not need this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah seems like opts.exhaustive(size, resultCount) accomplishes the same goal

// update recently queried docs to monitor memory.
if results.EnforceLimits() {
if err := b.docsLimit.Inc(len(batch), source); err != nil {
if err := b.docsLimit.Inc(batch.Size(), source); err != nil {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will have to make a decision on how docs limit is affected in aggregate results; this is not correct

@robskillington
Copy link
Collaborator

Ah nice, this will make results fairly deterministic I assume? Can we add integration tests for this? I wonder if we also need limit client side for aggregating results across the nodes in m3/storage.go or not.

batch = AggregateResultsEntries(b.opts.AggregateResultsEntryArrayPool().Get())
maxBatch = cap(batch)
iterClosed = false // tracking whether we need to free the iterator at the end.
exhaustive = false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah seems like opts.exhaustive(size, resultCount) accomplishes the same goal

@@ -123,6 +123,7 @@ func (r *aggregatedResults) AggregateResultsOptions() AggregateResultsOptions {

func (r *aggregatedResults) AddFields(batch []AggregateResultsEntry) (int, int) {
r.Lock()
maxInsertions := r.aggregateOpts.SizeLimit - r.totalDocsCount
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be subtracting totalDocsCount or r.resultsMap.size()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah going to do some refactoring/renaming around this to make it clearer what each limit is; totalDocsCount is not quite correctly calculated at the moment, so will need to make a few touchups around it

break
}

field, term := iter.Current()
batch = b.appendFieldAndTermToBatch(batch, field, term, iterateTerms)
if len(batch) < batchSize {
if batch.Size() < maxBatch {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment on this check?

@@ -714,14 +724,14 @@ func (b *block) aggregateWithSpan(
}

// Add last batch to results if remaining.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment could be updated

})
valueInsertions++
} else {
// this value exceeds the limit, so should be released to the underling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

underlying

docResults, ok := results.(index.DocumentResults)
if !ok { // should never happen
state.Lock()
err := fmt.Errorf("unknown results type [%T] received during wide query", results)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what makes this invocation a "wide query"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 should be regular query

@@ -143,159 +140,32 @@ func (r *aggregatedResults) AddFields(batch []AggregateResultsEntry) (int, int)
valuesMap := aggValues.Map()
for _, t := range entry.Terms {
if !valuesMap.Contains(t) {
fmt.Println(maxInsertions, valueInsertions, t)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm

@arnikola arnikola marked this pull request as ready for review January 22, 2021 22:07
Copy link
Collaborator

@wesleyk wesleyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG with some questions!

valueInsertions := 0
defer r.Unlock()

maxDocs := int(math.MaxInt64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this more like, remaining docs we can load? remainingDocs?


// NB: already hit doc limit.
if maxDocs <= 0 {
for idx := 0; idx < len(batch); idx++ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so since we hit the limit, we essentially have to clean up this entire batch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly

limitTripped = true
}

if limitTripped {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we in-line the check above and remove the limitTripped var?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call

return err
valuesMap := aggValues.Map()
for _, t := range entry.Terms {
if maxDocs > docs {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the max is already hit should we break from the for? Or is it a trivial amount of iteration?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still have to iterate through the remainder here to finalize them

@arnikola arnikola merged commit e3f0282 into master Jan 23, 2021
@arnikola arnikola deleted the arnikola/fix-limits branch January 23, 2021 00:26
soundvibe pushed a commit that referenced this pull request Jan 25, 2021
* master:
  [dbnode] Handle empty slices in convert.FromSeriesIDAndEncodedTags (#3107)
  [dbnode] Fix AggregateQuery limits (#3112)
  [coordinator] Marshal return val of kv handler with jsonpb (#3116)
  [m3db] [m3coordinator] Properly pipe along require-exhaustive param for aggregate queries (#3115)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants