[dbnode] Aggregate() using only FSTs where possible #1545

prateek · 2019-04-11T01:10:00Z

When we're not filtering on a query, we can compose the results of Aggregation from the FSTs directly. This avoids the code path from postings lists -> documents, thereby saving a lot of CPU.

prateek · 2019-04-11T01:11:44Z

src/dbnode/storage/index/block.go

+	field, term []byte,
+	includeTerms bool,
+) []AggregateResultsEntry {
+	// NB(prateek): we make a copy of the (field, term) entries returned


highlight this bit as a potential contentious choice. could simplify to living with higher contention if people feel strongly.

I like this approach fwiw :)

Aye, me too.

prateek · 2019-04-11T01:12:58Z

src/dbnode/storage/index/aggregate_results.go

+			aggValues = r.valuesPool.Get()
+			// we can avoid the copy because we assume ownership of the passed ident.ID,
+			// but still need to finalize it.
+			r.resultsMap.set(f, aggValues, _AggregateResultsMapKeyOptions{


using an internal method of codegen'd type feels sketch but we don't have the equivalent exported so no way around

Can't this use

SetUnsafe(f, aggValues, AggregateResultsMapSetUnsafeOptions{ NoCopyKey: true, NoFinalizeKey: false, })

to avoid the internal method?

Good call, will do.

codecov · 2019-04-11T01:22:39Z

Codecov Report

Merging #1545 into master will decrease coverage by 0.1%.
The diff coverage is 39.6%.

@@           Coverage Diff            @@
##           master   #1545     +/-   ##
========================================
- Coverage    71.7%   71.6%   -0.2%     
========================================
  Files         947     948      +1     
  Lines       77981   78230    +249     
========================================
+ Hits        55960   56026     +66     
- Misses      18381   18548    +167     
- Partials     3640    3656     +16

Flag	Coverage Δ
#aggregator	`82.3% <ø> (ø)`	⬆️
#cluster	`85.7% <ø> (ø)`	⬆️
#collector	`63.7% <ø> (ø)`	⬆️
#dbnode	`79.6% <38.7%> (-0.5%)`	⬇️
#m3em	`73.2% <ø> (ø)`	⬆️
#m3ninx	`73.9% <100%> (ø)`	⬆️
#m3nsch	`51.1% <ø> (ø)`	⬆️
#metrics	`17.5% <ø> (ø)`	⬆️
#msg	`74.9% <ø> (ø)`	⬆️
#query	`66.5% <ø> (ø)`	⬆️
#x	`85.7% <ø> (-0.2%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a395df...313d21f. Read the comment docs.

arnikola · 2019-04-11T21:08:53Z

src/dbnode/storage/index/types.go

@@ -181,6 +181,16 @@ type AggregateResults interface {
 		aggregateQueryOpts AggregateResultsOptions,
 	)

+	// AggregateResultsOptions returns the set AggregateResultsOptions.


returns the options for this AggregateResult?

arnikola · 2019-04-11T21:10:59Z

src/dbnode/storage/index/types.go

+		opts QueryOptions,
+		results AggregateResults,
+	) (exhaustive bool, err error)
+
 	// AddResults adds bootstrap results to the block, if c.


I know it's not part of this pr but can you fix this comment?

arnikola · 2019-04-11T21:11:45Z

src/dbnode/storage/index/types.go

@@ -273,6 +290,16 @@ type Block interface {
 		results BaseResults,
 	) (exhaustive bool, err error)

+	// Aggregate aggregates known tag names/values.
+	// NB(prateek): this is different from Query, as we can


nit; different from AggregateQuery?

Nah, methods on this interface are Query() and Aggregate(). Unless you mean to say rename?

different from aggregating by means of Query maybe? Otherwise it kinda reads like this aggregates tag names/values which you don't really expect Query to do

arnikola · 2019-04-11T21:12:42Z

src/dbnode/storage/index/options.go

@@ -304,6 +322,16 @@ func (o *opts) DocumentArrayPool() doc.DocumentArrayPool {
 	return o.docArrayPool
 }

+func (o *opts) SetAggregateResultsEntryArrayPool(value AggregateResultsEntryArrayPool) Options {
+	opts := *o
+	opts.aggResultsEntryArrayPool = value


Should this pool be added to the validate function?

ah nvm looks like that's only for pools which need to be added rather than the ones created by NewOptions

meh, i'll add for sanity.

arnikola · 2019-04-12T14:56:11Z

src/dbnode/storage/index/aggregate_results.go

+			aggValues = r.valuesPool.Get()
+			// we can avoid the copy because we assume ownership of the passed ident.ID,
+			// but still need to finalize it.
+			r.resultsMap.set(f, aggValues, _AggregateResultsMapKeyOptions{


Can't this use

SetUnsafe(f, aggValues, AggregateResultsMapSetUnsafeOptions{ NoCopyKey: true, NoFinalizeKey: false, })

to avoid the internal method?

arnikola · 2019-04-12T17:23:10Z

src/dbnode/storage/index/block.go

+			return false, err
+		}
+
+		if err := iter.Close(); err != nil {


nit: I don't think this is necessary; you reset every loop and close the iterator explicitly below already

Hm it's necessary cause we want to ensure we're releasing any internal state (via Close()). Reset() just clears that state, doesn't release it.

Fair; in that case it's a little weird that we double close below

For this particular iterator it's not really a problem, but it's plausible for an iter implementatiopn to panic if it's double closed here

Definitely a fair point. This implementation doesn't have that issue intentionally for this reason - so as to ensure we free resources. I'll make a note about this property in a comment.

Still feels a little sketch to me because I think some of our Close() methods return objects to pools

that's a fair point. I'll rework to not rely on this behaviour.

arnikola · 2019-04-12T17:24:18Z

src/dbnode/storage/index/block.go

+
+	// Add last batch to results if remaining.
+	if len(batch) > 0 {
+		batch, size, err = b.addAggregateResults(cancellable, results, batch)


Does this need to append aggregate results too?

nah, cause we're already done that for the elements in the batch in a previous loop iteration.

arnikola · 2019-04-12T17:27:50Z

src/dbnode/storage/index/block.go

+	field, term []byte,
+	includeTerms bool,
+) []AggregateResultsEntry {
+	// NB(prateek): we make a copy of the (field, term) entries returned


I like this approach fwiw :)

arnikola · 2019-04-12T17:28:23Z

src/dbnode/storage/index/block.go

+	var (
+		entry            AggregateResultsEntry
+		lastField        []byte
+		lastFieldIsValid bool


nit: I don't think this is necessary, bytes.Equal should handle the lastField == nil case

Hm not sure we can. Consider the case when we're trying to distinguish the first element of a batch (i.e. no prior elements in the batch), from a batch with the last element having a nil field. I don't think it'll happen in practice but this way makes fewer assumptions about the data so tend to prefer it.

I think the nil field will break this in general even if it's somewhere in the middle (since you'd be trying to call .Bytes() on it); otherwise is there any real difference between?

Could do something like

if len(batch) > 0 { last := batch[len(batch)-1] if last != nil { lastField = last.Bytes() } } if bytes.Equal(lastField, field) { entry = batch[len(batch)-1] } else { entry.Field = b.pooledID(field) } ...

we're guaranteed it'd only be the first element (cause its the first lexicographic string). It wouldn't break anything cause we'd still allocate an ident.ID backed by an empty slice for it.

arnikola · 2019-04-12T17:31:25Z

src/dbnode/storage/index/block.go

+	return exhaustive, nil
+}
+
+func (b *block) appendAggregateResults(


nit: the name was really confusing (as was addAggregateResults followed by addAggregateResults), maybe rename this to appendFieldAndTermToBatch

arnikola · 2019-04-15T14:07:29Z

src/dbnode/storage/index/fields_terms_iterator.go

+
+func (fti *fieldsAndTermsIter) setNext() bool {
+	// if only need to iterate fields
+	if !fti.opts.iterateTerms {


nit: I think this might flow a little better if the branch is moved to the Next() function instead of calling setNext then instantly calling setNextField

fair, done.

arnikola · 2019-04-15T14:12:49Z

src/dbnode/storage/index/fields_terms_iterator.go

+		return false
+	}
+	fti.termIter = termsIter
+


Hmm this may need to check if !fti.opts.allow(field) and move onto the next fieldIter if so

nope, cause setNextField() does that check already.

👍 misread that this was calling Next instead

arnikola · 2019-04-15T14:28:06Z

src/dbnode/storage/index/field_terms_iterator_test.go

+	slice := toSlice(t, iter)
+	requireSlicesEqual(t, []pair{}, slice)
+}
+


Can you add a few tests with allowFn set, maybe one which goes through a couple of segments

re: allowFn tests - does the existing TestFieldsTermsIteratorSimpleSkip not capture stuff you're worried about?

re: multiple segments, do you mean Reset()? if so, done

also, re allowFn - the prop tests using fieldsTermsIteratorPropInput are generating test situations with those cases too.

My bad yeah, was looking at a partial diff so the tests seemed lacking

richardartoul · 2019-04-15T21:00:33Z

src/dbnode/storage/index.go

-	exhaustive, err := i.query(ctx, query, results, opts.QueryOptions)
+	// use appropriate fn to query underlying blocks.
+	fn := i.execBlockQueryFn
+	if query.Equal(idx.NewAllQuery()) {


can we just compare against a global one? Its miniscule but technically this allocates because it creates an interface internally

lol it's not going to have an impact on perf but sure.

richardartoul · 2019-04-15T21:07:00Z

src/dbnode/storage/index.go

+		return
+	}
+
+	if blockExhaustive {


Might be simpler to just do:

state.exhaustive = blockExhaustive return

Would make adding more logic later easier, although technically it might be slower cause you're forcing a memory write back 🤷‍♂️

setting to state.exhaustive = state.exhaustive && blockExhaustive always works so doing that instead

richardartoul · 2019-04-15T21:08:08Z

src/dbnode/storage/index.go

+		return
+	}
+
+	if blockExhaustive {


same comment

same as above.

richardartoul · 2019-04-15T21:15:52Z

src/dbnode/storage/index/aggregate_results.go

+		f := entry.Field
+		aggValues, ok := r.resultsMap.Get(f)
+		if !ok {
+			aggValues = r.valuesPool.Get()


How do these get reset? I don't see it after any Get() calls nor in the generated code on put. Are we just trusting that they've been properly reset here?

The way it's supposed to work: we register a Finalizer on the incoming context to ensure the Finalize() method on the object is called (which in turn calls Reset(nil, ...) which does the actual releasing).

That said, the current code doesn't look to be registering a Finalizer on the context in either Query/Aggregate. Will put up another PR for this.

#1567 as the follow up.

is it cheap to do a reset after pulling out the pool? I generally prefer that pattern over trust that every Put does a proper reset but I trust you understand this lifecycle better than I do

yeah Reset is O(items in the map); so should be free considering the Put is doing the right thing. Don't need to do it tho, we ensure to cleanup before the Put

I prefer this approach (assume object returned from pool is valid) cause it puts the cleanup penalty on the last callsite to use the object (as opposed to the new callsite to receive the object). Seems like a "fairer" way to tax users

richardartoul · 2019-04-15T21:22:19Z

src/dbnode/storage/index/aggregate_results.go

+	return r.aggregateOpts
+}
+
+func (r *aggregatedResults) AddFields(batch []AggregateResultsEntry) (int, error) {


is the error just future proofing the API? Doesnt look like its currently possible to return one

fair, will simplify

richardartoul · 2019-04-15T22:07:53Z

src/dbnode/storage/index/block.go

+	data := b.bytesPool.Get(len(id))
+	data.IncRef()
+	data.AppendAll(id)
+	data.DecRef()


hmm, seems like this ID will have no refs for awhile until we add it to the results :/ Nothing we can do about that I assume? adding more accounting probably not worth it since these are really expensive

The ref is only the bytes backing this ident, which the ident will take a ref to right after this line when we transfer the bytes to it (https://github.com/m3db/m3/blob/master/src/x/ident/identifier_pool.go#L105)

oh duh I misunderstood this. 👍

richardartoul · 2019-04-15T22:08:51Z

src/dbnode/storage/index/block.go

+	results AggregateResults,
+	batch []AggregateResultsEntry,
+) ([]AggregateResultsEntry, int, error) {
+	// Checkout the lifetime of the query before adding results


nit: period at the end of all these comments

richardartoul · 2019-04-15T22:10:20Z

src/dbnode/storage/index/field_terms_iterator_prop_test.go

+	parameters.Rng = rand.New(rand.NewSource(seed))
+	properties := gopter.NewProperties(parameters)
+
+	properties.Property("Fields Terms Iteration doesn't blow up", prop.ForAll(


Can you add a comment about the motivation for why we need a separate test just to make sure there are no panics when we already have a correctness test (that presumably might also catch panics)

tl;dr - correctness prop test ensures we behave correctly on the happy path. this prop tests ensures we don't panic unless the underlying itself iterator panics.

will add a comment.

itself -> iterator

richardartoul · 2019-04-15T22:12:51Z

src/dbnode/storage/index/fields_terms_iterator.go

+	fieldsAndTermsIterZeroed fieldsAndTermsIter
+)
+
+var _ fieldsAndTermsIterator = &fieldsAndTermsIter{}


this only happens at package initialization anyways right? who cares haha

richardartoul · 2019-04-15T22:13:23Z

src/dbnode/storage/index/fields_terms_iterator.go

+	// Err returns any errors encountered during iteration.
+	Err() error
+
+	// Close releases any resources held by the iterator.


Maybe add a comment explicitly saying this will not return the iter to the pool and that anything implementing this interface should explicitly support double closes

as mentioned earlier, reworked to avoid this dependency.

richardartoul

:stamp: to unblock

richardartoul · 2019-04-18T15:03:57Z

src/dbnode/storage/index/aggregate_results.go

+		f := entry.Field
+		aggValues, ok := r.resultsMap.Get(f)
+		if !ok {
+			aggValues = r.valuesPool.Get()


is it cheap to do a reset after pulling out the pool? I generally prefer that pattern over trust that every Put does a proper reset but I trust you understand this lifecycle better than I do

richardartoul · 2019-04-18T15:05:04Z

src/dbnode/storage/index/block.go

@@ -823,30 +852,235 @@ func (b *block) addQueryResults(
 	results BaseResults,
 	batch []doc.Document,
 ) ([]doc.Document, int, error) {
-	// Checkout the lifetime of the query before adding results
+	// checkout the lifetime of the query before adding results.


Why are you lower casing these? I've been commenting on all P.Rs to ensure they're capital!

oh really? I've been trying to stay all lower case for a while. Don't feel strongly about which one but we should pick a convention and stick to it.

richardartoul · 2019-04-18T15:11:33Z

src/dbnode/storage/index/block.go

+		size       = results.Size()
+		batch      = b.opts.AggregateResultsEntryArrayPool().Get()
+		batchSize  = cap(batch)
+		iterClosed = false // tracking whether we need to free the iterator at the end.


Is this just a guard against future maintainers? Cause it seems like you always close it now. If so maybe just mention in the comment its an extra precaution so people dont spend time chasing why its needed (unless I missed something)

nah so it's like the standard cleanup pattern we use in a few places

object := ... cleanedUp := false defer func() { if !cleanedUp { object.Cleanup() } }() // do thing 1 for object // if it fails, just early `return` // so on // finally, cleanedUp = true if err :=object.Cleanup(); err != nil { return err }

this way we can be sure all exit paths from the function cleanup the object; either because we manually do it; or the defer takes care of it.

without the defer (and the var). i'd have to interlace the cleanup calls at every exit point from the function. which seems brittle at best.

richardartoul · 2019-04-18T15:12:26Z

src/dbnode/storage/index/block.go

+	data := b.bytesPool.Get(len(id))
+	data.IncRef()
+	data.AppendAll(id)
+	data.DecRef()


oh duh I misunderstood this. 👍

richardartoul · 2019-04-18T15:13:00Z

src/dbnode/storage/index/field_terms_iterator_prop_test.go

+	parameters.Rng = rand.New(rand.NewSource(seed))
+	properties := gopter.NewProperties(parameters)
+
+	properties.Property("Fields Terms Iteration doesn't blow up", prop.ForAll(


itself -> iterator

prateek requested review from robskillington, richardartoul and arnikola April 11, 2019 01:10

prateek force-pushed the prateek/dbnode/index-agg-fst branch from fc5db8f to 578baac Compare April 11, 2019 01:10

prateek commented Apr 11, 2019

View reviewed changes

prateek force-pushed the prateek/dbnode/index-agg-fst branch 4 times, most recently from b143a0b to 501c20d Compare April 11, 2019 02:10

arnikola reviewed Apr 12, 2019

View reviewed changes

arnikola reviewed Apr 15, 2019

View reviewed changes

prateek force-pushed the prateek/dbnode/index-agg-fst branch from 313d21f to 39a94c1 Compare April 15, 2019 19:24

arnikola approved these changes Apr 15, 2019

View reviewed changes

richardartoul reviewed Apr 15, 2019

View reviewed changes

richardartoul approved these changes Apr 15, 2019

View reviewed changes

prateek force-pushed the prateek/dbnode/index-agg-fst branch 6 times, most recently from b8fb0fc to 2ce7b21 Compare April 18, 2019 04:56

prateek mentioned this pull request Apr 18, 2019

[dbnode] Register index.Query/Aggregate finalizers #1567

Merged

prateek force-pushed the prateek/dbnode/index-agg-fst branch from 22dd0c9 to c4fb8f4 Compare April 18, 2019 17:42

richardartoul approved these changes Apr 19, 2019

View reviewed changes

prateek force-pushed the prateek/dbnode/index-agg-fst branch from e13bfc0 to 40070bc Compare April 19, 2019 16:13

[dbnode] Aggregate() using only FSTs where possible

a42a09c

prateek force-pushed the prateek/dbnode/index-agg-fst branch from 40070bc to a42a09c Compare April 23, 2019 23:00

prateek merged commit 0185c0e into master Apr 23, 2019

prateek deleted the prateek/dbnode/index-agg-fst branch April 23, 2019 23:15

[dbnode] Aggregate() using only FSTs where possible #1545

[dbnode] Aggregate() using only FSTs where possible #1545

Conversation

prateek commented Apr 11, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prateek Apr 12, 2019 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Apr 11, 2019 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnikola Apr 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prateek Apr 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prateek Apr 18, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prateek commented Apr 11, 2019 •

edited

Loading

prateek Apr 12, 2019 •

edited

Loading

codecov bot commented Apr 11, 2019 •

edited

Loading

arnikola Apr 15, 2019 •

edited

Loading

prateek Apr 15, 2019 •

edited

Loading

prateek Apr 18, 2019 •

edited

Loading