Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail if FetchTagged partially retrieves results due to error #2610

Merged
merged 3 commits into from
Sep 10, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/dbnode/generated/thrift/rpc.thrift
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,8 @@ struct FetchTaggedIDResult {
2: required binary nameSpace
3: required binary encodedTags
4: optional list<Segments> segments

// Deprecated -- do not use.
5: optional Error err
}

Expand Down
26 changes: 13 additions & 13 deletions src/dbnode/network/server/tchannelthrift/node/service.go
Original file line number Diff line number Diff line change
Expand Up @@ -680,12 +680,16 @@ func (s *service) fetchTagged(ctx context.Context, db storage.Database, req *rpc
encodedDataResults = make([][][]xio.BlockReader, results.Size())
}
if err := s.fetchReadEncoded(ctx, db, response, results, nsID, nsIDBytes, callStart, opts, fetchData, encodedDataResults); err != nil {
s.metrics.fetchTagged.ReportError(s.nowFn().Sub(callStart))
return nil, err
}

// Step 2: If fetching data read the results of the asynchronuous block readers.
// Step 2: If fetching data read the results of the asynchronous block readers.
if fetchData {
s.fetchReadResults(ctx, response, nsID, encodedDataResults)
if err := s.fetchReadResults(ctx, response, nsID, encodedDataResults); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch... almost tempted to turn on the linter that forces error checking, but that will probably generate huge amounts of noise

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in an ideal world we'd have that on I think

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to do "_ = foo" to get around the linter if you run into that @arnikola for cases where you do really want to ignore the error.

s.metrics.fetchTagged.ReportError(s.nowFn().Sub(callStart))
return nil, err
}
}

s.metrics.fetchTagged.ReportSuccess(s.nowFn().Sub(callStart))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe worth pulling out the s.metrics.fetchTagged.ReportError(...) that happens in fetchReadEncoded and fetchReadResults? That way, the metric reporting all happens at this same method layer in fetchTagged?

Expand Down Expand Up @@ -723,7 +727,6 @@ func (s *service) fetchReadEncoded(ctx context.Context,
ctx.RegisterFinalizer(enc)
encodedTags, err := s.encodeTags(enc, tags)
if err != nil { // This is an invariant, should never happen
s.metrics.fetchTagged.ReportError(s.nowFn().Sub(callStart))
return tterrors.NewInternalError(err)
}

Expand All @@ -740,19 +743,20 @@ func (s *service) fetchReadEncoded(ctx context.Context,
encoded, err := db.ReadEncoded(ctx, nsID, tsID,
opts.StartInclusive, opts.EndExclusive)
if err != nil {
elem.Err = convert.ToRPCError(err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we aware of a reason this was originally written to not fail the batch on any ReadEncoded errors? Just wondering if this will move away from some specific behavior that was originally intended

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good question. @robskillington alluded to the fact that it was modeled after what was done for fetching blocks. but it's actually not necessary in this case. I think that's also corroborated by the fact that no one is actually making used of the embedded error message. I stopped short of removing Err from the individual element and this code path is only touched by the FetchTagged endpoint, so if folks wanted the ability to get and collect individual errors, that'd still be a possibility.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it yeah make sense then how it ended up present but not actually required. Yeah seems like both pros and cons to actually fully removing it. Some niceness in keeping it consistent and having around later if we decide to use it; but also makes current code a bit more confusing since we don't use it. I'd probably vote to either remove it or just leave a comment on the field for future awareness

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems worth removing if nobody is using it. odd to have "multiple" error paths.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mention here that Err is a field on a proto generated object, so leaving the field in place but added a comment indicating that the field is deprecated.

return convert.ToRPCError(err)
} else {
encodedDataResults[idx] = encoded
}
}
return nil
}

func (s *service) fetchReadResults(ctx context.Context,
func (s *service) fetchReadResults(
ctx context.Context,
response *rpc.FetchTaggedResult_,
nsID ident.ID,
encodedDataResults [][][]xio.BlockReader,
) {
) error {
ctx, sp, sampled := ctx.StartSampledTraceSpan(tracepoint.FetchReadResults)
if sampled {
sp.LogFields(
Expand All @@ -762,19 +766,15 @@ func (s *service) fetchReadResults(ctx context.Context,
}
defer sp.Finish()

for idx, elem := range response.Elements {
if elem.Err != nil {
continue
}

for idx := range response.Elements {
segments, rpcErr := s.readEncodedResult(ctx, nsID, encodedDataResults[idx])
if rpcErr != nil {
elem.Err = rpcErr
continue
return rpcErr
}

response.Elements[idx].Segments = segments
}
return nil
}

func (s *service) Aggregate(tctx thrift.Context, req *rpc.AggregateQueryRequest) (*rpc.AggregateQueryResult_, error) {
Expand Down
90 changes: 90 additions & 0 deletions src/dbnode/network/server/tchannelthrift/node/service_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1879,6 +1879,96 @@ func TestServiceFetchTaggedErrs(t *testing.T) {
require.Error(t, err)
}

func TestServiceFetchTaggedReturnOnFirstErr(t *testing.T) {
ctrl := xtest.NewController(t)
defer ctrl.Finish()

mockDB := storage.NewMockDatabase(ctrl)
mockDB.EXPECT().Options().Return(testStorageOpts).AnyTimes()
mockDB.EXPECT().IsOverloaded().Return(false)

service := NewService(mockDB, testTChannelThriftOptions).(*service)

tctx, _ := tchannelthrift.NewContext(time.Minute)
ctx := tchannelthrift.Context(tctx)
defer ctx.Close()

mtr := mocktracer.New()
sp := mtr.StartSpan("root")
ctx.SetGoContext(opentracing.ContextWithSpan(gocontext.Background(), sp))

start := time.Now().Add(-2 * time.Hour)
end := start.Add(2 * time.Hour)
start, end = start.Truncate(time.Second), end.Truncate(time.Second)

nsID := "metrics"

id := "foo"
s := []struct {
t time.Time
v float64
}{
{start.Add(10 * time.Second), 1.0},
{start.Add(20 * time.Second), 2.0},
}
enc := testStorageOpts.EncoderPool().Get()
enc.Reset(start, 0, nil)
for _, v := range s {
dp := ts.Datapoint{
Timestamp: v.t,
Value: v.v,
}
require.NoError(t, enc.Encode(dp, xtime.Second, nil))
}

stream, _ := enc.Stream(ctx)
mockDB.EXPECT().
ReadEncoded(gomock.Any(), ident.NewIDMatcher(nsID), ident.NewIDMatcher(id), start, end).
Return([][]xio.BlockReader{{
xio.BlockReader{
SegmentReader: stream,
},
}}, fmt.Errorf("random err")) // Return error that should trigger failure of the entire call

req, err := idx.NewRegexpQuery([]byte("foo"), []byte("b.*"))
require.NoError(t, err)
qry := index.Query{Query: req}

resMap := index.NewQueryResults(ident.StringID(nsID),
index.QueryResultsOptions{}, testIndexOptions)
resMap.Map().Set(ident.StringID("foo"), ident.NewTagsIterator(ident.NewTags(
ident.StringTag("foo", "bar"),
ident.StringTag("baz", "dxk"),
)))

mockDB.EXPECT().QueryIDs(
gomock.Any(),
ident.NewIDMatcher(nsID),
index.NewQueryMatcher(qry),
index.QueryOptions{
StartInclusive: start,
EndExclusive: end,
SeriesLimit: 10,
}).Return(index.QueryResult{Results: resMap, Exhaustive: true}, nil)

startNanos, err := convert.ToValue(start, rpc.TimeType_UNIX_NANOSECONDS)
require.NoError(t, err)
endNanos, err := convert.ToValue(end, rpc.TimeType_UNIX_NANOSECONDS)
require.NoError(t, err)
var limit int64 = 10
data, err := idx.Marshal(req)
require.NoError(t, err)
_, err = service.FetchTagged(tctx, &rpc.FetchTaggedRequest{
NameSpace: []byte(nsID),
Query: data,
RangeStart: startNanos,
RangeEnd: endNanos,
FetchData: true,
Limit: &limit,
})
require.Error(t, err)
}

func TestServiceAggregate(t *testing.T) {
ctrl := xtest.NewController(t)
defer ctrl.Finish()
Expand Down