Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Add config for timeseries limit returned by single DB node #1644

Merged
merged 12 commits into from
May 18, 2019

Conversation

robskillington
Copy link
Collaborator

@robskillington robskillington commented May 17, 2019

What this PR does / why we need it:

Adds a config and a sane default for setting a max timeseries limit returned by a single DB node to m3query.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

return handler.FetchOptionsBuilderOptions{
Limit: defaultStorageQueryLimit,
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

MaxFetchedDatapoints int64 `yaml:"maxFetchedDatapoints"`

// MaxTimeseries limits the number of time series returned by a storage node.
MaxTimeseries int64 `yaml:"maxTimeseries"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxFetchedTimeseries instead to be consistent with maxFetchedDatapoints?

return nil, xhttp.NewParseError(err, http.StatusBadRequest)
}
fetchOpts.Limit = n
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

) (*storage.FetchOptions, *xhttp.ParseError) {
fetchOpts := storage.NewFetchOptions()
fetchOpts.Limit = b.opts.Limit
if str := req.Header.Get(LimitMaxTimeseriesHeader); str != "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on making this part of the request. Although does it make more sense to have this be part of the url parameters? More of an educational question for me..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does make more sense to be a URL parameter but I wanted this to be a part of any endpoint. So I didn't want to necessarily have one endpoint using "limit" for something other than what I'm specifying then have them collide.

We can always go add a URL param later if we think we can avoid collisions, this was just a quick way to add it for now.

Copy link
Contributor

@richardartoul richardartoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamp to unblock, but a few things in there worth cleaning up.

Also:

  1. I really think we should have a small doc on how to set these if we don't already
  2. Any thoughts on tests for this? Not obvious how to me how you would write good tests for this at the coordinator level so happy to merge for now, but something to think about

) (*storage.FetchOptions, *xhttp.ParseError) {
fetchOpts := storage.NewFetchOptions()
fetchOpts.Limit = b.opts.Limit
if str := req.Header.Get(LimitMaxTimeseriesHeader); str != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any documentation right now on setting query limits and such? If not would be nice to add a simple document that outlines the YAML configs you can set as well as this head.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let me look around for it. @benraskin92 do you know of any?

DeprecatedHeader = "M3-Deprecated"

// LimitMaxTimeseriesHeader is the M3 limit timeseries header that limits
// the number of time series returned by each storage node.
LimitMaxTimeseriesHeader = "M3-Limit-Max-Timeseries"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think M3-Limit-Max-Series or just M3-Max-Series to be easier to remember

@@ -48,13 +48,15 @@ var (

// ListTagsHandler represents a handler for list tags endpoint.
type ListTagsHandler struct {
storage storage.Storage
nowFn clock.NowFn
storage storage.Storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new limit wont affect aggregate endpoints will it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will too yeah.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But will correctly respect the size of the aggregate values, not the amount of documents scanned.

query, parseBodyErr := h.parseBody(r)
opts, parseURLParamsErr := h.parseURLParams(r)
// NB(r): Use a loop here to avoid two err handling code paths
for _, rErr := range []*xhttp.ParseError{parseBodyErr, parseURLParamsErr} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me longer to digest this than two error handlings paths, and it possibly allocates? Would probably just do the normal thing here personally or at least do

for _, rErr := range [2]*xhttp.ParseError{parseBodyErr, parseURLParamsErr} {
...
}

to be sure it does not allocate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I might actually just do a helper function like I have in xerrors.FirstError (only reason I didn't do that is that it expects func FirstError(err ...error) error but these are typed errors....

}

if str := r.URL.Query().Get("limit"); str != "" {
if limit, err := strconv.Atoi(str); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly if they pass the correct query URL name but we can't parse it into a number, it should just return an error

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed, will change what this behavior used to do to match that.

Options QueryContextOptions
}

// QueryContextOptions is a set of optionally set options for the query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QueryContextOptions contains optional configuration for the query context

queryContextOptions, poolWrapper)

var (
waitForStart = make(chan struct{}, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just make this a chan error and get rid of the startErr atomic stuff? Seems confusing for no reason.

// StartNewGrpcServer starts server on given address, then notifies channel
func StartNewGrpcServer(
	server *grpc.Server,
	address string,
	waitForStart chan<- error{},
) error {
	lis, err := net.Listen("tcp", address)
	if err != nil {
                 waitForStart <-err
	}

	waitForStart <- nil
	return server.Serve(lis)
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole thing is kinda messed up, I wanted to introduce least amount of changes possible.

I'll just break it up into two steps and make the grpc server take the listener.

wg sync.WaitGroup
)
startErr.Store(nil)
wg.Add(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think you actually use this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ta, I'm going to rip all this stuff out thankfully - good catch though.

@codecov
Copy link

codecov bot commented May 18, 2019

Codecov Report

Merging #1644 into master will increase coverage by 0.9%.
The diff coverage is 69.2%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1644     +/-   ##
========================================
+ Coverage    71.3%   72.2%   +0.9%     
========================================
  Files         962     962             
  Lines       80908   80103    -805     
========================================
+ Hits        57747   57903    +156     
+ Misses      19393   18409    -984     
- Partials     3768    3791     +23
Flag Coverage Δ
#aggregator 82.3% <ø> (-0.1%) ⬇️
#cluster 85.7% <ø> (ø) ⬆️
#collector 63.9% <ø> (ø) ⬆️
#dbnode 80.1% <ø> (ø) ⬆️
#m3em 73.2% <ø> (ø) ⬆️
#m3ninx 74.1% <ø> (ø) ⬆️
#m3nsch 51.1% <ø> (ø) ⬆️
#metrics 17.6% <ø> (ø) ⬆️
#msg 74.7% <ø> (-0.2%) ⬇️
#query 67.9% <69.2%> (+4.1%) ⬆️
#x 86.4% <ø> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ec3b0b...5887d97. Read the comment docs.

@robskillington robskillington merged commit 9ce4743 into master May 18, 2019
@robskillington robskillington deleted the r/add-config-limit branch May 18, 2019 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants