Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add initial design overview #2

Merged
merged 1 commit into from
Nov 2, 2017
Merged

docs: add initial design overview #2

merged 1 commit into from
Nov 2, 2017

Conversation

fabxc
Copy link
Collaborator

@fabxc fabxc commented Nov 2, 2017

A first version of the overall design.

I also added a cost estimate at the bottom. Please don't quote me on it – especially the instance types are a completely wild guess. But since they account for a relatively small percentage, it might still be close. Resources for collecting Prometheus servers not included.

@fabxc fabxc merged commit 763c651 into fabxc-init Nov 2, 2017
@bwplotka bwplotka deleted the design branch November 17, 2017 16:27
povilasv added a commit to povilasv/thanos that referenced this pull request Aug 28, 2018
GiedriusS referenced this pull request in GiedriusS/thanos Feb 9, 2019
GiedriusS added a commit that referenced this pull request Mar 23, 2019
…798)

* store: add ability to limit max samples / conc. queries

* store/bucket: account for the RawChunk case

Convert raw chunks into XOR encoded chunks and call the NumSamples()
method on them to calculate the number of samples. Rip out the samples
calculation into a different function because it is used in two
different places.

* store/bucket_e2e_test: adjust sample limit size

It should be actually 30 - I miscalculated this.

* store/bucket: add metric thanos_bucket_store_queries_limited_total

* store/bucket: register queriesLimited metric

* store: make changes according to the review comments

* docs/store: update

* store: gating naming changes, add span/extra metric

* store: improve error messages

* store/limiter: improve error messages

* store/gate: time -> seconds

* store/bucket_e2e_test: narrow down the first query

* store/bucket: check for negative maxConcurrent

* cmd/store: clarify help message

* pkg/store: hook thanos_bucket_store_queries_limited into Limiter

* store/bucket_test: fix NewBucketStore call

* docs: update again

* store/gate: spelling fix

* store/gate: spelling fix #2

* store/bucket: remove pointless newline

* store/gate: generalize gate timing

Make the metric show in general how much time it takes for queries to
wait at the gate.

* store/gate: convert the g.gateTiming metric into a histogram

* store/bucket: change comment wording

* store/bucket: remove type from maxSamplesPerChunk

Let Go decide by itself what kind of type it needs.

* store/bucket: rename metric into thanos_bucket_store_queries_dropped

* thanos/store: clarify help message

Literally explain what it means in the help message so that it would be
clearer.

* store/gate: rename metric to thanos_bucket_store_queries_in_flight

More fitting as decided by everyone.

* store/gate: fix MustRegister() call

* docs: update

* store/bucket: clarify the name of the span

Make it more clearer about what it is for.

* store/bucket: inline calculation into the function call

No need to create an extra variable in a hot path in the code if we can
inline it and it will be just as clear.

* CHANGELOG: add item about this

* store/gate: reduce number of buckets

* store/bucket: rename metric to thanos_bucket_store_queries_dropped_total

* store/bucket: move defer out of code block

* store/gate: generalize gate for different kinds of subsystems

* store/limiter: remove non-nil check

* CHANGELOG: fixes

* store/limiter: convert failedCounter to non-ptr

* store/limiter: remove invalid comment

* *: update according to review comments

* CHANGELOG: update

* *: fix according to review

* *: fix according to review

* *: make docs

* CHANGELOG: clean up

* CHANGELOG: update

* *: queries_in_flight_total -> queries_in_flight

* store/bucket: do not wraper samplesLimiter error

The original error already informs us about what is going wrong.

* store/bucket: err -> errors.Wrap

It's still useful to know that we are talking about samples here
exactly.

* store: make store.grpc.series-max-concurrency 20 by default

Setting it to 0 by default doesn't make sense since the Go channel
becomes unbuffered and all queries will timeout. Set it to 20 by default
since that's the limit on Thanos Query and naturally there won't be more
than 20 by default so it's good.

* CHANGELOG: add warning about new limit
bwplotka pushed a commit that referenced this pull request Mar 31, 2019
paulfantom pushed a commit to paulfantom/thanos that referenced this pull request Jul 22, 2019
vendor: add vendor directory as we need self-contained code for OSBS
akanshat referenced this pull request in akanshat/thanos Jan 6, 2022
move cache TTLs to a struct
GiedriusS added a commit that referenced this pull request Jul 21, 2022
cache: cache locally in memcached as a PoC
LeviHarrison pushed a commit to LeviHarrison/thanos that referenced this pull request Aug 23, 2022
jnyi referenced this pull request in databricks/thanos Mar 31, 2023
charlottexl pushed a commit to charlottexl/thanos that referenced this pull request May 31, 2024
Rule do not turn off if resolving fails
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants