-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ts: Query rewrite part 2 #25587
ts: Query rewrite part 2 #25587
Conversation
Review status: 0 of 11 files reviewed at latest revision, all discussions resolved, some commit checks failed. pkg/ts/query.go, line 1 at r2 (raw file):
I assume this is just renamed from pkg/ts/query_test.go, line 579 at r1 (raw file):
Remove this or complete it (Our lint policy doesn't require comments on test functions) Comments from Reviewable |
1cec0f5
to
ad2fbcf
Compare
Review status: 0 of 11 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. pkg/ts/query.go, line 1 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Correct! pkg/ts/query_test.go, line 579 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. Comments from Reviewable |
3715d06
to
a59a253
Compare
@bdarnell Through a test failure, I found a rather obvious bug with the way I was pre-allocating space for the result; the result space is no longer pre-allocated, and as a result I have changed the way that the memory monitor grows (moved closer to the allocation point). This might be different enough that you want to take another look. |
LGTM Review status: 0 of 11 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. Comments from Reviewable |
Completes the new query model (by adding support for near-current-time filtering) and swaps tests for tsdb completely to the new query model. + Adds support for filtering data points for "incomplete" sample periods, which had a tendency to create persistent artifacts on our time series graphs. In the old query system this was being done using two different methods, one for downsampled points and another for points which are not being downsampled. The new model uses a single method for filtering both types of points. + Adjust TS tests model to handle incomplete sample filtering. + Adds new tests for incomplete sample filtering, which were not present at all for the old model. + Swaps over tsdb model tests to only use the new query model - this is necessary because the new tests would not pass on the old model. + Adds new unit test for the downsampling code of the new query model. Release note: None
Swap to new query implementation, removing the old query implementation completely. Release note: None
a59a253
to
6c68ad6
Compare
bors r+ |
25587: ts: Query rewrite part 2 r=mrtracy a=mrtracy Two commits, completing the rewrite of the query logic. + First commit adds support for "incomplete sample" filtering into the test model and the new query logic, adds tests for "incomplete sample" logic, swaps model tests to use the new query logic exclusively, and adds additional testing for downsampling in the new query logic. + Second commit swaps over the TS server to begin using the new query logic and completely removes the old query logic. 25804: cmd/roachtest: add testSpec.MinVersion r=danhhz a=petermattis Allow tests to specify the minimum cockroach version they require to run. If the build version (auto-detected using git) is less than the min version specified for the test, the test is skipped. For example, if `upgrade/oldVersion=v2.0.0` (which has a min version of `v2.1.0`) is run on the `release-2.0` branch we see: === RUN upgrade/oldVersion=v2.0.0/nodes=5 [unstable,skip] --- SKIP: upgrade/oldVersion=v2.0.0/nodes=5 (0.00s) build-version (2.0.2) < min-version (2.1.0-0) Fixes #25740 Release note: None Co-authored-by: Matt Tracy <[email protected]> Co-authored-by: Peter Mattis <[email protected]>
Build succeeded |
25789: ts: Implement new on-disk format for time series r=mrtracy a=mrtracy The first commit is #25587 and can be ignored for this PR. Implement a new *columnar* on-disk format for time series samples, replacing the previous row-like format. Previously, each slab of time series data contained a collection of "samples", where each sample was a message containing a number of fields; the timestamp (offset), and a number of value fields intended to contain "rolled-up" data for long-term, low resolution storage. The new format is columnar; the top-level slab contains multiple parallel arrays, with each array containing the ordered values for the individual samples. This gives us the following advantages: This has a columnar layout, made up of parallel arrays. This gives us all of the advantages we were missing in the previous layout: + High-resolution data can leave the aggregate fields completely empty; only the "last" aggregate fields. This will reduce the in-memory size of each sample from the full Sample structure down to a int32 and a float64. This also means we can add several more aggregates without inflating the size of each sample. + The columnar format takes advantage of protobuffer repeated field packing, which should save considerable space for the encoded on-disk format. + When querying, we can iterate directly over the data fields we need, which may improve data locality (and thus cache-miss performance) or allow us to more aggressively release memory for data that is not needed. This commit does the following: + Define new columnar fields in roachpb/internal.proto. + Write new C++ merge logic for the columnar fields. This does not replace the existing row logic; it lives alongside it. + Create an upgrade path in the merge logic for row-formatted slabs; this occurs whenever columnar data is merged into an existing key with row-formatted data. This ensures than any individual slab of data will contain only row-formatted samples or column-formatted samples, but never both. Release note: none. Co-authored-by: Matt Tracy <[email protected]>
Two commits, completing the rewrite of the query logic.