Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ts: Query rewrite part 2 #25587

Merged
merged 2 commits into from
May 22, 2018
Merged

Conversation

mrtracy
Copy link
Contributor

@mrtracy mrtracy commented May 16, 2018

Two commits, completing the rewrite of the query logic.

  • First commit adds support for "incomplete sample" filtering into the test model and the new query logic, adds tests for "incomplete sample" logic, swaps model tests to use the new query logic exclusively, and adds additional testing for downsampling in the new query logic.
  • Second commit swaps over the TS server to begin using the new query logic and completely removes the old query logic.

@mrtracy mrtracy requested review from bdarnell, tbg and a team May 16, 2018 19:08
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@bdarnell
Copy link
Contributor

:lgtm:


Review status: 0 of 11 files reviewed at latest revision, all discussions resolved, some commit checks failed.


pkg/ts/query.go, line 1 at r2 (raw file):

// Copyright 2018 The Cockroach Authors.

I assume this is just renamed from query_new.go without any interesting changes.


pkg/ts/query_test.go, line 579 at r1 (raw file):

}

// TestQueryNearCurrentTime

Remove this or complete it (Our lint policy doesn't require comments on test functions)


Comments from Reviewable

@mrtracy mrtracy force-pushed the mtracy/ts_query_rewrite_ii branch from 1cec0f5 to ad2fbcf Compare May 21, 2018 16:09
@mrtracy
Copy link
Contributor Author

mrtracy commented May 21, 2018

Review status: 0 of 11 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


pkg/ts/query.go, line 1 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I assume this is just renamed from query_new.go without any interesting changes.

Correct!


pkg/ts/query_test.go, line 579 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Remove this or complete it (Our lint policy doesn't require comments on test functions)

Done.


Comments from Reviewable

@mrtracy mrtracy force-pushed the mtracy/ts_query_rewrite_ii branch 2 times, most recently from 3715d06 to a59a253 Compare May 21, 2018 22:29
@mrtracy
Copy link
Contributor Author

mrtracy commented May 21, 2018

@bdarnell Through a test failure, I found a rather obvious bug with the way I was pre-allocating space for the result; the result space is no longer pre-allocated, and as a result I have changed the way that the memory monitor grows (moved closer to the allocation point). This might be different enough that you want to take another look.

@bdarnell
Copy link
Contributor

LGTM


Review status: 0 of 11 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


Comments from Reviewable

Matt Tracy added 2 commits May 22, 2018 11:50
Completes the new query model (by adding support for near-current-time
filtering) and swaps tests for tsdb completely to the new query model.

+ Adds support for filtering data points for "incomplete" sample
periods, which had a tendency to create persistent artifacts on our time
series graphs. In the old query system this was being done using two
different methods, one for downsampled points and another for points
which are not being downsampled. The new model uses a single method for
filtering both types of points.
+ Adjust TS tests model to handle incomplete sample filtering.
+ Adds new tests for incomplete sample filtering, which were not present
at all for the old model.
+ Swaps over tsdb model tests to only use the new query model - this is
necessary because the new tests would not pass on the old model.
+ Adds new unit test for the downsampling code of the new query model.

Release note: None
Swap to new query implementation, removing the old query implementation
completely.

Release note: None
@mrtracy mrtracy force-pushed the mtracy/ts_query_rewrite_ii branch from a59a253 to 6c68ad6 Compare May 22, 2018 15:50
@mrtracy
Copy link
Contributor Author

mrtracy commented May 22, 2018

bors r+

craig bot pushed a commit that referenced this pull request May 22, 2018
25587: ts: Query rewrite part 2 r=mrtracy a=mrtracy

Two commits, completing the rewrite of the query logic.

+ First commit adds support for "incomplete sample" filtering into the test model and the new query logic, adds tests for "incomplete sample" logic, swaps model tests to use the new query logic exclusively, and adds additional testing for downsampling in the new query logic.
+ Second commit swaps over the TS server to begin using the new query logic and completely removes the old query logic.

25804: cmd/roachtest: add testSpec.MinVersion r=danhhz a=petermattis

Allow tests to specify the minimum cockroach version they require to
run. If the build version (auto-detected using git) is less than the min
version specified for the test, the test is skipped. For example, if
`upgrade/oldVersion=v2.0.0` (which has a min version of `v2.1.0`) is run
on the `release-2.0` branch we see:

    === RUN   upgrade/oldVersion=v2.0.0/nodes=5 [unstable,skip]
    --- SKIP: upgrade/oldVersion=v2.0.0/nodes=5 (0.00s)
        build-version (2.0.2) < min-version (2.1.0-0)

Fixes #25740

Release note: None

Co-authored-by: Matt Tracy <[email protected]>
Co-authored-by: Peter Mattis <[email protected]>
@craig
Copy link
Contributor

craig bot commented May 22, 2018

Build succeeded

@craig craig bot merged commit 6c68ad6 into cockroachdb:master May 22, 2018
@mrtracy mrtracy deleted the mtracy/ts_query_rewrite_ii branch May 22, 2018 17:35
craig bot pushed a commit that referenced this pull request Jun 5, 2018
25789: ts: Implement new on-disk format for time series r=mrtracy a=mrtracy

The first commit is #25587 and can be ignored for this PR.


Implement a new *columnar* on-disk format for time series samples,
replacing the previous row-like format.

Previously, each slab of time series data contained a collection of
"samples", where each sample was a message containing a number of
fields; the timestamp (offset), and a number of value fields intended to
contain "rolled-up" data for long-term, low resolution storage.

The new format is columnar; the top-level slab contains multiple
parallel arrays, with each array containing the ordered values for the
individual samples.

This gives us the following advantages:

This has a columnar layout, made up of parallel arrays. This gives us
all of the advantages we were missing in the previous layout:

+ High-resolution data can leave the aggregate fields completely empty;
only the "last" aggregate fields. This will reduce the in-memory size of
each sample from the full Sample structure down to a int32 and a
float64. This also means we can add several more aggregates without
inflating the size of each sample.
+ The columnar format takes advantage of protobuffer repeated field
packing, which should save considerable space for the encoded on-disk
format.
+ When querying, we can iterate directly over the data fields we need,
which may improve data locality (and thus cache-miss performance) or
allow us to more aggressively release memory for data that is not
needed.

This commit does the following:

+ Define new columnar fields in roachpb/internal.proto.
+ Write new C++ merge logic for the columnar fields. This does not
replace the existing row logic; it lives alongside it.
+ Create an upgrade path in the merge logic for row-formatted slabs;
this occurs whenever columnar data is merged into an existing key with
row-formatted data. This ensures than any individual slab of data will
contain only row-formatted samples or column-formatted samples, but
never both.

Release note: none.

Co-authored-by: Matt Tracy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants