Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: SimplifyRootOrdering should be an essential rule #84067

Closed
cockroach-teamcity opened this issue Jul 8, 2022 · 7 comments · Fixed by #84194
Closed

opt: SimplifyRootOrdering should be an essential rule #84067

cockroach-teamcity opened this issue Jul 8, 2022 · 7 comments · Fixed by #84194
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-queries SQL Queries Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Jul 8, 2022

roachtest.unoptimized-query-oracle/disable-rules=half failed with artifacts on master @ 88789feea153b61b379c0240a07482bdfaf0c932:

test artifacts and logs in: /artifacts/unoptimized-query-oracle/disable-rules=half/run_1
	query_comparison_util.go:237,query_comparison_util.go:69,unoptimized_query_oracle.go:51,test_runner.go:896: . 8358 statements run: expected unoptimized and optimized results to be equal
		(1) attached stack trace
		  -- stack trace:
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.(*queryComparisonHelper).makeError
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/query_comparison_util.go:345
		  | [...repeated from below...]
		Wraps: (2) . 8358 statements run
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runUnoptimizedQueryOracleImpl
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/unoptimized_query_oracle.go:154
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerUnoptimizedQueryOracle.func1.1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/unoptimized_query_oracle.go:54
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runOneRoundQueryComparison
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/query_comparison_util.go:236
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runQueryComparison
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/query_comparison_util.go:69
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerUnoptimizedQueryOracle.func1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/unoptimized_query_oracle.go:51
		  | main.(*testRunner).runTest.func2
		  | 	main/pkg/cmd/roachtest/test_runner.go:896
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1581
		Wraps: (4) expected unoptimized and optimized results to be equal
		  |   []string{
		  |   	",0",
		  |   	",0",
		  | - 	"j\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgc,0",
		  | + 	",NULL",
		  | - 	"j\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgc,0",
		  | + 	",NULL",
		  | + 	"j\x1bK\x1bgc,0",
		  | + 	"j\x1bK\x1bgcj\x1bK\x1bgc,NULL",
		  | + 	"j\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgc,NULL",
		  | + 	"j\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgcj\x1bK\x1bgc,0",
		  |   }
		  | sql: SELECT
		  | 	concat_agg(tab_10456.col1_14::STRING)::STRING AS col_36431, tab_10456.col1_10 AS col_36432
		  | FROM
		  | 	defaultdb.public.table1@[0] AS tab_10456
		  | WHERE
		  | 	(NOT (_st_crosses('010400008008000000010100008048DC870C2A74DC4134065AF5B54ADCC1505B10E58547F8410101000080D0F97954056EEEC10054369F301CA3C122F0575FB532E9C10101000080F0711B5BD5B3C5C178E0F60AAB5BF4417C994EF18C9EF8C10101000080B2A9C79D74AA00C2F0B1E094207AC1C1BE4CFBD28239E4C10101000080C04C0197A056DEC172DE23B5A1D6F5C1283F8E78EFCA0042010100008048225B53CB84E141BC54514BF58CFBC1C0E887F7F657FAC1010100008072ADDA67FFD3F8C1BC9BEF99C94BD8C1F88F4DB72400E5410101000080CC0B61411F13E841A8A57C14789001C2EAF49BF8901E02C2':::GEOMETRY::GEOMETRY, '010500000002000000010200000003000000000000000000244000000000000024400000000000003440000000000000344000000000000024400000000000004440010200000004000000000000000000444000000000000044400000000000003E400000000000003E40000000000000444000000000000034400000000000003E400000000000002440':::GEOMETRY::GEOMETRY)::BOOL OR false))
		  | GROUP BY
		  | 	tab_10456.col1_13, tab_10456.col1_10, tab_10456.col1_14
		  | ORDER BY
		  | 	tab_10456.col1_14 ASC, tab_10456.col1_13 DESC, tab_10456.col1_13 DESC
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.leafError

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Epic: CRDB-14914

Jira issue: CRDB-17442

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Jul 8, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.2 milestone Jul 8, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Jul 8, 2022
@cockroach-teamcity

This comment was marked as duplicate.

@cockroach-teamcity

This comment was marked as off-topic.

@cockroach-teamcity

This comment was marked as off-topic.

@mgartner
Copy link
Collaborator

Unfortunately, there appears to be a bug in SET testing_optimizer_random_seed = ... and SET testing_optimizer_disable_rule_probability = ... which prevents those settings from affecting EXPLAIN queries. So, in the output of the first failure, the plans look the same even though they are different. EXPLAIN ANALYZE confirms that. I'm going to try to fix these settings first.

@mgartner mgartner self-assigned this Jul 11, 2022
@mgartner mgartner changed the title roachtest: unoptimized-query-oracle/disable-rules=half failed opt: SimplifyRootOrdering should be an essential rule Jul 11, 2022
craig bot pushed a commit that referenced this issue Jul 11, 2022
82161: ui: Add Jest as test runner to DB Console r=nathanstilwell a=nathanstilwell

DB Console is the last place Cockroach Labs is using a test runner other than [Jest](https://jestjs.io/). This PR adds Jest as the test runner intended to replace [Mocha](https://mochajs.org/). Mocha runs in a headless browser via [Karma](https://karma-runner.github.io/latest/index.html) whereas Jest will run tests in NodeJS and simulate a browser environment using [jsdom](https://github.com/jsdom/jsdom). Due to this change in environment, you will see not only files to set up the Jest test runner, but changes to some tests, some mocking of browser globals that are not included in jsdom by default, and some configuration adjustment to the `tsconfig.json`.

Since configuration changes are infrequent and are highly contextual, we decided to err on the side of verbose inline documentation in configuration files.

Details about individual changes to configs or tests are documented in commit messages.

84068: streamingccl: fix span use-after-finish in ingestion frontier r=samiskin a=stevendanna

This fixes the following use-after-finish panic:

    panic: use of Span after Finish. Span: ingestfntr. Finish previously
    called at: <stack not captured. Set debugUseAfterFinish>

    goroutine 1617744 [running]:
    github.com/cockroachdb/cockroach/pkg/util/tracing.(*Span).detectUseAfterFinish(0xc002c5f180)
    	github.com/cockroachdb/cockroach/pkg/util/tracing/span.go:186 +0x279
    github.com/cockroachdb/cockroach/pkg/util/tracing.(*Tracer).startSpanGeneric(0xc0138b2a50, {0x72291c0, 0xc019ebbe40}, {0x68e4eb2, 0x1d}, {{0x0}, 0x0, {0x0, 0x0, {{0x0, ...}, ...}, ...}, ...})
    	github.com/cockroachdb/cockroach/pkg/util/tracing/tracer.go:1207 +0x997
    github.com/cockroachdb/cockroach/pkg/util/tracing.(*Tracer).StartSpanCtx(0xc0138b2a50, {0x72291c0, 0xc019ebbe40}, {0x68e4eb2, 0x1d}, {0xc010cbb760, 0x1, 0x1})
    	github.com/cockroachdb/cockroach/pkg/util/tracing/tracer.go:1062 +0x1a7
    github.com/cockroachdb/cockroach/pkg/util/tracing.ChildSpan({0x72291c0, 0xc019ebbe40}, {0x68e4eb2, 0x1d})
    	github.com/cockroachdb/cockroach/pkg/util/tracing/tracer.go:1577 +0x145
    github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient.(*partitionedStreamClient).Heartbeat(0xc00b311080, {0x72291c0, 0xc019ebbe40}, 0xac93aa873318001, {0x16ffc388c7d7c42a, 0x0, 0x0})
    	github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient/partitioned_stream_client.go:83 +0xb7
    github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest.(*heartbeatSender).maybeHeartbeat(0xc01980f880, {0x72291c0, 0xc019ebbe40}, {0x16ffc388c7d7c42a, 0x0, 0x0})
    	github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest/stream_ingestion_frontier_processor.go:180 +0x250
    github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest.(*heartbeatSender).startHeartbeatLoop.func1.1()
    	github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest/stream_ingestion_frontier_processor.go:207 +0x41b
    github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest.(*heartbeatSender).startHeartbeatLoop.func1({0x72291c0, 0xc019ebbe40})
    	github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest/stream_ingestion_frontier_processor.go:227 +0xb5
    github.com/cockroachdb/cockroach/pkg/util/ctxgroup.Group.GoCtx.func1()
    	github.com/cockroachdb/cockroach/pkg/util/ctxgroup/ctxgroup.go:169 +0x52
    golang.org/x/sync/errgroup.(*Group).Go.func1()
    	golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:74 +0xb4
    created by golang.org/x/sync/errgroup.(*Group).Go
    	golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:71 +0xdd
    I220708 05:29:42.017167 1 (gostd) testmain.go:90  [-] 1  Test //pkg/ccl/streamingccl/streamingest:streamingest_test exited with error code 2

The use after finish was caused by goroutines in the ingestion
frontier processor that lived past a call to
(*ProcessorBase).InternalClose, which finishes the span attached to
the context passed to the processor in Start.

To address this we:

- ensure that we stop our heartbeat thread before calling
  InternalClose in ConsumerClosed, and

- provide a TrailingMetaCallback so that we can perform cleaned when
  DrainHelper() is called. When a TrailingMetaCallback is provided,
  DrainHelper() calls it instead of InternalClose(), allowing us to
  correctly clean up before the span is closed.

- use context cancellation rather than a channel to control the
  heartbeat loop exit to avoid having to deal with avoiding
  double-closes of the channel.

Fixes #84054

Release note: None

84100: kvserver: Clean up empty range directories after snapshots r=nicktrav a=itsbilal

Previously, we were creating subdirectories for ranges and
range snapshots in the auxiliary directory every time we
accepted a snapshot, but only cleaning up the snapshot
subdirectories after a snapshot scratch space closed. This
left empty parent range directories around on the FS,
slowing down future calls to Pebble.Capacity() and indirectly
slowing down AddSSTable in the future.

This change adds code to clean up empty range directories
in the aux directory if they're not being used. Some coordination
and synchronization code had to be added to ensure we wouldn't
remove a directory that was just created by a concurrent snapshot.

Fixes #83137 

Release note (bug fix, performance improvement): Addresses issue where
imports and rebalances were being slowed down due to the accumulation
of empty directories from range snapshot applications.

84170: sql/sqlstats: record QuerySummary when merging stats r=ericharmeling a=stevendanna

During execution of a transaction, all statement statistics are
collected in an struct local to that transaction, and then flushed to
the main ApplicationStats container when the transaction finishes.

Previously, when flushing, we failed to copy the QuerySummary field,
leading to `metadata->'querySummary'` from being empty in most cases.

Prior to ce1b42b this only affected
statements in an explicit transaction. After that commit, it affected
all statements.

Release note (bug fix): Fix a bug that led to the querySummary field
in crdb_internal.statements_statistics's metadata column being empty.

84194: opt: mark SimplifyRootOrdering as an essential rule r=mgartner a=mgartner

The unoptimized query oracle, which disables rules, found a bug in the
execution engine that is only possible to hit if the
`SimplifyRootOrdering` rule is disabled (see #84191). Until the bug is
fixed, we mark the rule as essential so that it is not disabled by these
tests.

Fixes #84067

Release note: None

Co-authored-by: Nathan Stilwell <[email protected]>
Co-authored-by: Sean Barag <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
@craig craig bot closed this as completed in 37ed376 Jul 11, 2022
@craig craig bot closed this as completed in #84194 Jul 11, 2022
@mgartner mgartner removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Jul 13, 2022
@rytaft
Copy link
Collaborator

rytaft commented Jul 17, 2022

Unfortunately, there appears to be a bug in SET testing_optimizer_random_seed = ... and SET testing_optimizer_disable_rule_probability = ... which prevents those settings from affecting EXPLAIN queries. So, in the output of the first failure, the plans look the same even though they are different. EXPLAIN ANALYZE confirms that. I'm going to try to fix these settings first.

What bug are you seeing? It seems to be working as far as I can tell...

@mgartner
Copy link
Collaborator

Oops, forgot to update after my latest findings. I had mistakenly come to the conclusion that the settings weren't working, but later realized that a "bug" in EXPLAIN was confusing me. The plans shown in EXPLAIN and EXPLAIN ANALYZE can be different when the SimplifyRootOrdering rule is disabled. I created an issue to track this: #84577.

@rytaft
Copy link
Collaborator

rytaft commented Jul 18, 2022

Great, thanks!

@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants