roachtest: -0 is not converted to 0, causing spurious tlp failure #77279

cockroach-teamcity · 2022-03-02T14:42:52Z

roachtest.tlp failed with artifacts on master @ e151a24dc1d495bf694e1e1c3ee8acb518c1424e:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tlp/run_1
	tlp.go:170,tlp.go:76,test_runner.go:779: expected unpartitioned and partitioned results to be equal
		(1) attached stack trace
		  -- stack trace:
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runTLPQuery.func2
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tlp.go:243
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runWithTimeout.func1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tlp.go:268
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1581
		Wraps: (2) expected unpartitioned and partitioned results to be equal
		  |   []string{
		  | + 	"-0",
		  |   	"-0.08327121444050486",
		  |   	"-0.3325643758917869",
		  |   	... // 3 identical elements
		  |   	"-1.029317916464786",
		  |   	"-2.5910802100934904",
		  | - 	"0",
		  |   	"0.34983588080578065",
		  |   	"0.4833675094669227",
		  |   	... // 4 identical elements
		  |   }
		  | sql: SELECT DISTINCT tab_43377.col1_0 FROM defaultdb.public.table1 AS tab_43377
		  | (SELECT DISTINCT tab_43377.col1_0 FROM defaultdb.public.table1 AS tab_43377 WHERE (tab_43377.col1_2 <= tab_43377.col1_2)) UNION (SELECT DISTINCT tab_43377.col1_0 FROM defaultdb.public.table1 AS tab_43377 WHERE NOT ((tab_43377.col1_2 <= tab_43377.col1_2))) UNION (SELECT DISTINCT tab_43377.col1_0 FROM defaultdb.public.table1 AS tab_43377 WHERE ((tab_43377.col1_2 <= tab_43377.col1_2)) IS NULL)
		  | with args: []
		Error types: (1) *withstack.withStack (2) *errutil.leafError

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/sql-queries _{This test on roachdash | Improve this report!

Jira issue: CRDB-13514}

The text was updated successfully, but these errors were encountered:

cockroach-teamcity · 2022-03-04T15:03:14Z

roachtest.tlp failed with artifacts on master @ 0e1edf3ad9da29b1d3bb28fcfc0f293d700e3cb9:

		  |   	"-0.6010823668975593,NULL,2003783032,fwAUY\x11,0,3250010446,2014-07-"...,
		  | - 	"-0.6010823668975593,NULL,2003783032,fwAUY\x11,0,3250010446,2014-07-27 14:09:05.000551 +0000 +0000,NULL,1101431691,\x1b\x1c\x05n\x1eu,\x03#~,',NULL,0.1756910732812027,0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.911081"...,
		  |   	"-0.6010823668975593,NULL,2003783032,fwAUY\x11,0,3250010446,2014-07-"...,
		  |   	"-0.6010823668975593,NULL,2003783032,fwAUY\x11,0,3250010446,2014-07-"...,
		  |   	"-0.6010823668975593,NULL,2003783032,fwAUY\x11,0,3250010446,2014-07-"...,
		  | + 	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,1972-08-02 14:17:37.000088 +0000 +0000,3257923258,2153347998,\n\x1a=\x10\x1au \b,\x03#~,zP\x1b,\x17d*,#+z 7\x02nx,-0.8354926840577999,-0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.92183"...,
		  |   	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,197"...,
		  |   	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,197"...,
		  |   	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,197"...,
		  | - 	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,1972-08-02 14:17:37.000088 +0000 +0000,3257923258,2153347998,\n\x1a=\x10\x1au \b,\x03#~,zP\x1b,\x17d*,#+z 7\x02nx,-0.8354926840577999,0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.921836"...,
		  |   	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,197"...,
		  |   	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,197"...,
		  |   	"-1.612266124236562,#+z 7\x02nX,457251543,\u007fDF:\"\x0e,NULL,2503567087,197"...,
		  | + 	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 +0000 +0000,1985245865,0,\x1b\x1c\x05n\x1eu,\x03#~,',NULL,0.776773440178762,-0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.9110816121101379,1.046212196"...,
		  |   	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 "...,
		  |   	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 "...,
		  |   	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 "...,
		  | - 	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 +0000 +0000,1985245865,0,\x1b\x1c\x05n\x1eu,\x03#~,',NULL,0.776773440178762,0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.9110816121101379,1.0462121963"...,
		  |   	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 "...,
		  |   	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 "...,
		  |   	"0,NULL,NULL,\x13\a,3087847003,3837798169,1972-08-02 14:17:37.000088 "...,
		  | + 	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 +0000,NULL,1833907620,\n\x1a=\x10\x1au \b,\x03#~,zP\x1b,\x17d*,NULL,0.776773440178762,-0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.9110816121101379,1.0462"...,
		  |   	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 "...,
		  |   	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 "...,
		  |   	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 "...,
		  | - 	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 +0000,NULL,1833907620,\n\x1a=\x10\x1au \b,\x03#~,zP\x1b,\x17d*,NULL,0.776773440178762,0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.9110816121101379,1.04621"...,
		  |   	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 "...,
		  |   	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 "...,
		  |   	"0,NULL,NULL,P@,NULL,1733338272,1972-08-02 14:17:37.000088 +0000 "...,
		  | + 	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14:17:37.000088 +0000 +0000,NULL,NULL,\n\x1a=\x10\x1au \b,\x03#~,zP\x1b,\x17d*,NULL,1.5020156625899135,-0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.91108161"...,
		  |   	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14"...,
		  |   	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14"...,
		  |   	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14"...,
		  | - 	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14:17:37.000088 +0000 +0000,NULL,NULL,\n\x1a=\x10\x1au \b,\x03#~,zP\x1b,\x17d*,NULL,1.5020156625899135,0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.911081612"...,
		  |   	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14"...,
		  |   	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14"...,
		  |   	"0.7252422224111514,NULL,NULL,\x1aYIX],NULL,1115311939,1972-08-02 14"...,
		  | + 	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33:34.000269 +0000 +0000,NULL,NULL,\x1b\x1c\x05n\x1eu,\x03#~,',m\x0e\x06w\x18}be#,NaN,-0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.9110816121101379,1.0462121963"...,
		  |   	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33"...,
		  |   	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33"...,
		  |   	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33"...,
		  | - 	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33:34.000269 +0000 +0000,NULL,NULL,\x1b\x1c\x05n\x1eu,\x03#~,',m\x0e\x06w\x18}be#,NaN,0,32121,NULL,p,NULL,4.681196838114059080E+25,false,46811968381140590800492426.9218368489017,\x02\x1e\x03wa),-0.9110816121101379,1.04621219635"...,
		  |   	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33"...,
		  |   	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33"...,
		  |   	"NaN,M\x0e\x06w\x18}bE#,1030807826,\x12,2859333651,872033927,1989-08-06 19:33"...,
		  |   }
		  | sql: SELECT * FROM defaultdb.public.table2 AS tab_1822 JOIN defaultdb.public.table3 AS tab_1823 ON true
		  | (SELECT * FROM defaultdb.public.table2 AS tab_1822 JOIN defaultdb.public.table3 AS tab_1823 ON tab_1823.col3_6) UNION ALL (SELECT * FROM defaultdb.public.table2 AS tab_1822 JOIN defaultdb.public.table3 AS tab_1823 ON NOT (tab_1823.col3_6)) UNION ALL (SELECT * FROM defaultdb.public.table2 AS tab_1822 JOIN defaultdb.public.table3 AS tab_1823 ON (tab_1823.col3_6) IS NULL)
		  | with args: []
		Error types: (1) *withstack.withStack (2) *errutil.leafError

Help

See: roachtest README

See: How To Investigate (internal)

_{This test on roachdash | Improve this report!}

rytaft · 2022-03-04T15:17:08Z

Neither of these failures seem to be a logic error, as the only difference in output is that one includes 0 while the other includes -0. We should figure out how to remove this source of spurious failures.

michae2 · 2022-04-05T04:54:15Z

Latest one reduced down to:

CREATE TABLE table1 (
  col1_0  UUID,
  col1_1  DATE,
  col1_3  INT8,
  col1_4  BIT(48) NULL,
  col1_5  BOOL,
  col1_6  FLOAT4,
  col1_7  REGPROC,
  col1_8  BIT(9),
  col1_10 TIME,
  col1_12 STRING AS (lower(CAST(col1_10 AS STRING))) VIRTUAL,
  PRIMARY KEY (col1_6 DESC),
  UNIQUE (lower(CAST(col1_5 AS STRING)), col1_12 ASC) STORING(col1_1, col1_4, col1_8),
  UNIQUE (col1_12 DESC),
  INDEX (col1_1 ASC),
  INDEX (col1_8)
);

INSERT
  INTO table1 (col1_0, col1_1, col1_3, col1_5, col1_6)
VALUES ('a4ecdba0-6c5c-4427-b6d2-a856d72d1b29', '1979-10-30', 0, false, 0);

SELECT col1_6 FROM table1;
SELECT col1_6 FROM table1 WHERE NOT col1_5;

Interestingly we never insert -0 into the table.

michae2 · 2022-04-05T04:57:16Z

So we have some kind of corruption:

[email protected]:26257/defaultdb> SELECT col1_6 FROM table1@table1_pkey;
  col1_6
----------
      -0
(1 row)


Time: 1ms total (execution 1ms / network 0ms)

[email protected]:26257/defaultdb> SELECT col1_6 FROM table1@table1_expr_col1_12_key;
  col1_6
----------
       0
(1 row)


Time: 1ms total (execution 1ms / network 0ms)

[email protected]:26257/defaultdb> SELECT col1_6 FROM table1@table1_col1_12_key;
  col1_6
----------
       0
(1 row)


Time: 1ms total (execution 1ms / network 0ms)

[email protected]:26257/defaultdb> SELECT col1_6 FROM table1@table1_col1_1_idx;
  col1_6
----------
       0
(1 row)


Time: 2ms total (execution 1ms / network 0ms)

[email protected]:26257/defaultdb> SELECT col1_6 FROM table1@table1_col1_8_idx;
  col1_6
----------
       0
(1 row)


Time: 1ms total (execution 1ms / network 0ms)

michae2 · 2022-04-05T05:04:00Z

Reduced a little further:

CREATE TABLE table1 (
  col1_0  UUID,
  col1_1  DATE,
  col1_3  INT8,
  col1_4  BIT(48),
  col1_5  BOOL,
  col1_6  FLOAT4,
  col1_8  BIT(9),
  col1_10 TIME,
  col1_12 STRING,
  PRIMARY KEY (col1_6 DESC),
  UNIQUE (col1_12 ASC) STORING(col1_1, col1_4, col1_8),
  INDEX (col1_1 ASC)
);

INSERT
  INTO table1 (col1_0, col1_1, col1_3, col1_5, col1_6)
VALUES ('a4ecdba0-6c5c-4427-b6d2-a856d72d1b29', '1979-10-30', 0, false, 0);

SELECT col1_6 FROM table1@table1_pkey;

yuzefovich · 2022-04-05T18:57:47Z

I couldn't stop myself from looking into this :)

One way to fix this is:

--- a/pkg/util/encoding/float.go
+++ b/pkg/util/encoding/float.go
@@ -93,5 +93,13 @@ func DecodeFloatAscending(buf []byte) ([]byte, float64, error) {
 // DecodeFloatDescending decodes floats encoded with EncodeFloatDescending.
 func DecodeFloatDescending(buf []byte) ([]byte, float64, error) {
        b, r, err := DecodeFloatAscending(buf)
-       return b, -r, err
+       if r != 0 {
+               // All values except for 0 and NaN were negated in
+               // EncodeFloatDescending, so we have to negate them back. Note that we
+               // don't need to check whether r is NaN since negating NaN gives back
+               // NaN too. Negative zero uses composite indexes to decode itself
+               // correctly.
+               r = -r
+       }
+       return b, r, err
 }

yuzefovich · 2022-04-05T19:13:31Z

Simplest repro:

CREATE TABLE t (f FLOAT, PRIMARY KEY (f DESC));
INSERT INTO t VALUES (0);
SELECT * FROM t;

michae2 · 2022-04-05T19:14:20Z

That was fast!

mgartner · 2022-04-05T20:24:36Z

Is it possible that this wasn't a TLP false-positive all along, and instead this bug?

michae2 · 2022-04-05T22:41:32Z

Is it possible that this wasn't a TLP false-positive all along, and instead this bug?

I reduced the failure involving MAX to:

CREATE TABLE table1 (
  col1_0 FLOAT8 NOT NULL,
  PRIMARY KEY (col1_0 ASC),
  INDEX (col1_0 DESC),
  INDEX (col1_0 ASC),
  INDEX (col1_0 DESC)
);

INSERT INTO table1 (col1_0) VALUES (0);

And in this case I think one query uses an ascending index and the other a descending index (the bug only affects descending indexes).

So yeah, I think this bug was probably the underlying issue for all of these.

mgartner · 2022-04-05T22:46:10Z

That's great news! Shame on us for doubting TLP. 😉

Fixes: cockroachdb#77279 Our old friend -0 is back. This time the problem was in DecodeFloatDescending, which was decoding both +0 and -0 as -0. Fix it to decode both as +0. (-0 is then properly decoded as -0 when the value part of the composite encoding is decoded.) Note that the on-disk data was correct. Only the decoding was affected. Release note (bug fix): Queries reading from an index or primary key on FLOAT or REAL columns DESC would read -0 for every +0 value stored in the index. Fix this to correctly read +0 for +0 and -0 for -0. Co-authored-by: Yahor Yuzefovich <[email protected]>

yuzefovich · 2022-04-06T04:53:54Z

Well, during the last collab session we did come up with a valid case when TLP could produce a false positive with 0 vs -0:

CREATE TABLE t (b BOOL, f FLOAT);
INSERT INTO t VALUES (false, 0), (true, -0);
SELECT max(f) FROM t; -- unpartitioned, returns 0
SELECT max(agg) FROM (SELECT max(f) AS agg FROM t WHERE b UNION ALL SELECT max(f) AS agg FROM t WHERE NOT b UNION ALL SELECT max(f) FROM t WHERE b IS NULL); -- partitioned, returns -0

michae2 · 2022-04-06T16:33:47Z

@yuzefovich good point, thanks for reminding me. I'll go ahead and make changes to TLP, too.

79085: ingesting,backupccl: fix bug in privileges of restored descriptors r=adityamaru a=adityamaru Previously, when restoring a table descriptor as part of a non-cluster backup, we were incorrectly generating its privileges without passing in the parent schema's DefaultPrivilegeDescriptor. This change fixes that. Additionally, this change rewrites the `restore-grants` datadriven test to be more understandble. There were also some errors in the test that have been fixed in this change. The expected behavior is that: 1) A cluster restore will just restore all privileges as was backed up. 2) A database restore will first generate ALL privileges for root and admin on the database, and subsequent table, type, schema descriptors will inherit these privileges. 3) A table restore into an existing database will generate its set of privileges based on the default privileges of the parentDB and parentSchema. All restored schema objects are owned by the user running the restore. All backed up privileges on these descriptors are wiped, since there is no guarantee that a backed up user will be present in the restoring cluster. Release note (bug fix): Privileges for restored table's were being generated incorrectly without taking into consideration their parent schema's default privilege descriptor. 79407: cloud: bump orchestrator to v21.2.8 r=aliher1911 a=celiala Release note: None Jira issue: CRDB-14811 79473: encoding: fix DecodeFloatDescending for positive zero r=yuzefovich,rytaft,mgartner a=michae2 Fixes: #77279 Our old friend -0 is back. This time the problem was in DecodeFloatDescending, which was decoding both +0 and -0 as -0. Fix it to decode both as +0. (-0 is then properly decoded as -0 when the value part of the composite encoding is decoded.) Note that the on-disk data was correct. Only the decoding was affected. Release note (bug fix): Queries reading from an index or primary key on FLOAT or REAL columns DESC would read -0 for every +0 value stored in the index. Fix this to correctly read +0 for +0 and -0 for -0. Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Aditya Maru <[email protected]> Co-authored-by: Celia La <[email protected]> Co-authored-by: Michael Erickson <[email protected]>

Fixes: #77279 Our old friend -0 is back. This time the problem was in DecodeFloatDescending, which was decoding both +0 and -0 as -0. Fix it to decode both as +0. (-0 is then properly decoded as -0 when the value part of the composite encoding is decoded.) Note that the on-disk data was correct. Only the decoding was affected. Release note (bug fix): Queries reading from an index or primary key on FLOAT or REAL columns DESC would read -0 for every +0 value stored in the index. Fix this to correctly read +0 for +0 and -0 for -0. Co-authored-by: Yahor Yuzefovich <[email protected]>

Fixes: #77279 Our old friend -0 is back. This time the problem was in DecodeFloatDescending, which was decoding both +0 and -0 as -0. Fix it to decode both as +0. (-0 is then properly decoded as -0 when the value part of the composite encoding is decoded.) Note that the on-disk data was correct. Only the decoding was affected. Release note (bug fix): Queries reading from an index or primary key on FLOAT or REAL columns DESC would read -0 for every +0 value stored in the index. Fix this to correctly read +0 for +0 and -0 for -0. Release justification: low-risk change to existing functionality. Co-authored-by: Yahor Yuzefovich <[email protected]>

Assists: cockroachdb#77279 TLP compares results of two queries, one unpartitioned and one partitioned three ways by a predicate. We were comparing TLP results by printing values to strings. Unfortunately there are some values which are equal according to SQL comparison, but print differently. And TLP, being the nefarious chaos monkey that it is, tends to find these. For example: ```sql 0::FLOAT -0::FLOAT 'hello' COLLATE "en-US-u-ks-level2" 'HELLO' COLLATE "en-US-u-ks-level2" INTERVAL '1 day' INTERVAL '24 hours' ``` When equal-yet-different values like these are used in MAX or MIN aggregations, or are the keys for GROUP BY bucketing, it is nondeterministic which value will be chosen. (This is also true in PostgreSQL.) And sometimes the two TLP queries choose differently. This is not indicative of a bug, yet TLP fails. So this patch teaches TLP to use SQL comparison instead of string comparison. To do so, we wrap both queries in a bigger query: ```sql WITH unpart AS MATERIALIZED ( <unpartitioned query> ), part AS MATERIALIZED ( <partitioned query> ), undiff AS ( TABLE unpart EXCEPT ALL TABLE part ), diff AS ( TABLE part EXCEPT ALL TABLE unpart ) SELECT (SELECT count(*) FROM undiff), (SELECT count(*) FROM diff) ``` Only if this query returns counts other than 0 (meaning the SQL comparison detected a real mismatch) do we diff the printed results. Release note: None

76504: *: fix system config for tenant r=ajwerner a=ajwerner Follow-on from #76279. Fixes #75864. This commit adds a mechanism to combine the system config data of the tenant with the data provided over the GossipSubscription from the system tenant. It then plumbs a version into the zone config methods. In the mixed version state, the tenant uses the existing override from the system tenant. After the span config infrastructure has been activated, the tenant uses the overrides they've set. This affects, realistically, just the GC job, and, to a lesser extent, the optimizer. Release note: None 78595: Week -1: follow-up emails r=rail a=celiala This PR adds follow-up emails for: - the day before `PrepDate` (to be sent if only there are still blockers) - the day of `PrepDate` (to be sent if only there are still blockers) - the day after `PrepDate` (to be sent if only there are still blockers) The `post-blockers` command should be used for all emails before and up to `PrepDate`: ``` # where `DAYS_BEFORE_PREP` is one of: many-days | day-before | day-of SMTP_PASSWORD=$MY_SMTP_PWD SMTP_USER=$MY_SMTP_USER GITHUB_TOKEN=$MY_GITHUB_TOKEN \ $(bazel info bazel-bin)/pkg/cmd/release/release_/release post-blockers \ --release-series 21.1 \ --smtp-user $MY_SMTP_USER --smtp-host smtp.gmail.com --smtp-port 587 \ --to $MY_TO_EMAIL \ --template-dir ./pkg/cmd/release/templates \ --prep-date "Apr 1, 2022" --publish-date "May 1, 2022" \ --days-before-prep-date $DAYS_BEFORE_PREP ``` The `cancel-release-date` should be used for the day after `PrepDate` email: ``` SMTP_PASSWORD=$MY_SMTP_PWD SMTP_USER=$MY_SMTP_USER GITHUB_TOKEN=$MY_GITHUB_TOKEN \ $(bazel info bazel-bin)/pkg/cmd/release/release_/release cancel-release-date \ --release-series 21.1 \ --smtp-user $MY_SMTP_USER --smtp-host smtp.gmail.com --smtp-port 587 \ --to $MY_TO_EMAIL \ --template-dir ./pkg/cmd/release/templates \ --publish-date "May 1, 2022" --next-publish-date "Jun 1, 2022" ``` ### TeamCity changes This will be paired with the following TeamCity jobs: * [Send release blockers - 78595](https://teamcity.cockroachdb.com/buildConfiguration/Internal_Cockroach_Release_Process_SendReleaseBlockers_78595?branch=78595&buildTypeTab=overview&mode=builds#all-projects) * [Cancel release date - 78595](https://teamcity.cockroachdb.com/buildConfiguration/Internal_Cockroach_Release_Process_CancelReleaseDate_78595?branch=78595&mode=builds#all-projects) ![image](https://user-images.githubusercontent.com/3051672/160359980-93ac438f-d8e6-4f6c-a0e0-00fab55b93bc.png) #### Manual tests ##### Test 1: post-blockers --days-before-prep-date many-days <img width="300" src="https://user-images.githubusercontent.com/3051672/160353978-12e8d2e9-1900-471b-b785-f247d19f474a.png"> ##### Test 2: post-blockers --days-before-prep-date day-before <img width="300" src="https://user-images.githubusercontent.com/3051672/160353943-8514a0d4-87f4-4ab7-8d67-78bb42f427c2.png"> ##### Test 3: post-blockers --days-before-prep-date day-of <img width="300" src="https://user-images.githubusercontent.com/3051672/160353904-d18e6d68-594c-48dc-8308-6f0c3621fe91.png"> ##### Test 4: cancel-release-date <img width="300" src="https://user-images.githubusercontent.com/3051672/160353668-1e2886f5-c39f-4d54-884d-f967ba06a114.png"> 79551: roachtest: use SQL comparison in TLP r=michae2 a=michae2 Assists: #77279 TLP compares results of two queries, one unpartitioned and one partitioned three ways by a predicate. We were comparing TLP results by printing values to strings. Unfortunately there are some values which are equal according to SQL comparison, but print differently. And TLP, being the nefarious chaos monkey that it is, tends to find these. For example: ```sql 0::FLOAT -0::FLOAT 'hello' COLLATE "en-US-u-ks-level2" 'HELLO' COLLATE "en-US-u-ks-level2" INTERVAL '1 day' INTERVAL '24 hours' ``` When equal-yet-different values like these are used in MAX or MIN aggregations, or are the keys for GROUP BY bucketing, it is nondeterministic which value will be chosen. (This is also true in PostgreSQL.) And sometimes the two TLP queries choose differently. This is not indicative of a bug, yet TLP fails. So this patch teaches TLP to use SQL comparison instead of string comparison. To do so, we wrap both queries in a bigger query: ```sql WITH unpart AS MATERIALIZED ( <unpartitioned query> ), part AS MATERIALIZED ( <partitioned query> ), undiff AS ( TABLE unpart EXCEPT ALL TABLE part ), diff AS ( TABLE part EXCEPT ALL TABLE unpart ) SELECT (SELECT count(*) FROM undiff), (SELECT count(*) FROM diff) ``` Only if this query returns counts other than 0 (meaning the SQL comparison detected a real mismatch) do we diff the printed results. Release note: None 79767: util/contextutil: clarify `RunWithTimeout` error message r=erikgrinaker a=erikgrinaker `RunWithTimeout` gave an error message of this form: ``` operation "send-snapshot" timed out after 1m (took 1m): context deadline exceeded ``` However, this can be misleading because either the caller or callee can set their own context timeout that is less than then one set by `RunWithTimeout`, in which case it looks like: ``` operation "send-snapshot" timed out after 1m (took 1s): context deadline exceeded ``` This is even worse on old versions, where the "took 1s" is not included, leading the reader to believe the operation actually took 1m when it only took 1s. This patch clarifies the error message somewhat: ``` operation "send-snapshot" timed out after 1s (given timeout 1m): context deadline exceeded ``` Resolves #79424. Release note: None 79787: protectedts: fix panic when fetching protectedts records r=adityamaru a=adityamaru Previously, we incorrectly assumed that all records in the `system.protectedts_records` table would have a non NULL target column. While this is enforced for all protected timestamp records written in a 22.1+ cluster, this is not true for records that might have been prior to the 22.1 cluster version being finalized. This change makes the method responsible for reading protectedts records more defensive when encountering a NULL target column. Fixes: #79684 Release note: None 79796: Publish cockroach-sql in publish bleeding edge r=rickystewart a=rail Previously, the `cockroach-sql` tool was published only as a part of the release workflow. The first commit refactors the way how we publish and test. This is very similar to what we use in publish-provisional-artifacts. The second commit adds the cockroach-sql binary to the list of artifacts to publish. Release note: None Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Celia La <[email protected]> Co-authored-by: Michael Erickson <[email protected]> Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Aditya Maru <[email protected]> Co-authored-by: Rail Aliiev <[email protected]>

Assists: #77279 TLP compares results of two queries, one unpartitioned and one partitioned three ways by a predicate. We were comparing TLP results by printing values to strings. Unfortunately there are some values which are equal according to SQL comparison, but print differently. And TLP, being the nefarious chaos monkey that it is, tends to find these. For example: ```sql 0::FLOAT -0::FLOAT 'hello' COLLATE "en-US-u-ks-level2" 'HELLO' COLLATE "en-US-u-ks-level2" INTERVAL '1 day' INTERVAL '24 hours' ``` When equal-yet-different values like these are used in MAX or MIN aggregations, or are the keys for GROUP BY bucketing, it is nondeterministic which value will be chosen. (This is also true in PostgreSQL.) And sometimes the two TLP queries choose differently. This is not indicative of a bug, yet TLP fails. So this patch teaches TLP to use SQL comparison instead of string comparison. To do so, we wrap both queries in a bigger query: ```sql WITH unpart AS MATERIALIZED ( <unpartitioned query> ), part AS MATERIALIZED ( <partitioned query> ), undiff AS ( TABLE unpart EXCEPT ALL TABLE part ), diff AS ( TABLE part EXCEPT ALL TABLE unpart ) SELECT (SELECT count(*) FROM undiff), (SELECT count(*) FROM diff) ``` Only if this query returns counts other than 0 (meaning the SQL comparison detected a real mismatch) do we diff the printed results. Release note: None

Assists: cockroachdb#77279 TLP compares results of two queries, one unpartitioned and one partitioned three ways by a predicate. We were comparing TLP results by printing values to strings. Unfortunately there are some values which are equal according to SQL comparison, but print differently. And TLP, being the nefarious chaos monkey that it is, tends to find these. For example: ```sql 0::FLOAT -0::FLOAT 'hello' COLLATE "en-US-u-ks-level2" 'HELLO' COLLATE "en-US-u-ks-level2" INTERVAL '1 day' INTERVAL '24 hours' ``` When equal-yet-different values like these are used in MAX or MIN aggregations, or are the keys for GROUP BY bucketing, it is nondeterministic which value will be chosen. (This is also true in PostgreSQL.) And sometimes the two TLP queries choose differently. This is not indicative of a bug, yet TLP fails. So this patch teaches TLP to use SQL comparison instead of string comparison. To do so, we wrap both queries in a bigger query: ```sql WITH unpart AS MATERIALIZED ( <unpartitioned query> ), part AS MATERIALIZED ( <partitioned query> ), undiff AS ( TABLE unpart EXCEPT ALL TABLE part ), diff AS ( TABLE part EXCEPT ALL TABLE unpart ) SELECT (SELECT count(*) FROM undiff), (SELECT count(*) FROM diff) ``` Only if this query returns counts other than 0 (meaning the SQL comparison detected a real mismatch) do we diff the printed results. Release note: None

blathers-crl bot added the T-sql-queries SQL Queries Team label Mar 2, 2022

rytaft removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 4, 2022

rytaft changed the title ~~roachtest: tlp failed~~ roachtest: -0 is not converted to 0, causing spurious tlp failure Mar 4, 2022

rharding6373 assigned michae2 Mar 8, 2022

rytaft mentioned this issue Mar 21, 2022

roachtest: tlp failed #78137

Closed

michae2 mentioned this issue Mar 25, 2022

roachtest: tlp failed #78492

Closed

rytaft mentioned this issue Mar 28, 2022

roachtest: tlp failed #78327

Closed

michae2 mentioned this issue Mar 30, 2022

roachtest: tlp failed #79033

Closed

rytaft mentioned this issue Apr 3, 2022

roachtest: tlp failed #79290

Closed

michae2 mentioned this issue Apr 5, 2022

encoding: fix DecodeFloatDescending for positive zero #79473

Merged

craig bot closed this as completed in b5bcb19 Apr 6, 2022

blathers-crl bot mentioned this issue Apr 6, 2022

release-22.1: encoding: fix DecodeFloatDescending for positive zero #79533

Merged

michae2 mentioned this issue Apr 6, 2022

roachtest: use SQL comparison in TLP #79551

Merged

blathers-crl bot mentioned this issue May 4, 2022

release-22.1: roachtest: use SQL comparison in TLP #81015

Merged

jlinder added the sync-me-3 label May 24, 2022

mgartner mentioned this issue May 31, 2022

release-22.1.0: roachtest: use SQL comparison in TLP #82135

Closed

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Done in SQL Queries Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: -0 is not converted to 0, causing spurious tlp failure #77279

roachtest: -0 is not converted to 0, causing spurious tlp failure #77279

cockroach-teamcity commented Mar 2, 2022 •

edited by cockroach-jira-scripts

Loading

cockroach-teamcity commented Mar 4, 2022

rytaft commented Mar 4, 2022

michae2 commented Apr 5, 2022

michae2 commented Apr 5, 2022

michae2 commented Apr 5, 2022

yuzefovich commented Apr 5, 2022

yuzefovich commented Apr 5, 2022

michae2 commented Apr 5, 2022

mgartner commented Apr 5, 2022

michae2 commented Apr 5, 2022

mgartner commented Apr 5, 2022

yuzefovich commented Apr 6, 2022

michae2 commented Apr 6, 2022

roachtest: -0 is not converted to 0, causing spurious tlp failure #77279

roachtest: -0 is not converted to 0, causing spurious tlp failure #77279

Comments

cockroach-teamcity commented Mar 2, 2022 • edited by cockroach-jira-scripts Loading

cockroach-teamcity commented Mar 4, 2022

rytaft commented Mar 4, 2022

michae2 commented Apr 5, 2022

michae2 commented Apr 5, 2022

michae2 commented Apr 5, 2022

yuzefovich commented Apr 5, 2022

yuzefovich commented Apr 5, 2022

michae2 commented Apr 5, 2022

mgartner commented Apr 5, 2022

michae2 commented Apr 5, 2022

mgartner commented Apr 5, 2022

yuzefovich commented Apr 6, 2022

michae2 commented Apr 6, 2022

cockroach-teamcity commented Mar 2, 2022 •

edited by cockroach-jira-scripts

Loading