Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA: text @@ text results don't match Postgres #98804

Closed
msirek opened this issue Mar 16, 2023 · 0 comments · Fixed by #99583
Closed

QA: text @@ text results don't match Postgres #98804

msirek opened this issue Mar 16, 2023 · 0 comments · Fixed by #99583
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa T-sql-queries SQL Queries Team

Comments

@msirek
Copy link
Contributor

msirek commented Mar 16, 2023

From TSVector and TSQuery QA,

The following queries return different results on CRDB vs. Postgres:

SELECT  '(fat & rats)' @@ 'fats cats ate fat rats';
SELECT  '(fat | rats)' @@ 'fats cats ate fat rats';

Note that the default text search configuration in Postgres must be updated via:

SET default_text_search_config = 'pg_catalog.simple';

to make the Postgres configuration match the CRDB default (CRDB only supports 'simple').

According to Postgres docs,

The form text @@ text is equivalent to to_tsvector(x) @@ plainto_tsquery(y).

According to this rule, the above queries should be equivalent to:

SELECT to_tsvector('simple', '(fat & rats)') @@ plainto_tsquery('simple', 'fats cats ate fat rats');
SELECT to_tsvector('simple', '(fat | rats)') @@ plainto_tsquery('simple', 'fats cats ate fat rats');

But the results of these queries don't match eachother:

root@localhost:26257/defaultdb> SELECT  '(fat & rats)' @@ 'fats cats ate fat rats';
  ?column?
------------
     t
(1 row)

root@localhost:26257/defaultdb> SELECT to_tsvector('simple', '(fat & rats)') @@ plainto_tsquery('simple', 'fats cats ate fat rats');
  ?column?
------------
     f
(1 row)

root@localhost:26257/defaultdb> SELECT  '(fat | rats)' @@ 'fats cats ate fat rats';
  ?column?
------------
     t
(1 row)

root@localhost:26257/defaultdb> SELECT to_tsvector('simple', '(fat | rats)') @@ plainto_tsquery('simple', 'fats cats ate fat rats');
  ?column?
------------
     f
(1 row)

Jira issue: CRDB-25530

@msirek msirek added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa labels Mar 16, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Mar 16, 2023
@msirek msirek self-assigned this Mar 25, 2023
craig bot pushed a commit that referenced this issue Mar 28, 2023
99411: roachtest: make `follower-reads` test use protobuf encoded requests r=aadityasondhi a=aadityasondhi

Previously, the roachtest for
`follower-reads/mixed-version/single-region` used JSON encoding for the `ts/query` endpoint. This caused the client to send the newly updated `timerseries` proto to an older cluster in JSON format. Although this was defined as an `optional` field in the proto, since it was encoded in JSON, that information was lost over the wire.

In other places, such as DBConsole, we expect the client to be sending protobuf encoded messages and not JSON. In the same way, roachtests (another client) should do the same. When this message is sent as a protobuf, the `optional` tag of the field is encoded in the message and the server is able to process it as such.

There are related links in the issue:
#99117.

Fixes #99117.

Release note: None

99583: tree: apply functions to TEXT expressions compared with the @@ operator r=DrewKimball a=msirek

The TSQuery and TSVector "matches" operator "`@@"` returns different results on CRDB vs. Postgres when one of the arguments is a TEXT expression. CRDB always applies a CAST, and only for constants. The rules at https://www.postgresql.org/docs/current/textsearch-intro.html#TEXTSEARCH-MATCHING specify:
> The form text `@@` tsquery is equivalent to to_tsvector(x) `@@` y.
> The form text `@@` text is equivalent to to_tsvector(x) `@@` plainto_tsquery(y).

This PR adds these implicit function calls in these "matches" comparison expressions during type checking as well as a cast of TEXT to TSQuery when the other argument is a TSVector, which allows variable expressions to be handled.

Fixes #98875
Fixes #98804

Release note (bug fix): This allows the text search `@@` ("matches") operator to work with variable expressions and fixes incorrect results when one of the arguments is a TEXT expression and the other argument is a TEXT or TSQuery expression.

99721: roachtest: fix distsql version issue with mt upgrade roachtest r=ajstorm a=healthy-pod

This code change only fixes one issue with the test by not validating the cluster in the intermediate state because the SQL servers will have different DistSQL versions.

Release note: None
Epic: none

Co-authored-by: Aaditya Sondhi <[email protected]>
Co-authored-by: Mark Sirek <[email protected]>
Co-authored-by: healthy-pod <[email protected]>
@craig craig bot closed this as completed in 60ad5fb Mar 28, 2023
blathers-crl bot pushed a commit that referenced this issue Mar 28, 2023
The TSQuery and TSVector "matches" operator "@@" returns different results
on CRDB vs. Postgres when one of the arguments is a TEXT expression.
CRDB always applies a CAST, and only for constants. The rules at
https://www.postgresql.org/docs/current/textsearch-intro.html#TEXTSEARCH-MATCHING
specify:
> The form text @@ tsquery is equivalent to to_tsvector(x) @@ y.
> The form text @@ text is equivalent to to_tsvector(x) @@ plainto_tsquery(y).

This PR adds these implicit function calls in these "matches" comparison
expressions during type checking as well as a cast of TEXT to TSQuery
when the other argument is a TSVector, which allows variable expressions
to be handled.

Fixes #98875
Fixes #98804

Release note (bug fix): This allows the text search @@ ("matches")
operator to work with variable expressions and fixes incorrect results
when one of the arguments is a TEXT expression and the other argument is
a TEXT or TSQuery expression.
msirek pushed a commit that referenced this issue Apr 5, 2023
The TSQuery and TSVector "matches" operator "@@" returns different results
on CRDB vs. Postgres when one of the arguments is a TEXT expression.
The rules at
https://www.postgresql.org/docs/current/textsearch-intro.html#TEXTSEARCH-MATCHING
specify:
> The form text @@ tsquery is equivalent to to_tsvector(x) @@ y.
> The form text @@ text is equivalent to to_tsvector(x) @@ plainto_tsquery(y).

This PR adds these implicit function calls in these "matches" comparison
expressions during type checking.

Fixes #98804

Release note (bug fix): This fixes incorrect results from the text search
@@ ("matches") operator when one of the arguments is a TEXT expression and the
other argument is a TEXT or TSQuery expression.
msirek pushed a commit to msirek/cockroach that referenced this issue Apr 7, 2023
The TSQuery and TSVector "matches" operator "@@" returns different results
on CRDB vs. Postgres when one of the arguments is a TEXT expression.
CRDB always applies a CAST, and only for constants. The rules at
https://www.postgresql.org/docs/current/textsearch-intro.html#TEXTSEARCH-MATCHING
specify:
> The form text @@ tsquery is equivalent to to_tsvector(x) @@ y.
> The form text @@ text is equivalent to to_tsvector(x) @@ plainto_tsquery(y).

This PR adds these implicit function calls in these "matches" comparison
expressions during type checking as well as a cast of TEXT to TSQuery
when the other argument is a TSVector, which allows variable expressions
to be handled.

Fixes cockroachdb#98875
Fixes cockroachdb#98804

Release note (bug fix): This allows the text search @@ ("matches")
operator to work with variable expressions and fixes incorrect results
when one of the arguments is a TEXT expression and the other argument is
a TEXT or TSQuery expression.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant