Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faulty pagination in ordered query with over 1024 results #42816

Closed
philschleier opened this issue Nov 27, 2019 · 6 comments · Fixed by #42831
Closed

Faulty pagination in ordered query with over 1024 results #42816

philschleier opened this issue Nov 27, 2019 · 6 comments · Fixed by #42831
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently.

Comments

@philschleier
Copy link

Describe the problem

In v19.2.1 the pagination on an ordered query continues from top after 1024 results whenever a LIMIT is set.

So, a query with OFFSET 1024 will return all results starting from the 1025th till the end (as expected), but as soon as I introduce a LIMIT of any size, the first returned result will be the first as if I had no offset. Sorting DESC makes things worse...

The problem does not appear right away, but takes a minute or so to manifest for new data.

To Reproduce

  1. Upgrade 3-node CockroachDB cluster (Kubernetes) to v19.2.1.
  2. Create a complex table with more than 1024 values as follows:
db.ExecContext(ctx, "CREATE TABLE test (i INT)")
for i := 0; i < 1026; i++ {
  db.ExecContext(ctx, "INSERT INTO test (i) values ($1)", i)
}
  1. Wait. The problem only appears after about a minute or so.

  2. Run following queries:

root@cockroachdb-public:26257/db> SELECT * FROM test ORDER BY i DESC OFFSET 1020;
  i  
+---+
  5  
  4  
  3  
  2  
  1  
  0  
(6 rows)

Time: 9.5643ms

root@cockroachdb-public:26257/db> SELECT * FROM test ORDER BY i DESC OFFSET 1020 LIMIT 10;
   i    
+------+
     5  
     4  
     3  
     2  
  1025  
  1024  
(6 rows)

Time: 4.090952ms

root@cockroachdb-public:26257/db> SELECT * FROM test ORDER BY i ASC OFFSET 1020;

   i    
+------+
  1020  
  1021  
  1022  
  1023  
  1024  
  1025  
(6 rows)

Time: 4.24624ms

root@cockroachdb-public:26257/db> SELECT * FROM test ORDER BY i ASC OFFSET 1020 LIMIT 10;

   i    
+------+
  1020  
  1021  
  1022  
  1023  
     0  
     1  
(6 rows)

Time: 4.370304ms
  1. Expect different output

Note: Downgrading to v19.2.0 makes the problem go away.

Expected behavior

I would expect ordered result sets to not jump to some other values...

Environment:

  • CockroachDB version v19.2.1
  • Client app: (any)

Additional context

This is the "secure" 3-node CockroachDB cluster on Kubernetes.

@philschleier
Copy link
Author

So I'm unsure about v19.2.0 because I can reproduce the above on v19.2.0, however going back to v19.2.0 did fix our production issues...

@jordanlewis
Copy link
Member

Thanks for the report. I would guess this is a bug in the new vectorized engine. Does it reproduce after you type set vectorize=off?

Please share the plan with explain, explain(distsql) and explain(vec) as well.

@jordanlewis jordanlewis added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. labels Nov 27, 2019
@jordanlewis
Copy link
Member

I was able to reproduce this without trouble and confirmed that it doesn't repro with vectorized off.

@jordanlewis
Copy link
Member

[12:55:130]% cdemo --nodes=3
[email protected]:55034/defaultdb> create table a (a int);
CREATE TABLE

Time: 2.555ms

[email protected]:55034/defaultdb> insert into a select * from generate_series(0,1025);
INSERT 1026

# Note: I waited a bit here

[email protected]:55034/defaultdb> SELECT * FROM a ORDER BY a offset 1020 limit 10;
   a
+------+
  1020
  1021
  1022
  1023
     0
     1
(6 rows)

Time: 3.485ms

[email protected]:55034/defaultdb> set vectorize=off;
SET

Time: 278µs

[email protected]:55034/defaultdb> SELECT * FROM a ORDER BY a offset 1020 limit 10;
   a
+------+
  1020
  1021
  1022
  1023
  1024
  1025
(6 rows)

@jordanlewis
Copy link
Member


[email protected]:55034/defaultdb> explain(vec) SELECT * FROM a ORDER BY a offset 1020 limit 10;
                text
+-----------------------------------+
  │
  ├ Node 1
  │ └ *colrpc.Inbox
  └ Node 2
    └ *colrpc.Outbox
      └ *colexec.limitOp
        └ *colexec.offsetOp
          └ *colexec.topKSorter
            └ *colexec.colBatchScan
(9 rows)

@yuzefovich
Copy link
Member

Oh, that's concerning. Looking into it.

As a side note, I think "waiting a little bit" is for statistics to be populated so that with auto setting vectorized engine is used. Alternatively, using experimental_on does the job with no waiting (or setting vectorize_row_count_threshold to 0 should work as well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants