Translate consecutive array of integers to "between x and y" #3272

aleksvujic · 2024-09-13T07:44:23Z

Description

We have the following simple database query. Variable ids is received from the client in our real-world application, but I hard-coded it in this example for simplicity. Column id is the primary key of the ticket table.

var ids = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

var tickets = _tickDbContext.Tickets
    .Where(x => ids.Contains(x.Id))
    .OrderBy(x => x.Id)
    .ToList();

This generates the following SQL:

--Microsoft.EntityFrameworkCore.Database.Command: Information: Executed DbCommand (96ms) [Parameters=[@__ids_0={ '1', '2', '3', '4', '5', ... } (DbType = Object)], CommandType='Text', CommandTimeout='30']
SELECT t.id, t.assignee_id, t.description, t.priority, t.reporter_id, t.summary
FROM tick.ticket AS t
WHERE t.id = ANY (@__ids_0)
ORDER BY t.id

Output of EXPLAIN ANALYZE with ANY(...) operator:

Sort  (cost=49.80..49.83 rows=11 width=75) (actual time=0.341..0.365 rows=11 loops=1)
  Sort Key: id
  Sort Method: quicksort  Memory: 27kB
  ->  Bitmap Heap Scan on ticket t  (cost=23.14..49.61 rows=11 width=75) (actual time=0.110..0.185 rows=11 loops=1)
        Recheck Cond: (id = ANY ('{1,2,3,4,5,6,7,8,9,10}'::integer[]))
        Heap Blocks: exact=6
        ->  Bitmap Index Scan on ticket_pkey  (cost=0.00..23.11 rows=11 width=0) (actual time=0.084..0.086 rows=11 loops=1)
              Index Cond: (id = ANY ('{1,2,3,4,5,6,7,8,9,10}'::integer[]))
Planning Time: 0.282 ms
Execution Time: 0.464 ms

I am wondering if the query would be faster if it used BETWEEN 1 AND 10 PostgreSQL syntax.

SELECT t.id, t.assignee_id, t.description, t.priority, t.reporter_id, t.summary
FROM tick.ticket AS t
WHERE t.id BETWEEN 1 AND 10
ORDER BY t.id

Output of EXPLAIN ANALYZE with BETWEEN x AND y operator:

Sort  (cost=31.05..31.08 rows=11 width=75) (actual time=0.298..0.322 rows=11 loops=1)
  Sort Key: id
  Sort Method: quicksort  Memory: 27kB
  ->  Bitmap Heap Scan on ticket t  (cost=4.39..30.86 rows=11 width=75) (actual time=0.103..0.180 rows=11 loops=1)
        Recheck Cond: ((id >= 1) AND (id <= 10))
        Heap Blocks: exact=6
        ->  Bitmap Index Scan on ticket_pkey  (cost=0.00..4.39 rows=11 width=0) (actual time=0.051..0.052 rows=11 loops=1)
              Index Cond: ((id >= 1) AND (id <= 10))
Planning Time: 0.276 ms
Execution Time: 0.414 ms

To me, execution plans look the same and should have equal performance. However, it needs to be checked if this is always the case (large tables, filtering is performed for non-indexed column, ...).

Proposal: Instead of using ANY operator for consecutive integers, maybe BETWEEN x AND y can be used. Of course, a thorough performance study has to be performed before jumping to the implementation of this proposal.

Further technical details

Npgsql version: 8.0.4
PostgreSQL version: 16.4
Operating system: Windows 10 64-bit

The text was updated successfully, but these errors were encountered:

roji · 2024-09-13T13:08:11Z

See also dotnet/efcore#12634 (comment), which is about transforming x >= y AND x <= z to x BETWEEN y AND z (not directly related).

Some remarks:

As you say, there's no clear indication that translating to BETWEEN here would have a significant advantage compared to the current translation.
In addition, this specific proposal would have us producing different SQL based on parameter values (since BETWEEN can only be used when the array being parameterized contains consecutive values); this isn't something we do unless there's a good reason to do it, since it means we can't cache the SQL within EF.
Finally, this proposal doesn't have anything to do with PostgreSQL specifically (i.e. could be done for any relational database), so if done, it would be done in EF itself rather than in the PG provider.

So while it's a nice idea, we'd need some more analysis/motivation that shows it's worth pursuing this. I'll go ahead and close the issue for now - if you believe it's worth doing and have more motivation for it, please reopen in the EF repo and we'll revisit.

vonzshik transferred this issue from npgsql/npgsql Sep 13, 2024

roji closed this as not planned Won't fix, can't repro, duplicate, stale Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translate consecutive array of integers to "between x and y" #3272

Translate consecutive array of integers to "between x and y" #3272

aleksvujic commented Sep 13, 2024

roji commented Sep 13, 2024

Translate consecutive array of integers to "between x and y" #3272

Translate consecutive array of integers to "between x and y" #3272

Comments

aleksvujic commented Sep 13, 2024

Description

Further technical details

roji commented Sep 13, 2024