Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: free scanner and parser allocations after large statement #97334

Open
michae2 opened this issue Feb 19, 2023 · 8 comments
Open

sql: free scanner and parser allocations after large statement #97334

michae2 opened this issue Feb 19, 2023 · 8 comments
Labels
A-sql-pgwire pgwire protocol issues. C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@michae2
Copy link
Collaborator

michae2 commented Feb 19, 2023

After a large statement is scanned and parsed, we hang onto the scanner and parser allocations in

// Parser wraps a scanner, parser and other utilities present in the parser
// package.
type Parser struct {
scanner scanner.Scanner
lexer lexer
parserImpl sqlParserImpl
tokBuf [8]sqlSymType
stmtBuf [1]Statement
}
until the session is closed. This seems like a temporary memory leak, but with connection pooling we might not actually close the session and free this memory for a long time. We should reset the parser after large statements.

Here's a repro, using python to create a large statement:

Python 3.10.10 (main, Feb  8 2023, 05:40:53) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import psycopg
>>> url = "..."
>>> conn = psycopg.connect(url)
>>> c = conn.cursor()
>>> c.execute("CREATE TABLE IF NOT EXISTS t AS SELECT 0 AS a")
>>> sql = "SELECT count(*) FROM t WHERE a IN (" + ",".join(("0" for _ in range(8000000))) + ")"
>>> len(sql) / 1024 / 1024
15.258822441101074
>>> c.execute(sql)

Even though this statement is less than 16 MiB, it causes us to hold onto over a GiB of memory in the scanner and parser, and this is not released until the connection closes:

Screenshot 2023-02-18 at 22 05 01

Screenshot 2023-02-18 at 22 40 44

This issue is somewhat similar to #47969 and #72581 and #80497

Jira issue: CRDB-24639

@michae2 michae2 added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-pgwire pgwire protocol issues. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) T-sql-queries SQL Queries Team labels Feb 19, 2023
@rytaft
Copy link
Collaborator

rytaft commented Feb 21, 2023

Seems like this may not be a large change, so let's try to get it done this release if possible and backport.

@exalate-issue-sync exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Feb 22, 2023
@lyang24
Copy link
Contributor

lyang24 commented Feb 22, 2023

two questions out of curiosity:
how do we define large statement?

  • option 1. we state a const threshold in code.
  • option 2. we let customer defined with a session variable.

whats the down side of resetting parser/scanner every loop?

@jordanlewis
Copy link
Member

Awesome find!

Did we confirm that this solves the problem? I think in some cases the prepared statement cache may hang onto some of the same memory. It's worth looking into this - depending on how the allocations are performed (are they slices of nodes?) the prepared statement cache may prevent this memory from being collected if there are pointers into bigger backing slices.

We saw problems like this before when cached things pointed directly to the backing byte buffer of e.g. a protobuf, but the same principle could apply to the bytes that come in from a connection and so on.

@michae2
Copy link
Collaborator Author

michae2 commented Feb 25, 2023

Did we confirm that this solves the problem? I think in some cases the prepared statement cache may hang onto some of the same memory. It's worth looking into this - depending on how the allocations are performed (are they slices of nodes?) the prepared statement cache may prevent this memory from being collected if there are pointers into bigger backing slices.

Great point. We're actually looking into the prepared statement cache now.

@michae2
Copy link
Collaborator Author

michae2 commented Mar 8, 2023

It looks like these two issues with prepared statements were actually responsible for the memory usage we saw:

@mgartner
Copy link
Collaborator

@michae2 so is this issue still valid, or should we close it?

@michae2
Copy link
Collaborator Author

michae2 commented Mar 30, 2023

@michae2 so is this issue still valid, or should we close it?

This issue is valid (the repro still causes it) but is not as urgent as I thought.

@mgartner mgartner moved this to New Backlog in SQL Queries Jul 24, 2023
@yuzefovich yuzefovich added the T-sql-queries SQL Queries Team label May 2, 2024
@rafiss
Copy link
Collaborator

rafiss commented Aug 27, 2024

This is related to this recent issue with how the scanner handles SQL comments: #127710. (Different cause, but same memory leak symptoms.)

@exalate-issue-sync exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgwire pgwire protocol issues. C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
Status: Backlog
Development

No branches or pull requests

7 participants