Fix memory leak when executing vectorized quals #242

japinli · 2024-02-23T03:04:57Z

Currently, a VectorColumn structure is created when executing vectorized quals for each tuple on the ExecutorState memory context. However, it will be freed only until execution finishes. This commit changes the memory context to a tuple memory context.

Here are steps to reproduce:

Clone tpch-kit.

git clone https://github.com/gregrahn/tpch-kit.git
cd tpch-kit/dbgen

Compile tpch-kit.
Make sure the following in Makefie:

DATABASE = POSTGRESQL
MACHINE  = LINUX
WORKLOAD = TPC

Generate data.

./dbgen -s 1
mkdir s1 && for i in $(ls *.tbl); do sed 's/|$//' $i > s1/${i/tbl/csv}; rm $i; done;

Generate load schema and data script.

cd s1
echo "CREATE EXTENSION columnar;" > load.sql
sed 's|);$|) USING columnar;|g' ../dss.ddl >> load.sql
for csv in $(ls *.csv); do table=$(echo $csv | cut -d. -f1); echo "COPY $table FROM '$PWD/$csv' WITH (FORMAT csv, DELIMITER '|');"; done >> load.sql

Initialize the database and load data.

initdb -D s1data
pg_ctl -l logfile -D s1data start
psql -f load.sql postgres

Test

Start a new connection using psql;
use top to monitor the memory used by the backend process;

Execute the following query;

select
  s_name, s_address
from
  supplier, nation
where
  s_suppkey in (
    select
      ps_suppkey
    from
      partsupp
    where
      ps_partkey in (
        select
          p_partkey
        from
          part
        where
          p_name like 'khaki%'
      )
      and ps_availqty > (
        select
          0.5 * sum(l_quantity)
        from
          lineitem
        where
          l_partkey = ps_partkey
          and l_suppkey = ps_suppkey
          and l_shipdate >= date '1997-01-01'
          and l_shipdate < date '1997-01-01' + interval '1' year
      )
  )
  and s_nationkey = n_nationkey
  and n_name = 'EGYPT'
order by
  s_name;

In the top you will see the memory increased and finally cause out of memory.

Currently, a VectorColumn structure is created when executing vectorized quals for each tuple on the ExecutorState memory context. However, it will be freed only until execution finishes. This commit changes the memory context to a tuple memory context.

In ReadStripeNextVector(), the memory of columnValueOffset is allocated in ExecutorState memory content, it will be freed until execution is finished, so call pfree() to explicitly release the memory to avoid memory growing up.

japinli added 2 commits February 23, 2024 10:31

Fix memory leak in ReadStripeNextVector()

8e94b82

In ReadStripeNextVector(), the memory of columnValueOffset is allocated in ExecutorState memory content, it will be freed until execution is finished, so call pfree() to explicitly release the memory to avoid memory growing up.

wuputah requested a review from mkaruza March 20, 2024 01:28

mkaruza approved these changes Mar 20, 2024

View reviewed changes

wuputah merged commit 3194451 into hydradatabase:main Mar 20, 2024

wuputah mentioned this pull request Apr 1, 2024

1.1.2 #252

Merged

japinli deleted the memory-leak branch April 3, 2024 05:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory leak when executing vectorized quals #242

Fix memory leak when executing vectorized quals #242

japinli commented Feb 23, 2024

Fix memory leak when executing vectorized quals #242

Fix memory leak when executing vectorized quals #242

Conversation

japinli commented Feb 23, 2024