Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not re-do permission checks on range table for parallel worker #233

Conversation

Deepesh125
Copy link
Contributor

@Deepesh125 Deepesh125 commented Oct 9, 2023

Description

With current design of postgres, workflow of parallel worker looks like something below:

-- main node
ExecutorRun
 standard_ExecutorStart
  InitPlan
   ExecCheckRTPerms <-- does the permission check on all the range table entries from planner stmt
 standard_ExecutorRun
  ExecutePlan
  .
  .
  .
  ExecGather
   gather_getnext
    -- spawns initialises and run parallel workers
  .
  .
  .
-- parallel worker
ParallelQueryMain
 ExecutorRun
  standard_ExecutorStart
   InitPlan
    ExecCheckRTPerms <- redo the permission check on same set of table
 standard_ExecutorRun
 .
 .
 .

Here, Main worker would not have spawned the worker if there is any permission check failures on any of the range table. And parallel worker is again doing the permission check on same set of tables which is kind of redundant. So this commit avoid that unnecessary permission check.

This change is also helpful when select query or subquery involves T-SQL temp tables. Currently, Babelfish is throwing error like "relation with OID 19401 does not exist" because metadata for temp tables is being stored in ENR metadata (which is backend memory) and is not being shared with parallel worker. So this changes will also avoid such error for temp table from parallel worker.

Regression tests run successfully:

========================
 1 of 213 tests failed.
========================

The differences that caused some tests to fail can be viewed in the
file "/local/home/dddhamel/workplace/babel-oss/src/postgresql_modified_for_babelfish/src/test/regress/regression.diffs".  A copy of the test summary that you see
above is saved in the file "/local/home/dddhamel/workplace/babel-oss/src/postgresql_modified_for_babelfish/src/test/regress/regression.out".

make[1]: *** [check] Error 1
make[1]: Leaving directory `/local/home/dddhamel/workplace/babel-oss/src/postgresql_modified_for_babelfish/src/test/regress'
make: *** [check] Error 2

Note: One of the test is failing because I had to make changes in test_setup.sql to force parallel query mode.

Task: BABEL-4450
Signed-off-by: Dipesh Dhameliya [email protected]

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the PostgreSQL license, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@Deepesh125 Deepesh125 merged commit f492997 into babelfish-for-postgresql:BABEL_3_X_DEV__PG_15_X Oct 12, 2023
2 checks passed
@Deepesh125 Deepesh125 deleted the jira-babel-4450 branch October 12, 2023 12:29
Deepesh125 added a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Oct 12, 2023
…belfish-for-postgresql#233)

With current design of postgres, workflow of parallel worker looks like something below:

```
-- main node
ExecutorRun
 standard_ExecutorStart
  InitPlan
   ExecCheckRTPerms <-- does the permission check on all the range table entries from planner stmt
 standard_ExecutorRun
  ExecutePlan
  .
  .
  .
  ExecGather
   gather_getnext
    -- spawns initialises and run parallel workers
  .
  .
  .
-- parallel worker
ParallelQueryMain
 ExecutorRun
  standard_ExecutorStart
   InitPlan
    ExecCheckRTPerms <- redo the permission check on same set of table
 standard_ExecutorRun
 .
 .
 .
```
Here, Main worker would not have spawned the worker if there is any permission check failures on any of the range table. And parallel worker is again doing the permission check on same set of tables which is kind of redundant. So this commit avoid that unnecessary permission check.

This change is also helpful when select query or subquery involves T-SQL temp tables. Currently, Babelfish is throwing error like "relation with OID 19401 does not exist" because metadata for temp tables is being stored in ENR metadata (which is backend memory) and is not being shared with parallel worker. So this changes will also avoid such error for temp table from parallel worker.

Task: BABEL-4450
Signed-off-by: Dipesh Dhameliya <[email protected]>
Sairakan pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 16, 2023
…belfish-for-postgresql#233)

With current design of postgres, workflow of parallel worker looks like something below:

```
-- main node
ExecutorRun
 standard_ExecutorStart
  InitPlan
   ExecCheckRTPerms <-- does the permission check on all the range table entries from planner stmt
 standard_ExecutorRun
  ExecutePlan
  .
  .
  .
  ExecGather
   gather_getnext
    -- spawns initialises and run parallel workers
  .
  .
  .
-- parallel worker
ParallelQueryMain
 ExecutorRun
  standard_ExecutorStart
   InitPlan
    ExecCheckRTPerms <- redo the permission check on same set of table
 standard_ExecutorRun
 .
 .
 .
```
Here, Main worker would not have spawned the worker if there is any permission check failures on any of the range table. And parallel worker is again doing the permission check on same set of tables which is kind of redundant. So this commit avoid that unnecessary permission check.

This change is also helpful when select query or subquery involves T-SQL temp tables. Currently, Babelfish is throwing error like "relation with OID 19401 does not exist" because metadata for temp tables is being stored in ENR metadata (which is backend memory) and is not being shared with parallel worker. So this changes will also avoid such error for temp table from parallel worker.

Task: BABEL-4450
Signed-off-by: Dipesh Dhameliya <[email protected]>
priyansx pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 22, 2023
…belfish-for-postgresql#233)

With current design of postgres, workflow of parallel worker looks like something below:

```
-- main node
ExecutorRun
 standard_ExecutorStart
  InitPlan
   ExecCheckRTPerms <-- does the permission check on all the range table entries from planner stmt
 standard_ExecutorRun
  ExecutePlan
  .
  .
  .
  ExecGather
   gather_getnext
    -- spawns initialises and run parallel workers
  .
  .
  .
-- parallel worker
ParallelQueryMain
 ExecutorRun
  standard_ExecutorStart
   InitPlan
    ExecCheckRTPerms <- redo the permission check on same set of table
 standard_ExecutorRun
 .
 .
 .
```
Here, Main worker would not have spawned the worker if there is any permission check failures on any of the range table. And parallel worker is again doing the permission check on same set of tables which is kind of redundant. So this commit avoid that unnecessary permission check.

This change is also helpful when select query or subquery involves T-SQL temp tables. Currently, Babelfish is throwing error like "relation with OID 19401 does not exist" because metadata for temp tables is being stored in ENR metadata (which is backend memory) and is not being shared with parallel worker. So this changes will also avoid such error for temp table from parallel worker.

Task: BABEL-4450
Signed-off-by: Dipesh Dhameliya <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants