Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #5594][AUTHZ] BuildQuery should respect normal node's input #5787

Closed
wants to merge 59 commits into from

Conversation

AngersZhuuuu
Copy link
Contributor

@AngersZhuuuu AngersZhuuuu commented Nov 29, 2023

🔍 Description

Issue References 🔗

This pull request fixes #5594

Describe Your Solution 🔧

For case

def filter_func(iterator):
                for pdf in iterator:
                    yield pdf[pdf.id == 1]

df = spark.read.table("test_mapinpandas")
execute_result = df.mapInPandas(filter_func, df.schema).show()

The logical plan is

GlobalLimit 21
+- LocalLimit 21
   +- Project [cast(id#5 as string) AS id#11, name#6]
      +- MapInPandas filter_func(id#0, name#1), [id#5, name#6]
         +- HiveTableRelation [`default`.`test_mapinpandas`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#0, name#1], Partition Cols: []]

When handle MapInPandas, we didn't match its input with HiveTableRelation, cause we miss input table's columns. This pr fix this

In this pr, we remove the branch of each project such as Project, Aggregate etc, handle it together.

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

For case

def filter_func(iterator):
                for pdf in iterator:
                    yield pdf[pdf.id == 1]

df = spark.read.table("test_mapinpandas")
execute_result = df.mapInPandas(filter_func, df.schema).show()

We miss column info of table test_mapinpandas

Behavior With This Pull Request 🎉

We got privilege object of table test_mapinpandas with it's column info.

Related Unit Tests


Checklists

📝 Author Self Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • This patch was not authored or co-authored using Generative Tooling

📝 Committer Pre-Merge Checklist

  • Pull request title is okay.
  • No license issues.
  • Milestone correctly set?
  • Test coverage is ok
  • Assignees are selected.
  • Minimum number of approvals
  • No changes are requested

Be nice. Be informative.

@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2023

Codecov Report

Attention: 8 lines in your changes are missing coverage. Please review.

Comparison is base (777b784) 61.12% compared to head (3d69997) 61.02%.

❗ Current head 3d69997 differs from pull request most recent head e085455. Consider uploading reports for the commit e085455 to get more accurate results

Files Patch % Lines
.../kyuubi/plugin/spark/authz/PrivilegesBuilder.scala 73.33% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #5787      +/-   ##
============================================
- Coverage     61.12%   61.02%   -0.10%     
  Complexity       23       23              
============================================
  Files           623      623              
  Lines         37060    37062       +2     
  Branches       5024     5021       -3     
============================================
- Hits          22652    22617      -35     
- Misses        11965    11995      +30     
- Partials       2443     2450       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor Author

@AngersZhuuuu AngersZhuuuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this pr can be reviewed again since I have running this in our prod for more then two weeks. cc @yaooqinn

@@ -127,7 +127,7 @@ abstract class V2CommandsPrivilegesSuite extends PrivilegesBuilderSuite {
assert(po0.catalog.isEmpty)
assertEqualsIgnoreCase(reusedDb)(po0.dbname)
assertEqualsIgnoreCase(reusedTableShort)(po0.objectName)
assert(po0.columns.take(2) === Seq("key", "value"))
assert(po0.columns === Seq("a", "key", "value"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added by test test("AlterTableAddColumnsCommand") cc @yaooqinn

@@ -186,7 +186,7 @@ abstract class V2CommandsPrivilegesSuite extends PrivilegesBuilderSuite {
assert(po0.catalog.isEmpty)
assertEqualsIgnoreCase(reusedDb)(po0.dbname)
assertEqualsIgnoreCase(reusedTableShort)(po0.objectName)
assert(po0.columns.take(2) === Seq("key", "value"))
assert(po0.columns === Seq("a", "key", "value"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@AngersZhuuuu
Copy link
Contributor Author

ping @yaooqinn

Copy link
Member

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pan3793 pan3793 added this to the v1.9.0 milestone Feb 4, 2024
@pan3793 pan3793 closed this in f67140e Feb 4, 2024
@pan3793
Copy link
Member

pan3793 commented Feb 4, 2024

Thanks, merged to master

zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this pull request Feb 5, 2024
…nput

# 🔍 Description
## Issue References 🔗

This pull request fixes apache#5594

## Describe Your Solution 🔧

For case
```
def filter_func(iterator):
                for pdf in iterator:
                    yield pdf[pdf.id == 1]

df = spark.read.table("test_mapinpandas")
execute_result = df.mapInPandas(filter_func, df.schema).show()
```

The logical plan is
```
GlobalLimit 21
+- LocalLimit 21
   +- Project [cast(id#5 as string) AS id#11, name#6]
      +- MapInPandas filter_func(id#0, name#1), [id#5, name#6]
         +- HiveTableRelation [`default`.`test_mapinpandas`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#0, name#1], Partition Cols: []]
```
When handle `MapInPandas`, we didn't match its  input with `HiveTableRelation`, cause we miss input table's  columns. This pr fix this

In this pr, we remove the branch of each project such as `Project`, `Aggregate` etc, handle it together.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️
For case
```
def filter_func(iterator):
                for pdf in iterator:
                    yield pdf[pdf.id == 1]

df = spark.read.table("test_mapinpandas")
execute_result = df.mapInPandas(filter_func, df.schema).show()
```

We miss column info of table `test_mapinpandas`

#### Behavior With This Pull Request 🎉
We got privilege object  of table `test_mapinpandas` with it's column info.

#### Related Unit Tests

---

# Checklists
## 📝 Author Self Checklist

- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

## 📝 Committer Pre-Merge Checklist

- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested

**Be nice. Be informative.**

Closes apache#5787 from AngersZhuuuu/KYUUBI-5594-approach2.

Closes apache#5594

e085455 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
49f09fb [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
4781f75 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
9e9208d [Angerszhuuuu] Update V2JdbcTableCatalogRangerSparkExtensionSuite.scala
626d3dd [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
3d69997 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
6eb4b8e [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
61efb8a [Angerszhuuuu] update
794ebb7 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
a236da8 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
74bd3f4 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
4acbc42 [Angerszhuuuu] Merge branch 'KYUUBI-5594-approach2' of https://github.com/AngersZhuuuu/incubator-kyuubi into KYUUBI-5594-approach2
266f7e8 [Angerszhuuuu] update
a6c7845 [Angerszhuuuu] Update PrivilegesBuilder.scala
d785d5f [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
014ef3b [Angerszhuuuu] Update PrivilegesBuilder.scala
7e1cd37 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
71d2661 [Angerszhuuuu] update
db95941 [Angerszhuuuu] update
490eb95 [Angerszhuuuu] update
70d110e [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
e6a5877 [Angerszhuuuu] Update PrivilegesBuilder.scala
5ff22b1 [Angerszhuuuu] Update PrivilegesBuilder.scala
e684301 [Angerszhuuuu] Update PrivilegesBuilder.scala
594b202 [Angerszhuuuu] Update PrivilegesBuilder.scala
2f87c61 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
1de8c1c [Angerszhuuuu] Update PrivilegesBuilder.scala
ad17255 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
4f5e850 [Angerszhuuuu] update
64349ed [Angerszhuuuu] Update PrivilegesBuilder.scala
11b7a4c [Angerszhuuuu] Update PrivilegesBuilder.scala
9a58fb0 [Angerszhuuuu] update
d0b022e [Angerszhuuuu] Update RuleApplyPermanentViewMarker.scala
e0f28a6 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
0ebdd5d [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
8e53236 [Angerszhuuuu] update
3bafa7c [Angerszhuuuu] update
d6e984e [Angerszhuuuu] update
b00bf5e [Angerszhuuuu] Update PrivilegesBuilder.scala
8214228 [Angerszhuuuu] update
93fc689 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
04184e3 [Angerszhuuuu] update
0bb7624 [Angerszhuuuu] Revert "Revert "Update PrivilegesBuilder.scala""
f481283 [Angerszhuuuu] Revert "Update PrivilegesBuilder.scala"
9f87182 [Angerszhuuuu] Revert "Update PrivilegesBuilder.scala"
29b67c4 [Angerszhuuuu] Update PrivilegesBuilder.scala
8785ad1 [Angerszhuuuu] Update PrivilegesBuilder.scala
270f21d [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
60872ef [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
c34f32e [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
86fc475 [Angerszhuuuu] Update PrivilegesBuilder.scala
404f1ea [Angerszhuuuu] Update PrivilegesBuilder.scala
dcca394 [Angerszhuuuu] Update PrivilegesBuilder.scala
c2c6fa4 [Angerszhuuuu] Update PrivilegesBuilder.scala
6f6a36e [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594]-AUTH]BuildQuery-should-respect-normal-node's-input
4dd47a1 [Angerszhuuuu] update
c549b6a [Angerszhuuuu] update
80013b9 [Angerszhuuuu] Update PrivilegesBuilder.scala
3cbba42 [Angerszhuuuu] Update PrivilegesBuilder.scala

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this pull request Mar 21, 2024
…nput

# 🔍 Description
## Issue References 🔗

This pull request fixes apache#5594

## Describe Your Solution 🔧

For case
```
def filter_func(iterator):
                for pdf in iterator:
                    yield pdf[pdf.id == 1]

df = spark.read.table("test_mapinpandas")
execute_result = df.mapInPandas(filter_func, df.schema).show()
```

The logical plan is
```
GlobalLimit 21
+- LocalLimit 21
   +- Project [cast(id#5 as string) AS id#11, name#6]
      +- MapInPandas filter_func(id#0, name#1), [id#5, name#6]
         +- HiveTableRelation [`default`.`test_mapinpandas`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#0, name#1], Partition Cols: []]
```
When handle `MapInPandas`, we didn't match its  input with `HiveTableRelation`, cause we miss input table's  columns. This pr fix this

In this pr, we remove the branch of each project such as `Project`, `Aggregate` etc, handle it together.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️
For case
```
def filter_func(iterator):
                for pdf in iterator:
                    yield pdf[pdf.id == 1]

df = spark.read.table("test_mapinpandas")
execute_result = df.mapInPandas(filter_func, df.schema).show()
```

We miss column info of table `test_mapinpandas`

#### Behavior With This Pull Request 🎉
We got privilege object  of table `test_mapinpandas` with it's column info.

#### Related Unit Tests

---

# Checklists
## 📝 Author Self Checklist

- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

## 📝 Committer Pre-Merge Checklist

- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested

**Be nice. Be informative.**

Closes apache#5787 from AngersZhuuuu/KYUUBI-5594-approach2.

Closes apache#5594

e085455 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
49f09fb [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
4781f75 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
9e9208d [Angerszhuuuu] Update V2JdbcTableCatalogRangerSparkExtensionSuite.scala
626d3dd [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
3d69997 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
6eb4b8e [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
61efb8a [Angerszhuuuu] update
794ebb7 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
a236da8 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
74bd3f4 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
4acbc42 [Angerszhuuuu] Merge branch 'KYUUBI-5594-approach2' of https://github.com/AngersZhuuuu/incubator-kyuubi into KYUUBI-5594-approach2
266f7e8 [Angerszhuuuu] update
a6c7845 [Angerszhuuuu] Update PrivilegesBuilder.scala
d785d5f [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
014ef3b [Angerszhuuuu] Update PrivilegesBuilder.scala
7e1cd37 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
71d2661 [Angerszhuuuu] update
db95941 [Angerszhuuuu] update
490eb95 [Angerszhuuuu] update
70d110e [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2
e6a5877 [Angerszhuuuu] Update PrivilegesBuilder.scala
5ff22b1 [Angerszhuuuu] Update PrivilegesBuilder.scala
e684301 [Angerszhuuuu] Update PrivilegesBuilder.scala
594b202 [Angerszhuuuu] Update PrivilegesBuilder.scala
2f87c61 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
1de8c1c [Angerszhuuuu] Update PrivilegesBuilder.scala
ad17255 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala
4f5e850 [Angerszhuuuu] update
64349ed [Angerszhuuuu] Update PrivilegesBuilder.scala
11b7a4c [Angerszhuuuu] Update PrivilegesBuilder.scala
9a58fb0 [Angerszhuuuu] update
d0b022e [Angerszhuuuu] Update RuleApplyPermanentViewMarker.scala
e0f28a6 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
0ebdd5d [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
8e53236 [Angerszhuuuu] update
3bafa7c [Angerszhuuuu] update
d6e984e [Angerszhuuuu] update
b00bf5e [Angerszhuuuu] Update PrivilegesBuilder.scala
8214228 [Angerszhuuuu] update
93fc689 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
04184e3 [Angerszhuuuu] update
0bb7624 [Angerszhuuuu] Revert "Revert "Update PrivilegesBuilder.scala""
f481283 [Angerszhuuuu] Revert "Update PrivilegesBuilder.scala"
9f87182 [Angerszhuuuu] Revert "Update PrivilegesBuilder.scala"
29b67c4 [Angerszhuuuu] Update PrivilegesBuilder.scala
8785ad1 [Angerszhuuuu] Update PrivilegesBuilder.scala
270f21d [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
60872ef [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
c34f32e [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594
86fc475 [Angerszhuuuu] Update PrivilegesBuilder.scala
404f1ea [Angerszhuuuu] Update PrivilegesBuilder.scala
dcca394 [Angerszhuuuu] Update PrivilegesBuilder.scala
c2c6fa4 [Angerszhuuuu] Update PrivilegesBuilder.scala
6f6a36e [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594]-AUTH]BuildQuery-should-respect-normal-node's-input
4dd47a1 [Angerszhuuuu] update
c549b6a [Angerszhuuuu] update
80013b9 [Angerszhuuuu] Update PrivilegesBuilder.scala
3cbba42 [Angerszhuuuu] Update PrivilegesBuilder.scala

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] BuildQuery should respect normal node's input
4 participants