Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #5594][AUTHZ] BuildQuery should respect normal node's input #5595

Conversation

AngersZhuuuu
Copy link
Contributor

@AngersZhuuuu AngersZhuuuu commented Nov 1, 2023

Why are the changes needed?

To close #5594
For case

def filter_func(iterator):
                for pdf in iterator:
                    yield pdf[pdf.id == 1]

df = spark.read.table("test_mapinpandas")
execute_result = df.mapInPandas(filter_func, df.schema).show()

The logical plan is

GlobalLimit 21
+- LocalLimit 21
   +- Project [cast(id#5 as string) AS id#11, name#6]
      +- MapInPandas filter_func(id#0, name#1), [id#5, name#6]
         +- HiveTableRelation [`default`.`test_mapinpandas`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#0, name#1], Partition Cols: []]

When handle MapInPandas, we didn't match its input with HiveTableRelation, cause we miss input table's columns. This pr fix this

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

With this patch, it can extract the correct privilege object now
截屏2023-11-02 上午10 25 38

Was this patch authored or co-authored using generative AI tooling?

@codecov-commenter
Copy link

codecov-commenter commented Nov 1, 2023

Codecov Report

Merging #5595 (86fc475) into master (27d845c) will decrease coverage by 0.22%.
Report is 7 commits behind head on master.
The diff coverage is 94.44%.

@@             Coverage Diff              @@
##             master    #5595      +/-   ##
============================================
- Coverage     61.67%   61.45%   -0.22%     
  Complexity       23       23              
============================================
  Files           603      603              
  Lines         35670    35684      +14     
  Branches       4869     4872       +3     
============================================
- Hits          21999    21930      -69     
- Misses        11291    11381      +90     
+ Partials       2380     2373       -7     
Files Coverage Δ
.../kyuubi/plugin/spark/authz/PrivilegesBuilder.scala 92.19% <94.44%> (+1.50%) ⬆️

... and 28 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@AngersZhuuuu
Copy link
Contributor Author

gentle ping @yaooqinn This is a common issue of authz, pls help check

@yaooqinn
Copy link
Member

yaooqinn commented Nov 3, 2023

How about we add a unit test for this issue?

@yaooqinn
Copy link
Member

yaooqinn commented Nov 3, 2023

Does this issue have a simple solution that we collect leaves for case p: Project =>

@AngersZhuuuu
Copy link
Contributor Author

How about we add a unit test for this issue?

Add a UT, the plan is


GlobalLimit 1
+- LocalLimit 1
   +- Project [cast(id#22 as string) AS id#30, cast(scope#23 as string) AS scope#31]
      +- MapInPandas mapInPandasUDF(id#17, scope#18)#21, [id#22, scope#23]
         +- HiveTableRelation [`spark_catalog`.`default`.`table1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#17, scope#18], Partition Cols: []]

Without this pr, this query only extract table, miss column privilege

"Permission denied: user [someone] does not have [select] privilege on [default/table1]" did not contain "does not have [select] privilege on [default/table1/id,default/table1/scope]"
ScalaTestFailureLocation: org.apache.kyuubi.plugin.spark.authz.ranger.HiveCatalogRangerSparkExtensionSuite at (RangerSparkExtensionSuite.scala:1125)
org.scalatest.exceptions.TestFailedException: "Permission denied: user [someone] does not have [select] privilege on [default/table1]" did not contain "does not have [select] privilege on [default/table1/id,default/table1/scope]"
	at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)

@AngersZhuuuu
Copy link
Contributor Author

Since local branch name changed, can't push to this pr's branch, move to #5627 close this one

@AngersZhuuuu
Copy link
Contributor Author

Does this issue have a simple solution that we collect leaves for case p: Project =>

Don't think so, it's not project leave's issue.

Change to this
截屏2023-11-06 下午12 01 08
Won't work for the ut

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] BuildQuery should respect normal node's input
3 participants