Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

[NSE-273] Spark shim layer infrastructure #361

Merged
merged 4 commits into from
Jun 14, 2021
Merged

[NSE-273] Spark shim layer infrastructure #361

merged 4 commits into from
Jun 14, 2021

Conversation

jerrychenhf
Copy link
Contributor

What changes were proposed in this pull request?

We implement Spark shim layer infrastructure for defining the common Spark shim interface, implementing shims for specific Spark versions, and the mechanisms to load the proper shim layer based on Spark versions.

How was this patch tested?

Unit test code that tests the loading of a right shim implementation based on the current Spark version.

@github-actions
Copy link

#273

Copy link
Collaborator

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the common layer is good to me
i think we should target for 3.1.2/3.2.0 in the shim layer

@@ -45,11 +45,15 @@
<arrow.install.dir>${arrow.script.dir}/build/arrow_install</arrow.install.dir>
<arrow_root>/usr/local</arrow_root>
<build_protobuf>ON</build_protobuf>
<project.prefix>spark-sql-columnar</project.prefix>
<project.name.prefix>OAP Project Spark Columnar Plugin</project.name.prefix>
<spark311.version>3.1.1</spark311.version>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should already supported 3.1.1, should use 3.1.2 or 3.2.0-snapshot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I put the initial shim layer for Spark 3.1.1 here because when you need to add a shim for Spark 3.1.2, that is for the reason Spark 3.1.2 is different on at least one api aspect with Spark 3.1.1. At that time you will need to create a shim for both Spark 3.1.1 and Spark 3.1.2. So Spark 3.1.1 shim will be always needed when you have a need to create a shim for another newer version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, Spark 3.1.1 layer will also act as a template when you need to add a new one.

<!-- dependencies are always listed in sorted order by groupId, artifactId -->
<dependency>
<groupId>com.intel.oap.</groupId>
<artifactId>${project.prefix}-shims-spark311</artifactId>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this is not implemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a default implementation (empty implementation) now. So you have no shim API to call other than getShimDescriptor. The developer team will need to identify add needed API to common/SparkShims interface and implement in spark311 shims.

There is typo there, "com.intel.oap." -> com.intel.oap". I will correct.

@zhouyuan
Copy link
Collaborator

@jerrychenhf
the aggregator does not compile in my local test. i'll push a change to remove this to merge it quickly and bring it back if necessary

[INFO] OAP Project Spark Columnar Plugin Shims Aggregator . FAILURE [  0.017 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  03:18 min
[INFO] Finished at: 2021-06-11T11:46:48+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project spark-sql-columnar-shims-aggregator: Could not resolve dependencies for project com.intel.oap:spark-sql-columnar-shims-aggregator:jar:1.2.0-snapshot: Failure to find com.intel.oap.:spark-sql-columnar-shims-spark311:jar:1.2.0-snapshot in http://maven.aliyun.com/nexus/content/groups/public/ was cached in the local repository, resolution will not be reattempted until the update interval of alimaven has elapsed or updates are forced -> [Help 1]

@zhouyuan zhouyuan merged commit e9d1859 into oap-project:master Jun 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants