Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#4545] improvement(paimon-catalog): reduce catalog-lakehouse-paimon libs size from 222MB to 75MB #4547

Merged
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
e71adea
[#4545] improvement(paimon-catalog): reduce catalog-lakehouse-paimon …
LiuQhahah Aug 15, 2024
a4997aa
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Aug 15, 2024
345cdee
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Aug 18, 2024
02316fe
#4545 rollback common module name and reduce the size of calculateDep…
LiuQhahah Aug 18, 2024
de60d9a
Merge remote-tracking branch 'origin/#4545-Shrink-Paimon-catalog-bina…
LiuQhahah Aug 18, 2024
e9e9f7a
#4591 remove task "Calculates the total size of all dependencies"
LiuQhahah Aug 27, 2024
8ff7e63
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Aug 27, 2024
d85c4bd
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Aug 27, 2024
3367c69
#4545 remove unused lib
LiuQhahah Aug 31, 2024
8627cde
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Aug 31, 2024
888e445
Merge remote-tracking branch 'origin/main' into #4545-Shrink-Paimon-c…
LiuQhahah Sep 4, 2024
c90138c
#4545 remove unused lib
LiuQhahah Sep 4, 2024
dccd5a0
Merge remote-tracking branch 'origin/#4545-Shrink-Paimon-catalog-bina…
LiuQhahah Sep 4, 2024
3022294
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Sep 4, 2024
b8924d9
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Sep 5, 2024
26c7557
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Sep 5, 2024
7bd4f16
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Sep 8, 2024
f16d54b
#4270 exclude org.mortbay.jetty
LiuQhahah Sep 9, 2024
fe92499
Merge remote-tracking branch 'origin/main' into #4545-Shrink-Paimon-c…
LiuQhahah Sep 9, 2024
c889962
Merge branch 'main' into #4545-Shrink-Paimon-catalog-binary-package-size
LiuQhahah Sep 18, 2024
4d9417f
#4545 rollback implementation(libs.hadoop2.mapreduce.client.core) , r…
LiuQhahah Sep 18, 2024
84a4e39
continue shrink
FANNG1 Sep 23, 2024
5476118
continue shrink
FANNG1 Sep 23, 2024
ad178c5
add hdfs client
FANNG1 Sep 23, 2024
e895a3d
remove log4j
FANNG1 Sep 23, 2024
d316a76
add exclude
FANNG1 Sep 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 53 additions & 9 deletions catalogs/catalog-lakehouse-paimon/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -30,40 +30,80 @@ val sparkMajorVersion: String = sparkVersion.substringBeforeLast(".")
val paimonVersion: String = libs.versions.paimon.get()

dependencies {
implementation(project(":api"))
implementation(project(":common"))
LiuQhahah marked this conversation as resolved.
Show resolved Hide resolved
implementation(project(":core"))
implementation(project(":api")) {
exclude("*")
}
implementation(project(":common")) {
exclude("*")
}
implementation(project(":core")) {
exclude("*")
}
implementation(libs.bundles.paimon) {
exclude("com.sun.jersey")
exclude("javax.servlet")
exclude("org.apache.curator")
exclude("org.apache.hive")
exclude("org.apache.hbase")
exclude("org.apache.zookeeper")
exclude("org.eclipse.jetty.aggregate:jetty-all")
exclude("org.mortbay.jetty")
exclude("org.mortbay.jetty:jetty")
exclude("org.mortbay.jetty:jetty-util")
exclude("org.mortbay.jetty:jetty-sslengine")
exclude("it.unimi.dsi")
exclude("com.ververica")
exclude("org.apache.hadoop")
exclude("org.apache.commons")
exclude("org.xerial.snappy")
exclude("com.github.luben")
exclude("com.google.protobuf")
exclude("joda-time")
exclude("org.apache.parquet:parquet-jackson")
exclude("org.apache.parquet:parquet-format-structures")
exclude("org.apache.parquet:parquet-encoding")
exclude("org.apache.parquet:parquet-common")
exclude("org.apache.parquet:parquet-hadoop")
exclude("org.apache.paimon:paimon-codegen-loader")
exclude("org.apache.paimon:paimon-shade-caffeine-2")
exclude("org.apache.paimon:paimon-shade-guava-30")
}
implementation(libs.bundles.log4j)
implementation(libs.commons.lang3)
implementation(libs.caffeine)
implementation(libs.guava)
implementation(libs.hadoop2.common) {
exclude("com.github.spotbugs")
exclude("com.sun.jersey")
exclude("javax.servlet")
exclude("org.apache.curator")
exclude("org.apache.zookeeper")
exclude("org.mortbay.jetty")
}
implementation(libs.hadoop2.hdfs) {
Copy link
Collaborator

@caican00 caican00 Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiuQhahah gravitino paimon catalog supports FilesystemCatalog as its backend catalog, and therefore we can not remove the hdfs dependency.

Copy link
Collaborator

@caican00 caican00 Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and could you please do a basic test locally? such as built successfully and ran through the unit tests. thanks!

exclude("*")
}
implementation(libs.hadoop2.hdfs.client) {
exclude("com.sun.jersey")
exclude("javax.servlet")
exclude("org.fusesource.leveldbjni")
exclude("org.mortbay.jetty")
}
implementation(libs.hadoop2.mapreduce.client.core) {
Copy link
Collaborator

@caican00 caican00 Sep 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiuQhahah It is necessary for spark IT, please roll it back. and then please update the pr title.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LiuQhahah do you have time to fix this?

exclude("com.sun.jersey")
exclude("javax.servlet")
exclude("*")
}

annotationProcessor(libs.lombok)
compileOnly(libs.lombok)

testImplementation(project(":clients:client-java"))
testImplementation(project(":integration-test-common", "testArtifacts"))
testImplementation(project(":server"))
testImplementation(project(":server-common"))
testImplementation(project(":server-common")) {
exclude("org.mortbay.jetty")
exclude("com.sun.jersey.contribs")
}
testImplementation("org.apache.spark:spark-hive_$scalaVersion:$sparkVersion") {
exclude("org.apache.hadoop")
exclude("org.rocksdb")
}
testImplementation("org.apache.spark:spark-sql_$scalaVersion:$sparkVersion") {
exclude("org.apache.avro")
Expand Down Expand Up @@ -94,7 +134,11 @@ tasks {

val copyCatalogLibs by registering(Copy::class) {
dependsOn("jar", "runtimeJars")
from("build/libs")
from("build/libs") {
exclude("guava-*.jar")
exclude("log4j-*.jar")
exclude("slf4j-*.jar")
}
into("$rootDir/distribution/package/catalogs/lakehouse-paimon/libs")
}

Expand Down
Loading