From 57a028a73acba18a08f9c8628faf144c08454f5d Mon Sep 17 00:00:00 2001
From: Gabor Roczei <roczei@cloudera.com>
Date: Sat, 17 Aug 2024 00:04:54 +0200
Subject: [PATCH] [SPARK-48867][BUILD] Upgrade `okhttp` to 4.12.0 and `okio` to
 3.9.0

### What changes were proposed in this pull request?

This PR aims to upgrade `okhttp` to 4.12.0 and `okio` to 3.9.0

### Why are the changes needed?

okhttp depends on okio which has to be upgraded as well.
The new okhttp version fixes the following vulnerabilities:

1)

CVE-2023-0833 - A flaw was found in Red Hat's AMQ-Streams,
which ships a version of the OKHttp component with an
information disclosure flaw via an exception triggered
by a header containing an illegal value. This issue could allow
an authenticated attacker to access information outside of their
regular permissions.

CVSSv3 Score:- 5.5(Medium)

https://nvd.nist.gov/vuln/detail/CVE-2023-0833

2)

CVE-2021-0341 - In verifyHostName of OkHostnameVerifier.java,
there is a possible way to accept a certificate for the wrong
domain due to improperly used crypto. This could lead to remote
information disclosure with no additional execution privileges
needed. User interaction is not needed for exploitation.

CVSSv3 Score:- 7.5(High)

https://nvd.nist.gov/vuln/detail/CVE-2021-0341
https://github.com/square/okhttp/issues/6724

There are two places in the Spark repository where the okhttp dependency comes
in as transitive dependency:

1)

[INFO] +- org.apache.hadoop:hadoop-cloud-storage:jar:3.4.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-annotations:jar:3.4.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-aliyun:jar:3.4.0:compile
[INFO] |  |  +- com.aliyun.oss:aliyun-sdk-oss:jar:3.13.2:compile
[INFO] |  |  |  +- org.jdom:jdom2:jar:2.0.6:compile
[INFO] |  |  |  +- com.aliyun:aliyun-java-sdk-core:jar:4.5.10:compile
[INFO] |  |  |  |  +- org.ini4j:ini4j:jar:0.5.4:compile
[INFO] |  |  |  |  +- io.opentracing:opentracing-api:jar:0.33.0:compile
[INFO] |  |  |  |  \- io.opentracing:opentracing-util:jar:0.33.0:compile
[INFO] |  |  |  |     \- io.opentracing:opentracing-noop:jar:0.33.0:compile
[INFO] |  |  |  +- com.aliyun:aliyun-java-sdk-ram:jar:3.1.0:compile
[INFO] |  |  |  \- com.aliyun:aliyun-java-sdk-kms:jar:2.11.0:compile
[INFO] |  |  \- org.codehaus.jettison:jettison:jar:1.5.4:compile
[INFO] |  +- org.apache.hadoop:hadoop-azure-datalake:jar:3.4.0:compile
[INFO] |  |  \- com.microsoft.azure:azure-data-lake-store-sdk:jar:2.3.9:compile
[INFO] |  \- org.apache.hadoop:hadoop-huaweicloud:jar:3.4.0:compile
[INFO] |     \- com.huaweicloud:esdk-obs-java:jar:3.20.4.2:compile
[INFO] |        +- com.jamesmurty.utils:java-xmlbuilder:jar:1.2:compile
[INFO] |        +- com.squareup.okhttp3:okhttp:jar:3.14.2:compile
[INFO] |        \- com.squareup.okio:okio:jar:1.17.6:compile

The Hadoop team has attempted to remove okhttp from their codebase:

remove okhttp usage: https://issues.apache.org/jira/browse/HADOOP-18890

Unfortunately the hadoop-huaweicloud dependency is still there which
pulls in the vulnerable okhttp 3.x version.

https://github.com/apache/hadoop/blob/trunk/hadoop-cloud-storage-project/hadoop-cloud-storage/pom.xml#L137C19-L137C37

Proposed solution for this:

com.huaweicloud:esdk-obs-java:jar:3.20.4.2 is vulnerable due to
okhttp 3.x (CVE-2023-0833, CVE-2021-0341), it has to be upgraded to 3.24.3
which depends on okhttp 4.12.0

2)

[INFO] +- org.apache.spark:spark-kubernetes_2.13:jar:4.0.0-SNAPSHOT:compile
[INFO] |  +- io.fabric8:kubernetes-httpclient-okhttp:jar:6.13.3:compile
[INFO] |  |  +- io.fabric8:kubernetes-client-api:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-core:jar:6.13.3:compile
[INFO] |  |  |  |  \- io.fabric8:kubernetes-model-common:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-gatewayapi:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-resource:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-rbac:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-admissionregistration:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-apps:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-autoscaling:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-apiextensions:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-batch:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-certificates:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-coordination:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-discovery:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-events:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-extensions:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-flowcontrol:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-networking:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-metrics:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-policy:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-scheduling:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-storageclass:jar:6.13.3:compile
[INFO] |  |  |  +- io.fabric8:kubernetes-model-node:jar:6.13.3:compile
[INFO] |  |  |  \- org.snakeyaml:snakeyaml-engine:jar:2.7:compile
[INFO] |  |  +- com.squareup.okhttp3:okhttp:jar:3.12.12:compile
[INFO] |  |  |  \- com.squareup.okio:okio:jar:1.17.6:compile
[INFO] |  |  \- com.squareup.okhttp3:logging-interceptor:jar:3.12.12:compile

The kubernetes-client's maintainers do not want upgrade to okhttp 4.x
because it's based on Kotlin, they recommend to exclude 3.x. Related documentation:

https://github.com/fabric8io/kubernetes-client/blob/main/doc/KubernetesClientWithIPv6Clusters.md

My proposed solution based on the above finding:

I think we should exclude the 3.x version and switch to use okhttp 4.x.
It is binary backwards compatible with okhttp 3.x. More details are here:

https://square.github.io/okhttp/upgrading_to_okhttp_4/

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.
---
 dev/deps/spark-deps-hadoop-3-hive-2.3         | 11 +++--
 hadoop-cloud/pom.xml                          | 35 ++++++++++++++
 pom.xml                                       | 10 +++-
 resource-managers/kubernetes/core/pom.xml     | 47 +++++++++++++++++++
 .../kubernetes/integration-tests/pom.xml      | 47 +++++++++++++++++++
 5 files changed, 144 insertions(+), 6 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3
index c7f4431de5957..8cb591dd9ed72 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -62,7 +62,7 @@ derby/10.16.1.1//derby-10.16.1.1.jar
 derbyshared/10.16.1.1//derbyshared-10.16.1.1.jar
 derbytools/10.16.1.1//derbytools-10.16.1.1.jar
 dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
-esdk-obs-java/3.20.4.2//esdk-obs-java-3.20.4.2.jar
+esdk-obs-java/3.24.3//esdk-obs-java-3.24.3.jar
 flatbuffers-java/24.3.25//flatbuffers-java-24.3.25.jar
 gcs-connector/hadoop3-2.2.21/shaded/gcs-connector-hadoop3-2.2.21-shaded.jar
 gmetric4j/1.0.10//gmetric4j-1.0.10.jar
@@ -119,7 +119,6 @@ jakarta.ws.rs-api/3.0.0//jakarta.ws.rs-api-3.0.0.jar
 jakarta.xml.bind-api/2.3.2//jakarta.xml.bind-api-2.3.2.jar
 janino/3.1.9//janino-3.1.9.jar
 java-diff-utils/4.12//java-diff-utils-4.12.jar
-java-xmlbuilder/1.2//java-xmlbuilder-1.2.jar
 javassist/3.29.2-GA//javassist-3.29.2-GA.jar
 javax.jdo/3.2.0-m3//javax.jdo-3.2.0-m3.jar
 javax.servlet-api/4.0.1//javax.servlet-api-4.0.1.jar
@@ -154,6 +153,7 @@ json4s-scalap_2.13/4.0.7//json4s-scalap_2.13-4.0.7.jar
 jsr305/3.0.0//jsr305-3.0.0.jar
 jta/1.1//jta-1.1.jar
 jul-to-slf4j/2.0.16//jul-to-slf4j-2.0.16.jar
+kotlin-stdlib/2.0.10//kotlin-stdlib-2.0.10.jar
 kryo-shaded/4.0.2//kryo-shaded-4.0.2.jar
 kubernetes-client-api/6.13.3//kubernetes-client-api-6.13.3.jar
 kubernetes-client/6.13.3//kubernetes-client-6.13.3.jar
@@ -189,7 +189,7 @@ log4j-api/2.22.1//log4j-api-2.22.1.jar
 log4j-core/2.22.1//log4j-core-2.22.1.jar
 log4j-layout-template-json/2.22.1//log4j-layout-template-json-2.22.1.jar
 log4j-slf4j2-impl/2.22.1//log4j-slf4j2-impl-2.22.1.jar
-logging-interceptor/3.12.12//logging-interceptor-3.12.12.jar
+logging-interceptor/4.12.0//logging-interceptor-4.12.0.jar
 lz4-java/1.8.0//lz4-java-1.8.0.jar
 metrics-core/4.2.26//metrics-core-4.2.26.jar
 metrics-graphite/4.2.26//metrics-graphite-4.2.26.jar
@@ -223,8 +223,9 @@ netty-transport-native-kqueue/4.1.110.Final/osx-x86_64/netty-transport-native-kq
 netty-transport-native-unix-common/4.1.110.Final//netty-transport-native-unix-common-4.1.110.Final.jar
 netty-transport/4.1.110.Final//netty-transport-4.1.110.Final.jar
 objenesis/3.3//objenesis-3.3.jar
-okhttp/3.12.12//okhttp-3.12.12.jar
-okio/1.17.6//okio-1.17.6.jar
+okhttp/4.12.0//okhttp-4.12.0.jar
+okio-jvm/3.9.0//okio-jvm-3.9.0.jar
+okio/3.9.0//okio-3.9.0.jar
 opencsv/2.3//opencsv-2.3.jar
 opentracing-api/0.33.0//opentracing-api-0.33.0.jar
 opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar
diff --git a/hadoop-cloud/pom.xml b/hadoop-cloud/pom.xml
index 9c2f21e7ab617..f648770c531b3 100644
--- a/hadoop-cloud/pom.xml
+++ b/hadoop-cloud/pom.xml
@@ -171,6 +171,41 @@
           <groupId>org.apache.hadoop</groupId>
           <artifactId>hadoop-cos</artifactId>
         </exclusion>
+        <!--
+          HADOOP-19224 / SPARK-48867: com.huaweicloud:esdk-obs-java:jar:3.20.4.2 is
+          vulnerable due to okhttp 3.x (CVE-2023-0833, CVE-2021-0341),
+          it has to be upgraded to 3.24.3 which depends on okhttp 4.12.0
+        -->
+        <exclusion>
+          <groupId>com.huaweicloud</groupId>
+          <artifactId>esdk-obs-java</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>com.huaweicloud</groupId>
+      <artifactId>esdk-obs-java</artifactId>
+      <version>${esdk.obs.java.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib-jdk8</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>org.jetbrains.kotlin</groupId>
+      <artifactId>kotlin-stdlib</artifactId>
+      <version>${kotlin-stdlib.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains</groupId>
+          <artifactId>annotations</artifactId>
+        </exclusion>
       </exclusions>
     </dependency>
     <!--
diff --git a/pom.xml b/pom.xml
index 69aae9d52be1a..36c84fc647eaf 100644
--- a/pom.xml
+++ b/pom.xml
@@ -160,6 +160,12 @@
     <aws.java.sdk.v2.version>2.24.6</aws.java.sdk.v2.version>
     <!-- the producer is used in tests -->
     <aws.kinesis.producer.version>0.12.8</aws.kinesis.producer.version>
+    <!--
+      HADOOP-19224 / SPARK-48867: com.huaweicloud:esdk-obs-java:jar:3.20.4.2 is
+      vulnerable due to okhttp 3.x (CVE-2023-0833, CVE-2021-0341),
+      it has to be upgraded to 3.24.3 which depends on okhttp 4.12.0
+    -->
+    <esdk.obs.java.version>3.24.3</esdk.obs.java.version>
     <!-- Do not use 3.0.0: https://github.com/GoogleCloudDataproc/hadoop-connectors/issues/1114 -->
     <gcs-connector.version>hadoop3-2.2.21</gcs-connector.version>
     <!--  org.apache.httpcomponents/httpclient-->
@@ -231,7 +237,9 @@
     <!-- org.fusesource.leveldbjni will be used except on arm64 platform. -->
     <leveldbjni.group>org.fusesource.leveldbjni</leveldbjni.group>
     <kubernetes-client.version>6.13.3</kubernetes-client.version>
-    <okio.version>1.17.6</okio.version>
+    <okio.version>3.9.0</okio.version>
+    <okhttp.version>4.12.0</okhttp.version>
+    <kotlin-stdlib.version>2.0.10</kotlin-stdlib.version>
 
     <test.java.home>${java.home}</test.java.home>
 
diff --git a/resource-managers/kubernetes/core/pom.xml b/resource-managers/kubernetes/core/pom.xml
index fa0fd454ccc44..64406b41bf701 100644
--- a/resource-managers/kubernetes/core/pom.xml
+++ b/resource-managers/kubernetes/core/pom.xml
@@ -79,6 +79,53 @@
       <groupId>io.fabric8</groupId>
       <artifactId>kubernetes-httpclient-okhttp</artifactId>
       <version>${kubernetes-client.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>com.squareup.okhttp3</groupId>
+          <artifactId>okhttp</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>com.squareup.okhttp3</groupId>
+          <artifactId>logging-interceptor</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>com.squareup.okhttp3</groupId>
+      <artifactId>okhttp</artifactId>
+      <version>${okhttp.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib-jdk8</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>com.squareup.okhttp3</groupId>
+      <artifactId>logging-interceptor</artifactId>
+      <version>${okhttp.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib-jdk8</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>org.jetbrains.kotlin</groupId>
+      <artifactId>kotlin-stdlib</artifactId>
+      <version>${kotlin-stdlib.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains</groupId>
+          <artifactId>annotations</artifactId>
+        </exclusion>
+      </exclusions>
     </dependency>
     <dependency>
       <groupId>io.fabric8</groupId>
diff --git a/resource-managers/kubernetes/integration-tests/pom.xml b/resource-managers/kubernetes/integration-tests/pom.xml
index 518c5bc217071..a82a083f87c62 100644
--- a/resource-managers/kubernetes/integration-tests/pom.xml
+++ b/resource-managers/kubernetes/integration-tests/pom.xml
@@ -68,6 +68,53 @@
       <groupId>io.fabric8</groupId>
       <artifactId>kubernetes-client</artifactId>
       <version>${kubernetes-client.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>com.squareup.okhttp3</groupId>
+          <artifactId>okhttp</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>com.squareup.okhttp3</groupId>
+          <artifactId>logging-interceptor</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>com.squareup.okhttp3</groupId>
+      <artifactId>okhttp</artifactId>
+      <version>${okhttp.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib-jdk8</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>com.squareup.okhttp3</groupId>
+      <artifactId>logging-interceptor</artifactId>
+      <version>${okhttp.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains.kotlin</groupId>
+          <artifactId>kotlin-stdlib-jdk8</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+    <dependency>
+      <groupId>org.jetbrains.kotlin</groupId>
+      <artifactId>kotlin-stdlib</artifactId>
+      <version>${kotlin-stdlib.version}</version>
+      <exclusions>
+        <exclusion>
+          <groupId>org.jetbrains</groupId>
+          <artifactId>annotations</artifactId>
+        </exclusion>
+      </exclusions>
     </dependency>
     <dependency>
       <groupId>org.apache.spark</groupId>