Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-36332: [CI][Java] Patch spark to use Netty 4.1.94.Final on our integration tests #36640

Closed
wants to merge 1 commit into from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Jul 12, 2023

Rationale for this change

It does seem that the only way to shadow Netty version is to modify the Pom for previous versions.

What changes are included in this PR?

Try to patch version of Netty on the pom when cloning Spark.

Are these changes tested?

Archery integration tests

Are there any user-facing changes?

No

@raulcd
Copy link
Member Author

raulcd commented Jul 12, 2023

@github-actions crossbow submit spark

@github-actions
Copy link

⚠️ GitHub issue #36332 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Jul 12, 2023
@github-actions
Copy link

Revision: 44ec8f8

Submitted crossbow builds: ursacomputing/crossbow @ actions-e71df5add3

Task Status
test-conda-python-3.10-spark-master Github Actions
test-conda-python-3.8-spark-v3.1.2 Github Actions
test-conda-python-3.9-spark-v3.2.0 Github Actions

@raulcd
Copy link
Member Author

raulcd commented Jul 12, 2023

This is not going to be enough because from my understanding Spark also pins the versions used on Hadoop here:
https://github.com/apache/spark/blob/master/dev/deps/spark-deps-hadoop-3-hive-2.3#L186-L203
and I am not sure we want to patch those too.
@kiszk any idea?

@kiszk
Copy link
Member

kiszk commented Jul 14, 2023

This PR is motivated by #36332

@kiszk
Copy link
Member

kiszk commented Jul 14, 2023

LGTM.
It looks like a practical solution to resolve a chicken-and-egg problem.
We need to ensure the latest Arrow works well with Spark. Then, we can release the latest Arrow with Netty 4.1.94. After that, Spark would update the version of Arrow.

@kiszk
Copy link
Member

kiszk commented Jul 14, 2023

cc @BryanCutler

@raulcd
Copy link
Member Author

raulcd commented Jul 14, 2023

@kiszk if I understand correctly you are suggesting to also patch the versions here: https://github.com/apache/spark/blob/master/dev/deps/spark-deps-hadoop-3-hive-2.3#L186-L203 to see if the job changing the versions succeeds (at least on master spark)?

@kiszk
Copy link
Member

kiszk commented Jul 14, 2023

@raulcd Sorry for confusing you. I am fine with this PR.

@raulcd
Copy link
Member Author

raulcd commented Jul 14, 2023

but the spark jobs still fail unless we also patch those. I mean the current change doesn't seem enough to make our CI integration jobs with spark successful as seen on the crossbow report comment here: #36640 (comment)

@LuciferYang
Copy link

LuciferYang commented Jul 18, 2023

@kiszk if I understand correctly you are suggesting to also patch the versions here: https://github.com/apache/spark/blob/master/dev/deps/spark-deps-hadoop-3-hive-2.3#L186-L203 to see if the job changing the versions succeeds (at least on master spark)?

I think this just used by dependencies change check(like a golden file), will not affect the actual netty version usage

@@ -45,6 +45,13 @@ export MAVEN_OPTS="${MAVEN_OPTS} -Dorg.slf4j.simpleLogger.log.org.apache.maven.c

pushd ${spark_dir}

# Due to CVE-2023-34462 we upgraded to a memory netty version which is incompatible
# with previous spark versions. Patch the pom to use newer version.
sed -i.bak -E -e \
Copy link

@LuciferYang LuciferYang Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run this sed statement locally, but I found that the netty.version in the pom.xml still 4.1.93.Final

Maybe we can test build/mvn versions:set-property -Dproperty=netty.version -DnewVersion=4.1.94.Final -DgenerateBackupPoms=false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this line may also say to use 4.1.93.final

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll investigate, I might have messed up on my tests :)

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jul 19, 2023
@raulcd
Copy link
Member Author

raulcd commented Jul 28, 2023

I am closing this PR as upgrading to netty 4.1.96 fixed the issue reverting the regression introduced on 4.1.94.

@raulcd raulcd closed this Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting changes Awaiting changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI][Java] Integration jobs with Spark fail with NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator
3 participants