Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: disable spark async profiler on non-arm based systems #1533

Merged
merged 1 commit into from
Oct 21, 2024

Conversation

derklaro
Copy link
Member

Motivation

The async profiler version bundled with spark (which itself is bundled in modern paper versions) does not support java 23 and causes a seqfault when being executed which crashes at least all modern paper services running on amd64 systems:

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000079c43ba06ecf, pid=39327, tid=39528
#
# JRE version: OpenJDK Runtime Environment (23.0+37) (build 23+37-2369)
# Java VM: OpenJDK 64-Bit Server VM (23+37-2369, mixed mode, sharing, tiered, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [spark-502cce8be50-libasyncProfiler.so.tmp+0x6ecf]  NMethod::isNMethod()+0x1f

Modification

Disable the async profiler integration in spark when an old version of the async profiler is used. The detection process of the async profiler version relies on a pull request to spark which is not yet merged (so we might need to change the detection process again when that happened). Additionally the issue does not happen on arm systems, therefore the async profiler is left enabled on these systems. The load method of AsyncProfilerAccess on matching spark versions is changed so that it always throws an exception that the async profiler is not available. The message of the exception is printed into the console when starting the profiler so that the user is informed why the profiler is disabled:

[23:22:44 INFO]: [spark] Starting background profiler...
[23:22:44 WARN]: [spark] Unable to initialise the async-profiler engine: this version of spark uses a version of async-profiler which does not support java 23+
[23:22:44 WARN]: [spark] Please see here for more information: https://spark.lucko.me/docs/misc/Using-async-profiler

Result

Non-arm servers that run modern paper versions and all servers running spark with an old version of async-profiler will no longer segfault when starting up/the plugin is enabled.

@derklaro derklaro added v: 4.X This pull should be included in the 4.0 release t: fix A pull request introducing a fix for a bug. in: wrapper An issue/pull request releated to the wrapper module code labels Oct 19, 2024
@derklaro derklaro requested a review from 0utplay October 19, 2024 23:42
@derklaro derklaro self-assigned this Oct 19, 2024
@derklaro derklaro added this to the 4.0.0-RC11.1 milestone Oct 19, 2024
Copy link

Test Results

 48 files  ±0   48 suites  ±0   1m 57s ⏱️ +5s
420 tests ±0  420 ✅ ±0  0 💤 ±0  0 ❌ ±0 
751 runs  ±0  751 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 4455564. ± Comparison against base commit 93cdaf4.

This pull request removes 37 and adds 37 tests. Note that renamed tests count towards both.
eu.cloudnetservice.driver.document.DocumentSerialisationTest ‑ [4] {"b":1,"s":2,"i":3,"l":4,"f":5.0,"d":6.0,"c":"/","string":"Hello, World!","bol":true,"cloud":["Ben?","Yes","No","HoHoHoHo"],"world":{"hello":"world","this":"is","insane":"!"}}, PRETTY
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [14] 2024-10-18
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [23] 17:49:17.137256662
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [28] 17:49:17.137425975Z
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [29] 17:49:17.137456602Z
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [30] 17:49:17.137479354+05:00
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [31] 17:49:17.137498650-03:00
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [34] 2024-10-18T17:49:17.137517445
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [39] 2024-10-18T17:49:17.137627910Z
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [4] 2024-10-18T17:49:17.137132561Z
…
eu.cloudnetservice.driver.document.DocumentSerialisationTest ‑ [4] {"b":1,"s":2,"i":3,"l":4,"f":5.0,"d":6.0,"c":"/","string":"Hello, World!","bol":true,"cloud":["Ben?","Yes","No","HoHoHoHo"],"world":{"insane":"!","hello":"world","this":"is"}}, PRETTY
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [14] 2024-10-20
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [23] 06:50:27.729482040
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [28] 06:50:27.729658809Z
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [29] 06:50:27.729691049Z
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [30] 06:50:27.729724532+05:00
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [31] 06:50:27.729748686-03:00
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [34] 2024-10-20T06:50:27.729779654
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [39] 2024-10-20T06:50:27.729927269Z
eu.cloudnetservice.driver.document.gson.JavaTimeSerializerTest ‑ [4] 2024-10-20T06:50:27.729344864Z
…

@derklaro derklaro merged commit 32dd3fc into nightly Oct 21, 2024
7 checks passed
@derklaro derklaro deleted the disable-spark-profiler branch October 21, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: wrapper An issue/pull request releated to the wrapper module code t: fix A pull request introducing a fix for a bug. v: 4.X This pull should be included in the 4.0 release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants