[BUG] compile-time references to classes potentially unavailable at run time #5648
Labels
bug
Something isn't working
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
Describe the bug
There is code that uses class loading by
String
namespark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/ExternalSource.scala
Line 37 in 19124fe
to check whether an optional module like spark-avro is available at run time
However, at the same time it contains compile-time references to AvroScan.
spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/ExternalSource.scala
Line 76 in 19124fe
I don't think there is any guarantee that a more aggressive JIT compile or classloading in some JVM implementation would not trigger loading of AvroScan using the classloader of
object ExternalSource
which is not necessarily the right one.It would be cleaner, if ExternalSource loaded by name some other class "GpuAvroScans" after checking hasSparkAvroJar
using Utils.getContextOrSparkClassLoader. This class may safely use compile-time references.
Something to the tune of:
ScansProvider.scala
AvroScansProvider.scala
Then change ExternalSource:
Steps/Code to reproduce bug
Expected behavior
ExternalSource bytecode should not have any compile-time references to AvroScan
Environment details (please complete the following information)
Any
Additional context
Similar pattern in GpuHiveOverrides
The text was updated successfully, but these errors were encountered: