-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Apache Arrow for improved Snowflake integration #9475
Comments
Tried to reproduce by enabling arrow in the connection. So far couldn't reproduce. |
|
As per @JaroslavTulach 's request - I will try out his patch and a) create a temporary solution b) submit patch to arrow/snowflake that will deprecate the former once merged |
So I don't think the patch that simply replaces throwing the exception with logging it will work for Snowflake.
For the custom snowflake to be picked up you will need this rather quick and easy hack to include unmanaged classpath: iff --git a/project/StdBits.scala b/project/StdBits.scala
index 1e17616de3..76c89bf7a3 100644
--- a/project/StdBits.scala
+++ b/project/StdBits.scala
@@ -44,7 +44,7 @@ object StdBits {
!graalVmOrgs.contains(orgName)
})
)
- val relevantFiles =
+ val relevantFiles0 =
libraryUpdates
.select(
configuration = configFilter,
@@ -52,6 +52,12 @@ object StdBits {
artifact = DependencyFilter.artifactFilter()
)
+ val relevantFiles = if (destination.getPath.contains("Snowflake")) {
+ val all = (Compile/unmanagedJars).value.map(_.data)
+ relevantFiles0 ++ all
+ } else {
+ relevantFiles0
+ }
val dependencyStore =
streams.value.cacheStoreFactory.make("std-bits-dependencies")
Tracked.diffInputs(dependencyStore, FileInfo.hash)(relevantFiles.toSet) { and --- a/build.sbt
+++ b/build.sbt
@@ -513,7 +513,7 @@ val hamcrestVersion = "1.3"
val netbeansApiVersion = "RELEASE180"
val fansiVersion = "0.4.0"
val httpComponentsVersion = "4.4.1"
-val apacheArrowVersion = "14.0.1"
+val apacheArrowVersion = "10.0.1"
val snowflakeJDBCVersion = "3.15.0"
// ============================================================================
@@ -2996,8 +2996,8 @@ lazy val `std-snowflake` = project
Compile / packageBin / artifactPath :=
`std-snowflake-polyglot-root` / "std-snowflake.jar",
libraryDependencies ++= Seq(
- "org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided",
- "net.snowflake" % "snowflake-jdbc" % snowflakeJDBCVersion
+ "org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided"//,
+ //"net.snowflake" % "snowflake-jdbc" % snowflakeJDBCVersion
), After all is done you will still get something along the lines of
when trying to return Arrow rows. Note that it would be nice to simply replace package Arrow's |
Thank you for the investigation. Can we get a stacktrace that fails on |
Roughly
|
Hubert Plociniczak reports a new STANDUP for yesterday (2024-03-26): Progress: Attempting to patch #9475, as requested, to avoid opening java modules.Snowflake appears to use some custom Arrow version, making the process difficult. It should be finished by 2024-03-27. Next Day: Next day I will be working on the #9475 task. Continue with the task. Also go back to benchmark issue. |
I assume the code here could just
...catching the exception and doing some regular Java operation ( If you don't share my enthusiasm for patching upstream projects, then please share a reproducer that works with |
Steps to reproduce:
--- a/distribution/lib/Standard/Snowflake/0.0.0-dev/src/Snowflake_Details.enso
+++ b/distribution/lib/Standard/Snowflake/0.0.0-dev/src/Snowflake_Details.enso
@@ -46,7 +46,7 @@ type Snowflake_Details
jdbc_properties : Vector (Pair Text Text)
jdbc_properties self =
## Avoid the Arrow dependency (https://community.snowflake.com/s/article/SAP-BW-Java-lang-NoClassDefFoundError-for-Apache-arrow)
- no_arrow = [Pair.new 'jdbc_query_result_format' 'json']
+ no_arrow = [] #[Pair.new 'jdbc_query_result_format' 'json']
account = [Pair.new 'account' self.account]
credentials = [Pair.new 'user' self.credentials.username, Pair.new 'password' self.credentials.password]
database = [Pair.new 'db' self.database]
You will need a snowflake account with full access to If you run runner with |
Report from my today's investigation. Arrow version is specified here and it is 10.0.1 Get sources from wget https://repo1.maven.org/maven2/org/apache/arrow/arrow-memory-core/10.0.1/arrow-memory-core-10.0.1-sources.jar
wget https://repo1.maven.org/maven2/org/apache/arrow/arrow-memory-netty/10.0.1/arrow-memory-netty-10.0.1-sources.jar Compile as javac -cp snowflake-jdbc-3.15.0.jar:$HOME/.m2/repository/org/slf4j/slf4j-api/1.7.29/slf4j-api-1.7.29.jar MemoryUtil.java -d . Alas, the furthest I could get is to:
Debugging shows there is https://github.com/apache/arrow/blob/84f6edef697fd0fa0f5fce252c017a31e4ba3944/java/memory/memory-core/src/main/java/org/apache/arrow/memory/DefaultAllocationManagerOption.java#L94 and one can use a property to specify https://github.com/apache/arrow/blob/84f6edef697fd0fa0f5fce252c017a31e4ba3944/java/memory/memory-core/src/main/java/org/apache/arrow/memory/DefaultAllocationManagerOption.java#L39C30-L39C67 allocation manager. However neither At the end they want to convert Looks like PlatformDependent was written by people who don't trust Java GC much...
|
In the absence of other workarounds we need to ensure that every Java process is started with extra options: `--add-opens=java.base/java.nio=ALL-UNNAMED` Verified locally. Note: this only affects LS and cli will continue to crash on examples using Snowflake. Closes #9475.
Lack of Arrow is apparently problematic for Snowflake. Enabling it also means we need to add
as per https://arrow.apache.org/docs/java/install.html
The text was updated successfully, but these errors were encountered: