Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Rapids Accelerator (0.2) failing to read csv file on Databricks 7.0 ML GPU Runtime #1322

Closed
Tracked by #2063
krajendrannv opened this issue Dec 8, 2020 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@krajendrannv
Copy link
Contributor

Describe the bug
Customer is replicating the Mortgage ETL query in a Databricks environment. Data is read from S3. The same S3 data is read using a CPU cluster and it works. GPU scan fails.
It is failing on the first read:

acq = read_acq_csv(spark, orig_acq_path)
def read_acq_csv(spark, path):
return spark.read.format('csv')
.option('nullValue', '')
.option('header', 'false')
.option('delimiter', '|')
.schema(_csv_acq_schema)
.load(path)
.withColumn('quarter', _get_quarter_from_csv_file_name())

Steps/Code to reproduce bug
The cluster uses p3.2xlarge (v100) for driver and executor.
Rapids Accelerator (0.2), Databricks 7.0ML GPU Runtime

Expected behavior
Attached both logs from CPU cluster (working-log4j_cpu.log) and GPU cluster (log4j_gpu_databricks7.0.log)

Environment details (please complete the following information)

  • Environment location: Databricks 7.0ML Runtime on AWS

Additional context
Add any other context about the problem here.
log4j_gpu_databricks7.0.log
working-log4j_cpu.log

@krajendrannv krajendrannv added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 8, 2020
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Dec 8, 2020
@tgravescs
Copy link
Collaborator

this looks like spark.io.compression.codec is set to zstd, if this is set by Databricks I would expect that to just work, if the setup scripts you are using is doing it, we should not.

The code that is erroring isn't even in the plugin

java.lang.NoSuchMethodError: com.github.luben.zstd.Zstd.setCompressionLevel(JI)I
	at com.github.luben.zstd.ZstdOutputStream.<init>(ZstdOutputStream.java:64)
	at org.apache.spark.io.ZStdCompressionCodec.compressedOutputStream(CompressionCodec.scala:224)
	at org.apache.spark.MapOutputTracker$.serializeMapStatuses(MapOutputTracker.scala:963)
	at org.apache.spark.ShuffleStatus.$anonfun$serializedMapStatus$2(MapOutputTracker.scala:234)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.ShuffleStatus.withWriteLock(MapOutputTracker.scala:75)
	at org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:231)
	at org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:485)

The issue is either the underlying library/jar is missing or its incompatible with what is expected:

java.lang.NoSuchMethodError: com.github.luben.zstd.Zstd.setCompressionLevel(JI)I

when I start a databricks 7.0ML cluster it don't see the setting of spark.io.compression.codec

so are you setting that?

@revans2 revans2 mentioned this issue Oct 27, 2022
38 tasks
@firestarman
Copy link
Collaborator

firestarman commented Oct 31, 2022

Do we still need to fix this ?

@revans2
Copy link
Collaborator

revans2 commented Oct 31, 2022

I agree that it is probably not an issue any more. We support zstd and we do not support ML7.0 for databricks any more. I mostly want to be sure that we are testing zstd on databricks with CSV. Even if we just manually verify it works once, that is good enough.

@firestarman
Copy link
Collaborator

@revans2 @revans2 FYI

This is more like a version issue because setCompressionLevel was introduced since v1.4.0.
Anyway, I just verified on Azure DB 9.1 and 10.4, both works now with zstd + plugin.

I am going to close this.

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants