You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is copied from trinodb/trino#17792 since I believe this repo is where zstd de/compression is handled.
I have a Hive table built on top of zst compressed data. On a Trino 419 cluster I get the following error when trying to read this table from Trino. That version of Trino uses aircompressor 0.23.
Query 20230607_172621_00003_5hpz7 failed: Error opening Hive split s3://path/to/file.csv.access.log.zst (offset=0, length=1544108): Window size too large (not yet supported): offset=3084
We are currently running Trino 405 and this query executes without an issue. We also have been running previous versions of Trino/Presto and this executed without an issue in the past. That version of Trino uses 0.21
Did something change between aircompressor 0.21 and 0.24 that might have caused this? And is there anything I can do to get past this error? Thanks in advance for your help!
Full stack trace
io.trino.spi.TrinoException: Error opening Hive split s3://path/to/file.csv.access.log.zst (offset=0, length=1544108): Window size too large (not yet supported): offset=3084
at io.trino.plugin.hive.line.LinePageSourceFactory.createPageSource(LinePageSourceFactory.java:179)
at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:218)
at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:156)
at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:48)
at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:61)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:298)
at io.trino.operator.Driver.processInternal(Driver.java:402)
at io.trino.operator.Driver.lambda$process$8(Driver.java:305)
at io.trino.operator.Driver.tryWithLock(Driver.java:701)
at io.trino.operator.Driver.process(Driver.java:297)
at io.trino.operator.Driver.processForDuration(Driver.java:268)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:888)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:556)
at io.trino.$gen.Trino_18f7842____20230607_162211_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.airlift.compress.MalformedInputException: Window size too large (not yet supported): offset=3084
at io.airlift.compress.zstd.Util.verify(Util.java:45)
at io.airlift.compress.zstd.ZstdFrameDecompressor.decodeCompressedBlock(ZstdFrameDecompressor.java:303)
at io.airlift.compress.zstd.ZstdIncrementalFrameDecompressor.partialDecompress(ZstdIncrementalFrameDecompressor.java:236)
at io.airlift.compress.zstd.ZstdInputStream.read(ZstdInputStream.java:89)
at io.airlift.compress.zstd.ZstdHadoopInputStream.read(ZstdHadoopInputStream.java:53)
at com.google.common.io.CountingInputStream.read(CountingInputStream.java:64)
at java.base/java.io.InputStream.readNBytes(InputStream.java:506)
at io.trino.hive.formats.line.text.TextLineReader.fillBuffer(TextLineReader.java:248)
at io.trino.hive.formats.line.text.TextLineReader.(TextLineReader.java:67)
at io.trino.hive.formats.line.text.TextLineReaderFactory.createLineReader(TextLineReaderFactory.java:77)
at io.trino.plugin.hive.line.LinePageSourceFactory.createPageSource(LinePageSourceFactory.java:171)
... 17 more
The text was updated successfully, but these errors were encountered:
This issue is copied from trinodb/trino#17792 since I believe this repo is where zstd de/compression is handled.
I have a Hive table built on top of zst compressed data. On a Trino 419 cluster I get the following error when trying to read this table from Trino. That version of Trino uses aircompressor 0.23.
We are currently running Trino 405 and this query executes without an issue. We also have been running previous versions of Trino/Presto and this executed without an issue in the past. That version of Trino uses 0.21
Did something change between aircompressor 0.21 and 0.24 that might have caused this? And is there anything I can do to get past this error? Thanks in advance for your help!
Full stack trace
The text was updated successfully, but these errors were encountered: