-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Double close of ParquetFileWriter in ParquetWriter #2935
Comments
hellishfire
changed the title
Double close of parquetFileWriter in ParquetWriter
Double close of ParquetFileWriter in ParquetWriter
Jun 27, 2024
devOpsHazelcast
pushed a commit
to hazelcast/hazelcast
that referenced
this issue
Jul 10, 2024
…[5.3.z] (#2561) Fixes #22541 Fixes #24981 Fixes #26354 Closes https://hazelcast.atlassian.net/browse/REL-279 Backports https://github.com/hazelcast/hazelcast-mono/pull/2467 Notes: 1. apache/parquet-java@274dc51b has broken `ParquetWriter#close()`. See also: https://issues.apache.org/jira/browse/PARQUET-2496 and apache/parquet-java#2935. 2. `hadoop2` classifier has been removed from `avro-mapred`. See also: https://github.com/hazelcast/hazelcast-mono/pull/834. 3. Upgrades `software.amazon.awssdk` from 2.20.95 to 2.24.13. 4. Upgrades `maven-shade-plugin` to 3.6.0 because `org.apache.parquet:parquet-jackson:1.14.1` has classes compiled with Java 21. 5. Allows `MIT-0` license, which is used by `org.reactivestreams:reactive-streams:1.0.4`. See also: #25325. 6. Adds `jar-with-dependencies` classifier to `hazelcast-jet-kafka` and `hazelcast-jet-mongodb` in enterprise-sql-it/pom.xml because `animal-sniffer-maven-plugin` cannot find some transitive dependencies (`kafka-clients` and `mongodb-driver-sync`) in `mvn verify` (it can find them in `mvn install`). See also: https://hazelcast.slack.com/archives/C07066ELRRD/p1720539966962809. GitOrigin-RevId: e838e0abe0123ef6580d31ada5e675dab1526c20
devOpsHazelcast
pushed a commit
to hazelcast/hazelcast
that referenced
this issue
Jul 10, 2024
…#2571) Fixes #22541 Fixes #24981 Fixes #26354 Closes https://hazelcast.atlassian.net/browse/REL-257 Forwardports https://github.com/hazelcast/hazelcast-mono/pull/2467 Notes: 1. apache/parquet-java@274dc51b has broken `ParquetWriter#close()`. See also: https://issues.apache.org/jira/browse/PARQUET-2496 and apache/parquet-java#2935. 2. Adds `jdk8` classifier to `jline` because it contains classes compiled with Java 22, which breaks the build. See also: jline/jline3#937 (comment). GitOrigin-RevId: 519e71667822b3fd7d2c7cf654f261a8a238d583
Fokko
pushed a commit
that referenced
this issue
Jul 22, 2024
* GH-2935: Avoid double close of ParquetFileWriter * fix comment --------- Co-authored-by: youming.whl <[email protected]>
Fokko
pushed a commit
that referenced
this issue
Jul 22, 2024
* GH-2935: Avoid double close of ParquetFileWriter * fix comment --------- Co-authored-by: youming.whl <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
ParquetWriter.close() invokes InternalParquetRecordWriter.close() with following logic:
Apparently parquetFileWriter is closed twice here, first time by
parquetFileWriter.end(finalMetadata), which eventually calls parquetFileWriter.close()
second time by
AutoCloseables.uncheckedClose(columnStore, pageStore, bloomFilterWriteStore, parquetFileWriter);
This causes the underlying PositionOutputStream in ParquetFileWriter to be flushed again after it's closed, which may raise exception depending on the underlying stream implementation.
sample exception:
Caused by: org.apache.parquet.util.AutoCloseables$ParquetCloseResourceException: Unable to close resource
at org.apache.parquet.util.AutoCloseables.uncheckedClose(AutoCloseables.java:85)
at org.apache.parquet.util.AutoCloseables.uncheckedClose(AutoCloseables.java:94)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:144)
at org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:437)
... 70 more
Caused by: java.io.IOException: stream is already closed
(-------- specific stream implementation ----------------)
at java.io.FilterOutputStream.flush(FilterOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at org.apache.parquet.hadoop.util.HadoopPositionOutputStream.flush(HadoopPositionOutputStream.java:59)
at org.apache.parquet.hadoop.ParquetFileWriter.close(ParquetFileWriter.java:1659)
at org.apache.parquet.util.AutoCloseables.close(AutoCloseables.java:49)
at org.apache.parquet.util.AutoCloseables.uncheckedClose(AutoCloseables.java:83)
This issue is observed since 1.14.0, and I suspect PARQUET-2496 is caused by this similar issue.
The text was updated successfully, but these errors were encountered: