Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-35095: [C++] Prevent write after close in arrow::ipc::IpcFormatWriter #37783

Conversation

chrisjordansquire
Copy link
Contributor

@chrisjordansquire chrisjordansquire commented Sep 18, 2023

This addresses GH-35095 by adding a flag to IpcFormatWriter to track when a writer has been closed, and check this flag before writes.

Rationale for this change

This addresses #35095 , preventing stream and file IPC writers from writing record batches once the IPC writer has been closed.

What changes are included in this PR?

Adding a flag so that an IpcFormatWriter to track when it's been closed, a check before writes in IpcFormatWriter, and two tests to confirm it works as expected.

Are these changes tested?

Yes, the changes are tested. The two tests were added, and the C++ test suite ran. No unexpected failures appeared.

Are there any user-facing changes?

Other than newly returning an invalid status when writing after close, no, there should not be any user-facing changes.

@github-actions
Copy link

⚠️ GitHub issue #35095 has been automatically assigned in GitHub to PR creator.


// Write after close raises status
auto foo = writer_helper.WriteBatch(batch_ints);
// ASSERT_RAISES(Invalid, writer_helper.WriteBatch(batch_ints));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this assertion commented out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. An extremely embarrassing mistake. Thank you for catching it. It's been fixed.

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Sep 19, 2023
@@ -1070,6 +1070,9 @@ class ARROW_EXPORT IpcFormatWriter : public RecordBatchWriter {
Status WriteRecordBatch(
const RecordBatch& batch,
const std::shared_ptr<const KeyValueMetadata>& custom_metadata) override {
if (closed_) {
return Status::Invalid("Destination already closed");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need similar check in WriteTable too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.
It appears that WriteTable ultimately calls WriteRecordBatch. This appears to be documented as well, in that the documentation for RecordBatchWriter::WriteTable is that a a table is written by creating a sequence of record batches. So testing WriteTable doesn't seem to be necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I see.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Sep 19, 2023
@kou kou changed the title GH-35095: [C++] Prevent write after close GH-35095: [C++] Prevent write after close in arrow::ipc::IpcFormatWriter Sep 19, 2023
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

cpp/src/arrow/ipc/read_write_test.cc Outdated Show resolved Hide resolved
@@ -1070,6 +1070,9 @@ class ARROW_EXPORT IpcFormatWriter : public RecordBatchWriter {
Status WriteRecordBatch(
const RecordBatch& batch,
const std::shared_ptr<const KeyValueMetadata>& custom_metadata) override {
if (closed_) {
return Status::Invalid("Destination already closed");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I see.

@github-actions
Copy link

⚠️ GitHub issue #35095 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Sep 19, 2023
@kou
Copy link
Member

kou commented Sep 20, 2023

Could you check the "C++ / AMD64 Ubuntu 22.04 C++ ASAN UBSAN" failure?

https://github.com/apache/arrow/actions/runs/6242111241/job/16945464958?pr=37783#step:6:3569

==15144==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 512 byte(s) in 1 object(s) allocated from:
...

@chrisjordansquire chrisjordansquire force-pushed the GH-35095-write-record-batch-after-close-is-error branch from c1754fe to 85f3916 Compare September 20, 2023 13:31
@chrisjordansquire
Copy link
Contributor Author

Yep, taking a look. Rebasing on main first, in case it was fixed elsewhere.

This addresses apacheGH-35095 by adding a flag to IpcFormatWriter
to track when a writer has been closed, and check this flag
before writes.
@chrisjordansquire chrisjordansquire force-pushed the GH-35095-write-record-batch-after-close-is-error branch from 85f3916 to ae6e0fa Compare September 20, 2023 20:13
@kou
Copy link
Member

kou commented Sep 21, 2023

Oh, the failure has gone.
It'll not be related to this change.
I'll merge this.

@kou kou merged commit 6cd34f3 into apache:main Sep 21, 2023
30 of 33 checks passed
@kou kou removed the awaiting change review Awaiting change review label Sep 21, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 6cd34f3.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.

@chrisjordansquire chrisjordansquire deleted the GH-35095-write-record-batch-after-close-is-error branch September 21, 2023 14:51
etseidl pushed a commit to etseidl/arrow that referenced this pull request Sep 28, 2023
…matWriter (apache#37783)

This addresses apacheGH-35095 by adding a flag to IpcFormatWriter to track when a writer has been closed, and check this flag before writes.

### Rationale for this change

This addresses apache#35095 , preventing stream and file IPC writers from writing record batches once the IPC writer has been closed. 

### What changes are included in this PR?

Adding a flag so that an IpcFormatWriter to track when it's been closed, a check before writes in IpcFormatWriter, and two tests to confirm it works as expected. 

### Are these changes tested?

Yes, the changes are tested. The two tests were added, and the C++ test suite ran. No unexpected failures appeared. 

### Are there any user-facing changes?

Other than newly returning an invalid status when writing after close, no, there should not be any user-facing changes. 

* Closes: apache#35095

Lead-authored-by: Chris Jordan-Squire <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 23, 2023
…matWriter (apache#37783)

This addresses apacheGH-35095 by adding a flag to IpcFormatWriter to track when a writer has been closed, and check this flag before writes.

### Rationale for this change

This addresses apache#35095 , preventing stream and file IPC writers from writing record batches once the IPC writer has been closed. 

### What changes are included in this PR?

Adding a flag so that an IpcFormatWriter to track when it's been closed, a check before writes in IpcFormatWriter, and two tests to confirm it works as expected. 

### Are these changes tested?

Yes, the changes are tested. The two tests were added, and the C++ test suite ran. No unexpected failures appeared. 

### Are there any user-facing changes?

Other than newly returning an invalid status when writing after close, no, there should not be any user-facing changes. 

* Closes: apache#35095

Lead-authored-by: Chris Jordan-Squire <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…matWriter (apache#37783)

This addresses apacheGH-35095 by adding a flag to IpcFormatWriter to track when a writer has been closed, and check this flag before writes.

### Rationale for this change

This addresses apache#35095 , preventing stream and file IPC writers from writing record batches once the IPC writer has been closed. 

### What changes are included in this PR?

Adding a flag so that an IpcFormatWriter to track when it's been closed, a check before writes in IpcFormatWriter, and two tests to confirm it works as expected. 

### Are these changes tested?

Yes, the changes are tested. The two tests were added, and the C++ test suite ran. No unexpected failures appeared. 

### Are there any user-facing changes?

Other than newly returning an invalid status when writing after close, no, there should not be any user-facing changes. 

* Closes: apache#35095

Lead-authored-by: Chris Jordan-Squire <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…matWriter (apache#37783)

This addresses apacheGH-35095 by adding a flag to IpcFormatWriter to track when a writer has been closed, and check this flag before writes.

### Rationale for this change

This addresses apache#35095 , preventing stream and file IPC writers from writing record batches once the IPC writer has been closed. 

### What changes are included in this PR?

Adding a flag so that an IpcFormatWriter to track when it's been closed, a check before writes in IpcFormatWriter, and two tests to confirm it works as expected. 

### Are these changes tested?

Yes, the changes are tested. The two tests were added, and the C++ test suite ran. No unexpected failures appeared. 

### Are there any user-facing changes?

Other than newly returning an invalid status when writing after close, no, there should not be any user-facing changes. 

* Closes: apache#35095

Lead-authored-by: Chris Jordan-Squire <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] RecordBatchWriter::Close does not prevent future writes
2 participants