Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37670: [C++] IO FileInterface extend from enable_shared_from_this #37713

Merged
merged 8 commits into from
Sep 19, 2023

Conversation

mapleFU
Copy link
Member

@mapleFU mapleFU commented Sep 14, 2023

Rationale for this change

S3 FlushAsync might has lifetime problem, this patch fixes that.

What changes are included in this PR?

  1. Move enable_shared_from_this to FileInterface
  2. Update S3 FlushAsync
  3. Implement sync Flush to avoid call share_from_this in dtor.

Are these changes tested?

no

Are there any user-facing changes?

no

@github-actions
Copy link

⚠️ GitHub issue #37670 has been automatically assigned in GitHub to PR creator.

@mapleFU mapleFU force-pushed the io/extend-from-enable-share-from-this branch from ec41eee to 624612b Compare September 14, 2023 04:51
@mapleFU
Copy link
Member Author

mapleFU commented Sep 14, 2023

Ooops.

OutputStream: FileIterface
InputStream: FileIterface, Readable (Advance, Peak, supports_zero_copy, ReadMetadata, ReadMetadataAsync)
RandomAccessFile: InputStream, Seekable (GetStream, GetSize, ReadAt, ReadAsync, ReadManyAsync, WillNeed)
WritableFile: OutputStream (WriteAt)
ReadWriteFileInterface: RandomAccessFile, WritableFile

If OutputStream has enable_shared_from_this, ReadWriteFileInterface might meet problem of multi-base class enable_shared_from_this?

@@ -1472,36 +1515,9 @@ class ObjectOutputStream final : public io::OutputStream {
RETURN_NOT_OK(UploadPart("", 0));
}

auto self = shared_from_this();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is that we cannot call shared_from_this in dtor ( https://stackoverflow.com/questions/28338978/stdenable-shared-from-this-is-it-allowed-to-call-shared-from-this-in-destru ), so I separate a FinishPartUploadAfterFlush.

@mapleFU mapleFU requested review from lidavidm and pitrou September 15, 2023 10:43
@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Sep 15, 2023
@mapleFU mapleFU changed the title GH-37670: [C++] IO Output extend from enable_shared_from_this GH-37670: [C++] S3 IO Output extend from enable_shared_from_this Sep 15, 2023
@pitrou
Copy link
Member

pitrou commented Sep 18, 2023

If OutputStream has enable_shared_from_this, ReadWriteFileInterface might meet problem of multi-base class enable_shared_from_this?

If that's the case, then we can move enable_shared_from_this at the upper inheritance level?

@mapleFU
Copy link
Member Author

mapleFU commented Sep 18, 2023

Sure, let me try it

@mapleFU
Copy link
Member Author

mapleFU commented Sep 18, 2023

After some trying, we have FileInterface, and can make FileInterface extended from enable_shared_from_this.

However, because of the derive is virtual, we need to change checked_pointer_cast (which might use std::static_pointer_cast) to std::dynamic_pointer_cast, like below:

// Default ReadMetadataAsync() implementation: simply issue the read on the context's
// executor
Future<std::shared_ptr<const KeyValueMetadata>> InputStream::ReadMetadataAsync(
    const IOContext& ctx) {
  std::shared_ptr<InputStream> self = std::dynamic_pointer_cast<InputStream>(shared_from_this());
  return DeferNotOk(internal::SubmitIO(ctx, [self] { return self->ReadMetadata(); }));
}

Do you think this is ok? @pitrou

@pitrou
Copy link
Member

pitrou commented Sep 18, 2023

cc @bkietz for opinions on dynamic_pointer_cast and diamond inheritance from enable_shared_from_this

@bkietz
Copy link
Member

bkietz commented Sep 18, 2023

dynamic_pointer_cast is indeed necessary to move shared_from_this to FileInterface. AFAIU, this shouldn't cause any problems (technically enable_shared_from_this will not be a virtual base, only FileInterface is); enable_shared_from_this is usually implemented as a data member private weak pointer to itself so we don't need to worry about offsetof hackery falling down.

@mapleFU
Copy link
Member Author

mapleFU commented Sep 19, 2023

@pitrou @bkietz I've done moving enable_shared_from_this to parent, would you mind take a look?

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, just a few nits

cpp/src/arrow/filesystem/s3fs.cc Outdated Show resolved Hide resolved
cpp/src/arrow/filesystem/s3fs_test.cc Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Sep 19, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Sep 19, 2023
@mapleFU mapleFU force-pushed the io/extend-from-enable-share-from-this branch from 009d8cb to d87f09b Compare September 19, 2023 13:49
@mapleFU
Copy link
Member Author

mapleFU commented Sep 19, 2023

@bkietz I've resolve the comment, would you mind take a look again?

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't specific enough:

cpp/src/arrow/filesystem/s3fs_test.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes awaiting review Awaiting review and removed awaiting change review Awaiting change review labels Sep 19, 2023
@github-actions github-actions bot added awaiting review Awaiting review awaiting merge Awaiting merge and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Sep 19, 2023
@mapleFU mapleFU changed the title GH-37670: [C++] S3 IO Output extend from enable_shared_from_this GH-37670: [C++] IO FileInterface extend from enable_shared_from_this Sep 19, 2023
@mapleFU
Copy link
Member Author

mapleFU commented Sep 19, 2023

RemoteHost: 127.0.0.1
Host: 127.0.0.8:51443
UserAgent: aws-sdk-cpp/1.11.68 Windows/10.0.17763.4252 AMD64 MSVC/1929
Error: file access denied (cmd.StorageErr)
       6: internal\logger\logger.go:278:logger.LogIf()
       5: cmd\fs-v1-helpers.go:430:cmd.fsDeleteFile()
       4: cmd\fs-v1.go:1196:cmd.(*FSObjects).DeleteObject()
       3: cmd\fs-v1.go:1142:cmd.(*FSObjects).DeleteObjects()
       2: cmd\bucket-handlers.go:592:cmd.objectAPIHandlers.DeleteMultipleObjectsHandler()
       1: net\http\server.go:2047:http.HandlerFunc.ServeHTTP()
API: DeleteMultipleObjects(bucket=stress, multiObject=true, numberOfObjects=1000)
Time: 14:48:36 UTC 09/19/2023
DeploymentID: e8cea2aa-a72f-4074-90e9-0d3a974514d4
RequestID: 178654398B35E984
RemoteHost: 127.0.0.1
Host: 127.0.0.8:51443
UserAgent: aws-sdk-cpp/1.11.68 Windows/10.0.17763.4252 AMD64 MSVC/1929
Error: file access denied (cmd.StorageErr)
       6: internal\logger\logger.go:278:logger.LogIf()
       5: cmd\fs-v1-helpers.go:430:cmd.fsDeleteFile()
       4: cmd\fs-v1.go:1196:cmd.(*FSObjects).DeleteObject()
       3: cmd\fs-v1.go:1142:cmd.(*FSObjects).DeleteObjects()
       2: cmd\bucket-handlers.go:592:cmd.objectAPIHandlers.DeleteMultipleObjectsHandler()
       1: net\http\server.go:2047:http.HandlerFunc.ServeHTTP()
[FATAL] 2023-09-19 14:47:20.923 WinHttpSyncHttpClient [5648] Failed setting secure crypto protocols with error code: 87
[FATAL] 2023-09-19 14:47:21.524 WinHttpSyncHttpClient [5648] Failed setting secure crypto protocols with error code: 87
C:/projects/arrow/cpp/src/arrow/filesystem/s3fs_test.cc(865): error: Failed
'fs_->DeleteDirContents("stress")' failed with IOError: Got the following 1 errors when deleting objects in S3 bucket 'stress':
- key '1/508': Access Denied.
C:/projects/arrow/cpp/src/arrow/status.cc:155: When trying to delete temporary directory: IOError: Cannot delete directory entry 'C:/Users/appveyor/AppData/Local/Temp/1/s3fs-test-1v1m07bb/.minio.sys': . Detail: [Windows error 145] The directory is not empty.
[  FAILED  ] TestS3FS.GetFileInfoGeneratorStress (75971 ms)
[ RUN      ] TestS3FS.GetFileInfoGeneratorCancelled
[FATAL] 2023-09-19 14:48:36.903 WinHttpSyncHttpClient [5648] Failed setting secure crypto protocols with error code: 87

Hmm seems this is not caused by me...?

@bkietz
Copy link
Member

bkietz commented Sep 19, 2023

Yes, that seems to be known flakiness #30568

I've restarted that CI job

@bkietz bkietz merged commit 0e6a683 into apache:main Sep 19, 2023
@bkietz bkietz removed the awaiting merge Awaiting merge label Sep 19, 2023
@mapleFU mapleFU deleted the io/extend-from-enable-share-from-this branch September 19, 2023 16:52
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 0e6a683.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.

loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…_this (apache#37713)

### Rationale for this change

S3 `FlushAsync` might has lifetime problem, this patch fixes that.

### What changes are included in this PR?

1. Move `enable_shared_from_this` to `FileInterface`
2. Update S3 `FlushAsync`
3. Implement sync Flush to avoid call `share_from_this` in dtor.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: apache#37670

Lead-authored-by: mwish <[email protected]>
Co-authored-by: mwish <[email protected]>
Co-authored-by: Benjamin Kietzman <[email protected]>
Signed-off-by: Benjamin Kietzman <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…_this (apache#37713)

### Rationale for this change

S3 `FlushAsync` might has lifetime problem, this patch fixes that.

### What changes are included in this PR?

1. Move `enable_shared_from_this` to `FileInterface`
2. Update S3 `FlushAsync`
3. Implement sync Flush to avoid call `share_from_this` in dtor.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: apache#37670

Lead-authored-by: mwish <[email protected]>
Co-authored-by: mwish <[email protected]>
Co-authored-by: Benjamin Kietzman <[email protected]>
Signed-off-by: Benjamin Kietzman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] FS: Would S3 OutputStream CloseAsync meets lifetime issue?
3 participants