-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-40068: [C++] Possible data race when reading metadata of a parquet file #40111
GH-40068: [C++] Possible data race when reading metadata of a parquet file #40111
Conversation
|
…e we access the value.
@github-actions crossbow submit -g cpp |
@raulcd This would be good for 15.0.1, if not too late. |
Revision: a3fe9ec Submitted crossbow builds: ursacomputing/crossbow @ actions-f5521fde34 |
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit a7ac7e0. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…arquet file (apache#40111) ### Rationale for this change The `ParquetFileFragment` will cache the parquet metadata when loading it. The `metadata()` method accesses this metadata (a shared_ptr) but does not grab the lock used to set that shared_ptr. It's possible then that we are reading a shared_ptr at the same time some other thread is setting the shared_ptr which is technically (I think) undefined behavior. ### What changes are included in this PR? Guard access to the metadata by grabbing the mutex first ### Are these changes tested? Existing tests should regress this change ### Are there any user-facing changes? No * Closes: apache#40068 Authored-by: Weston Pace <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
…arquet file (apache#40111) ### Rationale for this change The `ParquetFileFragment` will cache the parquet metadata when loading it. The `metadata()` method accesses this metadata (a shared_ptr) but does not grab the lock used to set that shared_ptr. It's possible then that we are reading a shared_ptr at the same time some other thread is setting the shared_ptr which is technically (I think) undefined behavior. ### What changes are included in this PR? Guard access to the metadata by grabbing the mutex first ### Are these changes tested? Existing tests should regress this change ### Are there any user-facing changes? No * Closes: apache#40068 Authored-by: Weston Pace <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
… file (#40111) ### Rationale for this change The `ParquetFileFragment` will cache the parquet metadata when loading it. The `metadata()` method accesses this metadata (a shared_ptr) but does not grab the lock used to set that shared_ptr. It's possible then that we are reading a shared_ptr at the same time some other thread is setting the shared_ptr which is technically (I think) undefined behavior. ### What changes are included in this PR? Guard access to the metadata by grabbing the mutex first ### Are these changes tested? Existing tests should regress this change ### Are there any user-facing changes? No * Closes: #40068 Authored-by: Weston Pace <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Rationale for this change
The
ParquetFileFragment
will cache the parquet metadata when loading it. Themetadata()
method accesses this metadata (a shared_ptr) but does not grab the lock used to set that shared_ptr. It's possible then that we are reading a shared_ptr at the same time some other thread is setting the shared_ptr which is technically (I think) undefined behavior.What changes are included in this PR?
Guard access to the metadata by grabbing the mutex first
Are these changes tested?
Existing tests should regress this change
Are there any user-facing changes?
No