Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-44846: [C++] Fix thread-unsafe access in ConcurrentQueue::UnsyncFront #44849

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented Nov 25, 2024

Rationale for this change

The UnsyncFront method claims that it can access a std::deque's front element without synchronizing with another thread that would call std::deque::push_back, but std::deque::front can actually be implemented in terms of std::deque::begin while std::deque::push_back is specified to invalidate iterators.

In the end, UnsyncFront is concretely thread-unsafe even though it might ideally be thread-safe. This shows up as occasional Thread Sanitizer failures.

What changes are included in this PR?

Replace UnsyncFront with a thread-safe Front method.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@pitrou
Copy link
Member Author

pitrou commented Nov 25, 2024

@github-actions crossbow submit test-ubuntu-24.04-cpp-thread-sanitizer

Copy link

Unable to match any tasks for `[test-ubuntu-24.04-cpp-thread-sanitizer](https://github.com/ursacomputing/crossbow/actions/runs/12001035805/job/33450981551)`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/12013415989

Copy link

Revision: 2f92a5b

Submitted crossbow builds: ursacomputing/crossbow @ actions-5084f8405a

Task Status
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@pitrou pitrou marked this pull request as ready for review November 25, 2024 16:11
@pitrou pitrou requested a review from westonpace as a code owner November 25, 2024 16:11
@pitrou pitrou requested review from felipecrv and bkietz November 25, 2024 16:14
@pitrou
Copy link
Member Author

pitrou commented Nov 25, 2024

@github-actions crossbow submit -g cpp

Copy link

Revision: 2f92a5b

Submitted crossbow builds: ursacomputing/crossbow @ actions-7059b221ca

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp-ubuntu-20.04-cuda-11.2.2 GitHub Actions
test-cuda-cpp-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-39-cpp GitHub Actions
test-ubuntu-20.04-cpp GitHub Actions
test-ubuntu-20.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

size_t UnsyncSize() const { return queue_.size(); }
const T& Front() const {
// Need to lock the queue because `front()` may be implemented in terms
// of `begin()`, which isn't safe with concurrent calls to e.g. `push()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this push means deque::push_back/front or ConcurrentQueue::Push()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it means std::queue::push :) (which is concretely the same as std::deque::push_back)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the original code is too optimistic. The deque might implemented as something like std::vector<block*> and block* for elements, the front() might visit std::vector<block*> and push_back might enlarge it, so it really has problem here

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Nov 25, 2024
Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this. Approve minus one question.

// of `begin()`, which isn't safe with concurrent calls to e.g. `push()`.
// (see GH-44846)
std::unique_lock<std::mutex> lock(mutex_);
return queue_.front();
Copy link
Member

@westonpace westonpace Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to return a reference or does a copy need to be made (the reference will outlive the lock)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

References to deque contents are not invalidated by container mutations, so it's ok.
(unlike iterators, which are invalidated)

@github-actions github-actions bot added awaiting review Awaiting review awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels Nov 25, 2024
@mapleFU
Copy link
Member

mapleFU commented Nov 25, 2024

🤔 would std::list be ok here? Should we merge this or trying list since it would not invalidate anything ( but has less cache locality and requires more spaces)

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review awaiting merge Awaiting merge labels Nov 25, 2024
@pitrou
Copy link
Member Author

pitrou commented Nov 26, 2024

std::list will perform much more poorly. My intuition is that locking the front() access is better.

@pitrou pitrou merged commit c4d17fd into apache:main Nov 26, 2024
47 of 48 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Nov 26, 2024
@pitrou pitrou deleted the gh44846-tsan-concurrent-queue branch November 26, 2024 09:37
@pitrou
Copy link
Member Author

pitrou commented Nov 26, 2024

Thanks for the reviews @westonpace @mapleFU !

Copy link

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit c4d17fd.

There were 132 benchmark results with an error:

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants