Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for async backtraces of Tasks on any thread #51430

Merged
merged 2 commits into from
Sep 25, 2023

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Sep 22, 2023

No description provided.

@vtjnash vtjnash added multithreading Base.Threads and related functionality feature Indicates new feature / enhancement requests labels Sep 22, 2023
The jl_live_tasks API now reports all threads, instead of only Tasks
first started by the current thread. There is a new abstraction called
mtarraylist with adds functionality to small_arraylist (it is
layout-compatible). In particular, it makes it safe for another thread
to observe the content of the list concurrently with any mutations.
@vtjnash vtjnash merged commit e5c6340 into master Sep 25, 2023
1 check passed
@vtjnash vtjnash deleted the jn/async-stacktrace branch September 25, 2023 11:20
kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Sep 25, 2023
@kpamnany
Copy link
Contributor

What this PR does (for future reference):

Previously, the C/C++ part of the runtime had only arraylist_t and small_arraylist_t for lists of things, neither of which are inherently thread-safe and need an external lock.

This PR adds small_mtarraylist_t which is basically a small_arraylist_t with atomics that allows multiple threads to safely read pushes and adds (which have implicit resizes when necessary) concurrently with a single writer thread.

Three lists in ptls->heap are converted to small_mtarraylist_ts -- weak_refs, live_tasks and free_stacks and the code that touches these has been updated to use the new thread-safe accessors.

Then, jl_live_tasks() is also updated and, it looks like, hardened somewhat to handle foreign threads and GC threads correctly.

There are some changes to the signal_listener and jl_thread_suspend_and_get_state and jl_thread_resume that I think are used by the profiling mechanism but I'm not very clear on what's going on there.

All that essentially enables: jl_rec_backtrace() is improved to not simply skip over sticky tasks or tasks running on other threads; it now uses jl_thread_suspend_and_get_state to stop the thread and get its state (which includes the backtrace). And jl_print_task_backtraces() is updated to use the correct (small_mtarraylist_t) calls to access live_tasks.

@oscardssmith
Copy link
Member

what's the difference between an arraylist_t and a Julia Vector (which we can also use from the C side?

@vtjnash
Copy link
Member Author

vtjnash commented Sep 26, 2023

An arraylist_t is not a safepoint, and as such, cannot hold GC references and cannot be reclaimed by the GC. In most cases, Vector{Int} is the preferable thing to use, if the GC is available and initialized on that thread.

kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Sep 26, 2023
kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Sep 26, 2023
vtjnash pushed a commit that referenced this pull request Sep 27, 2023
In `jl_print_task_backtraces()`. Follow-on to
#51430.
kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Sep 27, 2023
kpamnany added a commit to RelationalAI/julia that referenced this pull request Sep 27, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Sep 27, 2023
kpamnany added a commit to RelationalAI/julia that referenced this pull request Sep 27, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
@kpamnany
Copy link
Contributor

The previous implementation of this capability was insufficiently safe and segfaults on occasion, so this is actually a bugfix. Can we backport it to 1.10 @vtjnash and @KristofferC?

@kpamnany
Copy link
Contributor

If so, #51471 should be backported also.

kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Oct 19, 2023
kpamnany added a commit to RelationalAI/julia that referenced this pull request Oct 19, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
@kpamnany kpamnany added the backport 1.10 Change should be backported to the 1.10 release label Oct 20, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Oct 20, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 1, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 1, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 1, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 2, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 2, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
@KristofferC KristofferC mentioned this pull request Nov 6, 2023
39 tasks
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 7, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 7, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 10, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 10, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 10, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 10, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 10, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 10, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 14, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 14, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 14, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 14, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 15, 2023
DelveCI pushed a commit to RelationalAI/julia that referenced this pull request Nov 15, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Nov 16, 2023
kpamnany added a commit to RelationalAI/julia that referenced this pull request Nov 16, 2023
In `jl_print_task_backtraces()`. Follow-on to
JuliaLang#51430.
KristofferC pushed a commit that referenced this pull request Nov 27, 2023
KristofferC pushed a commit that referenced this pull request Nov 27, 2023
In `jl_print_task_backtraces()`. Follow-on to
#51430.

(cherry picked from commit cde964f)
KristofferC pushed a commit that referenced this pull request Nov 27, 2023
KristofferC pushed a commit that referenced this pull request Nov 27, 2023
In `jl_print_task_backtraces()`. Follow-on to
#51430.

(cherry picked from commit cde964f)
KristofferC added a commit that referenced this pull request Dec 2, 2023
Backported PRs:
- [x] #51213 <!-- Wait for other threads to finish compiling before
exiting -->
- [x] #51520 <!-- Make allocopt respect the GC verifier rules with non
usual address spaces -->
- [x] #51598 <!-- Use a simple error when reporting sysimg load
failures. -->
- [x] #51757 <!-- fix parallel peakflop usage -->
- [x] #51781 <!-- Don't make pkgimages global editable -->
- [x] #51848 <!-- allow finalizers to take any locks and yield during
exit -->
- [x] #51847 <!-- add missing wait during Timer and AsyncCondition close
-->
- [x] #50824 <!-- Add some aliasing warnings to docstrings for mutating
functions in Base -->
- [x] #51885 <!-- remove chmodding the pkgimages -->
- [x] #50207 <!-- [devdocs] Improve documentation about building
external forks of LLVM -->
- [x] #51967 <!-- further fix to the new promoting method for
AbstractDateTime subtraction -->
- [x] #51980 <!-- macroexpand: handle const/atomic struct fields
correctly -->
- [x] #51995 <!-- [Artifacts] Pass artifacts dictionary to
`ensure_artifact_installed` dispatch -->
- [x] #52098 <!-- Fix errors in `sort` docstring -->
- [x] #52136 <!-- Bump JuliaSyntax to 0.4.7 -->
- [x] #52140 <!-- Make c func `abspath` consistent on Windows. Fix
tracking path conversion. -->
- [x] #52009 <!-- fix completion that resulted in startpos of 0 for `\\
-->
- [x] #52192 <!-- cap the number of GC threads to number of cpu cores
-->
- [x] #52206 <!-- Make have_fma consistent between interpreter and
compiled -->
- [x] #52027 <!-- fix Unicode.julia_chartransform for Julia 1.10 -->
- [x] #52217 <!-- More helpful error message for empty `cpu_target` in
`Base.julia_cmd` -->
- [x] #51371 <!-- Memoize `cwstring` when used for env lookup /
modification on Windows -->
- [x] #52214 <!-- Turn Method Overwritten Error into a PrecompileError
-- turning off caching -->
- [x] #51895 <!-- Devdocs on fixing precompile hangs, take 2 -->
- [x] #51596 <!-- Reland "Don't mark nonlocal symbols as hidden"" -->
- [x] #51834 <!-- [REPLCompletions] allow symbol completions within
incomplete macrocall expression -->
- [x] #52010 <!-- Revert "Support sorting iterators (#46104)" -->
- [x] #51430 <!-- add support for async backtraces of Tasks on any
thread -->
- [x] #51471 <!-- Fix segfault if root task is NULL -->
- [x] #52194 <!-- Fix multiversioning issues caused by the parallel llvm
work -->
- [x] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [x] #52030 <!-- Bump Statistics -->
- [x] #52189 <!-- codegen: ensure i1 bool is widened to i8 before
storing -->
- [x] #52228 <!-- Widen diagonal var during `Type` unwrapping in
`instanceof_tfunc` -->
- [x] #52182 <!-- jitlayers: replace sharedbytes intern pool with one
that respects alignment -->

Contains multiple commits, manual intervention needed:
- [ ] #51092 <!-- inference: fix bad effects for recursion -->

Non-merged PRs with backport label:
- [ ] #52196 <!-- Fix creating custom log level macros -->
- [ ] #52170 <!-- fix invalidations related to `ismutable` -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants