forked from JuliaLang/julia
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update/julia master #2
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
qinsoon
approved these changes
Jan 31, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
qinsoon
pushed a commit
to qinsoon/julia
that referenced
this pull request
Feb 2, 2023
This PR updates the binding to the latest Julia master (up to this commit: 134f3e7).
kpamnany
pushed a commit
that referenced
this pull request
Mar 13, 2023
This PR updates the binding to the latest Julia master (up to this commit: 134f3e7).
kpamnany
pushed a commit
that referenced
this pull request
Mar 15, 2023
This PR updates the binding to the latest Julia master (up to this commit: 134f3e7).
qinsoon
pushed a commit
to qinsoon/julia
that referenced
this pull request
May 2, 2024
…#51489) This exposes the GC "stop the world" API to the user, for causing a thread to quickly stop executing Julia code. This adds two APIs (that will need to be exported and documented later): ``` julia> @CCall jl_safepoint_suspend_thread(#=tid=mmtk#1::Cint, #=magicnumber=mmtk#2::Cint)::Cint # roughly tkill(1, SIGSTOP) julia> @CCall jl_safepoint_resume_thread(#=tid=mmtk#1::Cint)::Cint # roughly tkill(1, SIGCONT) ``` You can even suspend yourself, if there is another task to resume you 10 seconds later: ``` julia> ccall(:jl_enter_threaded_region, Cvoid, ()) julia> t = @task let; Libc.systemsleep(10); print("\nhello from $(Threads.threadid())\n"); @CCall jl_safepoint_resume_thread(0::Cint)::Cint; end; ccall(:jl_set_task_tid, Cint, (Any, Cint), t, 1); schedule(t); julia> @time @CCall jl_safepoint_suspend_thread(0::Cint, 2::Cint)::Cint hello from 2 10 seconds (6 allocations: 264 bytes) 1 ``` The meaning of the magic number is actually the kind of stop that you want: ``` // n.b. suspended threads may still run in the GC or GC safe regions // but shouldn't be observable, depending on which enum the user picks (only 1 and 2 are typically recommended here) // waitstate = 0 : do not wait for suspend to finish // waitstate = 1 : wait for gc_state != 0 (JL_GC_STATE_WAITING or JL_GC_STATE_SAFE) // waitstate = 2 : wait for gc_state != 0 (JL_GC_STATE_WAITING or JL_GC_STATE_SAFE) and that GC is not running on that thread // waitstate = 3 : wait for full suspend (gc_state == JL_GC_STATE_WAITING) -- this may never happen if thread is sleeping currently // if another thread comes along and calls jl_safepoint_resume, we also return early // return new suspend count on success, 0 on failure ``` Only magic number 2 is currently meaningful to the user though. The difference between waitstate 1 and 2 is only relevant in C code which is calling this from JL_GC_STATE_SAFE, since otherwise it is a priori known that GC isn't running, else we too would be running the GC. But the distinction of those states might be useful if we have a concurrent collector. Very important warning: if the stopped thread is holding any locks (e.g. for codegen or types) that you then attempt to acquire, your thread will deadlock. This is very likely, unless you are very careful. A future update to this API may try to change the waitstate to give the option to wait for the thread to release internal or known locks.
qinsoon
pushed a commit
to qinsoon/julia
that referenced
this pull request
May 2, 2024
`@something` eagerly unwraps any `Some` given to it, while keeping the variable between its arguments the same. This can be an issue if a previously unpacked value is used as input to `@something`, leading to a type instability on more than two arguments (e.g. because of a fallback to `Some(nothing)`). By using different variables for each argument, type inference has an easier time handling these cases that are isolated to single branches anyway. This also adds some comments to the macro, since it's non-obvious what it does. Benchmarking the specific case I encountered this in led to a ~2x performance improvement on multiple machines. 1.10-beta3/master: ``` [sukera@tower 01]$ jl1100 -q --project=. -L 01.jl -e 'bench()' v"1.10.0-beta3" BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 38.670 μs … 70.350 μs ┊ GC (min … max): 0.00% … 0.00% Time (median): 43.340 μs ┊ GC (median): 0.00% Time (mean ± σ): 43.395 μs ± 1.518 μs ┊ GC (mean ± σ): 0.00% ± 0.00% ▆█▂ ▁▁ ▂▂▂▂▂▂▂▂▂▁▂▂▂▃▃▃▂▂▃▃▃▂▂▂▂▂▄▇███▆██▄▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃ 38.7 μs Histogram: frequency by time 48 μs < Memory estimate: 0 bytes, allocs estimate: 0. ``` This PR: ``` [sukera@tower 01]$ julia -q --project=. -L 01.jl -e 'bench()' v"1.11.0-DEV.970" BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 22.820 μs … 44.980 μs ┊ GC (min … max): 0.00% … 0.00% Time (median): 24.300 μs ┊ GC (median): 0.00% Time (mean ± σ): 24.370 μs ± 832.239 ns ┊ GC (mean ± σ): 0.00% ± 0.00% ▂▅▇██▇▆▅▁ ▂▂▂▂▂▂▂▂▃▃▄▅▇███████████▅▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▂ ▃ 22.8 μs Histogram: frequency by time 27.7 μs < Memory estimate: 0 bytes, allocs estimate: 0. ``` <details> <summary>Benchmarking code (spoilers for Advent Of Code 2023 Day 01, Part 01). Running this requires the input of that Advent Of Code day.</summary> ```julia using BenchmarkTools using InteractiveUtils isdigit(d::UInt8) = UInt8('0') <= d <= UInt8('9') someDigit(c::UInt8) = isdigit(c) ? Some(c - UInt8('0')) : nothing function part1(data) total = 0 may_a = nothing may_b = nothing for c in data digitRes = someDigit(c) may_a = @something may_a digitRes Some(nothing) may_b = @something digitRes may_b Some(nothing) if c == UInt8('\n') digit_a = may_a::UInt8 digit_b = may_b::UInt8 total += digit_a*0xa + digit_b may_a = nothing may_b = nothing end end return total end function bench() data = read("input.txt") display(VERSION) println() display(@benchmark part1($data)) nothing end ``` </details> <details> <summary>`@code_warntype` before</summary> ```julia julia> @code_warntype part1(data) MethodInstance for part1(::Vector{UInt8}) from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7 Arguments #self#::Core.Const(part1) data::Vector{UInt8} Locals @_3::Union{Nothing, Tuple{UInt8, Int64}} may_b::Union{Nothing, UInt8} may_a::Union{Nothing, UInt8} total::Int64 c::UInt8 digit_b::UInt8 digit_a::UInt8 val@_10::Any val@_11::Any digitRes::Union{Nothing, Some{UInt8}} @_13::Union{Some{Nothing}, Some{UInt8}, UInt8} @_14::Union{Some{Nothing}, Some{UInt8}} @_15::Some{Nothing} @_16::Union{Some{Nothing}, Some{UInt8}, UInt8} @_17::Union{Some{Nothing}, UInt8} @_18::Some{Nothing} Body::Int64 1 ── (total = 0) │ (may_a = Main.nothing) │ (may_b = Main.nothing) │ %4 = data::Vector{UInt8} │ (@_3 = Base.iterate(%4)) │ %6 = (@_3 === nothing)::Bool │ %7 = Base.not_int(%6)::Bool └─── goto mmtk#24 if not %7 2 ┄─ Core.NewvarNode(:(digit_b)) │ Core.NewvarNode(:(digit_a)) │ Core.NewvarNode(:(val@_10)) │ %12 = @_3::Tuple{UInt8, Int64} │ (c = Core.getfield(%12, 1)) │ %14 = Core.getfield(%12, 2)::Int64 │ (digitRes = Main.someDigit(c)) │ (val@_11 = may_a) │ %17 = (val@_11::Union{Nothing, UInt8} !== Base.nothing)::Bool └─── goto mmtk#4 if not %17 3 ── (@_13 = val@_11::UInt8) └─── goto mmtk#11 4 ── (val@_11 = digitRes) │ %22 = (val@_11::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool └─── goto mmtk#6 if not %22 5 ── (@_14 = val@_11::Some{UInt8}) └─── goto mmtk#10 6 ── (val@_11 = Main.Some(Main.nothing)) │ %27 = (val@_11::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true) └─── goto mmtk#8 if not %27 7 ── (@_15 = val@_11::Core.Const(Some(nothing))) └─── goto mmtk#9 8 ── Core.Const(:(@_15 = Base.nothing)) 9 ┄─ (@_14 = @_15) 10 ┄ (@_13 = @_14) 11 ┄ %34 = @_13::Union{Some{Nothing}, Some{UInt8}, UInt8} │ (may_a = Base.something(%34)) │ (val@_10 = digitRes) │ %37 = (val@_10::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool └─── goto mmtk#13 if not %37 12 ─ (@_16 = val@_10::Some{UInt8}) └─── goto mmtk#20 13 ─ (val@_10 = may_b) │ %42 = (val@_10::Union{Nothing, UInt8} !== Base.nothing)::Bool └─── goto mmtk#15 if not %42 14 ─ (@_17 = val@_10::UInt8) └─── goto mmtk#19 15 ─ (val@_10 = Main.Some(Main.nothing)) │ %47 = (val@_10::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true) └─── goto mmtk#17 if not %47 16 ─ (@_18 = val@_10::Core.Const(Some(nothing))) └─── goto mmtk#18 17 ─ Core.Const(:(@_18 = Base.nothing)) 18 ┄ (@_17 = @_18) 19 ┄ (@_16 = @_17) 20 ┄ %54 = @_16::Union{Some{Nothing}, Some{UInt8}, UInt8} │ (may_b = Base.something(%54)) │ %56 = c::UInt8 │ %57 = Main.UInt8('\n')::Core.Const(0x0a) │ %58 = (%56 == %57)::Bool └─── goto mmtk#22 if not %58 21 ─ (digit_a = Core.typeassert(may_a, Main.UInt8)) │ (digit_b = Core.typeassert(may_b, Main.UInt8)) │ %62 = total::Int64 │ %63 = (digit_a * 0x0a)::UInt8 │ %64 = (%63 + digit_b)::UInt8 │ (total = %62 + %64) │ (may_a = Main.nothing) └─── (may_b = Main.nothing) 22 ┄ (@_3 = Base.iterate(%4, %14)) │ %69 = (@_3 === nothing)::Bool │ %70 = Base.not_int(%69)::Bool └─── goto mmtk#24 if not %70 23 ─ goto mmtk#2 24 ┄ return total ``` </details> <details> <summary>`@code_native debuginfo=:none` Before </summary> ```julia julia> @code_native debuginfo=:none part1(data) .text .file "part1" .globl julia_part1_418 # -- Begin function julia_part1_418 .p2align 4, 0x90 .type julia_part1_418,@function julia_part1_418: # @julia_part1_418 # %bb.0: # %top push rbp mov rbp, rsp push r15 push r14 push r13 push r12 push rbx sub rsp, 40 mov rax, qword ptr [rdi + 8] test rax, rax je .LBB0_1 # %bb.2: # %L17 mov rcx, qword ptr [rdi] dec rax mov r10b, 1 xor r14d, r14d # implicit-def: $r12b # implicit-def: $r13b # implicit-def: $r9b # implicit-def: $sil mov qword ptr [rbp - 64], rax # 8-byte Spill mov al, 1 mov dword ptr [rbp - 48], eax # 4-byte Spill # implicit-def: $al # kill: killed $al xor eax, eax mov qword ptr [rbp - 56], rax # 8-byte Spill mov qword ptr [rbp - 72], rcx # 8-byte Spill # implicit-def: $cl jmp .LBB0_3 .p2align 4, 0x90 .LBB0_8: # in Loop: Header=BB0_3 Depth=1 mov dword ptr [rbp - 48], 0 # 4-byte Folded Spill .LBB0_24: # %post_union_move # in Loop: Header=BB0_3 Depth=1 movzx r13d, byte ptr [rbp - 41] # 1-byte Folded Reload mov r12d, r8d cmp qword ptr [rbp - 64], r14 # 8-byte Folded Reload je .LBB0_13 .LBB0_25: # %guard_exit113 # in Loop: Header=BB0_3 Depth=1 inc r14 mov r10d, ebx .LBB0_3: # %L19 # =>This Inner Loop Header: Depth=1 mov rax, qword ptr [rbp - 72] # 8-byte Reload xor ebx, ebx xor edi, edi movzx r15d, r9b movzx ecx, cl movzx esi, sil mov r11b, 1 # implicit-def: $r9b movzx edx, byte ptr [rax + r14] lea eax, [rdx - 58] lea r8d, [rdx - 48] cmp al, -10 setae bl setb dil test r10b, 1 cmovne r15d, edi mov edi, 0 cmovne ecx, ebx mov bl, 1 cmovne esi, edi test r15b, 1 jne .LBB0_7 # %bb.4: # %L76 # in Loop: Header=BB0_3 Depth=1 mov r11b, 2 test cl, 1 jne .LBB0_5 # %bb.6: # %L78 # in Loop: Header=BB0_3 Depth=1 mov ebx, r10d mov r9d, r15d mov byte ptr [rbp - 41], r13b # 1-byte Spill test sil, 1 je .LBB0_26 .LBB0_7: # %L82 # in Loop: Header=BB0_3 Depth=1 cmp al, -11 jbe .LBB0_9 jmp .LBB0_8 .p2align 4, 0x90 .LBB0_5: # in Loop: Header=BB0_3 Depth=1 mov ecx, r8d mov sil, 1 xor ebx, ebx mov byte ptr [rbp - 41], r8b # 1-byte Spill xor r9d, r9d xor ecx, ecx cmp al, -11 ja .LBB0_8 .LBB0_9: # %L90 # in Loop: Header=BB0_3 Depth=1 test byte ptr [rbp - 48], 1 # 1-byte Folded Reload jne .LBB0_23 # %bb.10: # %L115 # in Loop: Header=BB0_3 Depth=1 cmp dl, 10 jne .LBB0_11 # %bb.14: # %L122 # in Loop: Header=BB0_3 Depth=1 test r15b, 1 jne .LBB0_15 # %bb.12: # %L130.thread # in Loop: Header=BB0_3 Depth=1 movzx eax, byte ptr [rbp - 41] # 1-byte Folded Reload mov bl, 1 add eax, eax lea eax, [rax + 4*rax] add al, r12b movzx eax, al add qword ptr [rbp - 56], rax # 8-byte Folded Spill mov al, 1 mov dword ptr [rbp - 48], eax # 4-byte Spill cmp qword ptr [rbp - 64], r14 # 8-byte Folded Reload jne .LBB0_25 jmp .LBB0_13 .p2align 4, 0x90 .LBB0_23: # %L115.thread # in Loop: Header=BB0_3 Depth=1 mov al, 1 # implicit-def: $r8b mov dword ptr [rbp - 48], eax # 4-byte Spill cmp dl, 10 jne .LBB0_24 jmp .LBB0_21 .LBB0_11: # in Loop: Header=BB0_3 Depth=1 mov r8d, r12d jmp .LBB0_24 .LBB0_1: xor eax, eax mov qword ptr [rbp - 56], rax # 8-byte Spill .LBB0_13: # %L159 mov rax, qword ptr [rbp - 56] # 8-byte Reload add rsp, 40 pop rbx pop r12 pop r13 pop r14 pop r15 pop rbp ret .LBB0_21: # %L122.thread test r15b, 1 jne .LBB0_15 # %bb.22: # %post_box_union58 movabs rdi, offset .L_j_str1 movabs rax, offset ijl_type_error movabs rsi, 140008511215408 movabs rdx, 140008667209736 call rax .LBB0_15: # %fail cmp r11b, 1 je .LBB0_19 # %bb.16: # %fail movzx eax, r11b cmp eax, 2 jne .LBB0_17 # %bb.20: # %box_union54 movzx eax, byte ptr [rbp - 41] # 1-byte Folded Reload movabs rcx, offset jl_boxed_uint8_cache mov rdx, qword ptr [rcx + 8*rax] jmp .LBB0_18 .LBB0_26: # %L80 movabs rax, offset ijl_throw movabs rdi, 140008495049392 call rax .LBB0_19: # %box_union movabs rdx, 140008667209736 jmp .LBB0_18 .LBB0_17: xor edx, edx .LBB0_18: # %post_box_union movabs rdi, offset .L_j_str1 movabs rax, offset ijl_type_error movabs rsi, 140008511215408 call rax .Lfunc_end0: .size julia_part1_418, .Lfunc_end0-julia_part1_418 # -- End function .type .L_j_str1,@object # @_j_str1 .section .rodata.str1.1,"aMS",@progbits,1 .L_j_str1: .asciz "typeassert" .size .L_j_str1, 11 .section ".note.GNU-stack","",@progbits ``` </details> <details> <summary>`@code_warntype` After</summary> ```julia [sukera@tower 01]$ julia -q --project=. -L 01.jl julia> data = read("input.txt"); julia> @code_warntype part1(data) MethodInstance for part1(::Vector{UInt8}) from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7 Arguments #self#::Core.Const(part1) data::Vector{UInt8} Locals @_3::Union{Nothing, Tuple{UInt8, Int64}} may_b::Union{Nothing, UInt8} may_a::Union{Nothing, UInt8} total::Int64 val@_7::Union{} val@_8::Union{} c::UInt8 digit_b::UInt8 digit_a::UInt8 #JuliaLang#215::Some{Nothing} #JuliaLang#216::Union{Nothing, UInt8} #JuliaLang#217::Union{Nothing, Some{UInt8}} #JuliaLang#212::Some{Nothing} #JuliaLang#213::Union{Nothing, Some{UInt8}} #JuliaLang#214::Union{Nothing, UInt8} digitRes::Union{Nothing, Some{UInt8}} @_19::Union{Nothing, UInt8} @_20::Union{Nothing, UInt8} @_21::Nothing @_22::Union{Nothing, UInt8} @_23::Union{Nothing, UInt8} @_24::Nothing Body::Int64 1 ── (total = 0) │ (may_a = Main.nothing) │ (may_b = Main.nothing) │ %4 = data::Vector{UInt8} │ (@_3 = Base.iterate(%4)) │ %6 = @_3::Union{Nothing, Tuple{UInt8, Int64}} │ %7 = (%6 === nothing)::Bool │ %8 = Base.not_int(%7)::Bool └─── goto mmtk#24 if not %8 2 ┄─ Core.NewvarNode(:(val@_7)) │ Core.NewvarNode(:(val@_8)) │ Core.NewvarNode(:(digit_b)) │ Core.NewvarNode(:(digit_a)) │ Core.NewvarNode(:(#JuliaLang#215)) │ Core.NewvarNode(:(#JuliaLang#216)) │ Core.NewvarNode(:(#JuliaLang#217)) │ Core.NewvarNode(:(#JuliaLang#212)) │ Core.NewvarNode(:(#JuliaLang#213)) │ %19 = @_3::Tuple{UInt8, Int64} │ (c = Core.getfield(%19, 1)) │ %21 = Core.getfield(%19, 2)::Int64 │ %22 = c::UInt8 │ (digitRes = Main.someDigit(%22)) │ %24 = may_a::Union{Nothing, UInt8} │ (#JuliaLang#214 = %24) │ %26 = Base.:!::Core.Const(!) │ %27 = #JuliaLang#214::Union{Nothing, UInt8} │ %28 = Base.isnothing(%27)::Bool │ %29 = (%26)(%28)::Bool └─── goto mmtk#4 if not %29 3 ── %31 = #JuliaLang#214::UInt8 │ (@_19 = Base.something(%31)) └─── goto mmtk#11 4 ── %34 = digitRes::Union{Nothing, Some{UInt8}} │ (#JuliaLang#213 = %34) │ %36 = Base.:!::Core.Const(!) │ %37 = #JuliaLang#213::Union{Nothing, Some{UInt8}} │ %38 = Base.isnothing(%37)::Bool │ %39 = (%36)(%38)::Bool └─── goto mmtk#6 if not %39 5 ── %41 = #JuliaLang#213::Some{UInt8} │ (@_20 = Base.something(%41)) └─── goto mmtk#10 6 ── %44 = Main.Some::Core.Const(Some) │ %45 = Main.nothing::Core.Const(nothing) │ (#JuliaLang#212 = (%44)(%45)) │ %47 = Base.:!::Core.Const(!) │ %48 = #JuliaLang#212::Core.Const(Some(nothing)) │ %49 = Base.isnothing(%48)::Core.Const(false) │ %50 = (%47)(%49)::Core.Const(true) └─── goto mmtk#8 if not %50 7 ── %52 = #JuliaLang#212::Core.Const(Some(nothing)) │ (@_21 = Base.something(%52)) └─── goto mmtk#9 8 ── Core.Const(nothing) │ Core.Const(:(val@_8 = Base.something(Base.nothing))) │ Core.Const(nothing) │ Core.Const(:(val@_8)) └─── Core.Const(:(@_21 = %58)) 9 ┄─ %60 = @_21::Core.Const(nothing) └─── (@_20 = %60) 10 ┄ %62 = @_20::Union{Nothing, UInt8} └─── (@_19 = %62) 11 ┄ %64 = @_19::Union{Nothing, UInt8} │ (may_a = %64) │ %66 = digitRes::Union{Nothing, Some{UInt8}} │ (#JuliaLang#217 = %66) │ %68 = Base.:!::Core.Const(!) │ %69 = #JuliaLang#217::Union{Nothing, Some{UInt8}} │ %70 = Base.isnothing(%69)::Bool │ %71 = (%68)(%70)::Bool └─── goto mmtk#13 if not %71 12 ─ %73 = #JuliaLang#217::Some{UInt8} │ (@_22 = Base.something(%73)) └─── goto mmtk#20 13 ─ %76 = may_b::Union{Nothing, UInt8} │ (#JuliaLang#216 = %76) │ %78 = Base.:!::Core.Const(!) │ %79 = #JuliaLang#216::Union{Nothing, UInt8} │ %80 = Base.isnothing(%79)::Bool │ %81 = (%78)(%80)::Bool └─── goto mmtk#15 if not %81 14 ─ %83 = #JuliaLang#216::UInt8 │ (@_23 = Base.something(%83)) └─── goto mmtk#19 15 ─ %86 = Main.Some::Core.Const(Some) │ %87 = Main.nothing::Core.Const(nothing) │ (#JuliaLang#215 = (%86)(%87)) │ %89 = Base.:!::Core.Const(!) │ %90 = #JuliaLang#215::Core.Const(Some(nothing)) │ %91 = Base.isnothing(%90)::Core.Const(false) │ %92 = (%89)(%91)::Core.Const(true) └─── goto mmtk#17 if not %92 16 ─ %94 = #JuliaLang#215::Core.Const(Some(nothing)) │ (@_24 = Base.something(%94)) └─── goto mmtk#18 17 ─ Core.Const(nothing) │ Core.Const(:(val@_7 = Base.something(Base.nothing))) │ Core.Const(nothing) │ Core.Const(:(val@_7)) └─── Core.Const(:(@_24 = %100)) 18 ┄ %102 = @_24::Core.Const(nothing) └─── (@_23 = %102) 19 ┄ %104 = @_23::Union{Nothing, UInt8} └─── (@_22 = %104) 20 ┄ %106 = @_22::Union{Nothing, UInt8} │ (may_b = %106) │ %108 = Main.:(==)::Core.Const(==) │ %109 = c::UInt8 │ %110 = Main.UInt8('\n')::Core.Const(0x0a) │ %111 = (%108)(%109, %110)::Bool └─── goto mmtk#22 if not %111 21 ─ %113 = may_a::Union{Nothing, UInt8} │ (digit_a = Core.typeassert(%113, Main.UInt8)) │ %115 = may_b::Union{Nothing, UInt8} │ (digit_b = Core.typeassert(%115, Main.UInt8)) │ %117 = Main.:+::Core.Const(+) │ %118 = total::Int64 │ %119 = Main.:+::Core.Const(+) │ %120 = Main.:*::Core.Const(*) │ %121 = digit_a::UInt8 │ %122 = (%120)(%121, 0x0a)::UInt8 │ %123 = digit_b::UInt8 │ %124 = (%119)(%122, %123)::UInt8 │ (total = (%117)(%118, %124)) │ (may_a = Main.nothing) └─── (may_b = Main.nothing) 22 ┄ (@_3 = Base.iterate(%4, %21)) │ %129 = @_3::Union{Nothing, Tuple{UInt8, Int64}} │ %130 = (%129 === nothing)::Bool │ %131 = Base.not_int(%130)::Bool └─── goto mmtk#24 if not %131 23 ─ goto mmtk#2 24 ┄ %134 = total::Int64 └─── return %134 ``` </details> <details> <summary>`@code_native debuginfo=:none` After </summary> ```julia julia> @code_native debuginfo=:none part1(data) .text .file "part1" .globl julia_part1_1203 # -- Begin function julia_part1_1203 .p2align 4, 0x90 .type julia_part1_1203,@function julia_part1_1203: # @julia_part1_1203 ; Function Signature: part1(Array{UInt8, 1}) # %bb.0: # %top #DEBUG_VALUE: part1:data <- [DW_OP_deref] $rdi push rbp mov rbp, rsp push r15 push r14 push r13 push r12 push rbx sub rsp, 40 vxorps xmm0, xmm0, xmm0 #APP mov rax, qword ptr fs:[0] #NO_APP lea rdx, [rbp - 64] vmovaps xmmword ptr [rbp - 64], xmm0 mov qword ptr [rbp - 48], 0 mov rcx, qword ptr [rax - 8] mov qword ptr [rbp - 64], 4 mov rax, qword ptr [rcx] mov qword ptr [rbp - 72], rcx # 8-byte Spill mov qword ptr [rbp - 56], rax mov qword ptr [rcx], rdx #DEBUG_VALUE: part1:data <- [DW_OP_deref] 0 mov r15, qword ptr [rdi + 16] test r15, r15 je .LBB0_1 # %bb.2: # %L34 mov r14, qword ptr [rdi] dec r15 mov r11b, 1 mov r13b, 1 # implicit-def: $r12b # implicit-def: $r10b xor eax, eax jmp .LBB0_3 .p2align 4, 0x90 .LBB0_4: # in Loop: Header=BB0_3 Depth=1 xor r11d, r11d mov ebx, edi mov r10d, r8d .LBB0_9: # %L114 # in Loop: Header=BB0_3 Depth=1 mov r12d, esi test r15, r15 je .LBB0_12 .LBB0_10: # %guard_exit126 # in Loop: Header=BB0_3 Depth=1 inc r14 dec r15 mov r13d, ebx .LBB0_3: # %L36 # =>This Inner Loop Header: Depth=1 movzx edx, byte ptr [r14] test r13b, 1 movzx edi, r13b mov ebx, 1 mov ecx, 0 cmove ebx, edi cmovne edi, ecx movzx ecx, r10b lea esi, [rdx - 48] lea r9d, [rdx - 58] movzx r8d, sil cmove r8d, ecx cmp r9b, -11 ja .LBB0_4 # %bb.5: # %L89 # in Loop: Header=BB0_3 Depth=1 test r11b, 1 jne .LBB0_8 # %bb.6: # %L102 # in Loop: Header=BB0_3 Depth=1 cmp dl, 10 jne .LBB0_7 # %bb.13: # %L106 # in Loop: Header=BB0_3 Depth=1 test r13b, 1 jne .LBB0_14 # %bb.11: # %L114.thread # in Loop: Header=BB0_3 Depth=1 add ecx, ecx mov bl, 1 mov r11b, 1 lea ecx, [rcx + 4*rcx] add cl, r12b movzx ecx, cl add rax, rcx test r15, r15 jne .LBB0_10 jmp .LBB0_12 .p2align 4, 0x90 .LBB0_8: # %L102.thread # in Loop: Header=BB0_3 Depth=1 mov r11b, 1 # implicit-def: $sil cmp dl, 10 jne .LBB0_9 jmp .LBB0_15 .LBB0_7: # in Loop: Header=BB0_3 Depth=1 mov esi, r12d jmp .LBB0_9 .LBB0_1: xor eax, eax .LBB0_12: # %L154 mov rcx, qword ptr [rbp - 56] mov rdx, qword ptr [rbp - 72] # 8-byte Reload mov qword ptr [rdx], rcx add rsp, 40 pop rbx pop r12 pop r13 pop r14 pop r15 pop rbp ret .LBB0_15: # %L106.thread test r13b, 1 jne .LBB0_14 # %bb.16: # %post_box_union47 movabs rax, offset jl_nothing movabs rcx, offset jl_small_typeof movabs rdi, offset ".L_j_str_typeassert#1" mov rdx, qword ptr [rax] mov rsi, qword ptr [rcx + 336] movabs rax, offset ijl_type_error mov qword ptr [rbp - 48], rsi call rax .LBB0_14: # %post_box_union movabs rax, offset jl_nothing movabs rcx, offset jl_small_typeof movabs rdi, offset ".L_j_str_typeassert#1" mov rdx, qword ptr [rax] mov rsi, qword ptr [rcx + 336] movabs rax, offset ijl_type_error mov qword ptr [rbp - 48], rsi call rax .Lfunc_end0: .size julia_part1_1203, .Lfunc_end0-julia_part1_1203 # -- End function .type ".L_j_str_typeassert#1",@object # @"_j_str_typeassert#1" .section .rodata.str1.1,"aMS",@progbits,1 ".L_j_str_typeassert#1": .asciz "typeassert" .size ".L_j_str_typeassert#1", 11 .section ".note.GNU-stack","",@progbits ``` </details> Co-authored-by: Sukera <[email protected]>
qinsoon
pushed a commit
to qinsoon/julia
that referenced
this pull request
May 2, 2024
Adds a convenient way to enable PGO+LTO on Julia and LLVM together: 1. `cd contrib/pgo-lto` 2. `make -j$(nproc) stage1` 3. `make clean-profiles` 4. `./stage1.build/julia -O3 -e 'using Pkg; Pkg.add("LoopVectorization"); Pkg.test("LoopVectorization")'` 5. `make -j$(nproc) stage2` <details> <summary>* Output looks roughly like as follows</summary> ```c++ $ make -C contrib/pgo-lto top make: Entering directory '/dev/shm/julia/contrib/pgo-lto' llvm-profdata show --topn=50 /dev/shm/julia/contrib/pgo-lto/profiles/merged.prof | c++filt Instrumentation level: IR entry_first = 0 Total functions: 85943 Maximum function count: 7867557260 Maximum internal block count: 3468437590 Top 50 functions with the largest internal block counts: llvm::BitVector::operator|=(llvm::BitVector const&), max count = 7867557260 LateLowerGCFrame::ComputeLiveness(State&), max count = 3468437590 llvm::hashing::detail::hash_combine_recursive_helper::hash_combine_recursive_helper(), max count = 1742259834 llvm::SUnit::addPred(llvm::SDep const&, bool), max count = 511396575 llvm::LiveRange::overlaps(llvm::LiveRange const&, llvm::CoalescerPair const&, llvm::SlotIndexes const&) const, max count = 508061762 llvm::StringMapImpl::LookupBucketFor(llvm::StringRef), max count = 505682177 std::map<llvm::BasicBlock*, BBState, std::less<llvm::BasicBlock*>, std::allocator<std::pair<llvm::BasicBlock* const, BBState> > >::operator[](llvm::BasicBlock* const&), max count = 395628888 llvm::LiveRange::advanceTo(llvm::LiveRange::Segment const*, llvm::SlotIndex) const, max count = 384642728 llvm::LiveRange::isLiveAtIndexes(llvm::ArrayRef<llvm::SlotIndex>) const, max count = 380291040 llvm::PassRegistry::enumerateWith(llvm::PassRegistrationListener*), max count = 352313953 ijl_method_instance_add_backedge, max count = 349608221 llvm::SUnit::ComputeHeight(), max count = 336604330 llvm::LiveRange::advanceTo(llvm::LiveRange::Segment*, llvm::SlotIndex), max count = 331030109 llvm::SmallPtrSetImplBase::insert_imp(void const*), max count = 272966545 llvm::LiveIntervals::checkRegMaskInterference(llvm::LiveInterval&, llvm::BitVector&), max count = 257449540 LateLowerGCFrame::ComputeLiveSets(State&), max count = 252096274 /dev/shm/julia/src/jltypes.c:has_free_typevars, max count = 230879464 ijl_get_pgcstack, max count = 216953592 LateLowerGCFrame::RefineLiveSet(llvm::BitVector&, State&, std::vector<int, std::allocator<int> > const&), max count = 188013152 /dev/shm/julia/src/flisp/flisp.c:apply_cl, max count = 174863813 /dev/shm/julia/src/flisp/builtins.c:fl_memq, max count = 168621603 ``` </details> This results quite often in spectacular speedups for time to first X as it reduces the time spent in LLVM optimization passes by 25 or even 30%. Example 1: ```julia using LoopVectorization function f!(a, b) @turbo for i in eachindex(a) a[i] *= b[i] end return a end f!(rand(1), rand(1)) ``` ```console $ time ./julia -O3 lv.jl ``` Without PGO+LTO: 14.801s With PGO+LTO: 11.978s (-19%) Example 2: ```console $ time ./julia -e 'using Pkg; Pkg.test("Unitful");' ``` Without PGO+LTO: 1m47.688s With PGO+LTO: 1m35.704s (-11%) Example 3 (taken from issue JuliaLang#45395, which is almost only LLVM): ```console $ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl ``` Without PGO+LTO: ``` ===-------------------------------------------------------------------------=== ... Pass execution timing report ... ===-------------------------------------------------------------------------=== Total Execution Time: 101.0130 seconds (98.6253 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 53.6961 ( 54.7%) 0.1050 ( 3.8%) 53.8012 ( 53.3%) 53.8045 ( 54.6%) Unroll loops 25.5423 ( 26.0%) 0.0072 ( 0.3%) 25.5495 ( 25.3%) 25.5444 ( 25.9%) Global Value Numbering 7.1995 ( 7.3%) 0.0526 ( 1.9%) 7.2521 ( 7.2%) 7.2517 ( 7.4%) Induction Variable Simplification 6.0541 ( 5.1%) 0.0098 ( 0.3%) 5.0639 ( 5.0%) 5.0561 ( 5.1%) Combine redundant instructions mmtk#2 ``` With PGO+LTO: ``` ===-------------------------------------------------------------------------=== ... Pass execution timing report ... ===-------------------------------------------------------------------------=== Total Execution Time: 72.6507 seconds (70.1337 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 36.0894 ( 51.7%) 0.0825 ( 2.9%) 36.1719 ( 49.8%) 36.1738 ( 51.6%) Unroll loops 16.5713 ( 23.7%) 0.0129 ( 0.5%) 16.5843 ( 22.8%) 16.5794 ( 23.6%) Global Value Numbering 5.9047 ( 8.5%) 0.0395 ( 1.4%) 5.9442 ( 8.2%) 5.9438 ( 8.5%) Induction Variable Simplification 4.7566 ( 6.8%) 0.0078 ( 0.3%) 4.7645 ( 6.6%) 4.7575 ( 6.8%) Combine redundant instructions mmtk#2 ``` Or -28% time spent in LLVM. `perf` reports show this is mostly fewer instructions and reduction in icache misses. --- Finally there's a significant reduction in binary sizes. For libLLVM.so: ``` 79M usr/lib/libLLVM-13jl.so (before) 67M usr/lib/libLLVM-13jl.so (after) ``` And it can be reduced by another 2MB with `--icf=safe` when using LLD as a linker anyways. - [x] Two out-of-source builds would be better than a single in-source build, so that it's easier to find good profile data --------- Co-authored-by: Oscar Smith <[email protected]> Co-authored-by: Lilith Orion Hafner <[email protected]>
udesou
pushed a commit
to udesou/julia
that referenced
this pull request
Aug 29, 2024
…aLang#55600) As an application of JuliaLang#55545, this commit avoids the insertion of `:throw_undef_if_not` nodes when the defined-ness of a slot is guaranteed by abstract interpretation. ```julia julia> function isdefined_nothrow(c, x) local val if c val = x end if @isdefined val return val end return zero(Int) end; julia> @code_typed isdefined_nothrow(true, 42) ``` ```diff diff --git a/old b/new index c4980a5c9c..3d1d6d30f0 100644 --- a/old +++ b/new @@ -4,7 +4,6 @@ CodeInfo( 3 ┄ %3 = φ (mmtk#2 => x, #1 => #undef)::Int64 │ %4 = φ (mmtk#2 => true, #1 => false)::Bool └── goto mmtk#5 if not %4 -4 ─ $(Expr(:throw_undef_if_not, :val, :(%4)))::Any -└── return %3 +4 ─ return %3 5 ─ return 0 ) => Int64 ```
udesou
pushed a commit
to udesou/julia
that referenced
this pull request
Oct 16, 2024
Prior to this, especially on macOS, the gc-safepoint here would cause the process to segfault as we had already freed the current_task state. Rearrange this code so that the GC interactions (except for the atomic store to current_task) are all handled before entering GC safe, and then signaling the thread is deleted (via setting current_task = NULL, published by jl_unlock_profile_wr to other threads) is last. ``` ERROR: Exception handler triggered on unmanaged thread. Process 53827 stopped * thread mmtk#5, stop reason = EXC_BAD_ACCESS (code=2, address=0x100018008) frame #0: 0x0000000100b74344 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_set(ptls=0x000000011f8b3200, state='\x02', old_state=<unavailable>) at julia_threads.h:272:9 [opt] 269 assert(old_state != JL_GC_CONCURRENT_COLLECTOR_THREAD); 270 jl_atomic_store_release(&ptls->gc_state, state); 271 if (state == JL_GC_STATE_UNSAFE || old_state == JL_GC_STATE_UNSAFE) -> 272 jl_gc_safepoint_(ptls); 273 return old_state; 274 } 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, Target 0: (julia) stopped. (lldb) up frame #1: 0x0000000100b74320 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_save_and_set(ptls=0x000000011f8b3200, state='\x02') at julia_threads.h:278:12 [opt] 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, 276 int8_t state) 277 { -> 278 return jl_gc_state_set(ptls, state, jl_atomic_load_relaxed(&ptls->gc_state)); 279 } 280 #ifdef __clang_gcanalyzer__ 281 // these might not be a safepoint (if they are no-op safe=>safe transitions), but we have to assume it could be (statically) (lldb) frame mmtk#2: 0x0000000100b7431c libjulia-internal.1.12.0.dylib`jl_delete_thread(value=0x000000011f8b3200) at threading.c:537:11 [opt] 534 ptls->root_task = NULL; 535 jl_free_thread_gc_state(ptls); 536 // then park in safe-region -> 537 (void)jl_gc_safe_enter(ptls); 538 } ``` (test incorporated into JuliaLang#55793)
udesou
pushed a commit
to udesou/julia
that referenced
this pull request
Oct 16, 2024
Rebase and extension of @alexfanqi's initial work on porting Julia to RISC-V. Requires LLVM 19. Tested on a VisionFive2, built with: ```make MARCH := rv64gc_zba_zbb MCPU := sifive-u74 USE_BINARYBUILDER:=0 DEPS_GIT = llvm override LLVM_VER=19.1.1 override LLVM_BRANCH=julia-release/19.x override LLVM_SHA1=julia-release/19.x ``` ```julia-repl ❯ ./julia _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.12.0-DEV.1374 (2024-10-14) _/ |\__'_|_|_|\__'_| | riscv/25092a3982* (fork: 1 commits, 0 days) |__/ | julia> versioninfo(; verbose=true) Julia Version 1.12.0-DEV.1374 Commit 25092a3* (2024-10-14 09:57 UTC) Platform Info: OS: Linux (riscv64-unknown-linux-gnu) uname: Linux 6.11.3-1-riscv64 #1 SMP Debian 6.11.3-1 (2024-10-10) riscv64 unknown CPU: unknown: speed user nice sys idle irq #1 1500 MHz 922 s 0 s 265 s 160953 s 0 s mmtk#2 1500 MHz 457 s 0 s 280 s 161521 s 0 s mmtk#3 1500 MHz 452 s 0 s 270 s 160911 s 0 s mmtk#4 1500 MHz 638 s 15 s 301 s 161340 s 0 s Memory: 7.760246276855469 GB (7474.08203125 MB free) Uptime: 16260.13 sec Load Avg: 0.25 0.23 0.1 WORD_SIZE: 64 LLVM: libLLVM-19.1.1 (ORCJIT, sifive-u74) Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) Environment: HOME = /home/tim PATH = /home/tim/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/games TERM = xterm-256color julia> ccall(:jl_dump_host_cpu, Nothing, ()) CPU: sifive-u74 Features: +zbb,+d,+i,+f,+c,+a,+zba,+m,-zvbc,-zksed,-zvfhmin,-zbkc,-zkne,-zksh,-zfh,-zfhmin,-zknh,-v,-zihintpause,-zicboz,-zbs,-zvknha,-zvksed,-zfa,-ztso,-zbc,-zvknhb,-zihintntl,-zknd,-zvbb,-zbkx,-zkt,-zvkt,-zicond,-zvksh,-zvfh,-zvkg,-zvkb,-zbkb,-zvkned julia> @code_native debuginfo=:none 1+2. .text .attribute 4, 16 .attribute 5, "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zmmul1p0_zba1p0_zbb1p0" .file "+" .globl "julia_+_3003" .p2align 1 .type "julia_+_3003",@function "julia_+_3003": addi sp, sp, -16 sd ra, 8(sp) sd s0, 0(sp) addi s0, sp, 16 fcvt.d.l fa5, a0 ld ra, 8(sp) ld s0, 0(sp) fadd.d fa0, fa5, fa0 addi sp, sp, 16 ret .Lfunc_end0: .size "julia_+_3003", .Lfunc_end0-"julia_+_3003" .type ".L+Core.Float64#3005",@object .section .data.rel.ro,"aw",@progbits .p2align 3, 0x0 ".L+Core.Float64#3005": .quad ".L+Core.Float64#3005.jit" .size ".L+Core.Float64#3005", 8 .set ".L+Core.Float64#3005.jit", 272467692544 .size ".L+Core.Float64#3005.jit", 8 .section ".note.GNU-stack","",@progbits ``` Lots of bugs guaranteed, but with this we at least have a functional build and REPL for further development by whoever is interested. Also requires Linux 6.4+, since the fallback processor detection used here relies on LLVM's `sys::getHostCPUFeatures`, which for RISC-V is implemented using hwprobe introduced in 6.4. We could probably add a fallback that parses `/proc/cpuinfo`, either by building a CPU database much like how we've done for AArch64, or by parsing the actual ISA string contained there. That would probably also be a good place to add support for profiles, which are supposedly the way forward to package RISC-V binaries. That can happen in follow-up PRs though. For now, on older kernels, use the `-C` arg to Julia to specify an ISA. Co-authored-by: Alex Fan <[email protected]>
udesou
added a commit
that referenced
this pull request
Oct 22, 2024
* Add filesystem func to transform a path to a URI (#55454) In a few places across Base and the stdlib, we emit paths that we like people to be able to click on in their terminal and editor. Up to this point, we have relied on auto-filepath detection, but this does not allow for alternative link text, such as contracted paths. Doing so (via OSC 8 terminal links for example) requires filepath URI encoding. This functionality was previously part of a PR modifying stacktrace printing (#51816), but after that became held up for unrelated reasons and another PR appeared that would benefit from this utility (#55335), I've split out this functionality so it can be used before the stacktrace printing PR is resolved. * constrain the path argument of `include` functions to `AbstractString` (#55466) Each `Module` defined with `module` automatically gets an `include` function with two methods. Each of those two methods takes a file path as its last argument. Even though the path argument is unconstrained by dispatch, it's documented as constrained with `::AbstractString`: https://docs.julialang.org/en/v1.11-dev/base/base/#include Furthermore, I think that any invocation of `include` with a non-`AbstractString` path will necessarily throw a `MethodError` eventually. Thus this change should be harmless. Adding the type constraint to the path argument is an improvement because any possible exception would be thrown earlier than before. Apart from modules defined with `module`, the same issue is present with the anonymous modules created by `evalfile`, which is also addressed. Sidenote: `evalfile` seems to be completely untested apart from the test added here. Co-authored-by: Florian <[email protected]> * Mmap: fix grow! for non file IOs (#55849) Fixes https://github.com/JuliaLang/julia/issues/54203 Requires #55641 Based on https://github.com/JuliaLang/julia/pull/55641#issuecomment-2334162489 cc. @JakeZw @ronisbr --------- Co-authored-by: Jameson Nash <[email protected]> * codegen: split gc roots from other bits on stack (#55767) In order to help avoid memory provenance issues, and better utilize stack space (somewhat), and use FCA less, change the preferred representation of an immutable object to be a pair of `<packed-data,roots>` values. This packing requires some care at the boundaries and if the expected field alignment exceeds that of a pointer. The change is expected to eventually make codegen more flexible at representing unions of values with both bits and pointer regions. Eventually we can also have someone improve the late-gc-lowering pass to take advantage of this increased information accuracy, but currently it will not be any better than before at laying out the frame. * Refactoring to be considered before adding MMTk * Removing jl_gc_notify_image_load, since it's a new function and not part of the refactoring * Moving gc_enable code to gc-common.c * Addressing PR comments * Push resolution of merge conflict * Removing jl_gc_mark_queue_obj_explicit extern definition from scheduler.c * Don't need the getter function since it's possible to use jl_small_typeof directly * WIP: Adding support for MMTk/Immix * Refactoring to be considered before adding MMTk * Adding fastpath allocation * Fixing removed newlines * Refactoring to be considered before adding MMTk * Adding a few comments; Moving some functions to be closer together * Fixing merge conflicts * Applying changes from refactoring before adding MMTk * Update TaskLocalRNG docstring according to #49110 (#55863) Since #49110, which is included in 1.10 and 1.11, spawning a task no longer advances the parent task's RNG state, so this statement in the docs was incorrect. * Root globals in toplevel exprs (#54433) This fixes #54422, the code here assumes that top level exprs are always rooted, but I don't see that referenced anywhere else, or guaranteed, so conservatively always root objects that show up in code. * codegen: fix alignment typos (#55880) So easy to type jl_datatype_align to get the natural alignment instead of julia_alignment to get the actual alignment. This should fix the Revise workload. Change is visible with ``` julia> code_llvm(Random.XoshiroSimd.forkRand, (Random.TaskLocalRNG, Base.Val{8})) ``` * Fix some corner cases of `isapprox` with unsigned integers (#55828) * 🤖 [master] Bump the Pkg stdlib from ef9f76c17 to 51d4910c1 (#55896) * Profile: fix order of fields in heapsnapshot & improve formatting (#55890) * Profile: Improve generation of clickable terminal links (#55857) * inference: add missing `TypeVar` handling for `instanceof_tfunc` (#55884) I thought these sort of problems had been addressed by d60f92c, but it seems some were missed. Specifically, `t.a` and `t.b` from `t::Union` could be `TypeVar`, and if they are passed to a subroutine or recursed without being unwrapped or rewrapped, errors like JuliaLang/julia#55882 could occur. This commit resolves the issue by calling `unwraptv` in the `Union` handling within `instanceof_tfunc`. I also found a similar issue inside `nfields_tfunc`, so that has also been fixed, and test cases have been added. While I haven't been able to make up a test case specifically for the fix in `instanceof_tfunc`, I have confirmed that this commit certainly fixes the issue reported in JuliaLang/julia#55882. - fixes JuliaLang/julia#55882 * Install terminfo data under /usr/share/julia (#55881) Just like all other libraries, we don't want internal Julia files to mess with system files. Introduced by https://github.com/JuliaLang/julia/pull/55411. * expose metric to report reasons why full GCs were triggered (#55826) Additional GC observability tool. This will help us to diagnose why some of our servers are triggering so many full GCs in certain circumstances. * Revert "Improve printing of several arguments" (#55894) Reverts JuliaLang/julia#55754 as it overrode some performance heuristics which appeared to be giving a significant gain/loss in performance: Closes https://github.com/JuliaLang/julia/issues/55893 * Do not trigger deprecation warnings in `Test.detect_ambiguities` and `Test.detect_unbound_args` (#55869) #55868 * do not intentionally suppress errors in precompile script from being reported or failing the result (#55909) I was slightly annoying that the build was set up to succeed if this step failed, so I removed the error suppression and fixed up the script slightly * Remove eigvecs method for SymTridiagonal (#55903) The fallback method does the same, so this specialized method isn't necessary * add --trim option for generating smaller binaries (#55047) This adds a command line option `--trim` that builds images where code is only included if it is statically reachable from methods marked using the new function `entrypoint`. Compile-time errors are given for call sites that are too dynamic to allow trimming the call graph (however there is an `unsafe` option if you want to try building anyway to see what happens). The PR has two other components. One is changes to Base that generally allow more code to be compiled in this mode. These changes will either be merged in separate PRs or moved to a separate part of the workflow (where we will build a custom system image for this purpose). The branch is set up this way to make it easy to check out and try the functionality. The other component is everything in the `juliac/` directory, which implements a compiler driver script based on this new option, along with some examples and tests. This will eventually become a package "app" that depends on PackageCompiler and provides a CLI for all of this stuff, so it will not be merged here. To try an example: ``` julia contrib/juliac.jl --output-exe hello --trim test/trimming/hello.jl ``` When stripped the resulting executable is currently about 900kb on my machine. Also includes a lot of work by @topolarity --------- Co-authored-by: Gabriel Baraldi <[email protected]> Co-authored-by: Tim Holy <[email protected]> Co-authored-by: Cody Tapscott <[email protected]> * fix rawbigints OOB issues (#55917) Fixes issues introduced in #50691 and found in #55906: * use `@inbounds` and `@boundscheck` macros in rawbigints, for catching OOB with `--check-bounds=yes` * fix OOB in `truncate` * prevent loading other extensions when precompiling an extension (#55589) The current way of loading extensions when precompiling an extension very easily leads to cycles. For example, if you have more than one extension and you happen to transitively depend on the triggers of one of your extensions you will immediately hit a cycle where the extensions will try to load each other indefinitely. This is an issue because you cannot directly influence your transitive dependency graph so from this p.o.v the current system of loading extension is "unsound". The test added here checks this scenario and we can now precompile and load it without any warnings or issues. Would have made https://github.com/JuliaLang/julia/issues/55517 a non issue. Fixes https://github.com/JuliaLang/julia/issues/55557 --------- Co-authored-by: KristofferC <[email protected]> * TOML: Avoid type-pirating `Base.TOML.Parser` (#55892) Since stdlibs can be duplicated but Base never is, `Base.require_stdlib` makes type piracy even more complicated than it normally would be. To adapt, this changes `TOML.Parser` to be a type defined by the TOML stdlib, so that we can define methods on it without committing type-piracy and avoid problems like Pkg.jl#4017 Resolves https://github.com/JuliaLang/Pkg.jl/issues/4017#issuecomment-2377589989 * [FileWatching] fix PollingFileWatcher design and add workaround for a stat bug What started as an innocent fix for a stat bug on Apple (#48667) turned into a full blown investigation into the design problems with the libuv backend for PollingFileWatcher, and writing my own implementation of it instead which could avoid those singled-threaded concurrency bugs. * [FileWatching] fix FileMonitor similarly and improve pidfile reliability Previously pidfile used the same poll_interval as sleep to detect if this code made any concurrency mistakes, but we do not really need to do that once FileMonitor is fixed to be reliable in the presence of parallel concurrency (instead of using watch_file). * [FileWatching] reorganize file and add docs * Add `--trace-dispatch` (#55848) * relocation: account for trailing path separator in depot paths (#55355) Fixes #55340 * change compiler to be stackless (#55575) This change ensures the compiler uses very little stack, making it compatible with running on any arbitrary system stack size and depths much more reliably. It also could be further modified now to easily add various forms of pause-able/resumable inference, since there is no implicit state on the stack--everything is local and explicit now. Whereas before, less than 900 frames would crash in less than a second: ``` $ time ./julia -e 'f(::Val{N}) where {N} = N <= 0 ? 0 : f(Val(N - 1)); f(Val(1000))' Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable. Internal error: during type inference of f(Base.Val{1000}) Encountered stack overflow. This might be caused by recursion over very long tuples or argument lists. [23763] signal 6: Abort trap: 6 in expression starting at none:1 __pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line) Allocations: 1 (Pool: 1; Big: 0); GC: 0 Abort trap: 6 real 0m0.233s user 0m0.165s sys 0m0.049s ```` Now: it is effectively unlimited, as long as you are willing to wait for it: ``` $ time ./julia -e 'f(::Val{N}) where {N} = N <= 0 ? 0 : f(Val(N - 1)); f(Val(50000))' info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 2500 frames (may be slow). info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 5000 frames (may be slow). info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 10000 frames (may be slow). info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 20000 frames (may be slow). info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 40000 frames (may be slow). real 7m4.988s $ time ./julia -e 'f(::Val{N}) where {N} = N <= 0 ? 0 : f(Val(N - 1)); f(Val(1000))' real 0m0.214s user 0m0.164s sys 0m0.044s $ time ./julia -e '@noinline f(::Val{N}) where {N} = N <= 0 ? GC.safepoint() : f(Val(N - 1)); f(Val(5000))' info: inference of f(Base.Val{5000}) from f(Base.Val{N}) where {N} exceeding 2500 frames (may be slow). info: inference of f(Base.Val{5000}) from f(Base.Val{N}) where {N} exceeding 5000 frames (may be slow). real 0m8.609s user 0m8.358s sys 0m0.240s ``` * optimizer: simplify the finalizer inlining pass a bit (#55934) Minor adjustments have been made to the algorithm of the finalizer inlining pass. Previously, it required that the finalizer registration dominate all uses, but this is not always necessary as far as the finalizer inlining point dominates all the uses. So the check has been relaxed. Other minor fixes have been made as well, but their importance is low. * Limit `@inbounds` to indexing in the dual-iterator branch in `copyto_unaliased!` (#55919) This simplifies the `copyto_unalised!` implementation where the source and destination have different `IndexStyle`s, and limits the `@inbounds` to only the indexing operation. In particular, the iteration over `eachindex(dest)` is not marked as `@inbounds` anymore. This seems to help with performance when the destination uses Cartesian indexing. Reduced implementation of the branch: ```julia function copyto_proposed!(dest, src) axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes")) iterdest, itersrc = eachindex(dest), eachindex(src) for (destind, srcind) in zip(iterdest, itersrc) @inbounds dest[destind] = src[srcind] end dest end function copyto_current!(dest, src) axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes")) iterdest, itersrc = eachindex(dest), eachindex(src) ret = iterate(iterdest) @inbounds for a in src idx, state = ret::NTuple{2,Any} dest[idx] = a ret = iterate(iterdest, state) end dest end function copyto_current_limitinbounds!(dest, src) axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes")) iterdest, itersrc = eachindex(dest), eachindex(src) ret = iterate(iterdest) for isrc in itersrc idx, state = ret::NTuple{2,Any} @inbounds dest[idx] = src[isrc] ret = iterate(iterdest, state) end dest end ``` ```julia julia> a = zeros(40000,4000); b = rand(size(a)...); julia> av = view(a, UnitRange.(axes(a))...); julia> @btime copyto_current!($av, $b); 617.704 ms (0 allocations: 0 bytes) julia> @btime copyto_current_limitinbounds!($av, $b); 304.146 ms (0 allocations: 0 bytes) julia> @btime copyto_proposed!($av, $b); 240.217 ms (0 allocations: 0 bytes) julia> versioninfo() Julia Version 1.12.0-DEV.1260 Commit 4a4ca9c8152 (2024-09-28 01:49 UTC) Build Info: Official https://julialang.org release Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 8 × Intel(R) Core(TM) i5-10310U CPU @ 1.70GHz WORD_SIZE: 64 LLVM: libLLVM-18.1.7 (ORCJIT, skylake) Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores) Environment: JULIA_EDITOR = subl ``` I'm not quite certain why the proposed implementation here (`copyto_proposed!`) is even faster than `copyto_current_limitinbounds!`. In any case, `copyto_proposed!` is easier to read, so I'm not complaining. This fixes https://github.com/JuliaLang/julia/issues/53158 * Strong zero in Diagonal triple multiplication (#55927) Currently, triple multiplication with a `LinearAlgebra.BandedMatrix` sandwiched between two `Diagonal`s isn't associative, as this is implemented using broadcasting, which doesn't assume a strong zero, whereas the two-term matrix multiplication does. ```julia julia> D = Diagonal(StepRangeLen(NaN, 0, 3)); julia> B = Bidiagonal(1:3, 1:2, :U); julia> D * B * D 3×3 Matrix{Float64}: NaN NaN NaN NaN NaN NaN NaN NaN NaN julia> (D * B) * D 3×3 Bidiagonal{Float64, Vector{Float64}}: NaN NaN ⋅ ⋅ NaN NaN ⋅ ⋅ NaN julia> D * (B * D) 3×3 Bidiagonal{Float64, Vector{Float64}}: NaN NaN ⋅ ⋅ NaN NaN ⋅ ⋅ NaN ``` This PR ensures that the 3-term multiplication is evaluated as a sequence of two-term multiplications, which fixes this issue. This also improves performance, as only the bands need to be evaluated now. ```julia julia> D = Diagonal(1:1000); B = Bidiagonal(1:1000, 1:999, :U); julia> @btime $D * $B * $D; 656.364 μs (11 allocations: 7.63 MiB) # v"1.12.0-DEV.1262" 2.483 μs (12 allocations: 31.50 KiB) # This PR ``` * Fix dispatch on `alg` in Float16 Hermitian eigen (#55928) Currently, ```julia julia> using LinearAlgebra julia> A = Hermitian(reshape(Float16[1:16;], 4, 4)); julia> eigen(A).values |> typeof Vector{Float16} (alias for Array{Float16, 1}) julia> eigen(A, LinearAlgebra.QRIteration()).values |> typeof Vector{Float32} (alias for Array{Float32, 1}) ``` This PR moves the specialization on the `eltype` to an internal method, so that firstly all `alg`s dispatch to that method, and secondly, there are no ambiguities introduce by specializing the top-level `eigen`. The latter currently causes test failures in `StaticArrays` (https://github.com/JuliaArrays/StaticArrays.jl/actions/runs/11092206012/job/30816955210?pr=1279), and should be fixed by this PR. * Remove specialized `ishermitian` method for `Diagonal{<:Real}` (#55948) The fallback method for `Diagonal{<:Number}` handles this already by checking that the `diag` is real, so we don't need this additional specialization. * Fix logic in `?` docstring example (#55945) * fix `unwrap_macrocalls` (#55950) The implementation of `unwrap_macrocalls` has assumed that what `:macrocall` wraps is always an `Expr` object, but that is not necessarily correct: ```julia julia> Base.@assume_effects :nothrow @show 42 ERROR: LoadError: TypeError: in typeassert, expected Expr, got a value of type Int64 Stacktrace: [1] unwrap_macrocalls(ex::Expr) @ Base ./expr.jl:906 [2] var"@assume_effects"(__source__::LineNumberNode, __module__::Module, args::Vararg{Any}) @ Base ./expr.jl:756 in expression starting at REPL[1]:1 ``` This commit addresses this issue. * make faster BigFloats (#55906) We can coalesce the two required allocations for the MFPR BigFloat API design into one allocation, hopefully giving a easy performance boost. It would have been slightly easier and more efficient if MPFR BigFloat was already a VLA instead of containing a pointer here, but that does not prevent the optimization. * Add propagate_inbounds_meta to atomic genericmemory ops (#55902) `memoryref(mem, i)` will otherwise emit a boundscheck. ``` ; │ @ /home/vchuravy/WorkstealingQueues/src/CLL.jl:53 within `setindex_atomic!` @ genericmemory.jl:329 ; │┌ @ boot.jl:545 within `memoryref` %ptls_field = getelementptr inbounds i8, ptr %tls_pgcstack, i64 16 %ptls_load = load ptr, ptr %ptls_field, align 8 %"box::GenericMemoryRef" = call noalias nonnull align 8 dereferenceable(32) ptr @ijl_gc_small_alloc(ptr %ptls_load, i32 552, i32 32, i64 23456076646928) #9 %"box::GenericMemoryRef.tag_addr" = getelementptr inbounds i64, ptr %"box::GenericMemoryRef", i64 -1 store atomic i64 23456076646928, ptr %"box::GenericMemoryRef.tag_addr" unordered, align 8 store ptr %memoryref_data, ptr %"box::GenericMemoryRef", align 8 %.repack8 = getelementptr inbounds { ptr, ptr }, ptr %"box::GenericMemoryRef", i64 0, i32 1 store ptr %memoryref_mem, ptr %.repack8, align 8 call void @ijl_bounds_error_int(ptr nonnull %"box::GenericMemoryRef", i64 %7) unreachable ``` For the Julia code: ```julia function Base.setindex_atomic!(buf::WSBuffer{T}, order::Symbol, val::T, idx::Int64) where T @inbounds Base.setindex_atomic!(buf.buffer, order, val,((idx - 1) & buf.mask) + 1) end ``` from https://github.com/gbaraldi/WorkstealingQueues.jl/blob/0ebc57237cf0c90feedf99e4338577d04b67805b/src/CLL.jl#L41 * fix rounding mode in construction of `BigFloat` from pi (#55911) The default argument of the method was outdated, reading the global default rounding directly, bypassing the `ScopedValue` stuff. * fix `nonsetable_type_hint_handler` (#55962) The current implementation is wrong, causing it to display inappropriate hints like the following: ```julia julia> s = Some("foo"); julia> s[] = "bar" ERROR: MethodError: no method matching setindex!(::Some{String}, ::String) The function `setindex!` exists, but no method is defined for this combination of argument types. You attempted to index the type String, rather than an instance of the type. Make sure you create the type using its constructor: d = String([...]) rather than d = String Stacktrace: [1] top-level scope @ REPL[2]:1 ``` * REPL: make UndefVarError aware of imported modules (#55932) * fix test/staged.jl (#55967) In particular, the implementation of `overdub_generator54341` was dangerous. This fixes it up. * Explicitly store a module's location (#55963) Revise wants to know what file a module's `module` definition is in. Currently it does this by looking at the source location for the implicitly generated `eval` method. This is terrible for two reasons: 1. The method may not exist if the module is a baremodule (which is not particularly common, which is probably why we haven't seen it). 2. The fact that the implicitly generated `eval` method has this location information is an implementation detail that I'd like to get rid of (#55949). This PR adds explicit file/line info to `Module`, so that Revise doesn't have to use the hack anymore. * mergewith: add single argument example to docstring (#55964) I ran into this edge case. I though it should be documented. --------- Co-authored-by: Lilith Orion Hafner <[email protected]> * [build] avoid libedit linkage and align libccalllazy* SONAMEs (#55968) While building the 1.11.0-rc4 in Homebrew[^1] in preparation for 1.11.0 release (and to confirm Sequoia successfully builds) I noticed some odd linkage for our Linux builds, which included of: 1. LLVM libraries were linking to `libedit.so`, e.g. ``` Dynamic Section: NEEDED libedit.so.0 NEEDED libz.so.1 NEEDED libzstd.so.1 NEEDED libstdc++.so.6 NEEDED libm.so.6 NEEDED libgcc_s.so.1 NEEDED libc.so.6 NEEDED ld-linux-x86-64.so.2 SONAME libLLVM-16jl.so ``` CMakeCache.txt showed ``` //Use libedit if available. LLVM_ENABLE_LIBEDIT:BOOL=ON ``` Which might be overriding `HAVE_LIBEDIT` at https://github.com/JuliaLang/llvm-project/blob/julia-release/16.x/llvm/cmake/config-ix.cmake#L222-L225. So just added `LLVM_ENABLE_LIBEDIT` 2. Wasn't sure if there was a reason for this but `libccalllazy*` had mismatched SONAME: ```console ❯ objdump -p lib/julia/libccalllazy* | rg '\.so' lib/julia/libccalllazybar.so: file format elf64-x86-64 NEEDED ccalllazyfoo.so SONAME ccalllazybar.so lib/julia/libccalllazyfoo.so: file format elf64-x86-64 SONAME ccalllazyfoo.so ``` Modifying this, but can drop if intentional. --- [^1]: https://github.com/Homebrew/homebrew-core/pull/192116 * Add missing `copy!(::AbstractMatrix, ::UniformScaling)` method (#55970) Hi everyone! First PR to Julia here. It was noticed in a Slack thread yesterday that `copy!(A, I)` doesn't work, but `copyto!(A, I)` does. This PR adds the missing method for `copy!(::AbstractMatrix, ::UniformScaling)`, which simply defers to `copyto!`, and corresponding tests. I added a `compat` notice for Julia 1.12. --------- Co-authored-by: Lilith Orion Hafner <[email protected]> * Add forward progress update to NEWS.md (#54089) Closes #40009 which was left open because of the needs news tag. --------- Co-authored-by: Ian Butterworth <[email protected]> * Fix an intermittent test failure in `core` test (#55973) The test wants to assert that `Module` is not resolved in `Main`, but other tests do resolve this identifier, so the test can fail depending on test order (and I've been seeing such failures on CI recently). Fix that by running the test in a fresh subprocess. * fix comma logic in time_print (#55977) Minor formatting fix * optimizer: fix up the inlining algorithm to use correct `nargs`/`isva` (#55976) It appears that inlining.jl was not updated in JuliaLang/julia#54341. Specifically, using `nargs`/`isva` from `mi.def::Method` in `ir_prepare_inlining!` causes the following error to occur: ```julia function generate_lambda_ex(world::UInt, source::LineNumberNode, argnames, spnames, @nospecialize body) stub = Core.GeneratedFunctionStub(identity, Core.svec(argnames...), Core.svec(spnames...)) return stub(world, source, body) end function overdubbee54341(a, b) return a + b end const overdubee_codeinfo54341 = code_lowered(overdubbee54341, Tuple{Any, Any})[1] function overdub_generator54341(world::UInt, source::LineNumberNode, selftype, fargtypes) if length(fargtypes) != 2 return generate_lambda_ex(world, source, (:overdub54341, :args), (), :(error("Wrong number of arguments"))) else return copy(overdubee_codeinfo54341) end end @eval function overdub54341(args...) $(Expr(:meta, :generated, overdub_generator54341)) $(Expr(:meta, :generated_only)) end topfunc(x) = overdub54341(x, 2) ``` ```julia julia> topfunc(1) Internal error: during type inference of topfunc(Int64) Encountered unexpected error in runtime: BoundsError(a=Array{Any, 1}(dims=(2,), mem=Memory{Any}(8, 0x10632e780)[SSAValue(2), SSAValue(3), #<null>, #<null>, #<null>, #<null>, #<null>, #<null>]), i=(3,)) throw_boundserror at ./essentials.jl:14 getindex at ./essentials.jl:909 [inlined] ssa_substitute_op! at ./compiler/ssair/inlining.jl:1798 ssa_substitute_op! at ./compiler/ssair/inlining.jl:1852 ir_inline_item! at ./compiler/ssair/inlining.jl:386 ... ``` This commit updates the abstract interpretation and inlining algorithm to use the `nargs`/`isva` values held by `CodeInfo`. Similar modifications have also been made to EscapeAnalysis.jl. @nanosoldier `runbenchmarks("inference", vs=":master")` * Add `.zed` directory to `.gitignore` (#55974) Similar to the `vscode` config directory, we may ignore the `zed` directory as well. * typeintersect: reduce unneeded allocations from `merge_env` `merge_env` and `final_merge_env` could be skipped for emptiness test or if we know there's only 1 valid Union state. * typeintersect: trunc env before nested `intersect_all` if valid. This only covers the simplest cases. We might want a full dependence analysis and keep env length minimum in the future. * `@time` actually fix time report commas & add tests (#55982) https://github.com/JuliaLang/julia/pull/55977 looked simple but wasn't quite right because of a bad pattern in the lock conflicts report section. So fix and add tests. * adjust EA to JuliaLang/julia#52527 (#55986) `EnterNode.catch_dest` can now be `0` after the `try`/`catch` elision feature implemented in JuliaLang/julia#52527, and we actually need to adjust `EscapeAnalysis.compute_frameinfo` too. * Improvements to JITLink Seeing what this will look like, since it has a number of features (delayed compilation, concurrent compilation) that are starting to become important, so it would be nice to switch to only supporting one common implementation of memory management. Refs #50248 I am expecting https://github.com/llvm/llvm-project/issues/63236 may cause some problems, since we reconfigured some CI machines to minimize that issue, but it is still likely relevant. * rewrite catchjmp asm to use normal relocations instead of manual editing * add logic to prefer loading modules that are already loaded (#55908) Iterate over the list of existing loaded modules for PkgId whenever loading a new module for PkgId, so that we will use that existing build_id content if it otherwise passes the other stale_checks. * Apple: fix bus error on smaller readonly file in unix (#55859) Enables the fix for #28245 in #44354 for Apple now that the Julia bugs are fixed by #55641 and #55877. Closes #28245 * Add `Float16` to `Base.HWReal` (#55929) * docs: make mod an operator (#55988) * InteractiveUtils: add `@trace_compile` and `@trace_dispatch` (#55915) * Profile: document heap snapshot viewing tools (#55743) * [REPL] Fix #55850 by using `safe_realpath` instead of `abspath` in `projname` (#55851) * optimizer: enable load forwarding with the `finalizer` elision (#55991) When the finalizer elision pass is used, load forwarding is not performed currently, regardless of whether the pass succeeds or not. But this is not necessary, and by keeping the `setfield!` call, we can safely forward `getfield` even if finalizer elision is tried. * Avoid `stat`-ing stdlib path if it's unreadable (#55992) * doc: manual: cmd: fix Markdown in table entry for `--trim` (#55979) * Avoid conversions to `Float64` in non-literal powers of `Float16` (#55994) Co-authored-by: Alex Arslan <[email protected]> * Remove unreachable error branch in memset calls (and in repeat) (#55985) Some places use the pattern memset(A, v, length(A)), which requires a conversion UInt(length(A)). This is technically fallible, but can't actually fail when A is a Memory or Array. Remove the dead error branch by casting to UInt instead. Similarly, in repeat(x, r), r is first checked to be nonnegative, then converted to UInt, then used in multiple calls where it is converted to UInt each time. Here, only do it once. * fix up docstring of `mod` (#56000) * fix typos (#56008) these are all in markdown files Co-authored-by: spaette <[email protected]> * Vectorise random vectors of `Float16` (#55997) * Clarify `div` docstring for floating-point input (#55918) Closes #55837 This is a variant of the warning found in the `fld` docstring clarifying floating-point behaviour. * improve getproperty(Pairs) warnings (#55989) - Only call `depwarn` if the field is `itr` or `data`; otherwise let the field error happen as normal - Give a more specific deprecation warning. * Document type-piracy / type-leakage restrictions for `require_stdlib` (#56005) I was a recent offender in https://github.com/JuliaLang/Pkg.jl/issues/4017#issuecomment-2377589989 This PR tries to lay down some guidelines for the behavior that stdlibs and the callers of `require_stdlib` must adhere to to avoid "duplicate stdlib" bugs These bugs are particularly nasty because they are experienced semi-rarely and under pretty specific circumstances (they only occur when `require_stdlib` loads another copy of a stdlib, often in a particular order and/or with a particular state of your pre-compile / loading cache) so they may make it a long way through a pre-release cycle without an actionable bug report. * [LinearAlgebra] Remove unreliable doctests (#56011) The exact textual representation of the output of these doctests depend on the specific kernel used by the BLAS backend, and can vary between versions of OpenBLAS (as it did in #41973), or between different CPUs, which makes these doctests unreliable. Fix #55998. * cleanup functions of Hermitian matrices (#55951) The functions of Hermitian matrices are a bit of a mess. For example, if we have a Hermitian matrix `a` with negative eigenvalues, `a^0.5` doesn't produce the `Symmetric` wrapper, but `sqrt(a)` does. On the other hand, if we have a positive definite `b`, `b^0.5` will be `Hermitian`, but `sqrt(b)` will be `Symmetric`: ```julia using LinearAlgebra a = Hermitian([1.0 2.0;2.0 1.0]) a^0.5 sqrt(a) b = Hermitian([2.0 1.0; 1.0 2.0]) b^0.5 sqrt(b) ``` This sort of arbitrary assignment of wrappers happens with pretty much all functions defined there. There's also some oddities, such as `cis` being the only function defined for `SymTridiagonal`, even though all `eigen`-based functions work, and `cbrt` being the only function not defined for complex Hermitian matrices. I did a cleanup: I defined all functions for `SymTridiagonal` and `Hermitian{<:Complex}`, and always assigned the appropriate wrapper, preserving the input one when possible. There's an inconsistency remaining that I didn't fix, that only `sqrt` and `log` accept a tolerance argument, as changing that is probably breaking. There were also hardly any tests that I could find (only `exp`, `log`, `cis`, and `sqrt`). I'm happy to add them if it's desired. * Fix no-arg `ScopedValues.@with` within a scope (#56019) Fixes https://github.com/JuliaLang/julia/issues/56017 * LinearAlgebra: make matprod_dest public (#55537) Currently, in a matrix multiplication `A * B`, we use `B` to construct the destination. However, this may not produce the optimal destination type, and is essentially single-dispatch. Letting packages specialize `matprod_dest` would help us obtain the optimal type by dispatching on both the arguments. This may significantly improve performance in the matrix multiplication. As an example: ```julia julia> using LinearAlgebra, FillArrays, SparseArrays julia> F = Fill(3, 10, 10); julia> s = sprand(10, 10, 0.1); julia> @btime $F * $s; 15.225 μs (10 allocations: 4.14 KiB) julia> typeof(F * s) SparseMatrixCSC{Float64, Int64} julia> nnz(F * s) 80 julia> VERSION v"1.12.0-DEV.1074" ``` In this case, the destination is a sparse matrix with 80% of its elements filled and being set one-by-one, which is terrible for performance. Instead, if we specialize `matprod_dest` to return a dense destination, we may obtain ```julia julia> LinearAlgebra.matprod_dest(F::FillArrays.AbstractFill, S::SparseMatrixCSC, ::Type{T}) where {T} = Matrix{T}(undef, size(F,1), size(S,2)) julia> @btime $F * $s; 754.632 ns (2 allocations: 944 bytes) julia> typeof(F * s) Matrix{Float64} ``` Potentially, this may be improved further by specializing `mul!`, but this is a 20x improvement just by choosing the right destination. Since this is being made public, we may want to bikeshed on an appropriate name for the function. * Sockets: Warn when local network access not granted. (#56023) Works around https://github.com/JuliaLang/julia/issues/56022 * Update test due to switch to intel syntax by default in #48103 (#55993) * add require_lock call to maybe_loaded_precompile (#56027) If we expect this to be a public API (https://github.com/timholy/Revise.jl for some reason is trying to access this state), we should lock around it for consistency with the other similar functions. Needed for https://github.com/timholy/Revise.jl/pull/856 * fix `power_by_squaring`: use `promote` instead of type inference (#55634) Fixes #53504 Fixes #55633 * Don't show keymap `@error` for hints (#56041) It's too disruptive to show errors for hints. The error will still be shown if tab is pressed. Helps issues like https://github.com/JuliaLang/julia/issues/56037 * Refactoring to be considered before adding MMTk * Removing jl_gc_notify_image_load, since it's a new function and not part of the refactoring * Moving gc_enable code to gc-common.c * Addressing PR comments * Push resolution of merge conflict * Removing jl_gc_mark_queue_obj_explicit extern definition from scheduler.c * Don't need the getter function since it's possible to use jl_small_typeof directly * Remove extern from free_stack declaration in julia_internal.h * Putting everything that is common GC tls into gc-tls-common.h * Typo * Adding gc-tls-common.h to Makefile as a public header * Removing gc-tls-common fields from gc-tls-mmtk.h * Fix typo in sockets tests. (#56038) * EA: use `is_mutation_free_argtype` for the escapability check (#56028) EA has been using `isbitstype` for type-level escapability checks, but a better criterion (`is_mutation_free`) is available these days, so we would like to use that instead. * effects: fix `Base.@_noub_meta` (#56061) This had the incorrect number of arguments to `Expr(:purity, ...)` causing it to be silently ignored. * effects: improve `:noub_if_noinbounds` documentation (#56060) Just a small touch-up * Disallow assigning asymmetric values to SymTridiagonal (#56068) Currently, we can assign an asymmetric value to a `SymTridiagonal`, which goes against what `setindex!` is expected to do. This is because `SymTridiagonal` symmetrizes the values along the diagonal, so setting a diagonal entry to an asymmetric value would lead to a subsequent `getindex` producing a different result. ```julia julia> s = SMatrix{2,2}(1:4); julia> S = SymTridiagonal(fill(s,4), fill(s,3)) 4×4 SymTridiagonal{SMatrix{2, 2, Int64, 4}, Vector{SMatrix{2, 2, Int64, 4}}}: [1 3; 3 4] [1 3; 2 4] ⋅ ⋅ [1 2; 3 4] [1 3; 3 4] [1 3; 2 4] ⋅ ⋅ [1 2; 3 4] [1 3; 3 4] [1 3; 2 4] ⋅ ⋅ [1 2; 3 4] [1 3; 3 4] julia> S[1,1] = s 2×2 SMatrix{2, 2, Int64, 4} with indices SOneTo(2)×SOneTo(2): 1 3 2 4 julia> S[1,1] == s false julia> S[1,1] 2×2 Symmetric{Int64, SMatrix{2, 2, Int64, 4}} with indices SOneTo(2)×SOneTo(2): 1 3 3 4 ``` After this PR, ```julia julia> S[1,1] = s ERROR: ArgumentError: cannot set a diagonal entry of a SymTridiagonal to an asymmetric value ``` * Remove unused matrix type params in diag methods (#56048) These parameters are not used in the method, and are unnecessary for dispatch. * LinearAlgebra: diagzero for non-OneTo axes (#55252) Currently, the off-diagonal zeros for a block-`Diagonal` matrix is computed using `diagzero`, which calls `zeros` for the sizes of the elements. This returns an `Array`, unless one specializes `diagzero` for the custom `Diagonal` matrix type. This PR defines a `zeroslike` function that dispatches on the axes of the elements, which lets packages specialize on the axes to return custom `AbstractArray`s. Choosing to specialize on the `eltype` avoids the need to specialize on the container, and allows packages to return appropriate types for custom axis types. With this, ```julia julia> LinearAlgebra.zeroslike(::Type{S}, ax::Tuple{SOneTo, Vararg{SOneTo}}) where {S<:SMatrix} = SMatrix{map(length, ax)...}(ntuple(_->zero(eltype(S)), prod(length, ax))) julia> D = Diagonal(fill(SMatrix{2,3}(1:6), 2)) 2×2 Diagonal{SMatrix{2, 3, Int64, 6}, Vector{SMatrix{2, 3, Int64, 6}}}: [1 3 5; 2 4 6] ⋅ ⋅ [1 3 5; 2 4 6] julia> D[1,2] # now an SMatrix 2×3 SMatrix{2, 3, Int64, 6} with indices SOneTo(2)×SOneTo(3): 0 0 0 0 0 0 julia> LinearAlgebra.zeroslike(::Type{S}, ax::Tuple{SOneTo, Vararg{SOneTo}}) where {S<:MMatrix} = MMatrix{map(length, ax)...}(ntuple(_->zero(eltype(S)), prod(length, ax))) julia> D = Diagonal(fill(MMatrix{2,3}(1:6), 2)) 2×2 Diagonal{MMatrix{2, 3, Int64, 6}, Vector{MMatrix{2, 3, Int64, 6}}}: [1 3 5; 2 4 6] ⋅ ⋅ [1 3 5; 2 4 6] julia> D[1,2] # now an MMatrix 2×3 MMatrix{2, 3, Int64, 6} with indices SOneTo(2)×SOneTo(3): 0 0 0 0 0 0 ``` The reason this can't be the default behavior is that we are not guaranteed that there exists a `similar` method that accepts the combination of axes. This is why we have to fall back to using the sizes, unless a specialized method is provided by a package. One positive outcome of this is that indexing into such a block-diagonal matrix will now usually be type-stable, which mitigates https://github.com/JuliaLang/julia/issues/45535 to some extent (although it doesn't resolve the issue). I've also updated the `getindex` for `Bidiagonal` to use `diagzero`, instead of the similarly defined `bidiagzero` function that it was using. Structured block matrices may now use `diagzero` uniformly to generate the zero elements. * Multi-argument `gcdx(a, b, c...)` (#55935) Previously, `gcdx` only worked for two arguments - but the underlying idea extends to any (nonzero) number of arguments. Similarly, `gcd` already works for 1, 2, 3+ arguments. This PR implements the 1 and 3+ argument versions of `gcdx`, following the [wiki page](https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm#The_case_of_more_than_two_numbers) for the Extended Euclidean algorithm. * Refactoring to be considered before adding MMTk * Removing jl_gc_notify_image_load, since it's a new function and not part of the refactoring * Moving gc_enable code to gc-common.c * Addressing PR comments * Push resolution of merge conflict * Removing jl_gc_mark_queue_obj_explicit extern definition from scheduler.c * Don't need the getter function since it's possible to use jl_small_typeof directly * Remove extern from free_stack declaration in julia_internal.h * Putting everything that is common GC tls into gc-tls-common.h * Typo * Adding gc-tls-common.h to Makefile as a public header * Adding jl_full_sweep_reasons since timing.jl depends on it * Fixing issue with jl_full_sweep_reasons (missing constants) * fix `_growbeg!` unncessary resizing (#56029) This was very explicitly designed such that if there was a bunch of extra space at the end of the array, we would copy rather than allocating, but by making `newmemlen` be at least `overallocation(memlen)` rather than `overallocation(len)`, this branch was never hit. found by https://github.com/JuliaLang/julia/issues/56026 * REPL: hide any prints to stdio during `complete_line` (#55959) * teach llvm-alloc-helpers about `gc_loaded` (#56030) combined with https://github.com/JuliaLang/julia/pull/55913, the compiler is smart enough to fully remove ``` function f() m = Memory{Int}(undef, 3) @inbounds m[1] = 2 @inbounds m[2] = 2 @inbounds m[3] = 4 @inbounds return m[1] + m[2] + m[3] end ``` * mpfr: prevent changing precision (#56049) Changing precision requires reallocating the data field, which is better done by making a new BigFloat (since they are conceptually immutable anyways). Also do a bit a cleanup while here. Closes #56044 * stackwalk: fix jl_thread_suspend_and_get_state race (#56047) There was a missing re-assignment of old = -1; at the end of that loop which means in the ABA case, we accidentally actually acquire the lock on the thread despite not actually having stopped the thread; or in the counter-case, we try to run through this logic with old==-1 on the next iteration, and that isn't valid either (jl_thread_suspend_and_get_state should return failure and the loop will abort too early). Fix #56046 * irrationals: restrict assume effects annotations to known types (#55886) Other changes: * replace `:total` with the less powerful `:foldable` * add an `<:Integer` dispatch constraint on the `rationalize` method, closes #55872 * replace `Rational{<:Integer}` with just `Rational`, they're equal Other issues, related to `BigFloat` precision, are still present in irrationals.jl, to be fixed by followup PRs, including #55853. Fixes #55874 * update `hash` doc string: `widen` not required any more (#55867) Implementing `widen` isn't a requirement any more, since #26022. * Merge `diag` methods for triangular matrices (#56086) * slightly improve inference in precompilation code (#56084) Avoids the ``` 11: signature Tuple{typeof(convert), Type{String}, Any} triggered MethodInstance for Base.Precompilation.ExplicitEnv(::String) (84 children) ``` shown in https://github.com/JuliaLang/julia/issues/56080#issuecomment-2404765120 Co-authored-by: KristofferC <[email protected]> * avoid defining `convert(Vector{String}, ...)` in LibGit2 (#56082) This is a weird conversion function to define. Seems cleaner to use the iteration interface for this. Also avoids some invalidations (https://github.com/JuliaLang/julia/issues/56080#issuecomment-2404765120) Co-authored-by: KristofferC <[email protected]> * array: inline `convert` where possible (#56034) This improves a common scenario, where someone wants to `push!` a poorly-typed object onto a well-typed Vector. For example: ```julia const NT = @NamedTuple{x::Int,y::Any} foo(v::Vector{NT}, x::Int, @nospecialize(y)) = push!(v, (; x, y)) ``` The `(; x, y)` is slightly poorly-typed here. It could have any type for its `.y` field before it is converted inside the `push!` to a NamedTuple with `y::Any` Without this PR, the dispatch for this `push!` cannot be inferred: ```julia julia> code_typed(foo, (Vector{NT}, Int, Any))[1] CodeInfo( 1 ─ ... │ %4 = %new(%3, x, y)::NamedTuple{(:x, :y), <:Tuple{Int64, Any}} │ %5 = Main.push!(v, %4)::Vector{@NamedTuple{x::Int64, y}} └── return %5 ) => Vector{@NamedTuple{x::Int64, y}} ``` With this PR, the above dynamic call is fully statically resolved and inlined (and therefore `--trim` compatible) * Remove some unnecessary `real` specializations for structured matrices (#56083) The `real(::AbstractArray{<:Rea})` fallback method should handle these cases correctly. * Combine `diag` methods for `SymTridiagonal` (#56014) Currently, there are two branches, one for an `eltype` that is a `Number`, and the other that deals with generic `eltype`s. They do similar things, so we may combine these, and use branches wherever necessary to retain the performance. We also may replace explicit materialized arrays by generators in `copyto!`. Overall, this improves performance in `diag` for matrices of matrices, whereas the performance in the common case of matrices of numbers remains unchanged. ```julia julia> using StaticArrays, LinearAlgebra julia> s = SMatrix{2,2}(1:4); julia> S = SymTridiagonal(fill(s,100), fill(s,99)); julia> @btime diag($S); 1.292 μs (5 allocations: 7.16 KiB) # nightly, v"1.12.0-DEV.1317" 685.012 ns (3 allocations: 3.19 KiB) # This PR ``` This PR also allows computing the `diag` for more values of the band index `n`: ```julia julia> diag(S,99) 1-element Vector{SMatrix{2, 2, Int64, 4}}: [0 0; 0 0] ``` This would work as long as `getindex` works for the `SymTridiagonal` for that band, and the zero element may be converted to the `eltype`. * fix `Vararg{T,T} where T` crashing `code_typed` (#56081) Not sure this is the right place to fix this error, perhaps `match.spec_types` should always be a tuple of valid types? fixes #55916 --------- Co-authored-by: Jameson Nash <[email protected]> * [libblastrampoline_jll] Upgrade to v5.11.1 (#56094) v5.11.1 is a patch release with a couple of RISC-V fixes. * Revert "REPL: hide any prints to stdio during `complete_line`" (#56102) * Remove warning from c when binding is ambiguous (#56103) * make `Base.ANSIIterator` have a concrete field (#56088) Avoids the invalidation ``` backedges: 1: superseding sizeof(s::AbstractString) @ Base strings/basic.jl:177 with MethodInstance for sizeof(::AbstractString) (75 children) ``` shown in https://github.com/JuliaLang/julia/issues/56080#issuecomment-2404765120. Co-authored-by: KristofferC <[email protected]> * Subtype: some performance tuning. (#56007) The main motivation of this PR is to fix #55807. dc689fe8700f70f4a4e2dbaaf270f26b87e79e04 tries to remove the slow `may_contain_union_decision` check by re-organizing the code path. Now the fast path has been removed and most of its optimization has been integrated into the preserved slow path. Since the slow path stores all inner ∃ decisions on the outer most R stack, there might be overflow risk. aee69a41441b4306ba3ee5e845bc96cb45d9b327 should fix that concern. The reported MWE now becomes ```julia 0.000002 seconds 0.000040 seconds (105 allocations: 4.828 KiB, 52.00% compilation time) 0.000023 seconds (105 allocations: 4.828 KiB, 49.36% compilation time) 0.000026 seconds (105 allocations: 4.828 KiB, 50.38% compilation time) 0.000027 seconds (105 allocations: 4.828 KiB, 54.95% compilation time) 0.000019 seconds (106 allocations: 4.922 KiB, 49.73% compilation time) 0.000024 seconds (105 allocations: 4.828 KiB, 52.24% compilation time) ``` Local bench also shows that 72855cd slightly accelerates `OmniPackage.jl`'s loading ```julia julia> @time using OmniPackage # v1.11rc4 20.525278 seconds (25.36 M allocations: 1.606 GiB, 8.48% gc time, 12.89% compilation time: 77% of which was recompilation) # v1.11rc4+aee69a4+72855cd 19.527871 seconds (24.92 M allocations: 1.593 GiB, 8.88% gc time, 15.13% compilation time: 82% of which was recompilation) ``` * rearrange jl_delete_thread to be thread-safe (#56097) Prior to this, especially on macOS, the gc-safepoint here would cause the process to segfault as we had already freed the current_task state. Rearrange this code so that the GC interactions (except for the atomic store to current_task) are all handled before entering GC safe, and then signaling the thread is deleted (via setting current_task = NULL, published by jl_unlock_profile_wr to other threads) is last. ``` ERROR: Exception handler triggered on unmanaged thread. Process 53827 stopped * thread #5, stop reason = EXC_BAD_ACCESS (code=2, address=0x100018008) frame #0: 0x0000000100b74344 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_set(ptls=0x000000011f8b3200, state='\x02', old_state=<unavailable>) at julia_threads.h:272:9 [opt] 269 assert(old_state != JL_GC_CONCURRENT_COLLECTOR_THREAD); 270 jl_atomic_store_release(&ptls->gc_state, state); 271 if (state == JL_GC_STATE_UNSAFE || old_state == JL_GC_STATE_UNSAFE) -> 272 jl_gc_safepoint_(ptls); 273 return old_state; 274 } 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, Target 0: (julia) stopped. (lldb) up frame #1: 0x0000000100b74320 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_save_and_set(ptls=0x000000011f8b3200, state='\x02') at julia_threads.h:278:12 [opt] 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, 276 int8_t state) 277 { -> 278 return jl_gc_state_set(ptls, state, jl_atomic_load_relaxed(&ptls->gc_state)); 279 } 280 #ifdef __clang_gcanalyzer__ 281 // these might not be a safepoint (if they are no-op safe=>safe transitions), but we have to assume it could be (statically) (lldb) frame #2: 0x0000000100b7431c libjulia-internal.1.12.0.dylib`jl_delete_thread(value=0x000000011f8b3200) at threading.c:537:11 [opt] 534 ptls->root_task = NULL; 535 jl_free_thread_gc_state(ptls); 536 // then park in safe-region -> 537 (void)jl_gc_safe_enter(ptls); 538 } ``` (test incorporated into https://github.com/JuliaLang/julia/pull/55793) * OpenBLAS: Use dynamic architecture support on AArch64. (#56107) We already do so on Yggdrasil, so this just makes both source and binary builds behave similarly. Closes https://github.com/JuliaLang/julia/issues/56075 * IRShow: label builtin / intrinsic / dynamic calls in `code_typed` (#56036) This makes it much easier to spot dynamic dispatches * 🤖 [master] Bump the Pkg stdlib from 51d4910c1 to fbaa2e337 (#56124) * Fix type instability of closures capturing types (2) (#40985) Instead of closures lowering to `typeof` for the types of captured fields, this introduces a new function `_typeof_captured_variable` that returns `Type{T}` if `T` is a type (w/o free typevars). - replaces/closes #35970 - fixes #23618 --------- Co-authored-by: Takafumi Arakaki <[email protected]> Co-authored-by: Shuhei Kadowaki <[email protected]> * Remove debug error statement from Makefile. (#56127) * align markdown table (#56122) @<!-- -->gbaraldi `#51197` @<!-- -->spaette `#56008` fix innocuous malalignment of table after those pulls were merged * Improve IOBuffer docs (#56024) Based on the discussion in #55978, I have tried to clarify the documentation of `IOBuffer`. * Comment out url and fix typo in stackwalk.c (#56131) Introduced in #55623 * libgit2: Always use the bundled PCRE library. (#56129) This is how Yggdrasil builds the library. * Update JLL build versions (#56133) This commit encompasses the following changes: - Updating the JLL build version for Clang, dSFMT, GMP, LibUV, LibUnwind, LLD, LLVM, libLLVM, MbedTLS, MPFR, OpenBLAS, OpenLibm, p7zip, PCRE2, SuiteSparse, and Zlib. - Updating CompilerSupportLibraries to v1.2.0. The library versions contained in this release of CSL don't differ from v1.1.1, the only difference is that v1.2.0 includes FreeBSD AArch64. - Updating nghttp2 from 1.60.0 to 1.63.0. See [here](https://github.com/nghttp2/nghttp2/releases) for changes between these versions. - Adding `aarch64-unknown-freebsd` to the list of triplets to check when refreshing checksums. Note that dependencies that link to MbedTLS (Curl, LibSSH2, LibGit2) are excluded here. They'll be updated once a resolution is reached for the OpenSSL switching saga. Once that happens, FreeBSD AArch64 should be able to be built without any dependency source builds. * typo in `Compiler.Effects` doc string: `checkbounds` -> `boundscheck` (#56140) Follows up on #56060 * HISTORY: fix missing links (#56137) * OpenBLAS: Fix cross-compilation detection for source build. (#56139) We may be cross-compiling Linux-to-Linux, in which case `BUILD_OS` == `OS`, so look at `XC_HOST` to determine whether we're cross compiling. * `diag` for `BandedMatrix`es for off-limit bands (#56065) Currently, one can only obtain the `diag` for a `BandedMatrix` (such as a `Diagonal`) when the band index is bounded by the size of the matrix. This PR relaxes this requirement to match the behavior for arrays, where `diag` returns an empty vector for a large band index instead of throwing an error. ```julia julia> D = Diagonal(ones(4)) 4×4 Diagonal{Float64, Vector{Float64}}: 1.0 ⋅ ⋅ ⋅ ⋅ 1.0 ⋅ ⋅ ⋅ ⋅ 1.0 ⋅ ⋅ ⋅ ⋅ 1.0 julia> diag(D, 10) Float64[] julia> diag(Array(D), 10) Float64[] ``` Something similar for `SymTridiagonal` is being done in https://github.com/JuliaLang/julia/pull/56014 * Port progress bar improvements from Pkg (#56125) Includes changes from https://github.com/JuliaLang/Pkg.jl/pull/4038 and https://github.com/JuliaLang/Pkg.jl/pull/4044. Co-authored-by: Kristoffer Carlsson <[email protected]> * Add support for LLVM 19 (#55650) Co-authored-by: Zentrik <[email protected]> * 🤖 [master] Bump the Pkg stdlib from fbaa2e337 to 27c1b1ee5 (#56146) * HISTORY entry for deletion of `length(::Stateful)` (#55861) xref #47790 xref #51747 xref #54953 xref #55858 * ntuple: ensure eltype is always `Int` (#55901) Fixes #55790 * Improve remarks of the alloc opt pass slightly. (#55995) The Value printer LLVM uses just prints the kind of instruction so it just shows call. --------- Co-authored-by: Oscar Smith <[email protected]> * Implement Base.fd() for TCPSocket, UDPSocket, and TCPServer (#53721) This is quite handy if you want to pass off the file descriptor to a C library. I also added a warning to the `fd()` docstring to warn folks about duplicating the file descriptor first. * Fix `JULIA_CPU_TARGET` being propagated to workers precompiling stdlib pkgimages (#54093) Apparently (thanks ChatGPT) each line in a makefile is executed in a separate shell so adding an `export` line on one line does not propagate to the next line. * Merge tr methods for triangular matrices (#56154) Since the methods do identical things, we don't need multiple of these. * Reduce duplication in triangular indexing methods (#56152) This uses an orthogonal design to reduce code duplication in the indexing methods for triangular matrices. * update LLVM docs (#56162) dump with raw=true so you don't get random erorrs, and show how to run single modules. --------- Co-authored-by: Valentin Churavy <[email protected]> Co-authored-by: Mosè Giordano <[email protected]> Co-authored-by: Jameson Nash <[email protected]> * Fix zero elements for block-matrix kron involving Diagonal (#55941) Currently, it's assumed that the zero element is identical for the matrix, but this is not necessary if the elements are matrices themselves and have different sizes. This PR ensures that `kron` for a `Diagonal` has the correct zero elements. Current: ```julia julia> D = Diagonal(1:2) 2×2 Diagonal{Int64, UnitRange{Int64}}: 1 ⋅ ⋅ 2 julia> B = reshape([ones(2,2), ones(3,2), ones(2,3), ones(3,3)], 2, 2); julia> size.(kron(D, B)) 4×4 Matrix{Tuple{Int64, Int64}}: (2, 2) (2, 3) (2, 2) (2, 2) (3, 2) (3, 3) (2, 2) (2, 2) (2, 2) (2, 2) (2, 2) (2, 3) (2, 2) (2, 2) (3, 2) (3, 3) ``` This PR ```julia julia> size.(kron(D, B)) 4×4 Matrix{Tuple{Int64, Int64}}: (2, 2) (2, 3) (2, 2) (2, 3) (3, 2) (3, 3) (3, 2) (3, 3) (2, 2) (2, 3) (2, 2) (2, 3) (3, 2) (3, 3) (3, 2) (3, 3) ``` Note the differences e.g. in the `CartesianIndex(4,1)`, `CartesianIndex(3,2)` and `CartesianIndex(3,3)` elements. * Call `MulAddMul` instead of multiplication in _generic_matmatmul! (#56089) Fix https://github.com/JuliaLang/julia/issues/56085 by calling a newly created `MulAddMul` object that only wraps the `alpha` (with `beta` set to `false`). This avoids the explicit multiplication if `alpha` is known to be `isone`. * improve `allunique`'s type stability (#56161) Caught by https://github.com/aviatesk/JET.jl/issues/667. * Add invalidation barriers for `displaysize` and `implicit_typeinfo` (#56159) These are invalidated by our own stdlibs (Dates and REPL) unfortunately so we need to put this barrier in. This fix is _very_ un-satisfying, because it doesn't do anything to solve this problem for downstream libraries that use e.g. `displaysize`. To fix that, I think we need a way to make sure callers get these invalidation barriers by default... * Fix markdown list in installation.md (#56165) Documenter.jl requires all trailing list content to follow the same indentation as the header. So, in the current view (https://docs.julialang.org/en/v1/manual/installation/#Command-line-arguments) the list appears broken. * [Random] Add more comments and a helper function in Xoshiro code (#56144) Follow up to #55994 and #55997. This should basically be a non-functional change and I see no performance difference, but the comments and the definition of a helper function should make the code easier to follow (I initially struggled in #55997) and extend to other types. * add objects to concisely specify initialization PerProcess: once per process PerThread: once per thread id PerTask: once per task object * add precompile support for recording fields to change Somewhat generalizes our support for changing Ptr to C_NULL. Not particularly fast, since it is just using the builtins implementation of setfield, and delaying the actual stores, but it should suffice. * improve OncePer implementation Address reviewer feedback, add more fixes and more tests, rename to add Once prefix. * fix use-after-free in test (detected in win32 CI) * Make loading work when stdlib deps are missing in the manifest (#56148) Closes https://github.com/JuliaLang/julia/issues/56109 Simulating a bad manifest by having `LibGit2_jll` missing as a dep of `LibGit2` in my default env, say because the manifest was generated by a different julia version or different master julia commit. ## This PR, it just works ``` julia> using Revise julia> ``` i.e. ``` % JULIA_DEBUG=loading ./julia --startup-file=no julia> using Revise ... ┌ Debug: Stdlib LibGit2 [76f85450-5226-5b5a-8eaa-529ad045b433] is trying to load `LibGit2_jll` │ which is not listed as a dep in the load path manifests, so resorting to search │ in the stdlib Project.tomls for true deps └ @ Base loading.jl:387 ┌ Debug: LibGit2 [76f85450-5226-5b5a-8eaa-529ad045b433] indeed depends on LibGit2_jll in project /Users/ian/Documents/GitHub/julia/usr/share/julia/stdlib/v1.12/LibGit2/Project.toml └ @ Base loading.jl:395 ... julia> ``` ## Master ``` julia> using Revise Info Given Revise was explicitly requested, output will be shown live ERROR: LoadError: ArgumentError: Package LibGit2 does not have LibGit2_jll in its dependencies: - Note that the following manifests in the load path were resolved with a potentially different DEV version of the current version, which may be the cause of the error. Try to re-resolve them in the current version, or consider deleting them if that fails: /Users/ian/.julia/environments/v1.12/Manifest.toml - You may have a partially installed environment. Try `Pkg.instantiate()` to ensure all packages in the environment are installed. - Or, if you have LibGit2 checked out for development and have added LibGit2_jll as a dependency but haven't updated your primary environment's manifest file, try `Pkg.resolve()`. - Otherwise you may need to report an issue with LibGit2 ... ``` * Remove llvm-muladd pass and move it's functionality to to llvm-simdloop (#55802) Closes https://github.com/JuliaLang/julia/issues/55785 I'm not sure if we want to backport this like this. Because that removes some functionality (the pass itself). So LLVM.jl and friends might need annoying version code. We can maybe keep the code there and just not run the pass in a backport. * Fix implicit `convert(String, ...)` in several places (#56174) This removes several `convert(String, ...)` from this code, which really shouldn't be something we invalidate on in the first place (see https://github.com/JuliaLang/julia/issues/56173) but this is still an improvement in code quality so let's take it. * Change annotations to use a NamedTuple (#55741) Due to popular demand, the type of annotations is to be changed from a `Tuple{UnitRange{Int}, Pair{Symbol, Any}}` to a `NamedTuple{(:region, :label, :value), Tuple{UnitRange{Int}, Symbol, Any}}`. This requires the expected code churn to `strings/annotated.jl`, and some changes to the StyledStrings and JuliaSyntaxHighlighting libraries. Closes #55249 and closes #55245. * Getting rid of mmtk_julia.c in the binding and moving it to gc-mmtk.c * Trying to organize and label the code in gc-mmtk.c * Remove redundant `convert` in `_setindex!` (#56178) Follow up to #56034, ref: https://github.com/JuliaLang/julia/pull/56034#discussion_r1798573573. --------- Co-authored-by: Cody Tapscott <[email protected]> * Improve type inference of Artifacts.jl (#56118) This also has some changes that move platform selection to compile time together with https://github.com/JuliaPackaging/JLLWrappers.jl/commit/45cc04963f3c99d4eb902f97528fe16fc37002cc, move the platform selection to compile time. (this helps juliac a ton) * Initial support for RISC-V (#56105) Rebase and extension of @alexfanqi's initial work on porting Julia to RISC-V. Requires LLVM 19. Tested on a VisionFive2, built with: ```make MARCH := rv64gc_zba_zbb MCPU := sifive-u74 USE_BINARYBUILDER:=0 DEPS_GIT = llvm override LLVM_VER=19.1.1 override LLVM_BRANCH=julia-release/19.x override LLVM_SHA1=julia-release/19.x ``` ```julia-repl ❯ ./julia _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.12.0-DEV.1374 (2024-10-14) _/ |\__'_|_|_|\__'_| | riscv/25092a3982* (fork: 1 commits, 0 days) |__/ | julia> versioninfo(; verbose=true) Julia Version 1.12.0-DEV.1374 Commit 25092a3982* (2024-10-14 09:57 UTC) Platform Info: OS: Linux (riscv64-unknown-linux-gnu) uname: Linux 6.11.3-1-riscv64 #1 SMP Debian 6.11.3-1 (2024-10-10) riscv64 unknown CPU: …
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR updates the binding to the latest Julia master (up to this commit). Should be merged together with mmtk/mmtk-julia#18.