Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NDTensors] [ITensors] Update tests to use Julia version 1.10 and 1.11 #1539

Merged
merged 28 commits into from
Oct 12, 2024

Conversation

kmp5VT
Copy link
Collaborator

@kmp5VT kmp5VT commented Oct 9, 2024

Description

Per this issue #1537 updating test parameters to test against Julia version 1.10 and 1.11 on Github and on Jenkins CI

Closes #1537 and closes #1538.

@kmp5VT kmp5VT requested a review from mtfishman October 9, 2024 14:46
@mtfishman

This comment was marked as resolved.

@kmp5VT

This comment was marked as resolved.

@kmp5VT

This comment was marked as resolved.

@mtfishman

This comment was marked as resolved.

@mtfishman

This comment was marked as resolved.

@ogauthe

This comment was marked as resolved.

@mtfishman

This comment was marked as resolved.

@mtfishman

This comment was marked as resolved.

@mtfishman mtfishman changed the title [ITensors][NDTensors] Update tests to use Julia version 1.10 and 1.11 [NDTensors] [ITensors] Update tests to use Julia version 1.10 and 1.11 Oct 10, 2024
jenkins/Jenkinsfile Outdated Show resolved Hide resolved
jenkins/Jenkinsfile Outdated Show resolved Hide resolved
jenkins/Jenkinsfile Outdated Show resolved Hide resolved
@mtfishman
Copy link
Member

mtfishman commented Oct 10, 2024

@kmp5VT I've skipped the broken BlockSparseArrays tests and fixed the failing ITensorMPS tests by using StableRNGs, so this should be good to go once you address my comments about the Jenkins workflow.

@mtfishman
Copy link
Member

mtfishman commented Oct 10, 2024

The ITensorUnicodePlots.jl test failure is being fixed in ITensor/ITensorUnicodePlots.jl#6. EDIT: That PR was merged and registered.

@mtfishman
Copy link
Member

mtfishman commented Oct 10, 2024

@kmp5VT it looks like this test: https://github.com/kmp5VT/ITensors.jl/blob/73d432f2009d4772caf4773091269b9f5b73e230/NDTensors/src/lib/Expose/test/runtests.jl#L253 is failing in Julia 1.11 for JLArrays.jl with the error:

ArgumentError: Attempt to use a freed reference.

I guess that will need some investigating.

EDIT: It looks like it could be related to JuliaGPU/GPUArrays.jl#546.

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Oct 10, 2024

@mtfishman Looking into the issue now, Thanks!

@mtfishman
Copy link
Member

Here's a more minimal reproduction of that error:

using JLArrays: jl
A = jl(randn(2))
B = jl(randn(2))
C = append!(parent(view(A, :)), B)
sum(C)

which outputs:

ERROR: LoadError: ArgumentError: Attempt to use a freed reference.
Stacktrace:
  [1] getindex
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/abstractarray.jl:70 [inlined]
  [2] unsafe_convert
    @ ~/.julia/packages/JLArrays/hD5YX/src/JLArrays.jl:271 [inlined]
  [3] pointer
    @ ~/.julia/packages/JLArrays/hD5YX/src/JLArrays.jl:254 [inlined]
  [4] copyto!(dest::Vector{Float64}, d_offset::Int64, source::JLArrays.JLArray{Float64, 1}, s_offset::Int64, amount::Int64)
    @ JLArrays ~/.julia/packages/JLArrays/hD5YX/src/JLArrays.jl:335
  [5] getindex
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:52 [inlined]
  [6] _broadcast_getindex
    @ ./broadcast.jl:626 [inlined]
  [7] _getindex
    @ ./broadcast.jl:670 [inlined]
  [8] _broadcast_getindex
    @ ./broadcast.jl:645 [inlined]
  [9] getindex
    @ ./broadcast.jl:605 [inlined]
 [10] iterate
    @ ./broadcast.jl:273 [inlined]
 [11] iterate
    @ ./broadcast.jl:267 [inlined]
 [12] iterate
    @ ./generator.jl:45 [inlined]
 [13] collect(itr::Base.Generator{Base.Broadcast.Broadcasted{…}, typeof(identity)})
    @ Base ./array.jl:780
 [14] map(f::Function, A::Base.Broadcast.Broadcasted{JLArrays.JLArrayStyle{…}, Tuple{…}, typeof(identity), Tuple{…}})
    @ Base ./abstractarray.jl:3399
 [15] macro expansion
    @ ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:210 [inlined]
 [16] mapreducedim!(f::Function, op::Function, R::JLArrays.JLArray{…}, A::Base.Broadcast.Broadcasted{…}; init::Float64)
    @ JLArrays ~/.julia/packages/JLArrays/hD5YX/src/JLArrays.jl:413
 [17] mapreducedim!
    @ ~/.julia/packages/JLArrays/hD5YX/src/JLArrays.jl:408 [inlined]
 [18] _mapreduce(f::typeof(identity), op::typeof(Base.add_sum), As::JLArrays.JLArray{Float64, 1}; dims::Colon, init::Nothing)
    @ GPUArrays ~/.julia/packages/GPUArrays/qt4ax/src/host/mapreduce.jl:67
 [19] _mapreduce
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/mapreduce.jl:33 [inlined]
 [20] mapreduce
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/mapreduce.jl:28 [inlined]
 [21] _sum
    @ ./reducedim.jl:987 [inlined]
 [22] _sum
    @ ./reducedim.jl:986 [inlined]
 [23] sum(a::JLArrays.JLArray{Float64, 1})
    @ Base ./reducedim.jl:982

with versions:

julia> versioninfo()
Julia Version 1.11.0
Commit 501a4f25c2b (2024-10-07 11:40 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 10 × Apple M1 Max
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)

(julia_1.11) pkg> st JLArrays
Status `*/Project.toml`
  [27aeb0d3] JLArrays v0.1.5

though interestingly I also see that same error in Julia 1.10 so I don't know why the tests in Julia 1.10 are passing.

Maybe we should just not be capturing and using the output of append! and that goes against the memory management model of GPU arrays.

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Oct 11, 2024

though interestingly I also see that same error in Julia 1.10 so I don't know why the tests in Julia 1.10 are passing.

Maybe we should just not be capturing and using the output of append! and that goes against the memory management >model of GPU arrays.

Interestingly, I only see this bug occasionally, I made this to see how frequently the error occurs

using JLArrays: jl
num = 0
for i in 1:1000
A = jl(randn(2))
B = jl(randn(2))
C = append!(parent(view(A, :)), B)
try 
  sum(C)
  catch
    @show i
    @show num
  end
  num += 1
end

And the first time I saw the error was num = 30477 and the second time was num = 51163

@mtfishman
Copy link
Member

mtfishman commented Oct 11, 2024

It looks like the issue doesn't have to do with wrapping the input, and is an issue with the reference counting/memory management being done by append!/push!, since this errors in Julia 1.10 and 1.11:

using JLArrays: jl
A = jl(randn(2))
B = jl(randn(2))
append!(A, B)
sum(A)

This issue was fixed by JuliaGPU/GPUArrays.jl#546 (I checked that it is fixed when I use the latest commit of the master branch of the GPUArrays.jl repository) but it wasn't included in a new registered version of JLArrays.jl yet. I'll raise an issue over there to ask for a new registered version. EDIT: I requested a new registered version of JLArrays.jl here: JuliaGPU/GPUArrays.jl#562.

@mtfishman
Copy link
Member

@kmp5VT it looks like instead of installing Julia directly here: https://github.com/kmp5VT/ITensors.jl/blob/73d432f2009d4772caf4773091269b9f5b73e230/jenkins/Dockerfile#L15-L16 we could use this command: https://julialang.org/downloads/#install_julia to install juliaup, and then use juliaup to install Julia in the Jenkins workflows.

@dylex does that sound right? We are trying to configure it so we can use juliaup to have more flexibility about which Julia versions we install in the Jenkins workflow. For example, juliaup supports automatically installing the long term support (LTS) and latest Julia release with nice syntax, so we don't have to manually change the versions in the workflow file when there are new releases.

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Oct 11, 2024

@mtfishman I tried to install juliaup in the docker file installation. This seems to work fine and I can grab correct versions (using on LTS or 1). However there are issues with juliaup and the installed julia executable during the runtime of the docker. I could try to install juliaup when the docker is instantiated?

@mtfishman
Copy link
Member

mtfishman commented Oct 11, 2024

@mtfishman I tried to install juliaup in the docker file installation. This seems to work fine and I can grab correct versions (using on LTS or 1). However there are issues with juliaup and the installed julia executable during the runtime of the docker. I could try to install juliaup when the docker is instantiated?

I think this is beyond my expertise. I see errors like this in the Jenkins log:

+ /usr/local/.juliaup/bin/juliaup add 1
Error: Failed to update versions db.

Caused by:
0: run_command_update_version_db command failed to load configuration db.
1: Failed to open juliaup config file.
2: Permission denied (os error 13)
script returned exit code 1

It looks like it is some issue with not having permission to access the juliaup config file. Anyway, if it isn't obvious, I don't think this is worth bothering with right now, we can just revert it to the way it was before your last recent set of commits. Maybe we can get help from @dylex about this at some point.

@mtfishman mtfishman merged commit eba5e17 into ITensor:main Oct 12, 2024
15 checks passed
@kmp5VT kmp5VT deleted the kmp5/debug/update_test_1_11 branch October 16, 2024 16:24
corbett5 pushed a commit to corbett5/ITensors.jl that referenced this pull request Nov 4, 2024
* Require Julia 1.10 and above

* Fixes for Julia 1.11, update CI to use Julia version 1.10 and 1.11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants