-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
support GPU (`CUDA.jl`) acceleration for `evolution` and `spectrum`
- Loading branch information
Showing
19 changed files
with
383 additions
and
46 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
steps: | ||
- label: "HierarchicalEOM_CUDAExt" | ||
plugins: | ||
- JuliaCI/julia#v1: | ||
version: "1" | ||
- JuliaCI/julia-test#v1: | ||
test_args: "--quickfail" | ||
coverage: false # 1000x slowdown | ||
agents: | ||
queue: "juliagpu" | ||
cuda: "*" | ||
env: | ||
GROUP: "HierarchicalEOM_CUDAExt" | ||
JULIA_PKG_SERVER: "" # it often struggles with our large artifacts | ||
# SECRET_CODECOV_TOKEN: "..." | ||
timeout_in_minutes: 30 | ||
# Don't run Buildkite if the commit message includes the text [skip tests] | ||
if: build.message !~ /\[skip tests\]/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# [Extension for CUDA.jl](@id doc-ext-CUDA) | ||
|
||
This is an extension to support GPU ([`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl)) acceleration for solving the [time evolution](@ref doc-Time-Evolution) and [spectrum](@ref doc-Spectrum). This improves the execution time and memory usage especially when the HEOMLS matrix is super large. | ||
|
||
!!! compat "Compat" | ||
The described feature requires `Julia 1.9+`. | ||
|
||
The functions [`evolution`](@ref doc-Time-Evolution) (only supports ODE method with time-independent system Hamiltonian) and [`spectrum`](@ref doc-Spectrum) will automatically choose to solve on CPU or GPU depend on the type of the sparse matrix in `M::AbstractHEOMLSMatrix` objects (i.e., the type of the field `M.data`). | ||
|
||
```julia | ||
typeof(M.data) <: SparseMatrixCSC # solve on CPU | ||
typeof(M.data) <: CuSparseMatrixCSC # solve on GPU | ||
``` | ||
|
||
Therefore, we wrapped several functions in `CUDA` and `CUDA.CUSPARSE` in order to return a new HEOMLS-matrix-type object with `M.data` is in the type of `CuSparseMatrix`, and also change the element type into `ComplexF32` and `Int32` (since GPU performs better in this type). The functions are listed as follows: | ||
- `cu(M::AbstractHEOMLSMatrix)` : Translate `M.data` into the type `CuSparseMatrixCSC{ComplexF32, Int32}` | ||
- `CuSparseMatrixCSC(M::AbstractHEOMLSMatrix)` : Translate `M.data` into the type `CuSparseMatrixCSC{ComplexF32, Int32}` | ||
|
||
### Demonstration | ||
|
||
The extension will be automatically loaded if user imports the package `CUDA.jl` : | ||
|
||
```julia | ||
using CUDA | ||
using HierarchicalEOM | ||
using LinearSolve # to change the solver for better GPU performance | ||
``` | ||
|
||
### Setup | ||
|
||
Here, we demonstrate this extension by using the example of [the single-impurity Anderson model](@ref exp-SIAM). | ||
|
||
```julia | ||
ϵ = -5 | ||
U = 10 | ||
Γ = 2 | ||
μ = 0 | ||
W = 10 | ||
kT = 0.5 | ||
N = 5 | ||
tier = 3 | ||
|
||
tlist = 0f0:1f-1:10f0 # same as 0:0.1:10 but in the type of `Float32` | ||
ωlist = -10f0:1f0:10f0 # same as -10:1:10 but in the type of `Float32` | ||
|
||
σm = [0 1; 0 0] | ||
σz = [1 0; 0 -1] | ||
II = [1 0; 0 1] | ||
d_up = kron( σm, II) | ||
d_dn = kron(-1 * σz, σm) | ||
ρ0 = kron([1 0; 0 0], [1 0; 0 0]) | ||
Hsys = ϵ * (d_up' * d_up + d_dn' * d_dn) + U * (d_up' * d_up * d_dn' * d_dn) | ||
|
||
bath_up = Fermion_Lorentz_Pade(d_up, Γ, μ, W, kT, N) | ||
bath_dn = Fermion_Lorentz_Pade(d_dn, Γ, μ, W, kT, N) | ||
bath_list = [bath_up, bath_dn] | ||
|
||
# even HEOMLS matrix | ||
M_even_cpu = M_Fermion(Hsys, tier, bath_list; verbose=false) | ||
M_even_gpu = cu(M_even_cpu) | ||
|
||
# odd HEOMLS matrix | ||
M_odd_cpu = M_Fermion(Hsys, tier, bath_list, ODD; verbose=false) | ||
M_odd_gpu = cu(M_odd_cpu) | ||
|
||
# solve steady state with CPU | ||
ados_ss = SteadyState(M_even_cpu); | ||
``` | ||
|
||
!!! note "Note" | ||
This extension does not support for solving [`SteadyState`](@ref doc-Stationary-State) on GPU since it is not efficient and might get wrong solutions. If you really want to obtain the stationary state with GPU, you can repeatedly solve the [`evolution`](@ref doc-Time-Evolution) until you find it. | ||
|
||
### Solving time evolution with CPU | ||
|
||
```julia | ||
ados_list_cpu = evolution(M_even_cpu, ρ0, tlist; verbose=false) | ||
``` | ||
|
||
### Solving time evolution with GPU | ||
|
||
```julia | ||
ados_list_gpu = evolution(M_even_gpu, ρ0, tlist; verbose=false) | ||
``` | ||
|
||
### Solving Spectrum with CPU | ||
|
||
```julia | ||
dos_cpu = spectrum(M_odd_cpu, ados_ss, d_up, ωlist; verbose=false) | ||
``` | ||
|
||
### Solving Spectrum with GPU | ||
|
||
```julia | ||
dos_gpu = spectrum(M_odd_gpu, ados_ss, d_up, ωlist; solver=KrylovJL_BICGSTAB(rtol=1f-10, atol=1f-12), verbose=false) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
module HierarchicalEOM_CUDAExt | ||
|
||
using HierarchicalEOM | ||
import HierarchicalEOM.HeomAPI: _HandleVectorType, _HandleSteadyStateMatrix | ||
import HierarchicalEOM.Spectrum: _HandleIdentityType | ||
import CUDA | ||
import CUDA: cu, CuArray | ||
import CUDA.CUSPARSE: CuSparseMatrixCSC | ||
import SparseArrays: sparse, SparseVector, SparseMatrixCSC | ||
|
||
@doc raw""" | ||
cu(M::AbstractHEOMLSMatrix) | ||
Return a new HEOMLS-matrix-type object with `M.data` is in the type of `CuSparseMatrixCSC{ComplexF32, Int32}` for gpu calculations. | ||
""" | ||
cu(M::AbstractHEOMLSMatrix) = CuSparseMatrixCSC(M) | ||
|
||
@doc raw""" | ||
CuSparseMatrixCSC(M::AbstractHEOMLSMatrix) | ||
Return a new HEOMLS-matrix-type object with `M.data` is in the type of `CuSparseMatrixCSC{ComplexF32, Int32}` for gpu calculations. | ||
""" | ||
function CuSparseMatrixCSC(M::T) where T <: AbstractHEOMLSMatrix | ||
A = M.data | ||
if typeof(A) <: CuSparseMatrixCSC | ||
return M | ||
else | ||
colptr = CuArray{Int32}(A.colptr) | ||
rowval = CuArray{Int32}(A.rowval) | ||
nzval = CuArray{ComplexF32}(A.nzval) | ||
A_gpu = CuSparseMatrixCSC{ComplexF32, Int32}(colptr, rowval, nzval, size(A)) | ||
if T <: M_S | ||
return M_S(A_gpu, M.tier, M.dim, M.N, M.sup_dim, M.parity) | ||
elseif T <: M_Boson | ||
return M_Boson(A_gpu, M.tier, M.dim, M.N, M.sup_dim, M.parity, M.bath, M.hierarchy) | ||
elseif T <: M_Fermion | ||
return M_Fermion(A_gpu, M.tier, M.dim, M.N, M.sup_dim, M.parity, M.bath, M.hierarchy) | ||
else | ||
return M_Boson_Fermion(A_gpu, M.Btier, M.Ftier, M.dim, M.N, M.sup_dim, M.parity, M.Bbath, M.Fbath, M.hierarchy) | ||
end | ||
end | ||
end | ||
|
||
# for changing a `CuArray` back to `ADOs` | ||
function _HandleVectorType(V::T, cp::Bool=false) where T <: CuArray | ||
return Vector{ComplexF64}(V) | ||
end | ||
|
||
# for changing the type of `ADOs` to match the type of HEOMLS matrix | ||
function _HandleVectorType(MatrixType::Type{TM}, V::SparseVector) where TM <: CuSparseMatrixCSC | ||
TE = eltype(MatrixType) | ||
return CuArray{TE}(V) | ||
end | ||
|
||
##### We first remove this part because there are errors when solveing steady states using GPU | ||
# function _HandleSteadyStateMatrix(MatrixType::Type{TM}, M::AbstractHEOMLSMatrix, S::Int) where TM <: CuSparseMatrixCSC | ||
# colptr = Vector{Int32}(M.data.colPtr) | ||
# rowval = Vector{Int32}(M.data.rowVal) | ||
# nzval = Vector{ComplexF32}(M.data.nzVal) | ||
# A = SparseMatrixCSC{ComplexF32, Int32}(S, S, colptr, rowval, nzval) | ||
# A[1,1:S] .= 0f0 | ||
# | ||
# # sparse(row_idx, col_idx, values, row_dims, col_dims) | ||
# A += sparse(ones(Int32, M.dim), [Int32((n - 1) * (M.dim + 1) + 1) for n in 1:(M.dim)], ones(ComplexF32, M.dim), S, S) | ||
# return CuSparseMatrixCSC(A) | ||
# end | ||
|
||
function _HandleIdentityType(MatrixType::Type{TM}, S::Int) where TM <: CuSparseMatrixCSC | ||
colptr = CuArray{Int32}(Int32(1):Int32(S+1)) | ||
rowval = CuArray{Int32}(Int32(1):Int32(S)) | ||
nzval = CUDA.ones(ComplexF32, S) | ||
return CuSparseMatrixCSC{ComplexF32, Int32}(colptr, rowval, nzval, (S, S)) | ||
end | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.