Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch evaluations #819

Merged
merged 15 commits into from
Oct 16, 2023
Merged

Batch evaluations #819

merged 15 commits into from
Oct 16, 2023

Conversation

termi-official
Copy link
Member

@termi-official termi-official commented Oct 12, 2023

This PR introduces internal batched evaluations where possible, which is interesting for matrix-free and high order stuff (and maybe threading?). We might want to expose the functionality to the users in #766 . It should also help with eval times in #790 . In interpolations.jl there is an outline for #389 .

Due to the interaction this PR might precede #790 and #798 . Further is part of the infrastructure for #766 .

TODOs

  • exports for new functions
  • better docs

@codecov-commenter
Copy link

codecov-commenter commented Oct 13, 2023

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (e9ae940) 92.77% compared to head (a736e65) 92.33%.
Report is 1 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #819      +/-   ##
==========================================
- Coverage   92.77%   92.33%   -0.44%     
==========================================
  Files          33       33              
  Lines        4952     4955       +3     
==========================================
- Hits         4594     4575      -19     
- Misses        358      380      +22     
Files Coverage Δ
src/FEValues/cell_values.jl 100.00% <100.00%> (ø)
src/FEValues/face_values.jl 98.21% <100.00%> (-0.07%) ⬇️
src/PointEval/PointEvalHandler.jl 92.21% <ø> (-0.60%) ⬇️
src/PointEval/point_values.jl 96.96% <100.00%> (-0.40%) ⬇️
src/interpolations.jl 96.23% <68.75%> (-0.78%) ⬇️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@termi-official termi-official marked this pull request as ready for review October 13, 2023 01:10
Copy link
Collaborator

@AbdAlazezAhmed AbdAlazezAhmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 212 to 213
function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray}
@inbounds for i in 1:getnbasefunctions(ip)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a checkbounds here as default, which can be removed by calling @inbounds shape_values!(...)?

Suggested change
function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray}
@inbounds for i in 1:getnbasefunctions(ip)
@propagate_inbounds function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray}
@boundscheck checkbounds(values, 1:getnbasefunctions(ip))
@inbounds for i in 1:getnbasefunctions(ip)

Comment on lines 493 to 561
# struct TensorProductLagrange{shape, order} <: ScalarInterpolation{shape, order}
# end

# adjust_dofs_during_distribution(::TensorProductLagrange) = true
# adjust_dofs_during_distribution(::TensorProductLagrange{<:Any, 2}) = false
# adjust_dofs_during_distribution(::TensorProductLagrange{<:Any, 1}) = false

# TODO support these orderings in the dof handler close
# vertexdof_indices(::TensorProductLagrange{RefLine,order}) where {order} = ((1,),(order+1,))
# vertexdof_indices(::TensorProductLagrange{RefQuadrilateral,order}) where {order} = ((1,),(order+1,), (order*(order+1),), ((order+1)*(order+1),))
# ...

# Sum factorized eval for tensor product elements
# function shape_values!(values::AT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec{2,T}) where {T, AT <: AbstractArray, order}
# # Auxillary 1D interpolation
# ip_1d = Lagrange{RefLine,order}()
# n_basefunctions_1d = getnbasefunctions(ip_1d)
# # Evaluate 1D
# evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
# evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
# # Sum factorization
# values_2d = reshape(values, (n_basefunctions_1d, n_basefunctions_1d))
# for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d
# values_2d[ix, iy] = evals_x_1d[ix] * evals_y_1d[iy]
# end
# TODO permutate values_2d
# end

# function shape_values!(gradients_2d::AT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec{2,T}) where {T, AT <: AbstractArray, order}
# # Auxillary 1D interpolation
# ip_1d = Lagrange{RefLine,order}()
# n_basefunctions_1d = getnbasefunctions(ip_1d)
# # Evaluate 1D
# evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
# evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
# evals_dx_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
# evals_dy_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
# # Sum factorization
# gradients_2d_2d = reshape(gradients_2d, (n_basefunctions_1d, n_basefunctions_1d))
# for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d
# gradients_2d_2d[ix, iy, 1] = ...
# gradients_2d_2d[ix, iy, 2] = ...
# end
# TODO permutate gradients_2d_2d
# end

# function shape_gradients_and_values!(gradients::GAT, shapes::SAT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec ) where {T, SAT <: AbstractArray, GAT <: AbstractArray, dim, order}
# # Auxillary 1D interpolation
# ip_1d = Lagrange{RefLine,order}()
# n_basefunctions_1d = getnbasefunctions(ip_1d)
# # Evaluate 1D
# evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
# evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
# evals_dx_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
# evals_dy_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
# # Sum factorization
# values_2d = reshape(values, (n_basefunctions_1d, n_basefunctions_1d))
# gradients_2d = reshape(gradients, (n_basefunctions_1d, n_basefunctions_1d))
# for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d
# values_2d[ix, iy, 1] = ...
# values_2d[ix, iy, 2] = ...
# end
# TODO permutate values_2d
# TODO permutate gradients_2d_2d
# end

# TODO implement kernels above for 3D
# TODO implement vectorized interpolation case

Copy link
Member

@KnutAM KnutAM Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be removed before merge?
Seems better to add as a comment to #389 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. This is basically a template to guide how the evals might be implemented in #790 .

ip::IP
end

function PointValuesInternal(ξ::Vec{dim, T}, ip::IP) where {dim, T, shape <: AbstractRefShape{dim}, IP <: Interpolation{shape}}
n_func_basefuncs = getnbasefunctions(ip)
N = [shape_value(ip, ξ, i) for i in 1:n_func_basefuncs]
return PointValuesInternal{IP, eltype(N)}(N, ip)
N = MVector(ntuple(i->shape_value(ip, ξ, i), n_func_basefuncs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any performance benefit of using MVector here? (Nice to document why since we don't use it elsewhere)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I am just trying to get stuff ready for GPU mode. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, would it make sense to leave that for a separate issue then?
Instead of hard-coding a different vector type?

My impression is that other efforts (at least those I'm involved in) are more about allowing GPU-compatible arrays instead of enforcing them?

@KnutAM
Copy link
Member

KnutAM commented Oct 13, 2023

These functions are really nice!
Will use them in #764 if this gets merged!

@termi-official
Copy link
Member Author

So this one preceedes #764 ?

@KnutAM
Copy link
Member

KnutAM commented Oct 13, 2023

So this one preceedes #764 ?

If this could be merged quickly, then yes.
Most of changes in this PR would get reverted though.
But in that case I can use these functions internally which is nice (I kind-of did something similar but not so nice/general with

function precompute_values!(fv::FunctionValues, qr::QuadratureRule)
n_shape = getnbasefunctions(fv.ip)
for (qp, ξ) in pairs(getpoints(qr))
for i in 1:n_shape
fv.dNdξ[i, qp], fv.N_ξ[i, qp] = shape_gradient_and_value(fv.ip, ξ, i)
end
end
end

Seeing this, I'll at least use this internally, and potentially an even better option would be to extend to shape_gradient_value!(dNdxi::AbstractMatrix, N_xi::AbstractMatrix, ip, qr::QuadratureRule) and skip the precompute_values! function.

@termi-official
Copy link
Member Author

So this one preceedes #764 ?

If this could be merged quickly, then yes. Most of changes in this PR would get reverted though. But in that case I can use these functions internally which is nice (I kind-of did something similar but not so nice/general with

The purpose of this PR is also more to get the discussion up for the changes, so I don't end up with a 3k lines diff for the GPU PR. Even if large parts are reverted, them I am fine with it, no worries. :)

Seeing this, I'll at least use this internally, and potentially an even better option would be to extend to shape_gradient_value!(dNdxi::AbstractMatrix, N_xi::AbstractMatrix, ip, qr::QuadratureRule) and skip the precompute_values! function.

Out of couriosity, do you have a use case where we need such large batches? But yes, we can extend this one here easily to the full quadrature batch.

@KnutAM
Copy link
Member

KnutAM commented Oct 14, 2023

Out of couriosity, do you have a use case where we need such large batches? But yes, we can extend this one here easily to the full quadrature batch.

Nah, I was perhaps overly enthusiastic at first :) (because it is only used in one location, so it would just be moving code around for now). I think just having these should suffice and I'll use views in precompute_values!. In that respect, it doesn't matter much either way which gets merged in first.

One case I thought of, for future work, could be to actually define shape_values(ip) = @inbounds shape_values!(Vector{shape_value_type(ip)}(undef, getnbasefunctions(ip)), ip) and similar for shape_gradients_values and potentially shape_hessians_gradients_values. But I think this should be left for the future for now:) Internally, this would be used in the creation of FunctionValues and GeometryMapping, but would make things convenient for users wanting to extend things.

@fredrikekre
Copy link
Member

I added the boundschecks and removed the extra stuff. I backed up the PR on the branch fe/do/ip-batched-eval in case you need it. Looks like shape_gradients! is untested btw, it used an undefined variable before my push.

@fredrikekre fredrikekre merged commit fe44e7f into master Oct 16, 2023
7 checks passed
@fredrikekre fredrikekre deleted the do/ip-batched-eval branch October 16, 2023 07:43
@fredrikekre fredrikekre removed their request for review October 16, 2023 07:44
Evaluate all shape function gradients and values of `ip` at once at the reference point `ξ`
and store them in `values`.
"""
function shape_gradients_and_values!(gradients::GAT, values::SAT, ip::IP, ξ::Vec) where {IP <: Interpolation, SAT <: AbstractArray, GAT <: AbstractArray}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't @propagate_inbounds required here (and for shape_gradients! above) if we want to avoid the boundschecks, @fredrikekre ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, missed these ones. Would also be good benchmark whether it is useful or not I suppose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants