-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch evaluations #819
Batch evaluations #819
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #819 +/- ##
==========================================
- Coverage 92.77% 92.33% -0.44%
==========================================
Files 33 33
Lines 4952 4955 +3
==========================================
- Hits 4594 4575 -19
- Misses 358 380 +22
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
src/interpolations.jl
Outdated
function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray} | ||
@inbounds for i in 1:getnbasefunctions(ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be a checkbounds
here as default, which can be removed by calling @inbounds shape_values!(...)
?
function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray} | |
@inbounds for i in 1:getnbasefunctions(ip) | |
@propagate_inbounds function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray} | |
@boundscheck checkbounds(values, 1:getnbasefunctions(ip)) | |
@inbounds for i in 1:getnbasefunctions(ip) |
src/interpolations.jl
Outdated
# struct TensorProductLagrange{shape, order} <: ScalarInterpolation{shape, order} | ||
# end | ||
|
||
# adjust_dofs_during_distribution(::TensorProductLagrange) = true | ||
# adjust_dofs_during_distribution(::TensorProductLagrange{<:Any, 2}) = false | ||
# adjust_dofs_during_distribution(::TensorProductLagrange{<:Any, 1}) = false | ||
|
||
# TODO support these orderings in the dof handler close | ||
# vertexdof_indices(::TensorProductLagrange{RefLine,order}) where {order} = ((1,),(order+1,)) | ||
# vertexdof_indices(::TensorProductLagrange{RefQuadrilateral,order}) where {order} = ((1,),(order+1,), (order*(order+1),), ((order+1)*(order+1),)) | ||
# ... | ||
|
||
# Sum factorized eval for tensor product elements | ||
# function shape_values!(values::AT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec{2,T}) where {T, AT <: AbstractArray, order} | ||
# # Auxillary 1D interpolation | ||
# ip_1d = Lagrange{RefLine,order}() | ||
# n_basefunctions_1d = getnbasefunctions(ip_1d) | ||
# # Evaluate 1D | ||
# evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1) | ||
# evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1) | ||
# # Sum factorization | ||
# values_2d = reshape(values, (n_basefunctions_1d, n_basefunctions_1d)) | ||
# for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d | ||
# values_2d[ix, iy] = evals_x_1d[ix] * evals_y_1d[iy] | ||
# end | ||
# TODO permutate values_2d | ||
# end | ||
|
||
# function shape_values!(gradients_2d::AT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec{2,T}) where {T, AT <: AbstractArray, order} | ||
# # Auxillary 1D interpolation | ||
# ip_1d = Lagrange{RefLine,order}() | ||
# n_basefunctions_1d = getnbasefunctions(ip_1d) | ||
# # Evaluate 1D | ||
# evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1) | ||
# evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1) | ||
# evals_dx_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[1],)), i), order+1) | ||
# evals_dy_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[2],)), i), order+1) | ||
# # Sum factorization | ||
# gradients_2d_2d = reshape(gradients_2d, (n_basefunctions_1d, n_basefunctions_1d)) | ||
# for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d | ||
# gradients_2d_2d[ix, iy, 1] = ... | ||
# gradients_2d_2d[ix, iy, 2] = ... | ||
# end | ||
# TODO permutate gradients_2d_2d | ||
# end | ||
|
||
# function shape_gradients_and_values!(gradients::GAT, shapes::SAT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec ) where {T, SAT <: AbstractArray, GAT <: AbstractArray, dim, order} | ||
# # Auxillary 1D interpolation | ||
# ip_1d = Lagrange{RefLine,order}() | ||
# n_basefunctions_1d = getnbasefunctions(ip_1d) | ||
# # Evaluate 1D | ||
# evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1) | ||
# evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1) | ||
# evals_dx_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[1],)), i), order+1) | ||
# evals_dy_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[2],)), i), order+1) | ||
# # Sum factorization | ||
# values_2d = reshape(values, (n_basefunctions_1d, n_basefunctions_1d)) | ||
# gradients_2d = reshape(gradients, (n_basefunctions_1d, n_basefunctions_1d)) | ||
# for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d | ||
# values_2d[ix, iy, 1] = ... | ||
# values_2d[ix, iy, 2] = ... | ||
# end | ||
# TODO permutate values_2d | ||
# TODO permutate gradients_2d_2d | ||
# end | ||
|
||
# TODO implement kernels above for 3D | ||
# TODO implement vectorized interpolation case | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be removed before merge?
Seems better to add as a comment to #389 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. This is basically a template to guide how the evals might be implemented in #790 .
src/PointEval/point_values.jl
Outdated
ip::IP | ||
end | ||
|
||
function PointValuesInternal(ξ::Vec{dim, T}, ip::IP) where {dim, T, shape <: AbstractRefShape{dim}, IP <: Interpolation{shape}} | ||
n_func_basefuncs = getnbasefunctions(ip) | ||
N = [shape_value(ip, ξ, i) for i in 1:n_func_basefuncs] | ||
return PointValuesInternal{IP, eltype(N)}(N, ip) | ||
N = MVector(ntuple(i->shape_value(ip, ξ, i), n_func_basefuncs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any performance benefit of using MVector
here? (Nice to document why since we don't use it elsewhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. I am just trying to get stuff ready for GPU mode. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, would it make sense to leave that for a separate issue then?
Instead of hard-coding a different vector type?
My impression is that other efforts (at least those I'm involved in) are more about allowing GPU-compatible arrays instead of enforcing them?
These functions are really nice! |
So this one preceedes #764 ? |
If this could be merged quickly, then yes. Ferrite.jl/src/FEValues/FunctionValues.jl Lines 65 to 72 in 72f1ea2
Seeing this, I'll at least use this internally, and potentially an even better option would be to extend to |
The purpose of this PR is also more to get the discussion up for the changes, so I don't end up with a 3k lines diff for the GPU PR. Even if large parts are reverted, them I am fine with it, no worries. :)
Out of couriosity, do you have a use case where we need such large batches? But yes, we can extend this one here easily to the full quadrature batch. |
Nah, I was perhaps overly enthusiastic at first :) (because it is only used in one location, so it would just be moving code around for now). I think just having these should suffice and I'll use views in One case I thought of, for future work, could be to actually define |
I added the boundschecks and removed the extra stuff. I backed up the PR on the branch |
Evaluate all shape function gradients and values of `ip` at once at the reference point `ξ` | ||
and store them in `values`. | ||
""" | ||
function shape_gradients_and_values!(gradients::GAT, values::SAT, ip::IP, ξ::Vec) where {IP <: Interpolation, SAT <: AbstractArray, GAT <: AbstractArray} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't @propagate_inbounds
required here (and for shape_gradients!
above) if we want to avoid the boundschecks, @fredrikekre ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, missed these ones. Would also be good benchmark whether it is useful or not I suppose.
This PR introduces internal batched evaluations where possible, which is interesting for matrix-free and high order stuff (and maybe threading?). We might want to expose the functionality to the users in #766 . It should also help with eval times in #790 . In interpolations.jl there is an outline for #389 .
Due to the interaction this PR might precede #790 and #798 . Further is part of the infrastructure for #766 .
TODOs