Skip to content

Commit

Permalink
[docs] completing pages
Browse files Browse the repository at this point in the history
  • Loading branch information
pmartorell committed May 10, 2024
1 parent 45ac51f commit 7a5024e
Show file tree
Hide file tree
Showing 2 changed files with 139 additions and 18 deletions.
106 changes: 90 additions & 16 deletions docs/src/distributed.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,101 @@
# Distributed

When dealing with large-scale problems, this package can be accelerate through two types of parallelization. The first one is multi-threading, which uses [Julia Threads](https://docs.julialang.org/en/v1/base/multi-threading/) for shared memory parallelization (e.g., `julia -t 4`). This method, adds some speed-up. However, it is only efficient for a reduced number of threads.

The second one is a distributed memory computing. For such parallelization, we use MPI (`mpiexec -np 4 julia input.jl`) though [`PartitionedArrays`](https://www.francescverdugo.com/PartitionedArrays.jl/stable) and [`GridapEmbedded`](https://gridap.github.io/GridapDistributed.jl/dev/). With MPI expect to efficiently compute large-scale problems, up to thousands of cores.


# Distributed usage

The STLCutters package has both multi-threading and distributed memory computing implementations. Multi-threading is straightforward for the user. However, setting distributed memory computations is a bit more involved at a driver level.

Here, we provide an example of a distributed computation with STLCutters.

## Introduction

When dealing with large-scale problems, this package can be accelerated through two types of parallelization. The first one is multi-threading, which uses [Julia Threads](https://docs.julialang.org/en/v1/base/multi-threading/) for shared memory parallelization (e.g., `julia -t 4`). This method, adds some speed-up. However, it is only efficient for a reduced number of threads.

The second one is a distributed memory computing. For such parallelization, we use MPI (`mpiexec -np 4 julia input.jl`) through [`PartitionedArrays`](https://www.francescverdugo.com/PartitionedArrays.jl/stable) and [`GridapDistributed`](https://gridap.github.io/GridapDistributed.jl/dev/). With MPI expect to efficiently compute large-scale problems, up to thousands of cores.

Additionally, the distributed memory implementation is built on top of [GridapEmbedded](https://github.com/gridap/GridapEmbedded.jl). GridapEmbedded provides parallelization tools since [v0.9.2](https://github.com/gridap/GridapEmbedded.jl/releases/tag/v0.9.2).


## Underlying distributed algorithms

The implementation of multi-threading is straightforwardly applied to the embarrassingly parallel loops. However, the distributed memory implementation requires more involved algorithms for the global computations, e.g., the classification of the background cells as inside or outside. These algorithms are described in Chapter 4 of the following PhD thesis.

> Pere A. Martorell. "Unfitted finite element methods for explicit boundary representations". PhD Thesis. Universitat Politècnica de Catalunya. 2024. [hdl.handle.net/10803/690625](https://www.tdx.cat/handle/10803/690625)
## Multi-threading usage

The usage of this package with multi-threading is the same as for the serial case (see [Serial example](@ref)). The user only needs to set the number of threads when initializing Julia. E.g.,

```bash
julia -threads 4
```

## Distributed memory usage

The distributed usage needs to be set up at the driver level. The user needs to install and load `MPI.jl`, `PartitionedArrays.jl`, and `GridapDistributed.jl`.

Here, we provide a basic example of solving the Poisson equation with distributed STLCutters with 8 MPI tasks. Run the following command.

```bash
mpiexec -np 8 julia poisson.jl
```

Where `poisson.jl` is the following code.

```julia
using STLCutters
using GridapEmbedded
using GridapDistributed
using PartitionedArrays
using MPI

parts = (2,2,2)
cells = (10,10,10)
filename = "stl_file_path.stl"

with_mpi() do distribute
ranks = distribute(LinearIndices((prod(parts))))
# Domain and discretization
geo = STLGeometry(filename)
pmin,pmax = get_bounding_box(geo)
model = CartesianDiscreteModel(ranks,parts,pmin,pmax,cells)
cutgeo = cut(model,geo)
# Cell aggregation
model,cutgeo,aggregates = aggregate(AggregateAllCutCells(),cutgeo)
# Triangulations
Ω_act = Triangulation(cutgeo,ACTIVE)
Ω = Triangulation(cutgeo)
Γ = EmbeddedBoundary(cutgeo)
= get_normal_vector(Γ)
= Measure(Ω,2)
= Measure(Γ,2)
# FE spaces
Vstd = TestFESpace(Ω_act,ReferenceFE(lagrangian,Float64,1))
V = AgFEMSpace(Vstd)
U = TrialFESpace(V)
# Weak form
γ = 10.0
h = (pmax - pmin)[1] / cells[1]
ud(x) = x[1] - x[2]
f = 0
a(u,v) =
( (v)(u) )dΩ +
( (γ/h)*v*u - v*(nΓ(u)) - (nΓ(v))*u )dΓ
l(v) =
( v*f )dΩ +
( (γ/h)*v*ud - (nΓ(v))*ud )dΓ
# Solve
op = AffineFEOperator(a,l,U,V)
uh = solve(op)
writevtk(Ω,"results",cellfields=["uh"=>uh])
end
```
!!! note
The STL file can be downloaded using [`download_thingi10k`](@ref).

!!! note
Add example of distributed
We can also add a tutorial for the serial
Add some explanation of the algorithm, cite PhD thesis
Comment GridapEMbedded since 0.9.2
It is recommended to use `with_debug()` instead of `with_mpi()` for debugging in serialized execution, see more details in [`PartitionedArrays`](https://www.francescverdugo.com/PartitionedArrays.jl/stable).

!!! note
One can consider a different stabilization of the small cut-cell problem instead of AgFEM. Then, the `aggregate` and `AgFEMSpace` need to be removed.
See more examples in [`GridapEmbedded`](https://github.com/gridap/GridapEmbedded.jl)


!!! warning
Even though the distributed algorithms are proven to be efficient for large-scale weak scaling tests [Martorell, 2024]. The performance of this implementation is not tested.
Even though the distributed algorithms are proven to be efficient for large-scale weak scaling tests [Martorell, 2024](https://www.tdx.cat/handle/10803/690625). The performance of this implementation is not tested.



Expand Down
51 changes: 49 additions & 2 deletions docs/src/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,53 @@ Like in GridapEmbedded, we extract the embedded triangulations as follows.
Λ = SkeletonTriangulation(cutgeo)
```

## Serial example

!!! warning
Do not mix `@docs` here
Now, we provide an example of the solution of a Poisson problem on the embedded domain.

```julia
using STLCutters
using GridapEmbedded
cells = (10,10,10)
filename = "stl_file_path.stl"
# Domain and discretization
geo = STLGeometry(filename)
pmin,pmax = get_bounding_box(geo)
model = CartesianDiscreteModel(pmin,pmax,cells)
cutgeo = cut(model,geo)
# Cell aggregation
aggregates = aggregate(AggregateAllCutCells(),cutgeo)
# Triangulations
Ω_act = Triangulation(cutgeo,ACTIVE)
Ω = Triangulation(cutgeo)
Γ = EmbeddedBoundary(cutgeo)
= get_normal_vector(Γ)
= Measure(Ω,2)
= Measure(Γ,2)
# FE spaces
Vstd = TestFESpace(Ω_act,ReferenceFE(lagrangian,Float64,1))
V = AgFEMSpace(Vstd)
U = TrialFESpace(V)
# Weak form
γ = 10.0
h = (pmax - pmin)[1] / cells[1]
ud(x) = x[1] - x[2]
f = 0
a(u,v) =
( (v)(u) )dΩ +
( (γ/h)*v*u - v*(nΓ(u)) - (nΓ(v))*u )dΓ
l(v) =
( v*f )dΩ +
( (γ/h)*v*ud - (nΓ(v))*ud )dΓ
# Solve
op = AffineFEOperator(a,l,U,V)
uh = solve(op)
writevtk(Ω,"results",cellfields=["uh"=>uh])
```

!!! note
The STL file can be downloaded using [`download_thingi10k`](@ref).

!!! note
One can consider a different stabilization of the small cut-cell problem instead of AgFEM. Then, the `aggregate` and `AgFEMSpace` need to be removed.
See more examples in [`GridapEmbedded`](https://github.com/gridap/GridapEmbedded.jl)

0 comments on commit 7a5024e

Please sign in to comment.