Skip to content

Commit

Permalink
Update algorithm.md - Made Tim's Requested Changes
Browse files Browse the repository at this point in the history
  • Loading branch information
Aero-Spec authored Sep 22, 2024
1 parent 4678dc0 commit b4b08b0
Showing 1 changed file with 25 additions and 111 deletions.
136 changes: 25 additions & 111 deletions docs/src/algorithm.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,50 @@
# Algorithm

The divided rectangles algorithm, or DIRECT (for DIvided RECTangles), incrementally refines a retangular partition of the design space. The refinement is driven a heuristic that involves reasoning about potential Lipschitz constants.
The divided rectangles algorithm, or DIRECT (for DIvided RECTangles), incrementally refines a retangular partition of the design space. The refinement is driven by a heuristic that involves reasoning about potential Lipschitz constants.

The strength of the DIRECT algorithm lies in its ability to systematically explore the entire search space while focusing on the most promising areas. This systematic coverage helps the algorithm escape local minima, making it particularly effective for objective functions with multiple local minima.

Additionally, by not requiring the Lipschitz constant, the DIRECT algorithm is adaptable to various optimization problems, including those where the smoothness of the objective function is not well understood.

---

- The figure below shows the DIRECT method after 16 iterations on the Branin function. The cells are much denser around the minima of the Branin function because the DIRECT method is designed to increase its resolution in promising regions..

![page_11](https://github.com/user-attachments/assets/b833bedd-41aa-40c5-a27f-26188a171797)
---
### Key Concepts of the DIRECT Algorithm

1. **Division of Search Space**:
1. **Search Space**:

- The algorithm begins by treating the entire feasible region as a single hyper-rectangle.
- The algorithm minimizes an objective function f(x) over a hyper-rectangular search space.
- The search space is normalized to the unit hypercube to avoid oversensitivity to dimensions with larger domains. If minimizing $f(x)$ in the interval between lower and upper ranges $a$ and $b$, DIRECT will instead minimize:

```math
g(\mathbf{x}) = f(\mathbf{x} \odot (\mathbf{b} - \mathbf{a}) + \mathbf{a})
```

After finding the minimum $x^*$ of $g$, the minimum of $f$ is
After finding the minimum $x^*$ of $g$, The minimizer of $f$ is

```math
\mathbf{x}^* \odot (\mathbf{b} - \mathbf{a}) + \mathbf{a}
```
---

- The figure below shows DIRECT method after 16 iterations on the Branin function. The cells are much denser around the minima of the Branin function because the DIRECT method is designed to increase resolution in promising regions.

![page_11](https://github.com/user-attachments/assets/b833bedd-41aa-40c5-a27f-26188a171797)
2. **Function Evaluation**:
- DIRECT partitions its search space into hyperrectangular intervals.
- The objective function is evaluated at the center of each hyper-rectangle.
- Each interval has a center $c^{(i)}$, an associated objective function value $f(c^{(i)})$, and a radius $r^{(i)}$. The radius is the distance from the center to a vertex."

---
4. **Selection of Potentially Optimal Rectangles**:
- In each iteration, the algorithm identifies potentially optimal rectangles. A rectangle is considered potentially optimal if it could contain the global minimum based on the evaluations performed so far.

2. **Function Evaluation**:
- The function is evaluated at the center of each hyper-rectangle.
- Each interval has a center $c^{(i)}$ and an associated objective function value $f(c^{(i)})$, as well as a radius $r^{(i)}$, which is the distance from the center to a vertex.
### Lipschitz Lower Bound:

3. **Selection of Potentially Optimal Rectangles**:
- After evaluation, the algorithm identifies potentially optimal rectangles. A rectangle is considered potentially optimal if it could contain the global minimum based on the evaluations performed so far.
- The Lipschitz lower bound for an interval is a circular cone extending downward from its center $c^{(i)}$.

4. **Lipschitz Lower Bound**:
- The Lipschitz lower bound for an interval is a circular cone extending downward from its center $c^{(i)}$

```math
f(\mathbf{x}) \geq f(\mathbf{c}^{(i)}) - \ell \|\mathbf{x} - \mathbf{c}^{(i)}\|_2
```
```math
f(\mathbf{x}) \geq f(\mathbf{c}^{(i)}) - \ell \|\mathbf{x} - \mathbf{c}^{(i)}\|_2
```
- This lower bound is constrained by the extents of the interval, and its lowest value is achieved at the vertices, which are all a distance $r^{(i)}$ from the center.

```math
Expand Down Expand Up @@ -78,7 +83,7 @@ f(c^{(i)}) - \ell r^{(i)}
## Splitting Intervals

When splitting a region without equal side lengths, only the longest dimensions are split. Splitting proceeds on these dimensions in the same manner as with a hypercube. The width in a given dimension depends on how many times that dimension has been split. Since DIRECT always splits axis directions by thirds, a dimension
that has been split d times will have a width of $3^−d$. If we have $n$ dimensions and track how many times each dimension of a given interval has been split in a vector $d$, then the radius of that interval is
that has been split d times will have a width of $3^{−d}$. If we have $n$ dimensions and track how many times each dimension of a given interval has been split in a vector $d$, then the radius of that interval is

```math
r = \left\|\left[ \frac{1}{2 \cdot 3^{-d_1}}, \dots, \frac{1}{2 \cdot 3^{-d_n}} \right]\right\|_2
Expand All @@ -92,94 +97,3 @@ r = \left\|\left[ \frac{1}{2 \cdot 3^{-d_1}}, \dots, \frac{1}{2 \cdot 3^{-d_n}}

![page_17](https://github.com/user-attachments/assets/99caea66-02b5-4371-90e2-69305c035ddf)


---
## Practical Implementations:

- Struct `DirectRectangle`:

```julia
struct DirectRectangle
c # center point
y # center point value
d # number of divisions per dimension
r # the radius of the interval
end
```

- `direct` Function:

```julia
function direct(f, a, b, k_max, r_min)
g = x -> f(x .* (b - a) + a) # evaluate within unit hypercube
n = length(a)
c = fill(0.5, n)
□s = [DirectRectangle(c, g(c), fill(0, n), sqrt(0.5^n))]
c_best = c
for k in 1 : k_max
□s_split = get_split_intervals(□s, r_min)
setdiff!(□s, □s_split)
for □_split in □s_split
append!(□s, split_interval(□_split, g))
end
c_best = □s[findmin(□.y forin □s)[2]].c
end
return c_best .* (b - a) + a # from unit hypercube
end
```
- `is_ccw` Function:

```julia
function is_ccw(a, b, c)
return a.r * (b.y - c.y) - a.y * (b.r - c.r) + (b.r * c.y - b.y * c.r) < 1e-6
end
```
- `get_split_intervals`Function

```julia
function get_split_intervals(□s, r_min)
hull = DirectRectangle[]
sort!(□s, by =-> (□.r, □.y))
forin □s
if length(hull) >= 1 &&.r == hull[end].r
continue # Repeated r values cannot be improvements
end
if length(hull) >= 1 &&.y hull[end].y
pop!(hull) # Remove the last point if the new one is better
end
if length(hull) >= 2 && is_ccw(hull[end-1], hull[end], □)
pop!(hull)
end
push!(hull, □)
end
filter!(□ ->.r r_min, hull) # Only split intervals larger than the minimum radius
return hull
end
```
- `split_interval` Function:

```julia
function split_interval(□, g)
c, n, d_min, d =.c, length(□.c), minimum(□.d), copy(□.d)
dirs, δ = findall(d .== d_min), 3.0^(-d_min-1)
Cs = [(c + δ*basis(i, n), c - δ*basis(i, n)) for i in dirs]
Ys = [(g(C[1]), g(C[2])) for C in Cs]
minvals = [min(Y[1], Y[2]) for Y in Ys]
□s = DirectRectangle[]
for j in sortperm(minvals)
d[dirs[j]] += 1 # Increment the number of splits
C, Y, r = Cs[j], Ys[j], norm(0.5 * 3.0.^(-d))
push!(□s, DirectRectangle(C[1], Y[1], copy(d), r))
push!(□s, DirectRectangle(C[2], Y[2], copy(d), r))
end
r = norm(0.5 * 3.0.^(-d))
push!(□s, DirectRectangle(c, □.y, d, r))
return □s
end
```
---

### Strengths of the DIRECT Algorithm
The strength of the DIRECT algorithm lies in its ability to systematically explore the entire search space while focusing on the most promising areas. This systematic coverage helps the algorithm escape local minima, making it particularly effective for objective functions with multiple local minima.
By not requiring the Lipschitz constant, the DIRECT algorithm is adaptable to various optimization problems, including those where the smoothness of the objective function is not well understood.

0 comments on commit b4b08b0

Please sign in to comment.