Skip to content

Commit

Permalink
cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
CarloLucibello committed Apr 4, 2024
1 parent 9a28998 commit 8d35864
Show file tree
Hide file tree
Showing 5 changed files with 6 additions and 2,366 deletions.
2 changes: 1 addition & 1 deletion docs/src/training/optimisers.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ opt = OptimiserChain(WeightDecay(1e-4), Descent())
```

Here we apply the weight decay to the `Descent` optimiser.
The resultin optimser `opt` can be used as any optimiser.
The resulting optimiser `opt` can be used as any optimiser.

```julia
w = [randn(10, 10), randn(10, 10)]
Expand Down
2 changes: 0 additions & 2 deletions docs/src/training/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,6 @@ See the [Optimisers documentation](https://fluxml.ai/Optimisers.jl/dev/) for det

```@docs
Flux.params
Flux.update!(opt::Flux.Optimise.AbstractOptimiser, xs::AbstractArray, gs)
Flux.train!(loss, ps::Flux.Params, data, opt::Flux.Optimise.AbstractOptimiser; cb)
```

## Callbacks
Expand Down
10 changes: 5 additions & 5 deletions docs/src/training/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,13 +117,13 @@ fmap(model, grads[1]) do p, g
end
```

A slightly more refined version of this loop to update all the parameters is wrapped up as a function [`update!`](@ref Flux.Optimise.update!)`(opt_state, model, grads[1])`.
And the learning rate is the only thing stored in the [`Descent`](@ref Flux.Optimise.Descent) struct.
A slightly more refined version of this loop to update all the parameters is wrapped up as a function [`update!`](@ref)`(opt_state, model, grads[1])`.
And the learning rate is the only thing stored in the [`Descent`](@ref) struct.

However, there are many other optimisation rules, which adjust the step size and
direction in various clever ways.
Most require some memory of the gradients from earlier steps, rather than always
walking straight downhill -- [`Momentum`](@ref Flux.Optimise.Momentum) is the simplest.
walking straight downhill -- [`Momentum`](@ref) is the simplest.
The function [`setup`](@ref Flux.Train.setup) creates the necessary storage for this, for a particular model.
It should be called once, before training, and returns a tree-like object which is the
first argument of `update!`. Like this:
Expand All @@ -140,7 +140,7 @@ for data in train_set
end
```

Many commonly-used optimisation rules, such as [`Adam`](@ref Flux.Optimise.Adam), are built-in.
Many commonly-used optimisation rules, such as [`Adam`](@ref), are built-in.
These are listed on the [optimisers](@ref man-optimisers) page.

!!! compat "Implicit-style optimiser state"
Expand Down Expand Up @@ -325,7 +325,7 @@ After that, in either case, [`Adam`](@ref Flux.Adam) computes the final update.
The same trick works for *L₁ regularisation* (also called Lasso), where the penalty is
`pen_l1(x::AbstractArray) = sum(abs, x)` instead. This is implemented by `SignDecay(0.42)`.

The same `OptimiserChain` mechanism can be used for other purposes, such as gradient clipping with [`ClipGrad`](@ref Flux.Optimise.ClipValue) or [`ClipNorm`](@ref Flux.Optimise.ClipNorm).
The same `OptimiserChain` mechanism can be used for other purposes, such as gradient clipping with [`ClipGrad`](@ref) or [`ClipNorm`](@ref).

Besides L1 / L2 / weight decay, another common and quite different kind of regularisation is
provided by the [`Dropout`](@ref Flux.Dropout) layer. This turns off some outputs of the
Expand Down
8 changes: 0 additions & 8 deletions error.jl

This file was deleted.

Loading

0 comments on commit 8d35864

Please sign in to comment.