v0.31.0-rc1
Pre-release
Pre-release
borg323
released this
25 Mar 22:53
·
25 commits
to release/0.31
since this release
In this version:
- The blas, cuda, eigen, metal and onnx backends now have support for multihead network architecture and can run BT3/BT4 nets.
- Updated the internal Elo model to better align with regular Elo for human players.
- There is a new XLA backend that uses OpenXLA compiler to produce code to execute the neural network. See https://github.com/LeelaChessZero/lc0/wiki/XLA-backend for details. Related are new leela2onnx options to output the HLO format that XLA understands.
- There is a vastly simplified lc0 interface available by renaming the executable to
lc0simple
. - The backends can now suggest a minibatch size to the search, this is enabled by
--minibatch-size=0
(the new default). - If the cudnn backend detected an unsupported network architecture it will switch to the cuda backend.
- Two new selfplay options enable value and policy tournaments. A policy tournament is using a single node policy to select the move to play, while a value tournament searches all possible moves at depth 1 to select the one with the best q.
- While it is easy to get a single node policy evaluation (
go nodes 1
using uci), there was no simple way to get the effect of a value only evaluation, so the--value-only
option was added. - Button uci options were implemented and a button to clear the tree was added (as hidden option).
- Support for the uci
go mate
option was added. - The rescorer can now be built from the lc0 code base instead of a separate branch.
- A dicrete onnx layernorm implementation was added to get around a onnxruntime bug with directml - this has some overhead so it is only enabled for onnx-dml and can be switched off with the
alt_layernorm=false
backend option. - The
--onnx2pytoch
option was added to leela2onnx to generate pytorch compatible models. - There is a cuda
min_batch
backend option to reduce non-determinism with small batches. - New options were added to onnx2leela to fix tf exported onnx models.
- The onnx backend can now be built for amd's rocm.
- Fixed a bug where the Contempt effect on eval was too low for nets with natively higher draw rates.
- Made the WDL Rescale sharpness limit configurable via the
--wdl-max-s
hidden option. - The search task workers can be set automatically, to either 0 for cpu backends or up to 4 depending on the number of cpu cores. This is enabled by
--task-workers=-1
(the new default). - Several assorted fixes and code cleanups.