Lc0.28 released

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

bmp1974
Posts: 75
Joined: Wed Dec 04, 2019 11:25 am
Full name: Prasanna Bandihole

Lc0.28 released

Post by bmp1974 »

The much awaited Lc0.28 version is released.
https://github.com/LeelaChessZero/lc0/releases

In this release:
  • Multigather is now made the default (and also improved). Some search settings have changed meaning, so if you have modified values please discard them. Specifically, max-collision-events, max-collision-visits and max-out-of-order-evals-factor have changed default values, but other options also affect the search. Similarly, check that your GUI is not caching the old values.
    Updated several other default parameter values, including the MLH ones.
    Performance improvements for the cuda/cudnn backends. This includes the multi_stream cuda backend option that is off by default. You should test adding multi_stream=true to backend-opts (command line) or BackendOptions (UCI) if you have a recent GPU with a lot of VRAM.
    Support for policy focus during training.
    Larger/stronger 15b default net for all packages except android, blas and dnnl that get a new 10b network.
    The distributed binaries come with the mimalloc memory allocator for better performance when a large tree has to be destroyed (e.g. after an unexpected move).
    The legacy time manager is again the default and will use more time for the first move after a long book line.
    The --preload command line flag will initialize the backend and load the network during startup. This may help in cases where the GUI is confused by long start times, but only if backend and network are not changed via UCI options.
    A 'fen' command was added as a UCI extension to print the current position.
    Experimental onednn backend for recent intel CPUs and GPUs.
    Added support for ONNX network files and runtime with the onnx backend.
    Several bug and stability fixes.

    Note: Some small third-party nets seem to play really bad with the dx12 backend and certain GPU drivers, setting the enable-gemm-metacommand=false backend option is reported to work around this issue.
bmp1974
Posts: 75
Joined: Wed Dec 04, 2019 11:25 am
Full name: Prasanna Bandihole

Re: Lc0.28 released

Post by bmp1974 »

I was wondering how to activate the current move tab for Lc0.28? It is not coming up automatically.

Image
MMarco
Posts: 214
Joined: Sun Apr 12, 2020 1:09 am
Full name: Marc-O Moisan-Plante

Re: Lc0.28 released

Post by MMarco »

Performance improvements for the cuda/cudnn backends. This includes the multi_stream cuda backend option that is off by default. You should test adding multi_stream=true to backend-opts (command line) or BackendOptions (UCI) if you have a recent GPU with a lot of VRAM.
Nice!

The multi_stream option gave me a +21% speed-up on my 3080 using the large J94-100 network. The speed-up was smaller with the newer 15b default net (around 6-7%, but I didn't save the results).

./lc0.exe benchmark

Code: Select all

J94-100
===========================
Total time (ms) : 340891
Nodes searched  : 10836183
Nodes/second    : 31788
./lc0.exe benchmark --backend-opts=multi_stream=true

Code: Select all

J94-100
===========================
Total time (ms) : 340524
Nodes searched  : 13105102
Nodes/second    : 38485
Jouni
Posts: 3846
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Lc0.28 released

Post by Jouni »

For already slow GTX 1650 card 0.28 is 20% slower than 0.27 :( . BTW what's the slowest reasonable time control for Lc0? Default time buffer is 200 ms (SF has 10 ms). So 60+0,6 isn't making any sense! Even 1 sec increment is short vs 200 ms. Can I change time buffer to 10 ms!?
Jouni
User avatar
AlexChess
Posts: 1563
Joined: Sat Feb 06, 2021 8:06 am
Full name: Alex Morales

Re: Lc0.28 released

Post by AlexChess »

bmp1974 wrote: Thu Aug 26, 2021 12:39 pm The much awaited Lc0.28 version is released.
https://github.com/LeelaChessZero/lc0/releases

In this release:
  • Multigather is now made the default (and also improved). Some search settings have changed meaning, so if you have modified values please discard them. Specifically, max-collision-events, max-collision-visits and max-out-of-order-evals-factor have changed default values, but other options also affect the search. Similarly, check that your GUI is not caching the old values.
    Updated several other default parameter values, including the MLH ones.
    Performance improvements for the cuda/cudnn backends. This includes the multi_stream cuda backend option that is off by default. You should test adding multi_stream=true to backend-opts (command line) or BackendOptions (UCI) if you have a recent GPU with a lot of VRAM.
    Support for policy focus during training.
    Larger/stronger 15b default net for all packages except android, blas and dnnl that get a new 10b network.
    The distributed binaries come with the mimalloc memory allocator for better performance when a large tree has to be destroyed (e.g. after an unexpected move).
    The legacy time manager is again the default and will use more time for the first move after a long book line.
    The --preload command line flag will initialize the backend and load the network during startup. This may help in cases where the GUI is confused by long start times, but only if backend and network are not changed via UCI options.
    A 'fen' command was added as a UCI extension to print the current position.
    Experimental onednn backend for recent intel CPUs and GPUs.
    Added support for ONNX network files and runtime with the onnx backend.
    Several bug and stability fixes.

    Note: Some small third-party nets seem to play really bad with the dx12 backend and certain GPU drivers, setting the enable-gemm-metacommand=false backend option is reported to work around this issue.
Giancarlo, can I assign the Belgian flag to LC0 on my SuperBlitz, or should I set an Earth flag? Thank you, just started testing the latest version :D
Chess engines and dedicated chess computers fan since 1981 :D macOS Sequoia 16GB-512GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum
Viren
Posts: 57
Joined: Fri Jun 18, 2021 7:54 pm
Full name: Viren Peanut

Re: Lc0.28 released

Post by Viren »

Jouni wrote: Wed Sep 01, 2021 9:57 pm For already slow GTX 1650 card 0.28 is 20% slower than 0.27 :( . BTW what's the slowest reasonable time control for Lc0? Default time buffer is 200 ms (SF has 10 ms). So 60+0,6 isn't making any sense! Even 1 sec increment is short vs 200 ms. Can I change time buffer to 10 ms!?
Are you using the cuda package instead of the cudnn one? For that card cudnn will probably be faster.

Also MoveOverheadMs is "Amount of time, in milliseconds, that the engine subtracts from it’s total available time (to compensate for slow connection, interprocess communication, etc)." so unrelated to increment. You can find descriptions for most of the parameters here:

https://lczero.org/play/flags/
Guenther
Posts: 4718
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Lc0.28 released

Post by Guenther »

AlexChess wrote: Thu Sep 02, 2021 8:50 pm
bmp1974 wrote: Thu Aug 26, 2021 12:39 pm The much awaited Lc0.28 version is released.
https://github.com/LeelaChessZero/lc0/releases

Giancarlo, can I assign the Belgian flag to LC0 on my SuperBlitz, or should I set an Earth flag? Thank you, just started testing the latest version :D
bmp1974

Code: Select all

Full name: Prasanna Bandihole
https://rwbc-chess.de

[Trolls n'existent pas...]
User avatar
mclane
Posts: 18968
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Lc0.28 released

Post by mclane »

multi_stream=true maybe only works for CUDA, do we have something similar for AMD GPUs using DX12 ?!
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....
Jouni
Posts: 3846
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Lc0.28 released

Post by Jouni »

Benchmark in GTX 1650:

0.27.0
===========================
Total time (ms) : 346868
Nodes searched : 616413
Nodes/second : 1777

0.28.0
===========================
Total time (ms) : 351131
Nodes searched : 489606
Nodes/second : 1394

0.28.0 removed from hard drive now.
Jouni
brianr
Posts: 540
Joined: Thu Mar 09, 2006 3:01 pm
Full name: Brian Richardson

Re: Lc0.28 released

Post by brianr »

For better or for worse there are great many configuration options for Lc0.
Which options are best depends on the GPU (or CPU) hardware (and drivers and libraries), the net size, time controls and Lc0 version.
The changes with v28 may be slower or faster in nps depending on a particular configuration.
However, there are search changes, so nps speed is not as important as actual playing strength.
That said, overall the 1650 is a pretty weak GPU for Lc0.

Even for SF, nps in the NNUE era does not mean much relative to strength.
See: https://github.com/dkappe/leela-chess-w ... NPS-Debate