Huge experimental RoundRobin tournament (10500 games, 3min+1sec) with 3 engines (Stockfish 14, KomodoDragon 2.5 and KomodoDragon 2.5 MCTS), each with 5 different MultiPV-settings (1,2,3,5 and 7, were 1 is the normal, default playing mode). Goal: Measure, how much Elo is lost by calculating more than one PV-line. And to measure, if Dragon 2.5 MCTS has less Elo-loss, than the AlphaBeta-engines, when MultiPV is 3 or higher...
I think, the results are pretty interesting, especially, when you use engines for analyzing human games using the MultiPV-mode.
https://www.sp-cc.de/experiments.htm
SPCC: Huge MultiPV-rating experiment
Moderator: Ras
-
pohl4711
- Posts: 2921
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
-
MMarco
- Posts: 214
- Joined: Sun Apr 12, 2020 1:09 am
- Full name: Marc-O Moisan-Plante
Re: SPCC: Huge MultiPV-rating experiment
If it works like Lc0, there is no extra calculations done in MPV mode. Lc0 simply reports the value of other moves it already calculated when doing the work to find the best move. So for game strenght, the MPV parameter is irrelevant. All differences that might show up are likely to be noise.And to measure, if Dragon 2.5 MCTS has less Elo-loss, than the AlphaBeta-engines, when MultiPV is 3 or higher...
-
pohl4711
- Posts: 2921
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: SPCC: Huge MultiPV-rating experiment
That is exactly, what my experiment has shown. And what I have written in my conclusions: "As you can see, all 5 KomodoDragon 2.5 MCTS MultiPV-engines are in a very small range of 11 Elo, only (!) and with MultiPV=5 and =7, KomodoDragon 2.5 MCTS is clearly stronger, than Stockfish and KomodoDragon 2.5 AlphaBeta."MMarco wrote: ↑Sat Oct 02, 2021 2:58 amIf it works like Lc0, there is no extra calculations done in MPV mode. Lc0 simply reports the value of other moves it already calculated when doing the work to find the best move. So for game strenght, the MPV parameter is irrelevant. All differences that might show up are likely to be noise.And to measure, if Dragon 2.5 MCTS has less Elo-loss, than the AlphaBeta-engines, when MultiPV is 3 or higher...