Recommendation for difficult test suite

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

gordonr
Posts: 222
Joined: Thu Aug 06, 2009 8:04 pm
Location: UK

Re: Recommendation for difficult test suite

Post by gordonr »

peter wrote: Sun Jun 29, 2025 9:35 am New version of the 256 and a list of MultiPV=4 as well as MulltiPV=1 runs with 6 threads of the 16x3.5GHz CPU (5 concurrencies) and 1G hash, the 3070ti GPU and 1G NN- cache, 30"/position.

Code: Select all

    Program                                    Elo   +/-  Matches  Score   Av.Op.   S.Pos.   MST1    MST2   RIndex

  1 RemsM-091224-6t-26-4-2000                : 3551    4   5147    57.7 %   3497   204/256    5.4s   10.4s   0.68
  2 ShashChess250623-6t-MuPV4                : 3547    4   5134    57.1 %   3497   201/256    5.8s   11.0s   0.68
  3 Lc0v0.32.0-6147500PT-MuPV4               : 3545    5   5198    56.7 %   3498   190/256    4.3s   10.9s   0.66
  4 CorChess4.5-250618-6t-MuPV4              : 3542    4   5084    56.3 %   3497   197/256    5.5s   11.2s   0.66
  5 Lc0v0.32.0-6147500PT-MuPV1               : 3527    5   5044    54.0 %   3499   175/256    4.1s   12.3s   0.59
Hi Peter, I'm curious about the results for Lc0v0.32.0-6147500PT-MuPV4 and Lc0v0.32.0-6147500PT-MuPV1 because using MuPV doesn't affect Lc0 analysis or at least that was my understanding for MCTS. I'd expect these to be identical for analysis albeit results may vary across test runs. The error bars are small.
peter
Posts: 3393
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: Recommendation for difficult test suite

Post by peter »

gordonr wrote: Sun Jun 29, 2025 11:53 am I'm curious about the results for Lc0v0.32.0-6147500PT-MuPV4 and Lc0v0.32.0-6147500PT-MuPV1 because using MuPV doesn't affect Lc0 analysis or at least that was my understanding for MCTS. I'd expect these to be identical for analysis albeit results may vary across test runs. The error bars are small.
You discussed the item here
viewtopic.php?p=979611&hilit=lc0+multipv#p979611
to some more extent, and search of Lc0 isn't pure MCTS (e.g. even Dragon's one doesn't profit as much from MultiPV at most positions I tried).
As for my personal pov. MultiPV started to effect Lc0- search more and more since about version 0.28 and at least in this one test with this one hardware- TC difference get's out of error bar as you see, even if it's not as big a difference as with most of newer A-B-engines. Rems e.g. profits very much from its internal MultiPV, 26-4-2000 means Random Op. Plies 26, Random Op. MultiPV=4, Random Op. Score=2000, this one setting I chose to be somewhat near to MultiPV=4 of "external" (set by GUI) ones, regards
Peter.
gordonr
Posts: 222
Joined: Thu Aug 06, 2009 8:04 pm
Location: UK

Re: Recommendation for difficult test suite

Post by gordonr »

peter wrote: Sun Jun 29, 2025 12:09 pm
gordonr wrote: Sun Jun 29, 2025 11:53 am I'm curious about the results for Lc0v0.32.0-6147500PT-MuPV4 and Lc0v0.32.0-6147500PT-MuPV1 because using MuPV doesn't affect Lc0 analysis or at least that was my understanding for MCTS. I'd expect these to be identical for analysis albeit results may vary across test runs. The error bars are small.
You discussed the item here
viewtopic.php?p=979611&hilit=lc0+multipv#p979611
to some more extent, and search of Lc0 isn't pure MCTS (e.g. even Dragon's one doesn't profit as much from MultiPV at most positions I tried).
As for my personal pov. MultiPV started to effect Lc0- search more and more since about version 0.28 and at least in this one test with this one hardware- TC difference get's out of error bar as you see, even if it's not as big a difference as with most of newer A-B-engines. Rems e.g. profits very much from its internal MultiPV, 26-4-2000 means Random Op. Plies 26, Random Op. MultiPV=4, Random Op. Score=2000, this one setting I chose to be somewhat near to MultiPV=4 of "external" (set by GUI) ones, regards
Hi Peter,

Yes, previously I had asked about why the highest eval isn't always "multipv 1" for LC0. I was informed it's based on visit count. But I still didn't think the number of multipv lines affected the analysis. However, I will now take into consideration your thoughts. Thanks for clarifying.