Ratinglist- and regression-testruns of Stockfish 220917 finished.
https://www.sp-cc.de
Also take a look at the EAS-Ratinglist, the world's first engine-ratinglist not measuring strength of engines but engines's style of play:
https://www.sp-cc.de/eas-ratinglist.htm
(Perhaps you have to clear your browsercache or reload the website)
SPCC: Testruns of Stockfish 220917 finished
Moderator: Ras
-
pohl4711
- Posts: 2821
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
-
mehmet123
- Posts: 692
- Joined: Sun Jan 26, 2020 10:38 pm
- Location: Turkey
- Full name: Mehmet Karaman
Re: SPCC: Testruns of Stockfish 220917 finished
Elo difference at Rating List and VLTC Regression Test is very different.
I dont think its relevant about the patch (simplfy trend and optimism), because there isnt any radical change at the code. NCM test show only -1.3 elo difference and this is a very smalll elo change. Some new values at search codes are very effective at long time controls and time control at VLTC is more than 3x according to Ratlng List test.
I dont think its relevant about the patch (simplfy trend and optimism), because there isnt any radical change at the code. NCM test show only -1.3 elo difference and this is a very smalll elo change. Some new values at search codes are very effective at long time controls and time control at VLTC is more than 3x according to Ratlng List test.
-
pohl4711
- Posts: 2821
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: SPCC: Testruns of Stockfish 220917 finished
IMHO the "problem" is, that new SF plays a little bit more "drawish". In my ratinglist-testrun, the draw-rate of SF 220917 is 62% (and SF 220907 (same opponents) had "only" 60.3% draws)). This reduces the number of wins for SF 220917 and the score (and the Elo).mehmet123 wrote: ↑Wed Sep 21, 2022 8:09 pm Elo difference at Rating List and VLTC Regression Test is very different.
I dont think its relevant about the patch (simplfy trend and optimism), because there isnt any radical change at the code. NCM test show only -1.3 elo difference and this is a very smalll elo change. Some new values at search codes are very effective at long time controls and time control at VLTC is more than 3x according to Ratlng List test.
In my VLTC Regression-testruns, I use my UHO-openings (otherwise there would be 95%+ draws). Using these openings reduces the draw-rate massively and this "hides" the fact, that one engine-version plays more drawish than another one... So in this test-setup, we see a progress and in the ratinglis-testsetup we see a regress. My 2 cents...
-
Lazy_Frank
- Posts: 74
- Joined: Mon Jul 23, 2018 10:56 pm
- Location: Latvia
- Full name: Raivis Baumanis
Re: SPCC: Testruns of Stockfish 220917 finished
Let me give you analogy with bridge (card game, as well interactive two players (in bridge two pairs) game).
Let's assume you always with partner get the deals where you can bid 3NT and play such contracts (balance and suits structure is well suited for that).
After thousands or millions games you can say: i know everything how to bid and play 3NT.
Fine. Sounds great and true.
But you do not have big clue how to defend 3NT,
because you always get the deals where can bid 3NT (believe me, in defense everything looks much more complicated then from declarer side).
Also you do not have clue (in practical sense) about other contracts besides 3NT as small or grand slam contracts (for example 6 clubs) apart from general game principles.
Let's assume you always with partner get the deals where you can bid 3NT and play such contracts (balance and suits structure is well suited for that).
After thousands or millions games you can say: i know everything how to bid and play 3NT.
Fine. Sounds great and true.
But you do not have big clue how to defend 3NT,
because you always get the deals where can bid 3NT (believe me, in defense everything looks much more complicated then from declarer side).
Also you do not have clue (in practical sense) about other contracts besides 3NT as small or grand slam contracts (for example 6 clubs) apart from general game principles.
-
Jouni
- Posts: 3713
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: SPCC: Testruns of Stockfish 220917 finished
Sorry, but I find EAS testing total waste of time. Only thing that matters is result. Remember SF 200 moves shuffling wins at TCEC. 1-0!
Jouni