Stockfish progress is stalled at NCM testing, why?

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Jouni
Posts: 3278
Joined: Wed Mar 08, 2006 8:15 pm

Stockfish progress is stalled at NCM testing, why?

Post by Jouni »

After 100 000s of games it's still +10 to SF9. May be SF7 is too weak opponent?
Jouni
Vizvezdenec
Posts: 52
Joined: Fri Jan 12, 2018 1:30 am

Re: Stockfish progress is stalled at NCM testing, why?

Post by Vizvezdenec »

sf now has slightly lowered default contempt (12 instead of 20), so it's maybe scoring a bit less vs sf7 than vs sf9, and latest sf regression test was +14 elo, which is firmly within error bars of NCM.
Actually run of SF9 is 150.5 +/- 4.5 and the latest sf run is 161+/-4.5, so the difference is somewhere between 2 and 20 elo.
NCM was never good to show progresses like 5 or even 10 elo.
Jouni
Posts: 3278
Joined: Wed Mar 08, 2006 8:15 pm

Re: Stockfish progress is stalled at NCM testing, why?

Post by Jouni »

Average for 10 last runs is almost exactly 160. After 100 000 games error bar is definitely small :) .
Jouni
Vizvezdenec
Posts: 52
Joined: Fri Jan 12, 2018 1:30 am

Re: Stockfish progress is stalled at NCM testing, why?

Post by Vizvezdenec »

But average for sf9 is nonexistant since it only had 1 run :D So it was perfectly fine for that run to fluke out some +5 elo so progress is seen lower than it is in reality.
+lowering contempt also loses a few elo vs sf7 (lowering contempt was part of new dynamic contempt which shown good performance in selfplay and was within 1.5 elo error bars vs sf7, maybe it actually was 1.5 elo weaker than sf7...).