Stockfish dev on fire in NCM!

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Jouni
Posts: 3283
Joined: Wed Mar 08, 2006 8:15 pm

Stockfish dev on fire in NCM!

Post by Jouni »

https://nextchessmove.com/dev-builds

+22 ELO suddenly! Latest patch was a new net, which gave +3 ELO in framework :o . So what's happening?
Jouni
User avatar
yurikvelo
Posts: 710
Joined: Sat Dec 06, 2014 1:53 pm

Re: Stockfish dev on fire in NCM!

Post by yurikvelo »

Jouni wrote: Tue Sep 15, 2020 7:23 pm which gave +3 ELO in framework :o . So what's happening?
Elo was not measured in framework.
Patch tests (not regression tests) are stopped as early as possible if +-2.95 ELO difference is reached or until specified number of games if elo diff is within +-2.95 range

+100 elo patch test will be terminated very early, as soon as 2.95 elo is reached with confidence
Jouni
Posts: 3283
Joined: Wed Mar 08, 2006 8:15 pm

Re: Stockfish dev on fire in NCM!

Post by Jouni »

It was 20400 games test in framework.
Jouni
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Stockfish dev on fire in NCM!

Post by Vinvin »

Jouni wrote: Tue Sep 15, 2020 7:23 pm https://nextchessmove.com/dev-builds

+22 ELO suddenly! Latest patch was a new net, which gave +3 ELO in framework :o . So what's happening?
Incredible improvement !!! :twisted: :twisted: :twisted:
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Stockfish dev on fire in NCM!

Post by Alayan »

Elo isn't transitive. The elo difference between two versions will depend on the opponent and settings, and the bigger the elo difference with the opponent the more extreme this can be.
RogerC
Posts: 41
Joined: Tue Oct 29, 2019 8:33 pm
Location: French Polynesia
Full name: Roger C.

Re: Stockfish dev on fire in NCM!

Post by RogerC »

I was following the spots of this patch minutes after minutes, and i noticed that Windows games showed very big improvments with the new network (nn-03744f8d56d8.nnue), more improvments than on Linux. Each "worker" playing 200 games against precedent master (Double probability of using classical eval):

Index of worker Info Last Updated Played Wins Losses Draws
1 Windows 10 2020-09-14 10:42:33 200 23 wins 6 losses 171 draws !

30 Windows 10 2020-09-14 14:33:22 200 13 wins 9 losses 178 draws !

45 Windows 10 2020-09-14 13:02:03 200 18 wins 8 losses 174 draws !

69 Windows 10 2020-09-14 14:12:34 200 15 wins 8 losses 177 draws !

TOTAL : on 800 games played on Windows, 69 wins 31 losses 700 draws :shock:
That's +16,5 ELO !

I think the platform conditions the results and explains the huge improvment against SF7 on nextchessmove dev-builds tests.
User avatar
Deberger
Posts: 91
Joined: Sat Nov 02, 2019 6:42 pm
Full name: ɹǝƃɹǝqǝᗡ ǝɔnɹꓭ

Re: Stockfish dev on fire in NCM!

Post by Deberger »

RogerC wrote: Wed Sep 16, 2020 2:25 am i noticed that Windows games showed very big improvments with the new network (nn-03744f8d56d8.nnue), more improvments than on Linux.
Yeah.

Are you sure this isn't due to the phase of the moon ? :roll: