Discussion of chess software programming and technical issues.
Moderators: hgm , Rebel , chrisw
Deberger
Posts: 91 Joined: Sat Nov 02, 2019 6:42 pm
Full name: ɹǝƃɹǝqǝᗡ ǝɔnɹꓭ
Post
by Deberger » Sat Feb 01, 2020 1:57 am
I wonder if this could lead to some paradigm shift.
For over a decade it was assumed that small, incremental changes which are functionally independent are generally additive.
lucasart
Posts: 3232 Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart
Post
by lucasart » Sat Feb 01, 2020 2:08 am
Deberger wrote: ↑ Sat Feb 01, 2020 1:57 am
I wonder if this could lead to some paradigm shift.
For over a decade it was assumed that small, incremental changes which are functionally independent are generally additive.
Yes, people abuse the notion of "simplification" to commit anything. Combine that with pervasive p-hacking, and that's no surprise.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Alayan
Posts: 550 Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh
Post
by Alayan » Sat Feb 01, 2020 1:41 pm
The reverted patches passed as "elo gainers".
Deberger
Posts: 91 Joined: Sat Nov 02, 2019 6:42 pm
Full name: ɹǝƃɹǝqǝᗡ ǝɔnɹꓭ
Post
by Deberger » Sat Feb 01, 2020 2:19 pm
The 5 reverted patches were all developmental, for a future Version 12.
Today a 6th patch was reverted, the final LMR which was included in Version 11.
https://github.com/official-stockfish/S ... 261b26ac3a
lucasart
Posts: 3232 Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart
Post
by lucasart » Sun Feb 02, 2020 8:57 am
Alayan wrote: ↑ Sat Feb 01, 2020 1:41 pm
The reverted patches passed as "elo gainers".
Are we sure this pentanomial test is correct ? When I look at these [0-2] results, I'm very surprised by how low the stopping time is, compared to what you'd expected it to be for SPRT(0,2). And considering that SPRT is asymptotically optimal, something doesn't make sense...
Another problem is the bounds used for STC. They provide almost no filtering. Previously, we have 0-5 for both STC and LTC, such that p-hacking was much reduced.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Michel
Posts: 2272 Joined: Mon Sep 29, 2008 1:50 am
Post
by Michel » Sun Feb 02, 2020 10:53 am
The validity of the pentanomial model can be verified by simulation.
https://github.com/vdbergh/pentanomial
Concerning short tests: there are various things to consider, notably: Fishtest Elo bounds are no longer BayesElo.
The stopping time distribution for an SPRT has long tails.
The great majority of patches submitted to Fishtest are at best neutral (the Elo prior was measured some time ago to be ~ N(-1,1)).
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Michel
Posts: 2272 Joined: Mon Sep 29, 2008 1:50 am
Post
by Michel » Sun Feb 02, 2020 11:28 pm
I wrote a simple multi-threaded C version of the pentanomial simulator.
https://github.com/vdbergh/simul
Everything in a single C file. As it is much much faster than the Python version one can see better how accurate the implementation is.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Michel
Posts: 2272 Joined: Mon Sep 29, 2008 1:50 am
Post
by Michel » Mon Feb 03, 2020 12:04 pm
Michel wrote: ↑ Sun Feb 02, 2020 11:28 pm
I wrote a simple multi-threaded C version of the pentanomial simulator.
https://github.com/vdbergh/simul
Everything in a single C file. As it is much much faster than the Python version one can see better how accurate the implementation is.
Now with a decent README.md!
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Jouni
Posts: 3283 Joined: Wed Mar 08, 2006 8:15 pm
Post
by Jouni » Mon Feb 03, 2020 8:27 pm
SF is so strong, that today all changes need almost astronomical number of games to pass. In good old days it was 100-200 games and engine was better
.
Jouni