Page 1 of 1

Stockfish Reverts 5 Recent Patches

Posted: Sat Feb 01, 2020 1:57 am
by Deberger
I wonder if this could lead to some paradigm shift.

For over a decade it was assumed that small, incremental changes which are functionally independent are generally additive.

Revert 5 patches which were merged, but lead to a regression test that showed negative Elo gain:

http://tests.stockfishchess.org/tests/v ... d58394fdb9

This was discussed in depth in:

https://github.com/official-stockfish/S ... ssues/2531

Re: Stockfish Reverts 5 Recent Patches

Posted: Sat Feb 01, 2020 2:08 am
by lucasart
Deberger wrote: Sat Feb 01, 2020 1:57 am I wonder if this could lead to some paradigm shift.

For over a decade it was assumed that small, incremental changes which are functionally independent are generally additive.

Revert 5 patches which were merged, but lead to a regression test that showed negative Elo gain:

http://tests.stockfishchess.org/tests/v ... d58394fdb9

This was discussed in depth in:

https://github.com/official-stockfish/S ... ssues/2531
Yes, people abuse the notion of "simplification" to commit anything. Combine that with pervasive p-hacking, and that's no surprise.

Re: Stockfish Reverts 5 Recent Patches

Posted: Sat Feb 01, 2020 1:41 pm
by Alayan
The reverted patches passed as "elo gainers".

Re: Stockfish Reverts 5 Recent Patches

Posted: Sat Feb 01, 2020 2:19 pm
by Deberger
The 5 reverted patches were all developmental, for a future Version 12.

Today a 6th patch was reverted, the final LMR which was included in Version 11.

https://github.com/official-stockfish/S ... 261b26ac3a

Re: Stockfish Reverts 5 Recent Patches

Posted: Sun Feb 02, 2020 8:03 am
by Deberger
Today a 7th patch was reverted.

(Simplify away king infiltration:)
https://github.com/official-stockfish/S ... 0d916f447c

A patch which was committed twelve days before Version 11 was released.

(Introduce king infiltration bonus:)
https://github.com/official-stockfish/S ... e025bf8f46

Re: Stockfish Reverts 5 Recent Patches

Posted: Sun Feb 02, 2020 8:57 am
by lucasart
Alayan wrote: Sat Feb 01, 2020 1:41 pm The reverted patches passed as "elo gainers".
Are we sure this pentanomial test is correct ? When I look at these [0-2] results, I'm very surprised by how low the stopping time is, compared to what you'd expected it to be for SPRT(0,2). And considering that SPRT is asymptotically optimal, something doesn't make sense...

Another problem is the bounds used for STC. They provide almost no filtering. Previously, we have 0-5 for both STC and LTC, such that p-hacking was much reduced.

Re: Stockfish Reverts 5 Recent Patches

Posted: Sun Feb 02, 2020 10:53 am
by Michel
  • The validity of the pentanomial model can be verified by simulation.
    https://github.com/vdbergh/pentanomial
  • Concerning short tests: there are various things to consider, notably:
    • Fishtest Elo bounds are no longer BayesElo.
    • The stopping time distribution for an SPRT has long tails.
    • The great majority of patches submitted to Fishtest are at best neutral (the Elo prior was measured some time ago to be ~ N(-1,1)).

Re: Stockfish Reverts 5 Recent Patches

Posted: Sun Feb 02, 2020 11:28 pm
by Michel
I wrote a simple multi-threaded C version of the pentanomial simulator.

https://github.com/vdbergh/simul

Everything in a single C file. As it is much much faster than the Python version one can see better how accurate the implementation is.

Re: Stockfish Reverts 5 Recent Patches

Posted: Mon Feb 03, 2020 12:04 pm
by Michel
Michel wrote: Sun Feb 02, 2020 11:28 pm I wrote a simple multi-threaded C version of the pentanomial simulator.

https://github.com/vdbergh/simul

Everything in a single C file. As it is much much faster than the Python version one can see better how accurate the implementation is.
Now with a decent README.md!

Re: Stockfish Reverts 5 Recent Patches

Posted: Mon Feb 03, 2020 8:27 pm
by Jouni
SF is so strong, that today all changes need almost astronomical number of games to pass. In good old days it was 100-200 games and engine was better :D .