Stockfish randomicity

syzygy · Post by **syzygy** » Thu Oct 05, 2023 11:45 pm

Uri Blass wrote: ↑Thu Oct 05, 2023 8:20 amI think that this type of tests are interesting but for some reason people who donate computer time prefer to donate the time only for testing at bullet or blitz.
I wonder if you can convince part of them to change their choice and there is no problem if one test take some months instead of one day.

There is no reason to believe that his claim has any merit, which is why people are rightly reluctant to donate resources. Anyone can make claims.

Uri Blass · Post by **Uri Blass** » Fri Oct 06, 2023 10:19 am

syzygy wrote: ↑Thu Oct 05, 2023 11:45 pm
Uri Blass wrote: ↑Thu Oct 05, 2023 8:20 amI think that this type of tests are interesting but for some reason people who donate computer time prefer to donate the time only for testing at bullet or blitz.
I wonder if you can convince part of them to change their choice and there is no problem if one test take some months instead of one day.
There is no reason to believe that his claim has any merit, which is why people are rightly reluctant to donate resources. Anyone can make claims.

If pruning prevent engines to solve some test positions even after a long time of search then there is a reason to believe that maybe the engine can be stronger at long time control without the pruning.

I think that the correct way to test it is first to see if you can prove that the engine performs relatively better at 60+0.6 relative to 10+0.1 even when it is weaker in both time controls.

I do not suggust to use elo because of possible diminishing returns but speed handicap.

If you use hardware that is 20% faster for the new version and
you find that the new version is weaker at 10+0.1 but stronger at 60+0.6 with the same speed handicap then there is a reason to suspect that the new version will perform better at longer time control and in this case it is logical to donate resources with equal hardware to test at clearly longer time control than 60+0.6.

Maybe the effect does not start from 10+0.1 to 60+0.6 and you need to start with 20+0.2 and 120+1.2 to show the effect but it is clear that if there is an effect of improvement at very long time control there is some effect of performing relatively better that is easier to prove with relatively faster time control.

syzygy · Post by **syzygy** » Fri Oct 06, 2023 5:26 pm

Uri Blass wrote: ↑Fri Oct 06, 2023 10:19 am
syzygy wrote: ↑Thu Oct 05, 2023 11:45 pm
Uri Blass wrote: ↑Thu Oct 05, 2023 8:20 amI think that this type of tests are interesting but for some reason people who donate computer time prefer to donate the time only for testing at bullet or blitz.
I wonder if you can convince part of them to change their choice and there is no problem if one test take some months instead of one day.
There is no reason to believe that his claim has any merit, which is why people are rightly reluctant to donate resources. Anyone can make claims.
If pruning prevent engines to solve some test positions even after a long time of search then there is a reason to believe that maybe the engine can be stronger at long time control without the pruning.

If someone comes with big claims and only empty statements to back them up, then the right thing to do is to prune the proposal and not to waste resources on it. This is the situation here.

Not everybody who claims that the rest of the world is wrong is an Einstein.

Stockfish randomicity

Re: Stockfish randomicity

Re: Stockfish randomicity

Re: Stockfish randomicity