Uri Blass wrote:
I am not a chess programmer, but it seems to me it is not about who has more cores, but do you have enough cores to run your tests and ideas.
Even if we had 1000 cores it wouldn't be enough. We have more ideas than we can test reasonably and in fact we have to take shortcuts. Once shortcut we take is to reject an idea that doesn't immediately start hot. For example if we are down 5 ELO after 1000 games and have 10 more ideas in the queue we may simply move on to the next test and take a big chance on rejecting a good idea. But 5 ELO is way under the error margin and the idea still has an excellent chance to succeed.
My opinion is that you are too fast to take shortcuts
I guess that you can probably get more progress
if you simply use SPRT like the stockfish team.
We are working on something like SRPT now but it will STILL have the characteristic that it's easier to reject than accept. Any test that accepts changes too readily will cause your program to regress over time and running long tests that have little chance of succeeding just wastes CPU power which is at a premium for us.
Generally, if your test accepts very many regressions due to sample error, your progress will be based on what percentage of improvements you actually test more than the actual results. For example if 1 out 10 things are actually good but the false positive rate is not super low, you will begin accepting many of those other 9 regressions. It's a serious problem unless you are testing BIG improvements or BIG regressions.
It's not currently a big problem for Stockfish because they are still catching up - most of the changes in the log are just re-turning something that should have been done long ago and they are getting big enough gains from these things not to suffer the regression problem.
The demage from rejecting an idea after a small number of games is not only the elo loss from not accepting the idea in case that it is a good idea but the fact that you have less knowledge in order to suggest good ideas for the future and you practically do not know if the idea is good or bad.
We might reject a test, but not an idea. We are not stupid about these things. We consider most of the ideas that get rejected as only unproved and will often even revisit them later. In fact we think the proper testing procedure is start with a simple and fast test which rejects a change that has little chance of success, but will allow the test to continue to a more stringent test even if it is somewhat negative and that concept will be built into our new testing infrastructure which I have wanted to do for years now.
The Stockfish project has inspired me to get this working.
I believe that knowing that the idea is good or knowing that the idea is bad can help you to suggest better ideas in the future.
The number of ideas is not what is important but the quality of the ideas.
We agree on this and we think we might be able to keep up even with a major hardware disadvantage because of this. But it still is a major disadvantage not having the hardware. As Larry pointed out we might be able to keep up with a serious disadvantage but probably not a 10 to 1 or more.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.