Jouni wrote:Testing framework now have Scaling Trend Prediction! Is it based on the fact, that faster computers have more draws?
Faster computers do have more draws.
But did you notice that the latest patch gets BETTER with longer time control?
That is very unusual. Normally, they become less noticeable when the time control gets longer. That is why I think it is important. I never do that ultra bullet kind of junk, so the high speed testing will tend to bubble up changes that are not very important to me.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Jouni wrote:Testing framework now have Scaling Trend Prediction! Is it based on the fact, that faster computers have more draws?
Faster computers do have more draws.
But did you notice that the latest patch gets BETTER with longer time control?
That is very unusual. Normally, they become less noticeable when the time control gets longer. That is why I think it is important. I never do that ultra bullet kind of junk, so the high speed testing will tend to bubble up changes that are not very important to me.
Do we have enough games to be practically sure that the patch really get better with longer time control?
passing SPRT at LTC faster may be also because of luck.
Dann Corbit wrote:There are lots of differences for Sugar, and not just the copyright headers. It's a real fork, and not one of those phony tweak things.
Nice to see that somebody else can see that SugaR is different enough.
Jouni wrote:Testing framework now have Scaling Trend Prediction! Is it based on the fact, that faster computers have more draws?
Faster computers do have more draws.
But did you notice that the latest patch gets BETTER with longer time control?
That is very unusual. Normally, they become less noticeable when the time control gets longer. That is why I think it is important. I never do that ultra bullet kind of junk, so the high speed testing will tend to bubble up changes that are not very important to me.
Do we have enough games to be practically sure that the patch really get better with longer time control?
passing SPRT at LTC faster may be also because of luck.
No, but I think it is a good guess.
And looking at what the patch does, it seems logical to me also.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Jouni wrote:Testing framework now have Scaling Trend Prediction! Is it based on the fact, that faster computers have more draws?
Faster computers do have more draws.
But did you notice that the latest patch gets BETTER with longer time control?
That is very unusual. Normally, they become less noticeable when the time control gets longer. That is why I think it is important. I never do that ultra bullet kind of junk, so the high speed testing will tend to bubble up changes that are not very important to me.
Do we have enough games to be practically sure that the patch really get better with longer time control?
passing SPRT at LTC faster may be also because of luck.
Dann Corbit wrote:It is twice as good at what they call "long time control" compared to the fast time control. (about 7% improvement verses 3.5% improvement.)
Hence, when it moves to real time control (e.g. analyzing games or positions) I think it may be really excellent.
The idea is simple and so other chess engines may also benefit. If you are an author, I suggest you examine "Tweak statScore condition" by GuardianRM.
if (ss->statScore > 0 && ss->statScore > (ss-1)->statScore)
r -= ONE_PLY;
else if (ss->statScore < 0 && ss->statScore < (ss-1)->statScore )
r += ONE_PLY;
, but it will never pass fishtest 1 minute tc games let alone their 10 sec/game time controls that you must pass before getting to their so called "LTC testing" -> 1 min bullet chess
Dann Corbit wrote:It is twice as good at what they call "long time control" compared to the fast time control. (about 7% improvement verses 3.5% improvement.)
Hence, when it moves to real time control (e.g. analyzing games or positions) I think it may be really excellent.
The idea is simple and so other chess engines may also benefit. If you are an author, I suggest you examine "Tweak statScore condition" by GuardianRM.
if (ss->statScore > 0 && ss->statScore > (ss-1)->statScore)
r -= ONE_PLY;
else if (ss->statScore < 0 && ss->statScore < (ss-1)->statScore )
r += ONE_PLY;
, but it will never pass fishtest 1 minute tc games let alone their 10 sec/game time controls that you must pass before getting to their so called "LTC testing" -> 1 min bullet chess
Too bad that testing at long time control is never done by the programming groups, but only by the testing groups.
But I do understand that everyone is in a big fluffy hurry.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Dann Corbit wrote:It is twice as good at what they call "long time control" compared to the fast time control. (about 7% improvement verses 3.5% improvement.)
Hence, when it moves to real time control (e.g. analyzing games or positions) I think it may be really excellent.
The idea is simple and so other chess engines may also benefit. If you are an author, I suggest you examine "Tweak statScore condition" by GuardianRM.
if (ss->statScore > 0 && ss->statScore > (ss-1)->statScore)
r -= ONE_PLY;
else if (ss->statScore < 0 && ss->statScore < (ss-1)->statScore )
r += ONE_PLY;
, but it will never pass fishtest 1 minute tc games let alone their 10 sec/game time controls that you must pass before getting to their so called "LTC testing" -> 1 min bullet chess
Too bad that testing at long time control is never done by the programming groups, but only by the testing groups.
But I do understand that everyone is in a big fluffy hurry.
It is the decision of the people who give computer time.
If many people will tell the stockfish team that they will give computer time only if they test at longer time control then the stockfish team will have no choice.
I do not understand the hurry for being number 1 as fast as possible.
For what?
Nobody earn money from stockfish and I think that it is better to spend computer time in testing in order to measure the value of different patches
but nobody really care in the stockfish team if some patch earn 5 elo or 1 elo and they even submit "non functional" patches without testing them seriously to verify that they are non functional patches so I am not sure that all the non functional patches are really non functional(a serious test to verify a patch is non functional should take many hours and the fact that bench is the same prove nothing because bench has not many positions and it is possible that some patch cause problems not in bench but in games so I think that the test should be at games with fixed depth or fixed number of nodes.
If you use 1 core with fixed depth or fixed number of nodes, then I think games should be exactly the same because I think that stockfish should be deterministic with 1 core.
I mention this because it does appear likely to me that Sugar xpro 1.2 is stronger that either the previous 9/17/17 version of SF or the 8/25/17 version of asmfish.
If not, would you know what kind of patch or patches SugaR is using to make it possibly stronger than its source?
-Norm
That patch was not in Sugar, so I added it for my own copy of it.
There are lots of differences for Sugar, and not just the copyright headers. It's a real fork, and not one of those phony tweak things.