Uri Blass wrote:bob wrote:Kirill Kryukov wrote:Kempelen wrote:For testing purposes where the majority of us test our engine in fast controls like 1m-1s or 1m-0s, do you think that if the engine improve in that fast control it will improve in slow/normal games? is there a correlation?
If not, what do you (assuming you test in fast games) do to take care of normal games?
I think that general assumption is that results of fast games correlate with the results of slow games very well. This is what we expect. I am yet to see any convincing data showing that it is not the case. I mean any example of engine that performs significantly different depending on time control. I think such example is necessary before this discussion can reach anywhere.
Best,
Kirill
Suppose I post some test results between Crafty and Fruit, at say 8000 game matches from the same starting positions. And I show you that at _very_ fast time controls, Crafty is +300 Elo better. At fast (but slower than previous run) Crafty is +100 better. And at long time controls, Crafty is about +40 better. Would that be convincing?
This is dependent on the definition of very fast time control.
People suggested 1 minute per game+1 second per move as very fast time control in this thread and the idea is that testing at slower time control is usually a waste of time.
It may be interesting if you can show a case when Crafty is 100 elo better than fruit2.1 at 1+1 and also weaker than fruit2.1 at slower time control when you use the same type of time control(X minutes per game+X seconds per move)
Note that very fast time control of 1 second per game against fruit is not interesting because in this case it is possible that fruit lose most of games on time.
Increasement is important to avoid losses on time.
You can try 6 seconds per game+0.1 second per move but the main problem is that I am not sure that programs use time in a rational way in this case because of the simple reason that testers usually do not test at this time control and programmers may take safety margin that is too big and tell their program to assume it has 0.3 seconds less time.
It is not going to cause big changes at 1+1 time control but it can cause big changes at faster time control because at some point the program may search to depth 1 inspite of having enough time to get depth 4 because the program tries not to take risks.
Uri
I have not found a time control where Crafty loses to fruit since the recent versions came out, at least on the 3981 starting positions I use for my testing.
But the point remains the same... 1+1 might be fast to you. I do a lot of 20sec +0.1sec games (20 seconds on clock, 0.1 second increment) so that I can play a 40K game match in an hour or so. With Crafty, and Glaurung, I can play ridiculously fast games (1 sec on clock, .01 sec increment) and not see any difference in results than when using 1+1 or 5+5. which is why I speculated that the case of Fruit suggests a time allocation issue that faster games highlight.
But today, 1+1 might be the same as .1 + .1 in a couple of years. And the problem shows up again.
I was pointing out that I can easily dig up the data to show this kind of behaviour, which some refuse to believe happens. And it happens over tens of thousands of games, not just over a hundred or so as most are using for testing...
All of my testing is done with increments. I play 20sec+.1sec for quick tests, 1+1 (takes about 12 hours for 32,000+ games) and 5+5 (takes 2 days for 32,000 games, roughly) and I have run 30+30 and 60+60 but they take an extended amount of time and I cut the number of positions significantly so that I am not sitting around for weeks. Some programs just can't cope with fast time controls. I had problems with Arasan in that regard. Fruit doesn't lose those fast games on time, but it loses them badly. Toga2, on the other hand, seems to not be affected nearly as much although there is a difference between fast and slow time control performance.