brianr wrote:I have been trying to run some games with "fast" time controls under Arena (it looks like Arena was used in your games also).
After running the games, I load all of the separate tournaments (I run on two quads) with SCID and delete the losses on time (search header for "on time" and negate the filter before exporting). Then, I use PGNEXTRACT to remove any duplicate games before finally using BAYESELO. Even using random games from Bob Hyatt's 4,000 position suite, there are typically some duplicate games.
I have found that some engines handle "fast" times much better than others. Moreover, engine startup time can result in time losses (say for initially handling EGTB files, although they are not often used with search times of 0.1 seconds or less).
One thing that I have found helps a lot with Arena is to only run one pair of engines per copy of Arena (I run 8 copies on the two quads--no pondering, of course). It is a bit more tedious to set things up this way, but the tournament duplicate command is helpful. This works better than more engines since Arena will not shut down and restart each engine if there are only two. I have tried not using engine restart, but Arena always does seem to do so anyway with more than two engines per tournament. This method saves a lot of time, since in some cases the engine start up time is a significant portion of the entire game time. For much longer time controls it would not be worth the bother to do separate individual pairings.
My hope is that removing the time losses and duplicate games will minimize any resulting ratings impact. Incidentally, the fast games seem to work quite well when testing evaluation changes, but I use longer times for search-related testing (as others have mentioned). These suggestions may not be as important for larger engine ranking tournaments, but I am primarily interested in testing/measuring small improvements in Tinker, which usually takes several thousand games.
In Arena I use 1/4+1/4 time control (15 sec + 0,25 sec increment) for fast games. Still faster games are not useful in Arena because of the large GUI time overhead. Also I had never a lost on time at 1/4+1/4, but at faster games I had. For still faster games I would prefer cutechess. I only use 2 engines in a tournament to avoid engine start up time. I don't use EGTB and use a small hash table size to avoid any trouble at ultra fast games.
brianr wrote:I have been trying to run some games with "fast" time controls under Arena (it looks like Arena was used in your games also).
After running the games, I load all of the separate tournaments (I run on two quads) with SCID and delete the losses on time (search header for "on time" and negate the filter before exporting). Then, I use PGNEXTRACT to remove any duplicate games before finally using BAYESELO. Even using random games from Bob Hyatt's 4,000 position suite, there are typically some duplicate games.
I have found that some engines handle "fast" times much better than others. Moreover, engine startup time can result in time losses (say for initially handling EGTB files, although they are not often used with search times of 0.1 seconds or less).
One thing that I have found helps a lot with Arena is to only run one pair of engines per copy of Arena (I run 8 copies on the two quads--no pondering, of course). It is a bit more tedious to set things up this way, but the tournament duplicate command is helpful. This works better than more engines since Arena will not shut down and restart each engine if there are only two. I have tried not using engine restart, but Arena always does seem to do so anyway with more than two engines per tournament. This method saves a lot of time, since in some cases the engine start up time is a significant portion of the entire game time. For much longer time controls it would not be worth the bother to do separate individual pairings.
My hope is that removing the time losses and duplicate games will minimize any resulting ratings impact. Incidentally, the fast games seem to work quite well when testing evaluation changes, but I use longer times for search-related testing (as others have mentioned). These suggestions may not be as important for larger engine ranking tournaments, but I am primarily interested in testing/measuring small improvements in Tinker, which usually takes several thousand games.
In Arena I use 1/4+1/4 time control (15 sec + 0,25 sec increment) for fast games. Still faster games are not useful in Arena because of the large GUI time overhead. Also I had never a lost on time at 1/4+1/4, but at faster games I had. For still faster games I would prefer cutechess. I only use 2 engines in a tournament to avoid engine start up time. I don't use EGTB and use a small hash table size to avoid any trouble at ultra fast games.
Thank you for your observations and thoughtful advice, Stefan.
I have to re-think and probably revise my ultra-fast testing methodology.