GPL Blitz update

lucasart · Post by **lucasart** » Sat Dec 24, 2011 6:58 pm

Here's an update on my blitz rating list. It's a blitz rating for open source engines only.

Results

Rank Name                Elo    +    - games score oppo. draws 
   1 Umko 1.2 (x2)      2899   44   41   200   81%  2673   28% 
   2 Fruit 2.1          2700   24   23   625   64%  2587   28% 
   3 GNU Chess 5.1      2659   26   26   450   55%  2627   32% 
   4 Sloppy 0.2.2       2631   27   27   425   49%  2634   29% 
   5 Pepito 1.59        2587   32   32   300   53%  2559   30% 
   6 Greko 8.2          2519   30   30   350   47%  2539   30% 
   7 Pawny 0.3.1        2488   24   24   600   45%  2520   21% 
   8 DoubleCheck 2.3.1  2397   33   32   300   59%  2335   22% 
   9 Sungorus 1.4       2357   22   22   675   44%  2410   19% 
  10 Jazz 5.01          2349   23   23   600   45%  2388   24% 
  11 DoubleCheck 2.3    2340   21   21   725   47%  2366   22% 
  12 Beowulf 2.4        2286   32   33   300   45%  2329   18% 
  13 GNU Chess 5.08     2191   37   38   250   30%  2346   18%

Conditions
* only open source and portable programs: I'm not interested in proprietary and/or windows only programs. Ideally licensed under the GNU GPL, otherwise no license or a license that doesn't present "excessive" copyright terms.
* 1min + 1sec increment: for any given CPU time, it's better to play 10 times more games, than play ten times longer games.
* 64 MB Hash, no EGTB: 64 is certainly enough for such rapid games. As for EGTB, any good program will show almost zero elo increase with EGTB.
* book: performance.bin by Marc Lacrosse, limited to 10 moves (20 half-moves).
* 64 bit versions only: I don't see any good reason to double the testing work by testing both 32 and 64 bit versions of a given engine.
* interface: cutechess-cli. This is a command line interface, which has two benefits compared to GUIs
- it allows multi-threaded testing. For example if engine A and B don't have an SMP search, then I can run 2 games in parallel on my 2 CPU hardware. When A and/or B are SMP, then games must be played one by one, allowing SMP engines to use the 2 CPUs.
- it is very fast and doesn't cause programs to lose on time for such a quick time control.
* SMP capable programs play with 2 CPU: It is not a trivial task for engine developers to parallelize the search algorithm, so it's only fair to give them that advantage over non SMP programs.
* pondering off: using pondering with multi-threaded testing or multi-threaded programs is a bad idea, as the engine pondering may significantly reduce the CPU allocation of its opponent.
* elo calculator: BayesElo, certainly better than EloStat for many reasons. The list is calibrated with Fruit 2.1 at 2700 elo.
* no automatic resigning for "weak" engines: some programs are buggy and may not be able to win a dead won endgame, so they should be penalised accordingly. Of course some engines (typically xboard) have a resign feature hardcoded in the program, so I let them resign as they please.