LS-rankinglist: restart with double thinking-time

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2434
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

LS-rankinglist: restart with double thinking-time

Post by pohl4711 »

The LS-rankinglist (LightSpeed-rankinglist), now restarted with more than double thinking time (45''+500ms instead of 20''+250ms)

Intel i7-2630QM (SSE42 support, Windows 7 64bit, 2 GHz Quadcore, FritzMark=20.2), 64 MB Hash, 1 core per engine (Hyperthreading off), no ponder, no endgame-bases, no resign. 500 selected opening-positions (all 8 moves deep, from Frank Q.-database)
Elos calculated with bayeselo (mm 0 1)(fixpoint Robbolito 0.085g3 3000 Elo). LittleBlitzerGUI (gauntlet-mode only, because this GUI chooses opening-positions per random in the round-robin-mode from the PGN-file...)
Time: 45''+500ms Fischerbonus (= 85-90 seconds per game/engine).

LS-rankinglist with best engine-versions only (no betas, no settings, no development-versions):

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3104    5    5 10000   62%  3016   43% 
   2 Critter 1.6a x64       3071    5    5 10000   57%  3019   52% 
   3 Strelka 5.5 x64        3070    5    5 10000   57%  3020   52% (singlecore)
   4 Komodo 5 x64           3058    5    5 10000   55%  3021   44% (singlecore)
   5 Ivanhoe 46h x64        3020    5    5 10000   49%  3024   54% (best open source)
   6 Robbolito 0.10 x64s    3018    5    5 10000   49%  3025   56% 
   7 Rybka 4.1 x64s         3012    5    5 10000   48%  3025   47% 
   8 Robbolito 0.085g3 x64  3000    5    5 10000   46%  3026   53% (singlecore)(Ippolit 2009)
   9 Stockfish 2.2.2 x64s   2994    5    5 10000   45%  3027   45% 
  10 Saros 3.0 x64          2988    5    5 10000   44%  3028   47% 
  11 Bouquet 1.4 x64s       2930    5    5 10000   35%  3034   44% 
The complete LS-rankinglist:

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64       3105    5    5 10000   62%  3017   43% 
   2 Houdini 1.5a x64       3084    5    5 10000   59%  3017   44% (best freeware (multicore))
   3 Critter 1.6a x64       3073    5    5 11000   57%  3026   52% 
   4 Strelka 5.5 x64        3073    5    5 11000   57%  3026   52% (singlecore)
   5 Komodo 5 x64           3060    5    5 11000   55%  3028   43% (singlecore)
   6 Ivanhoe 46h x64        3021    5    5 11000   49%  3031   54% (best open source)
   7 Robbolito 0.10 x64s    3019    5    5 11000   48%  3031   55% 
   8 Rybka 4.1 x64s         3013    5    5 11000   47%  3032   46% 
   9 Robbolito 0.085g3 x64  3000    5    5 11000   45%  3033   53% (singlecore)(Ippolit 2009)
  10 Stockfish 2.2.2 x64s   2996    5    5 11000   45%  3033   44% 
  11 Saros 3.0 x64          2988    5    5 11000   43%  3034   47% 
  12 Bouquet 1.4 x64s       2931    5    5 11000   35%  3039   43% 
(x64=64bit version, x64s=64bit SSE42-version)


deleted betas, development-versions, settings: none
aborted test-gauntlets (because of too bad results): none


If you want to get the games from the LS-list, send me an eMail-adress per PM. I will send the games (PGN-file) as soon as possible...

Greetings – Stefan
ThatsIt
Posts: 991
Joined: Thu Mar 09, 2006 2:11 pm

Re: LS-rankinglist: restart with double thinking-time

Post by ThatsIt »

Wow!, average (nearly) 1 (one) second per move !
Not too bad ... ermm, too short.
User avatar
pohl4711
Posts: 2434
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-rankinglist: restart with double thinking-time

Post by pohl4711 »

1 second is a very long time on a modern computer. And in the middlegame the time per move is around 2-3 seconds and 0.5 seconds in endgame. 1 second per move is the average time per move displayed by the LittleBlitzerGUI (for all moves in all games played by one engine).

Best - Stefan
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: LS-rankinglist: restart with double thinking-time

Post by lkaufman »

pohl4711 wrote:1 second is a very long time on a modern computer. And in the middlegame the time per move is around 2-3 seconds and 0.5 seconds in endgame. 1 second per move is the average time per move displayed by the LittleBlitzerGUI (for all moves in all games played by one engine).

Best - Stefan
I'm glad you doubled the time limit, it is at least now about what people call "bullet chess". I note that your computer is only 2 GHz, so it's still faster than bullet chess on any recent model computer. Your list will be pretty accurate for comparing similar engines, for example all Ippo-related engines, or different versions of the same engine, but will not be a good predictor of slower results when comparing dissimilar engines (i.e. Komodo or Stockfish vs. Ippos). Our latest (unreleased) Komodo on Windows is clearly stronger than H 1.5 (and probably equal to 2.0) at 5' level, but when I test at a level similar to the one you are using we are still 9 elo behind H 1.5 (after over 5000 games), though a PGO compile should close this gap. We still have no explanation for the extreme bullet strength of these Ippo-cousins, but it fades rapidly with longer time limits.
User avatar
pohl4711
Posts: 2434
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-rankinglist: restart with double thinking-time

Post by pohl4711 »

lkaufman wrote: Your list will not be a good predictor of slower results
Compared with my old LS-rankinglist (20''+250ms) only Stockfish scores really better with the double time in the new LS-list (around +20 Elo). And only Houdini 2.0c scores weaker (-14 Elo). Houdini 1.5a score and elo did not change with doubled time. Komodo 5 is only around 8-10 Elo stronger with doubled time.
So I believe, that only Houdini 2.0c gets stronger and stronger with less thinking time,but not Houdini 1.5a (perhaps because Houdini 2 plays more aggressive than Houdini 1.5a ?!?). And only Stockfish gets really stronger with more thinking time and Komodo a little bit.

Best - Stefan
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: LS-rankinglist: restart with double thinking-time

Post by Houdini »

pohl4711 wrote:
lkaufman wrote: Your list will not be a good predictor of slower results
Compared with my old LS-rankinglist (20''+250ms) only Stockfish scores really better with the double time in the new LS-list (around +20 Elo). And only Houdini 2.0c scores weaker (-14 Elo). Houdini 1.5a score and elo did not change with doubled time. Komodo 5 is only around 8-10 Elo stronger with doubled time.
So I believe, that only Houdini 2.0c gets stronger and stronger with less thinking time,but not Houdini 1.5a (perhaps because Houdini 2 plays more aggressive than Houdini 1.5a ?!?). And only Stockfish gets really stronger with more thinking time and Komodo a little bit.

Best - Stefan
Your 20"+250ms result for Houdini 2.0c was clearly a statistical outlier.
I play even faster games (e.g. 8"+80ms) on weaker hardware, and my results are very much in line with your current 45"+500ms results, Houdini 2.0c is about 20 to 25 Elo stronger than Houdini 1.5a.

It shows how dangerous it is to make extrapolations based on a limited number of data points, most of the simple conclusions tend to be incorrect.

Robert
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: LS-rankinglist: restart with double thinking-time

Post by lkaufman »

pohl4711 wrote:
lkaufman wrote: Your list will not be a good predictor of slower results
Compared with my old LS-rankinglist (20''+250ms) only Stockfish scores really better with the double time in the new LS-list (around +20 Elo). And only Houdini 2.0c scores weaker (-14 Elo). Houdini 1.5a score and elo did not change with doubled time. Komodo 5 is only around 8-10 Elo stronger with doubled time.
So I believe, that only Houdini 2.0c gets stronger and stronger with less thinking time,but not Houdini 1.5a (perhaps because Houdini 2 plays more aggressive than Houdini 1.5a ?!?). And only Stockfish gets really stronger with more thinking time and Komodo a little bit.

Best - Stefan
That agrees pretty well with my findings. But even a swing of ten elo for a doubling is not insignificant, it would imply that Komodo would gain 20 elo relative to H 1.5 and a lot more relative to H2 with two more doublings, which would be in the range of normal blitz.