Own Books tournament with learning and ponder

Spock · Post by **Spock** » Mon Apr 28, 2008 11:53 pm

Just for something completely different, I decided to run a tournament with
- own books
- learning on
- ponder on

i.e. just about the opposite of normal CCRL testing. It allows each engine to played as the author intended, and if an engine really does benefit more than other engines from own books and learning, then we might see evidence of that. My rule for books was:

- own book
- if no own book, then a book recommended by the author, used to it's full depth. e.g. Zappa here uses Perfect13.ctg
-if neither, then balanced-12.ctg is used. Toga is an example here

The field is:

Code: Select all

Rybka 2.3.2a 64-bit     * RybkaII.ctg
Naum 3 64-bit           * "Tiny" NaumBook.bin
Zappa Mexico II 64-bit  * Perfect13.ctg (author recommended book)
Toga II 1.4 beta5c      * standard generic book, use Balanced-12.ctg
Spike 1.2 Turin         * built-in book
Glaurung 2.0.1 64-bit   * Book.bin
Shredder 11             * built-in book
Fritz 11                * Fritz11.ctg
Hiarcs 12               * built-in book
Deep Sjeng 2.7          * sjeng.bok
Junior 10               * Junior 10.ctg
Ktulu 8                 * KBook.bin
Chess Tiger 2007.1      * Ct.tbk

Conditions
----------
1CPU
ponder on
Deep Fritz 10 GUI
each engine 256MB hash
40/4 repeating
learning enabled where supported
chessbase book learning enabled

So far:
- Rybka is still clear number one. Own books from other engines still don't let them get close
- nothing to choose between Shredder 11, Hiarcs 12, Fritz 11 and Zappa Mexico II
- Junior 10, Toga and Naum pretty equal behind that bunch
- Glaurung, Spike and Sjeng pretty evenly matched below them
- Chess Tiger and Ktulu brinign up the rear

With an odd number of engines, it's difficult to get a point in time when all engines have played equal number of games. So far the tournament

rankings are:

Code: Select all

 100.0/130  Rybka 2.3.2a x64
  78.0/130  Fritz 11
  78.0/130  Shredder 11
  76.0/131  Hiarcs 12
  75.0/130  Zappa Mexico II x64
  69.5/130  Naum 3 x64
  69.0/130  Toga II 1.4 beta5c
  66.0/130  Junior 10
  50.5/129  Glaurung 2.0.1 x64
  50.0/130  Spike 1.2 Turin
  49.0/130  Deep Sjeng 2.7
  44.0/130  Chess Tiger 2007.1
  40.0/130  Ktulu 8

Of particular note are some of the scores against Rybka:
Rybka vs Shredder 11 9.5 - 0.5
Rybka vs Hiarcs 12 6.5 - 4.5

OK not many games so far, but in an actual tournament like Leiden, Paderborn etc, if an engine is to stand any chance of winning, it must as a minimum be able to have a reasonable chance of a draw against Rybka. So if Shredder's performance here is anything to go by, a win in such a tournament would be very very difficult for it. Hiarcs on the other hand stands a much better chance (in my opinion...)

I have also run the pgn for games to date through ELO stat and I get the following (start value 2,800 for no particular reason):

Code: Select all

   Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 2.3.2a 64-bit            : 2992   52  50   130    76.9 %   2783   35.4 %
  2 Fritz 11                       : 2864   48  47   130    60.0 %   2794   38.5 %
  3 Shredder 11 UCI                : 2863   49  49   130    60.0 %   2793   35.4 %
  4 Zappa Mexico II 64-bit         : 2851   46  46   130    57.7 %   2797   41.5 %
  5 HIARCS 12 SP                   : 2851   47  47   131    58.0 %   2794   39.7 %
  6 Naum 3 64-bit                  : 2823   47  47   130    53.5 %   2799   39.2 %
  7 Toga II 1.4 beta5c             : 2820   47  47   130    53.1 %   2799   38.5 %
  8 Junior 10                      : 2805   51  51   130    50.8 %   2799   29.2 %
  9 Glaurung 2.0.1 64-bit          : 2727   50  50   129    39.1 %   2803   33.3 %
 10 Spike 1.2 Turin                : 2725   47  47   130    38.5 %   2806   40.0 %
 11 Deep Sjeng 2.7                 : 2719   47  47   130    37.7 %   2806   40.0 %
 12 Chess Tiger 2007.1             : 2691   44  45   130    33.8 %   2808   46.2 %
 13 Ktulu 8                        : 2669   52  53   130    30.8 %   2810   29.2 %

Results won't mean much until I get more games...

Ovyron · Post by **Ovyron** » Tue Apr 29, 2008 12:43 am

Thank you for running this.

Interesting fact is that Naum 3 fell from possible #2 to possible #6 place.

Graham Banks · Post by **Graham Banks** » Tue Apr 29, 2008 12:45 am

Ovyron wrote:Thank you for running this.

Interesting fact is that Naum 3 fell from possible #2 to possible #6 place.

I wonder how much of this is due to its tiny book. It would be interesting to know the sizes of the various books.
Although in theory, shouldn't a small but good book benefit more from learning?
Not my area of expertise which is why I pose the question.

Ovyron · Post by **Ovyron** » Tue Apr 29, 2008 12:58 am

Yes, in the case of a really bad own book, I'd like the idea of finding a good book that fits the engine (best book > own book.)

Spock · Post by **Spock** » Thu May 01, 2008 10:26 pm

Tournament update:

Code: Select all

135.0/179 Rybka 2.3.2a x64		
109.0/179 Fritz 11
109.0/179 Zappa Mexico II x64		
104.0/180 Hiarcs 12			
101.0/179 Naum 3 x64			
 99.5/179 Shredder 11			
 94.0/179 Toga II 1.4 beta5c		 
 87.0/179 Junior 10			 
 70.5/179 Spike 1.2 Turin			 
 69.0/179 Glaurung 2.0.1 x64		 
 64.0/179 Deep Sjeng 2.7			 
 63.5/179 Chess Tiger 2007.1
 58.5/179 Ktulu 8

Shredder has slipped back now, and Zappa is making a run for 2nd spot
Considering Zappa is supposedly not that good at blitz, and is only using a generic book (Perfect13) it is doing very well indeed

One comment about Shredder - according to Stefan it's book learning feature only works in it's own GUI, so therefore it's not working here under chessbase GUI. But as for learning - my suspicion is that an engine probably needs hundreds or more likely thousands of games before learning starts to have an effect. But I could be wrong, pure guesswork

Spock · Post by **Spock** » Fri May 02, 2008 12:02 am

A few games later, after each has 180 games:

Code: Select all


    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 2.3.2a 64-bit            : 2976   44  43   180    75.0 %   2785   34.4 %
  2 Zappa Mexico II 64-bit         : 2870   41  40   180    60.8 %   2794   38.3 %
  3 Fritz 11                       : 2868   40  40   180    60.6 %   2794   40.0 %
  4 HIARCS 12 SP                   : 2850   41  40   180    57.8 %   2795   37.8 %
  5 Naum 3 64-bit                  : 2843   41  41   180    56.7 %   2796   36.7 %
  6 Deep Shredder 11 UCI           : 2837   42  42   180    55.8 %   2796   33.9 %
  7 Toga II 1.4 beta5c             : 2817   41  41   180    52.8 %   2798   35.6 %
  8 Deep Junior 10                 : 2789   44  44   180    48.3 %   2800   25.6 %
  9 Spike 1.2 Turin                : 2733   41  42   180    39.7 %   2805   35.0 %
 10 Glaurung 2.0.1 64-bit 1CPU     : 2723   43  44   180    38.3 %   2806   30.0 %
 11 Deep Sjeng 2.7                 : 2708   42  43   180    36.1 %   2807   33.3 %
 12 Chess Tiger 2007.1             : 2704   39  39   180    35.6 %   2807   42.2 %
 13 Ktulu 8                        : 2682   46  46   180    32.5 %   2809   25.0 %

Tony Thomas · Post by **Tony Thomas** » Fri May 02, 2008 7:25 am

Graham Banks wrote:
Ovyron wrote:Thank you for running this.

Interesting fact is that Naum 3 fell from possible #2 to possible #6 place.
I wonder how much of this is due to its tiny book. It would be interesting to know the sizes of the various books.
Although in theory, shouldn't a small but good book benefit more from learning?
Not my area of expertise which is why I pose the question.

Dont misunderstand the name. Tiny book isnt exactly tiny (43MB), its the book that uses the highest quality games. Ofcourse the other books are much bigger, but I doubt that lower quality games will help Naum get in to better positions.

Graham Banks · Post by **Graham Banks** » Fri May 02, 2008 11:04 am

Tony Thomas wrote:
Graham Banks wrote:
Ovyron wrote:Thank you for running this.

Interesting fact is that Naum 3 fell from possible #2 to possible #6 place.
I wonder how much of this is due to its tiny book. It would be interesting to know the sizes of the various books.
Although in theory, shouldn't a small but good book benefit more from learning?
Not my area of expertise which is why I pose the question.
Dont misunderstand the name. Tiny book isnt exactly tiny (43MB), its the book that uses the highest quality games. Ofcourse the other books are much bigger, but I doubt that lower quality games will help Naum get in to better positions.

Thanks for that information Tony.

Bill Rogers · Post by **Bill Rogers** » Fri May 02, 2008 8:53 pm

Hi Ray
Are going to make the pgn's for the games available for all of us?
Thanks
Bill

Spock · Post by **Spock** » Fri May 02, 2008 9:09 pm

Bill Rogers wrote:Hi Ray
Are going to make the pgn's for the games available for all of us?
Thanks
Bill

Yes I can make them available from the CCRL public forum. I didn't think anyone would be interested in blitz games to study, more just the actual results

Own Books tournament with learning and ponder

Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder

Re: Own Books tournament with learning and ponder