Ratinglist based on positional openingpositions

Yarget · Post by **Yarget** » Mon Feb 04, 2008 12:45 pm

Hello everyone!

As I wrote earlier in this thread I have made a break in my test of Bright 0.2c because I want to test Toga II 1.4beta5c 2 CPU as quick as possible. I have now finished the positional games and the result is very fine for Toga. Here is the updated positional ratinglist:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 2.3.2a mp 32-bit         : 2935   42  41   200    70.2 %   2786   31.5 %
  2 Toga II 1.4 beta5c             : 2847   39  39   200    57.5 %   2795   35.0 %
  3 Deep Shredder 11 UCI           : 2830   40  39   200    54.8 %   2797   33.5 %
  4 Deep Fritz 10                  : 2827   40  40   200    54.2 %   2797   31.5 %
  5 Zap!Chess Zanzibar             : 2800   39  39   200    50.0 %   2800   34.0 %
  6 LoopMP 11A.32                  : 2779   38  38   200    46.8 %   2802   37.5 %
  7 Deep Junior 10.1               : 2776   44  44   200    46.2 %   2802   18.5 %
  8 SpikeMP 1.2 Turin              : 2776   39  40   200    46.2 %   2802   33.5 %
  9 HIARCS 11.1 MP UCI             : 2773   39  39   200    45.8 %   2802   35.5 %
 10 Naum 2.2                       : 2765   37  38   200    44.5 %   2803   40.0 %
 11 Glaurung 2.0.1                 : 2693   41  41   200    33.8 %   2810   31.5 %

I expected Toga to make a strong performance but not that strong. Again I have compared my testresults with the CEGT 40/4 Ratinglist and I have calculated that the performance of Toga is 25,9 ratingpoints above the expected. Very impressive (only Junior, Fritz and Rybka have done better than that).

The gambitgames are running now and it seems that Toga is struggling a bit more. I'll make a new update when the gambitgames are finished.

Regards
Per

Yarget · Post by **Yarget** » Mon Feb 04, 2008 12:47 pm

Sorry if I missed it, but were the games posted somewhere?

Send me a PM with your E-mailadress and I'll send you the games.

Regards
Per

maxchgr · Post by **maxchgr** » Tue Feb 05, 2008 6:49 am

these results are completely counterintuitive
the 'positonal' engines are doing well in gambit games

maxchgr · Post by **maxchgr** » Tue Feb 05, 2008 7:13 am

Like I have stated earlier in this thread Junior is in many ways an extreme/sensible engine so I'm not surprised by this result.

I can't make sense of it. What do you mean by extreme/sensible? I'm having trouble applying sensible to an engine, what does it mean? And aren't extreme and sensible kind of.. opposite words?

thanks for your clarification

Yarget · Post by **Yarget** » Tue Feb 05, 2008 8:47 am

I can't make sense of it. What do you mean by extreme/sensible? I'm having trouble applying sensible to an engine, what does it mean?

Well, let me try and clarify what I mean. In the first post in this thread I wrote the following which explains the idea of making these tests:

As some of you might remember I used to do the MP-tests for the former CSS Ratinglist. This ratinglist was based on fixed openingpositions and engines were not allowed to use any kind of openingbooks. I still remember that especially Deep Junior 10 was performing extremely well in certain closed openings like English (openingposition after: 1. c2-c4 c7-c5 2. Sb1-c3 Sb8-c6 3. g2-g3 g7-g6 4. Lf1-g2 Lf8-g7 5. e2-e4 e7-e5) while performing less well in other (more often) "open" openings. Inspired by this I got the idea to the current project that I've started a couple of weeks ago.

In my opinon an engine is "extreme" if it in some openings score very high (even better that Rybka) while performing very bad in other openings. This is indeed true for Junior and I think that my new tests are confirming this. Perhaps the word "sensible" is better meaning that the playing performance of Junior is very dependent of the opening and type of positions. The "sensibility" of Junior is also expressed in several ratinglists. When (Deep) Junior 10 is using a commin enginebook it has got a playingstrength clearly behind engines like Fritz 10, Zanzibar, Shredder 10 and Hiarcs 11 (check lists at CEGT and CCRL). However if Junior is allowed to play with its own well-tuned book then it's another story as the SSDF ratinglist is showing:

http://ssdf.bosjo.net/list.htm

Only Hiarcs is then in front of Junior and only by few points.

What is the explanation for the "sensibility" of Junior? Well, first of all I would say that Deep Junior 10.1 is a sharp engine. Just take a look at the drawfrequency in the positional ratinglist, below 20% for Junior is very remarkable when you compare with the other engines. What IMO makes Junior very special (and one of the reasons for doing so well in these rather closed and positional openings) is the habit of making "positional" sacrifices that has a long-termed aim. Especially in very closed positions such sacrifices can be very effective. Regarding Junior and sacrifices Steven Lopez recently wrote this: "Almost any chess engine will sacrifice material for an immediate gain; for example, if a Queen sacrifice results in a forced mate-in-two, you'll see a chess engine sac the Queen with no problem. Junior, though, will sometimes sacrifice minor material to clear a line or to otherwise free its game, which is something almost unheard of among chessplaying programs." I agree 100% with Steven and it's worth reading his article here:

http://www.chessbase.com/newsdetail.asp?newsid=4357

Regarding the "sensibility" of engines: I have made and calculated the ratingdifference (between the positional ratinglist and the gambit ratinglist) for each engine (the larger the number the more "sensible" the engine is):

1-2 Junior & Spike each 100 ratingpoints!
3. Naum 54 ratingpoints
4. Hiarcs 48 ratingpoints
5. Glaurung 40 ratingpoints
6. Rybka 28 ratingpoints
7. Loop 24 ratingpoints
8. Fritz 4 ratingpoints
9. Shredder 2 ratingpoints
10. Zap 0 ratingpoints!

In other words: for Zap it doesn't matter at all whether it plays the gambits or the positional games while engines like Junior and Spike are very "sensible" engines. Like I have stated earlier in this thread Junior is in many ways an extreme/sensible engine so I'm not surprised by this result.

Finally let me emphasize that the ratinglists I have produced are quite small and big, firm conclusions shouldn't be drawn from my tests. However these tests might provide some indications regarding the preferred type of positions for a number of engines.

Regards
Per

Marek Soszynski · Post by **Marek Soszynski** » Tue Feb 05, 2008 12:52 pm

these results are completely counterintuitive
the 'positonal' engines are doing well in gambit games

One way to make sense of this is to understand that there is no need to play sharply or speculatively in a position that is already "wild" or complex and tactical. Similarly, on balance there is no winning advantage in playing positionally in a position that remains "quiet" or simple and strategical.

This explains the apparent paradox that quiet engines do better in wild positions, whereas wild engines do better in quiet positions.

Marek Soszynski · Post by **Marek Soszynski** » Tue Feb 05, 2008 1:14 pm

Per,

You are performing some very interesting and valuable tests.

Unfortunately, extreme, sensible and sensibility are certainly not the best terms to use. You also use the word sharp to describe an engine (Junior) that has a low draw-frequency. In that case, what word would you use to describe an engine that has a relatively low draw-frequency but whose playing style is the complete opposite?

Tony Thomas · Post by **Tony Thomas** » Tue Feb 05, 2008 1:23 pm

Instead of sensible, he can use the word sensitive. Sensitive as in Junior is sensitive to the opening it plays, if it plays an opening it doesnt like, it pretty much loses most of the time.

Yarget · Post by **Yarget** » Tue Feb 05, 2008 2:35 pm

I admit that by using several words like sensitive, sensible, extreme, sharp and so on I have caused some confusion. However I hope that the point that the playingstrength of especially Junior depend on the type of opening played is clear.

Starting from now I'll only use the suggest by Tony: sensitive.

Regards
Per

Uri Blass · Post by **Uri Blass** » Tue Feb 05, 2008 2:40 pm

Yarget wrote:I admit that by using several words like sensitive, sensible, extreme, sharp and so on I have caused some confusion. However I hope that the point that the playingstrength of especially Junior depend on the type of opening played is clear.

Starting from now I'll only use the suggest by Tony: sensitive.

Regards
Per

I think that it also may be interesting to know if there are opening that are more sensitive to time.

In other words opening when the importance of more time is bigger in them and opening when time is relatively unimportant because relatively evaluation is more important than search in them.

Uri

Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions