Ratinglist based on positional openingpositions

Yarget · Post by **Yarget** » Sat Jan 12, 2008 8:16 pm

Hello everyone!

As some of you might remember I used to do the MP-tests for the former CSS Ratinglist. This ratinglist was based on fixed openingpositions and engines were not allowed to use any kind of openingbooks. I still remember that especially Deep Junior 10 was performing extremely well in certain closed openings like English (openingposition after: 1. c2-c4 c7-c5 2. Sb1-c3 Sb8-c6 3. g2-g3 g7-g6 4. Lf1-g2 Lf8-g7 5. e2-e4 e7-e5) while performing less well in other (more often) "open" openings. Inspired by this I got the idea to the current project that I've started a couple of weeks ago.

I have selected 10 fixed openingpositions that I would describe as positional. Most of them are very closed openingpositions (like the above mentioned English, Benoni, Stonewall, closed Kingsindian to mention some of them) and common to all 10 openingpositions is that sharp and tactical play is not "just around the corner". It's more about "long" knightoperations, pushing the pawns at the right moment after careful preparations, optimizing small advantages and so on. Needless to say, tactics and combinations can and will very often occur in these games but again: they are not likely to happend before the middlegame or more often the late middlegame.

Contrary to this I have selected 10 fixed openingpositions that consist of (very often sharp) gambits like Kings Gambit, Nordic Gambit, Morra Gambit and Blackmar-Diemar to mention a couple. The aim of all these tests are to determine which engines that "prefer" positional openings, which who "prefer" the closed, positional ones and which who don't mind. It should be emphasized that these tests won't result in firm conclusions stating that Engine X is a positional one or the opposite. Coming to such conclusions require more openingpositions, more games etc, much more than one person can do in one PC. However these tests might provide some indications regarding the preferred type of positions for a number of engines. Here comes the exact testconditions:

Windows XP Pro 32 bit
(Deep) Fritz 10 GUI
AMD Athlon 64 X2 4200
128 MB Hashtables for each engine
3-4-5 Tablebases (32 MB cache)
Pondern OFF
Timecontrol: 40/4 repeatedly (4 minutes for 40 moves)
Books: No books allowed, engines play on their own from the startpoint of each openingposition
Games: Each engine is playing each openingposition against all opponents with both white and black meaning that each enginematch will consist of 20 games

I have just finished the first 900 games in the positional test meaning that 10 engines have played 180 games each. This is how the first positional ratinglist looks (averagerating 2800):

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 2.3.2a mp 32-bit         : 2928   44  43   180    69.4 %   2785   32.2 %
  2 Deep Shredder 11 UCI           : 2833   42  42   180    55.3 %   2796   33.9 %
  3 Deep Fritz 10                  : 2831   43  42   180    55.0 %   2796   31.1 %
  4 Zap!Chess Zanzibar             : 2798   42  42   180    49.7 %   2800   32.8 %
  5 Deep Junior 10.1               : 2793   46  46   180    48.9 %   2800   20.0 %
  6 LoopMP 11A.32                  : 2784   40  40   180    47.5 %   2801   37.2 %
  7 HIARCS 11.1 MP UCI             : 2780   42  42   180    46.9 %   2802   32.8 %
  8 SpikeMP 1.2 Turin              : 2780   42  42   180    46.9 %   2802   32.8 %
  9 Naum 2.2                       : 2768   39  39   180    45.0 %   2803   41.1 %
 10 Glaurung 2.0.1                 : 2705   43  44   180    35.3 %   2810   30.6 %

At first sight everything looks quite normal, Rybka is leading in front of Deep Shredder 11 and Glaurung 2.0.1 at the bottom of the list. However this small list has a couple of small surprises and to reveal them I've used the CEGT 40/4 as a kind of referencelist. I have compared the ratingdifference between all engines in my list with the ratings for these engines in the CEGT 40/4 list. By doing that it is possible to make a Top 10 list that shows which engines who (compared to the CEGT list) have benefited from my tests and which who haven't:

1. Deep Junior 10.1 +70,33 ratingpoints
2. Deep Fritz 10 +49,22 ratingpoints
3. Rybka 2.3.2a mp +30,33 ratingpoints
4. Zap!Chess Zanzibar 2CPU +17,00 ratingpoints
5. SpikeMP 1.2 Turin +8,11 ratingpoints
6. LoopMP 11A.32 +1,44 ratingpoints
7. Deep Shredder 11 UCI -18,56 ratingpoints
8. Naum 2.2 2CPU -36,33 ratingpoints
9. Hiarcs 11.1 MP -57,44 ratingpoints
10. Glaurung 2.0.1 2CPU -64,11 ratingpoints

In other words: Deep Junior 10.1 has gained app. 70 ratingpoints compared to the CEGT Referencelist by competing in my tests, Deep Fritz 10 app. 49 points and so on. Considering what I wrote in the beginning it's hardly a surprise that Deep Junior 10.1 has gained 70 ratingpoints when the games start in a very often closed, positional position. Junior is an extrene engine in many ways. More surprising is the performance by Deep Fritz 10 (losing only 9½-10½ against Rybka!) and perhaps also Rybka. I certainly didn't expect Hiarcs to be more than 50 ratingpoints worse than the CEGT list but especially a 5-15 defeat against Deep Junior 10.1 was painful.

I have just started the tests of the Gambitopenings. When these tests are done I'll make a new ratinglist and then I'll compare the two lists. If someone is interested in the games send me a PM and I'll send you a pgn-file with the games.

Best regards
Per

Oscar L · Post by **Oscar L** » Sat Jan 12, 2008 10:30 pm

Your CSSF list is quite missed

Any possibility of a coming back by Klaus and you?

Thanks for the interesting testing and analysis. I always thought that Junior was better for sharp and open positions

Waiting for the results with the gambit openings

Yarget · Post by **Yarget** » Sun Jan 13, 2008 1:40 am

Producing the CSS Ratinglist (in fact it was 2 ratinglists: the original single ratinglist and then later the MP ratinglist as well) was quite demanding. Especially for Klaus Wlotzka who besides from testing was responsible for the website including all the statistic details etc. For the time being there are no plans to resume the CSS Ratinglist, both Klaus and I are too busy at the moment for such timedemanding projects. Having said that I should mention that we are still in touch and that we one day (when the conditions are right) would like to work together again (either to resume the CSS Ratinglist or start a new project).

Regarding Junior I would say that Deep Junior 10.1 indeed is a sharp engine. Just take a look at the drawfrequency in the ratinglist, only 20% for Junior is very remarkable when you compare with the other engines. What IMO makes Junior very special (and one of the reasons for doing so well in these rather closed and positional openings) is the habit of making "positional" sacrifices that has a long-termed aim. Especially in very closed positions such sacrifices can be very effective. Regarding Junior and sacrifices Steven Lopez recently wrote this: "Almost any chess engine will sacrifice material for an immediate gain; for example, if a Queen sacrifice results in a forced mate-in-two, you'll see a chess engine sac the Queen with no problem. Junior, though, will sometimes sacrifice minor material to clear a line or to otherwise free its game, which is something almost unheard of among chessplaying programs." I agree 100% with Steven and it's worth reading his article here:

http://www.chessbase.com/newsdetail.asp?newsid=4357

Speaking of Junior, at this very moment it's playing the gambitgames against Rybka and Junior is down with 2-13 after 15 games!! In the positional games it mannaged to achieve a very fine 8-12 defeat against Rybka but it certainly won't happend here in these Gambitgames.

Regards
Per

Mike S. · Post by **Mike S.** » Mon Jan 14, 2008 12:24 am

This is a good idea and a very interesting comparision. I am surprised that Junior 10.1 gains so much if positional (or closed) starting positions are used, only. But I am not much surprised that Hiarcs 11 lost points in comparison to CCRL, because - unlike many commentators - I consider tactics being Hiarcs' biggest talent, not positional play (although I'm sure it's not bad at that either).

I am curious how the other test with the tactical openings will end. I predict that Hiarcs will gain from it significantly, probably Glaurung too. Rybka could lose some points relative to CCRL, or at least in comparison to the positional result... OTOH, if the gambit openings are, or lead to imbalanced material and/or compensation for material, maybe Rybka gains even more than in the closed positions. We will see.

Can you post the two opening sets in PGN? With your permission, I think other testers would like to use them for such tests like yours, for other or new engines etc.

Yarget · Post by **Yarget** » Mon Jan 14, 2008 9:27 am

Hello Mike!

Your prediction regarding Hiarcs and tactical abilities seems to be right. It is still early days in the gambitgames but so far Hiarcs is scoring more than 50% while engines like Fritz and Shredder are below 50% (and Junior way below 50%). Rybka is as one would expect leading.

Here follows the 10 fixed openingpositions in the Gambitgames:

Code: Select all

1. e4 e5 2. f4 exf4 *

1. d4 f5 2. e4 fxe4 *

1. d4 d5 2. e4 dxe4 3. Nc3 Nf6 4. f3 exf3 *

1. e4 c5 2. d4 cxd4 3. c3 dxc3 *

1. e4 e5 2. d4 exd4 3. c3 dxc3 4. Bc4 cxb2 *

1. f4 e5 2. fxe5 d6 3. exd6 *

1. d4 Nf6 2. c4 e5 3. dxe5 *

1. e4 e5 2. Nf3 f5 3. Nxe5 *

1. e4 d5 2. exd5 Nf6 3. Bb5+ *

1. d4 Nf6 2. c4 c5 3. d5 b5 4. cxb5 a6 5. bxa6 Bxa6 6. Nc3 g6 *

As you see I have chosen 5 white and 5 black gambits and I haven't avoided the sharp ones (Nordic Gambit and Latvian Gambit as the probably most sharp ones). And here comes the positional ones:

Code: Select all

1. c4 c5 2. Nc3 Nc6 3. g3 g6 4. Bg2 Bg7 5. e4 e5*

1. e4 e6 2. d4 d5 3. e5 c5 4. c3 Nc6 5. Nf3 Qb6 6. a3 c4 *

1. d4 e6 2. c4 f5 3. Nc3 d5 4. Nf3 Nf6 5. Bg5 
c6 6. e3  Be7 *

1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Nxe4 5. d4 Nd6 6. Bxc6 dxc6 7. dxe5 Nf5
8. Qxd8+ *

1. d4 Nf6 2. c4 e6 3. Nc3 Bb4 4. e3 c5 5. Nf3 Nc6 6. Bd3 Bxc3+ 7. bxc3 d6 8.
O-O e5 9. d5 Ne7  *

1. e4 c5 2. Nc3 Nc6 3. g3 g6 4. Bg2 Bg7 5. d3 d6 *

1. d4 Nf6 2. c4 c5 3. d5 e5 4. Nc3 d6 5. e4 Be7 *

1. e4 c6 2. d4 d5 3. e5 Bf5 4. Nf3 e6 5. Be2 *

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. e4 d6 5. Nf3 O-O 6. Be2 e5 7. O-O Nc6 8. d5
Ne7 *

1. d4 d5 2. c4 c6 3. cxd5 cxd5 4. Nc3 Nf6 5. Bf4 Nc6 6. Nf3 a6 *

The clear majority are closed openings with at least 2 important exceptions: The Berlin defence in Ruy Lopez (number 4) and The Slav (number 10). I consider the first one as positional in the sense that it's about accumulating small advantages, placing the pieces on strong positions, making the right pawnmoves at the right moment(s) and so on. This is the only way to get an advantage and tactics is certainly not "just around the corner". It could be in the Slav but a major exchange of rooks and queens are likely to happend in the open C-line. I admit that including the Slav in these positional (and mostly closed openings) is a bit problematic but I have a weakness for this opening

The gambitgames are running at this moment and I must say that it is a pleasure following these games. The drawfrequency will for sure be lower than in the positional games and I've already seen many spectacular knock-out games. Compared to the often very long positional games this is a completely different game.

Regards
Per

Spock · Post by **Spock** » Mon Jan 14, 2008 1:36 pm

Thanks Per, this sort of stuff is really interesting. Normal ratings lists can be a little boring at times

ArmyBridge · Post by **ArmyBridge** » Mon Jan 14, 2008 4:40 pm

Hi Per!! your idea is great, and about the tactis opening, have you take in count the wing gambit? 1. e2-e4, c7-c5. 2. b2-b4, c5xb4. 3. Bf1-c4, Ng8-f6. also in french we have the same gambit : 1.e4 e6 2.d4 d5 3.e5 c5 4.b4 (1.e4 e6 2.d4 d5 3.c4) Halloween attack (a very wild opening) 1.e4 e5 2.Nc3 Nf6 3. Nf3 Nc6 4.Nxe5 (some peoaple thinks that it is a bad move, or loser move, but in my database white scored about 60%

) and for black would be interesant the taxler defence, ... hey I just forget the greco-moller attack in the italian game, and the italian gambit. I've noticed that Junior performance is very well in the english as you said above, to me is very strange that Junior performance poor in open games, add to it that in several tactic test Junior did very bad, is Junior a bad tactic?.
Regards

Yarget · Post by **Yarget** » Mon Jan 14, 2008 6:32 pm

Hello Armando!

Thanks for your suggestions regarding gambits. Indeed, there are a lot of gambits to choose between and it wasn' t easy at all to make the final selection. I'm sure that testing your suggested gambits would be great and if I at some time should make some changes I'll have your suggestions in my mind.

Regarding the playingstyle of Junior please check my answer to Oscar above. One thing is for sure, Junior is an extreme engine and after having performed very well in the positional games it's now facing huge problems in the gambits.

Regards
Per

Oscar L · Post by **Oscar L** » Mon Jan 14, 2008 7:43 pm

Hi Per.

It would be great to see you working with Klaus again, you are a nice team

Thanks for the chessbase link, the comments about Junior play are very interesting indeed. So the key word is speculation. But seeing your anticipated results in your gambit openings, what can we conclude, that Junior only like its own speculations?

I say this because gambit openings created by humans are based at least partly in speculation.

In hiarcs forum it has been said that junior 11 will be released soon, this time as UCI, not chessbase engine. What do you think we can expect. Will Amir Ban be able to maintain this particular style and at the same time increase elo strength to be competitive vs Rybka? Perhaps it is only possible being solid and positional.

Yarget · Post by **Yarget** » Mon Jan 14, 2008 9:53 pm

Hello Oscar

Yes, I've heard the rumours about Junior 11 as well. This release should be different as Junior will become an UCI-engine. Before the release I think we will se a Junior 10 UCI release. What should we expect from Junior 11? Well, hopefully a stronger engine but hopefully also an engine which maintains this special way of playing. The world of chessengines will become more boring if Junior 11 turns out to be another strong "mainstream" engine. We'll see.

As mentioned earlier Junior is quite an extreme engine but perhaps the word "sensible" fits better. Sensible in the sense that Junior when playing a position it "likes" is very, very strong and vice versa. Basicly this is true for all engines (and humans as well) but in particular for Junior. Just take a look at the results so far for Deep Junior 10 and compare them with the results for the positional games:

versus Rybka 2.3.2a mp 3-17 (8-12 in POS-games)
versus Hiarcs 11.1 MP 5½-14½ (15-5 in POS-games!!)
versus Deep Fritz 10 6-14 (5½-14½ in POS-games)
versus Deep Shredder 11 8-12 (9-11 in POS-games)
versus Glaurung 2.0.1 currently 4½-8½ (!) (11½-8½ in POS-games)

The "sensibility" of Junior is also expressed in several ratinglists. When (Deep) Junior 10 is using a commin enginebook it has got a playingstrength clearly behind engines like Fritz 10, Zanzibar, Shredder 10 and Hiarcs 11 (check lists at CEGT and CCRL). However if Junior is allowed to play with its own well-tuned book then it's another story as the SSDF ratinglist is showing:

http://ssdf.bosjo.net/list.htm

Only Hiarcs is then in front of Junior and only by few points.

Regards
Per

Ratinglist based on positional openingpositions

Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions

Re: Ratinglist based on positional openingpositions