I'm running some handicap matches between current Komodo (just a couple elo stronger than 9.1) and older engines, partly for curiosity and also to gain some experience with handicaps in case we decide to run some Komodo vs. GM (or IM or FM) matches. All of these matches use four threads per engine, which pretty much guarantees variety in the games even if the initial position is repeated. All are run on a recent (Haswell) I7. So far I'm just running straight game/1' matches with 256 Mb hash. Since four threads produces about a 3 to 1 effective speedup, this is about like three minute chess on one core.
The first match was against Crafty 20.14, a rather old engine that although far below current standards is nonetheless probably of human GM strength, higher at blitz. The handicap was knight odds (Komodo plays White, removes the b1 knight. Much to my surprise, Komodo won the 15 game match without losing a single game, scoring 10 wins and 5 draws, which is a +280 elo result. Hard to imagine such a result at knight odds.
Next, I decided to try to determine a reasonable handicap against Deep Rybka 4, the strongest Rybka and the top engine (or nearly so) a few years ago. Of course knight odds is ridiculous, which I verified with Rybka winning 11 games to one, no draws. Next I tried the traditional pawn and move (Komodo plays Black without f7). For added variety I programmed in five good opening moves for White and Black. Still, Rybka won by 7 to 3 (5 wins, one loss, 4 draws). White gets a big positional advantage on top of the extra pawn at this handicap. Then, I tried normal chess but with the opening sequence 1 e4 f6? 2d4 Kf7?. Despite playing Black from this awful position, Komodo won by an astonishing 8 to 2 (7 wins, one loss, two draws).
After this I tried Exchange odds (remove a1 rook, b8 knight, Komodo plays White). It was very close, but Komodo lost by one game (three wins, four losses, three draws. I then tried a modification where the Black rook starts on b8 so both sides start without long castling rights and without protection of the "a" pawn. Result was similar (three wins, five losses, eight draws).
So it seems that Exchange odds is just a bit much for Rybka 4 at bullet speed, but the closest match I've found so far. Naturally all of these handicaps would be harder for Komodo if both sides had more time.
More tests to come...
Handicap engine matches
Moderator: Ras
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Handicap engine matches
Komodo rules!
-
Ajedrecista
- Posts: 2189
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Handicap engine matches.
Hello Larry:
Interesting matches so far. Maybe this post could be useful to you? The info was extracted from Aquarium Demo.
Good luck with the next matches and the development of Komodo.
Regards from Spain.
Ajedrecista.
Interesting matches so far. Maybe this post could be useful to you? The info was extracted from Aquarium Demo.
Good luck with the next matches and the development of Komodo.
Regards from Spain.
Ajedrecista.
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Handicap engine matches.
Well, it's interesting, but what is the source of the rating estimates? Probably just one person's opinion. Actually, it's possible that at least part of the table is based on estimates from me; I have some recollection of helping Aquarium in this matter.Ajedrecista wrote:Hello Larry:
Interesting matches so far. Maybe this post could be useful to you? The info was extracted from Aquarium Demo.
Good luck with the next matches and the development of Komodo.
Regards from Spain.
Ajedrecista.
In any event the rating equivalence of a handicap depends heavily on the time limit. It's quite difficult for me to beat Komodo at knight odds in blitz (I use 4' +2" for blitz), but I don't think it would be a challenge for me at 30 minutes or so. One rather strong GM (I won't name him, but over 2600 FIDE) told me that he is a big underdog against Houdini at 20' with the traditional pawn and move (f7) handicap, and about even at two pawn handicap (b2 and f2). Presumably Komodo would be even harder for him. But I suppose he would do much better with two hours or so.
I may work up a table of rating equivalences for engines at various handicaps, but it will be time-limit dependent. I think that a human will always do better than an similarly rated engine when taking a handicap from a much stronger engine, because he knows how to modify his play to take into account that the opponent is much stronger.
Komodo rules!
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Handicap engine matches
Some more 1' results vs. Rybka 4:
At 2 knights for rook odds, White rook on b1, Rybka (Black) won by a huge 16 to 4 margin (14 wins, two losses, 4 draws), about 240 elo.
At odds of the c7 pawn, Rybka narrowly won by 5 wins to 4, with 11 draws, so 10.5 -9.5, just 18 elo.
At odds of starting the game by 1.e4 e5 2.Nf3 Nc6 3.Nxe5??, getting just a pawn and a tempo for the knight, Rybka as Black won with 14 wins to one with five draws, so 16.5 to 3.5, about 270 elo.
At odds of starting the game by 1.e4 Nc6 2.d4 Nb8? (so three move odds), Komodo as Black won by six wins to two with a dozen draws, so 12 to 8, which is plus 70 elo.
At odds of the f2 pawn Komodo won by 7 wins to 3 with ten draws, so 12 to 8 which is plus 70 elo.
The average elo difference between Komodo 9.1 and Rybka 4 or 4.1 on the CCRL and CEGT blitz lists is about 185 elo. So based on the above results at bullet engine chess the f2 pawn is about 115 elo, three moves is about 115 elo handicap, c7 pawn is about 202 elo, rook for knight is about 225 elo, f7 pawn is about 345 elo, two knights for a rook is about 425 elo, the unsound 1e4 e5 2.Nf3 Nc6 3.Nxe5?? is about 455 elo, and knight odds is about 600 elo. Of course these numbers are subject to fairly large margins of error, but seem generally consistent enough.
The elo gap between Komodo 9.1 and Crafty 20.14 is about 820 on average, and since Komodo won that knight odds match by 280 elo that puts knight odds at 540 elo, not too far off from the estimate based on Rybka.
My general impression from these matches is that material seems more important than position. By that I mean that given two positions with an equal advantage according to Komodo or Rybka, the weaker player will score better when his plus is mostly material rather than positional. The awful opening 1.e4 f6? 2.d4 Kf7? did not even prove to be a measurable handicap in ten games, and three moves was only 115 elo. But the c7 pawn pawn over 200, and the Exchange even more. I'm not sure why this should be, since the relative value of material and positional terms is quite well tuned in Komodo, so you would think a plus 1.00 eval would produce a similar score regardless of whether it was an actual pawn or just positional advantages. But these results say otherwise. Probably there's something to be learned from this, but I haven't figured out exactly what that is!
At 2 knights for rook odds, White rook on b1, Rybka (Black) won by a huge 16 to 4 margin (14 wins, two losses, 4 draws), about 240 elo.
At odds of the c7 pawn, Rybka narrowly won by 5 wins to 4, with 11 draws, so 10.5 -9.5, just 18 elo.
At odds of starting the game by 1.e4 e5 2.Nf3 Nc6 3.Nxe5??, getting just a pawn and a tempo for the knight, Rybka as Black won with 14 wins to one with five draws, so 16.5 to 3.5, about 270 elo.
At odds of starting the game by 1.e4 Nc6 2.d4 Nb8? (so three move odds), Komodo as Black won by six wins to two with a dozen draws, so 12 to 8, which is plus 70 elo.
At odds of the f2 pawn Komodo won by 7 wins to 3 with ten draws, so 12 to 8 which is plus 70 elo.
The average elo difference between Komodo 9.1 and Rybka 4 or 4.1 on the CCRL and CEGT blitz lists is about 185 elo. So based on the above results at bullet engine chess the f2 pawn is about 115 elo, three moves is about 115 elo handicap, c7 pawn is about 202 elo, rook for knight is about 225 elo, f7 pawn is about 345 elo, two knights for a rook is about 425 elo, the unsound 1e4 e5 2.Nf3 Nc6 3.Nxe5?? is about 455 elo, and knight odds is about 600 elo. Of course these numbers are subject to fairly large margins of error, but seem generally consistent enough.
The elo gap between Komodo 9.1 and Crafty 20.14 is about 820 on average, and since Komodo won that knight odds match by 280 elo that puts knight odds at 540 elo, not too far off from the estimate based on Rybka.
My general impression from these matches is that material seems more important than position. By that I mean that given two positions with an equal advantage according to Komodo or Rybka, the weaker player will score better when his plus is mostly material rather than positional. The awful opening 1.e4 f6? 2.d4 Kf7? did not even prove to be a measurable handicap in ten games, and three moves was only 115 elo. But the c7 pawn pawn over 200, and the Exchange even more. I'm not sure why this should be, since the relative value of material and positional terms is quite well tuned in Komodo, so you would think a plus 1.00 eval would produce a similar score regardless of whether it was an actual pawn or just positional advantages. But these results say otherwise. Probably there's something to be learned from this, but I haven't figured out exactly what that is!
Komodo rules!
-
Vinvin
- Posts: 5312
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Handicap engine matches
I'd be very glad to see a top GM vs a top engine with an handicap.
The more I like is 2 pawns odd (computer play white) or the f pawn odd (computer play black).
I don't like Rook vs Minor because it tends to give an advantage in the opening and the early middle game for the minor.
May be it's possible to make some hype on ICC to challenge some very high rated players in 5 or 15 min games !?
7 years ago : http://en.chessbase.com/post/the-milov- ... icap-match
Since then, computers (speed gives around + 150 Elo), software (Komodo is 250 Elo over Rybka 3) and humans have improved a lot !
Is there some recent analyse with Komodo to see where the play can be improved for the computer in this match ?
The software have to be tweak a bit to exchange as few pieces as possible but hold a good position.
The more I like is 2 pawns odd (computer play white) or the f pawn odd (computer play black).
I don't like Rook vs Minor because it tends to give an advantage in the opening and the early middle game for the minor.
May be it's possible to make some hype on ICC to challenge some very high rated players in 5 or 15 min games !?
7 years ago : http://en.chessbase.com/post/the-milov- ... icap-match
Since then, computers (speed gives around + 150 Elo), software (Komodo is 250 Elo over Rybka 3) and humans have improved a lot !
Is there some recent analyse with Komodo to see where the play can be improved for the computer in this match ?
The software have to be tweak a bit to exchange as few pieces as possible but hold a good position.
-
Uri Blass
- Posts: 11161
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Handicap engine matches.
I think that if you can usually win against Komodo with knight handicap then there is no reason to play different based on the opponent with knight handicap(You can play with the same style also against weaker opponents and win).lkaufman wrote:Well, it's interesting, but what is the source of the rating estimates? Probably just one person's opinion. Actually, it's possible that at least part of the table is based on estimates from me; I have some recollection of helping Aquarium in this matter.Ajedrecista wrote:Hello Larry:
Interesting matches so far. Maybe this post could be useful to you? The info was extracted from Aquarium Demo.
Good luck with the next matches and the development of Komodo.
Regards from Spain.
Ajedrecista.
In any event the rating equivalence of a handicap depends heavily on the time limit. It's quite difficult for me to beat Komodo at knight odds in blitz (I use 4' +2" for blitz), but I don't think it would be a challenge for me at 30 minutes or so. One rather strong GM (I won't name him, but over 2600 FIDE) told me that he is a big underdog against Houdini at 20' with the traditional pawn and move (f7) handicap, and about even at two pawn handicap (b2 and f2). Presumably Komodo would be even harder for him. But I suppose he would do much better with two hours or so.
I may work up a table of rating equivalences for engines at various handicaps, but it will be time-limit dependent. I think that a human will always do better than an similarly rated engine when taking a handicap from a much stronger engine, because he knows how to modify his play to take into account that the opponent is much stronger.
If a similiarly rated engine like you(and it include Komodo at time control that is fast enough) cannot perform the same as you against komodo with knight handicap then it suggest that the engine has some weakness and not a weakness of not considering the level of the opponent.
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Handicap engine matches
I've at least started to talk to strong players about possible future matches. I haven't looked at the Milov games with Komodo, but my general comment about Exchange odds games is that it's better to start the extra rook on b8 instead of a8, so that neither side can castle long or has more pawns initially protected. I agree that the opening tends to look good for White, but perhaps play is a bit more normal than with a pawn or two missing, and Black has a clear plan to victory, trade down. Komodo 9.1 has some new code to deter trading down when losing, which we could make stronger for such matches. Two pawn odds might be appropriate for an ordinary GM, but probably not for an Elite one, unless we are talking about blitz chess or near-blitz. If so, should it be f2 and c2, or f2 and b2, or something else?Vinvin wrote:I'd be very glad to see a top GM vs a top engine with an handicap.
The more I like is 2 pawns odd (computer play white) or the f pawn odd (computer play black).
I don't like Rook vs Minor because it tends to give an advantage in the opening and the early middle game for the minor.
May be it's possible to make some hype on ICC to challenge some very high rated players in 5 or 15 min games !?
7 years ago : http://en.chessbase.com/post/the-milov- ... icap-match
Since then, computers (speed gives around + 150 Elo), software (Komodo is 250 Elo over Rybka 3) and humans have improved a lot !
Is there some recent analyse with Komodo to see where the play can be improved for the computer in this match ?
The software have to be tweak a bit to exchange as few pieces as possible but hold a good position.
Regards,
Larry
Komodo rules!
-
Jesse Gersenson
- Posts: 593
- Joined: Sat Aug 20, 2011 9:43 am
Re: Handicap engine matches
Komodo is either:lkaufman wrote:My general impression from these matches is that material seems more important than position. By that I mean that given two positions with an equal advantage according to Komodo or Rybka, the weaker player will score better when his plus is mostly material rather than positional. The awful opening 1.e4 f6? 2.d4 Kf7? did not even prove to be a measurable handicap in ten games, and three moves was only 115 elo. But the c7 pawn pawn over 200, and the Exchange even more. I'm not sure why this should be, since the relative value of material and positional terms is quite well tuned in Komodo, so you would think a plus 1.00 eval would produce a similar score regardless of whether it was an actual pawn or just positional advantages. But these results say otherwise. Probably there's something to be learned from this, but I haven't figured out exactly what that is!
a) under-evaluating the missing material in these starting position
b) over-evaluating the positional hindrances in these positions
c) both a and b
The starting position is the most-distant point from the game result. Is the eval, for example a 1.00 eval, directly related to an expected game result? What does 1.00 mean?
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Handicap engine matches
A score of 1.00 is supposed to mean that White's expected result is the same as if he had one extra pawn in an endgame where all positional factors were equal. I think it's the same in Stockfish. In the opening a clear extra pawn shows less than 1.00 because White's chances of victory are less than they would be in a similar endgame.Jesse Gersenson wrote:Komodo is either:lkaufman wrote:My general impression from these matches is that material seems more important than position. By that I mean that given two positions with an equal advantage according to Komodo or Rybka, the weaker player will score better when his plus is mostly material rather than positional. The awful opening 1.e4 f6? 2.d4 Kf7? did not even prove to be a measurable handicap in ten games, and three moves was only 115 elo. But the c7 pawn pawn over 200, and the Exchange even more. I'm not sure why this should be, since the relative value of material and positional terms is quite well tuned in Komodo, so you would think a plus 1.00 eval would produce a similar score regardless of whether it was an actual pawn or just positional advantages. But these results say otherwise. Probably there's something to be learned from this, but I haven't figured out exactly what that is!
a) under-evaluating the missing material in these starting position
b) over-evaluating the positional hindrances in these positions
c) both a and b
The starting position is the most-distant point from the game result. Is the eval, for example a 1.00 eval, directly related to an expected game result? What does 1.00 mean?
If what you say is correct, then although the relative value of material and positional factors in Komodo all well-tuned in general, they may not be well-tuned for the initial position. That is quite possible. But I'm not sure how to correct this without lowering elo overall. Maybe we're missing some important idea here.
It is also possible that Komodo does all this pretty well, and the problem arises solely from defects in Rybka 4, but I don't believe that.
Komodo rules!
-
kranium
- Posts: 2130
- Joined: Thu May 29, 2008 10:43 am
Re: Handicap engine matches
Perhaps positional defects can more easily remedied than regaining lost material...lkaufman wrote: My general impression from these matches is that material seems more important than position.