
Of course, the first player after Komodo would be the winner of the tournament, but in the meantime, we could have a better idea of the actual ranking of computer.
Moderator: Ras
Thanks again for these extra matches. Based on a one minute search on 4 cores of the handicap starting positions, f7 actually shows as more of a handicap than even a2 + c2, 1.32 vs 1.14. Probably this means something isn't quite right about the material vs. positional scoring in Komodo, although it is well-tuned. Something we need to investigate. f7 is quite a large handicap because Black's development must be very modest due to tactical problems, but still a2 + c2 seems larger (and scores as higher in your tests). I think that Komodo's eval tricked me into underestimating the two pawn handicaps.Laskos wrote:I will go on a trip for a week, so I tried to test the handicaps you proposed now, quick 500 game matches at ultra-fast time controls with Komodo on 2 cores, extrapolated to 45'+15'':lkaufman wrote:Thanks for running these tests! The relative size of the handicaps looks about right to me, and the 710 value for the c2/f2 handicap is consistent with my direct results if we assume Gaussian rather than Logistic distribution, as you say. I'm running similar tests myself with the aim of measuring the handicaps without having to rely much (if at all) on extrapolation, but I don't expect my results will differ too much from yours.Laskos wrote:I have computed the handicaps in self-play of Komodo at ultra-fast (2 cores) with time odds, and extrapolated to 45'+15'' by analogy with the c2 and f2 pawns handicap, the only one where I tested this long time control. First, I got the values of doubling in time for Komodo (2 cores) at contempt 0, and the doubling seems indeed to be worth more than that of Houdini:lkaufman wrote:It seems most of us underestimated the grandmaster in this match or at least underestimated the handicap. I think it's pretty obvious now that the handicap of f2 and c2 pawns was just too much for anyone or anything to give to a grandmaster in a serious game. Yet Komodo won fairly easily playing Black and giving the f7 pawn only, which is the worst pawn for Black to remove.
I think two pawn handicap is still playable against an ordinary (around 2500) grandmaster, if we are a bit more conservative about the choice of the pawns. As it was, White was not only two pawns down, but his king was weakened and his pawns were split up into three groups. Moreover he only had two of the four pawns that can control central squares. I chose this handicap out of deference to the tradition of giving the "f" pawn as a handicap, but it's just too difficult, especially if repeated game after game while the grandmaster learns each time.
When Kasparov gave two pawns to Terrence Chapman (said to be 2150 level) in a match, he removed the "a" pawn plus one other varying pawn. This is what we should have done too, although I think it was a bit unfair to play one game with both edge pawns removed, which is probably no more than the f7 handicap. I think Komodo can still offer two pawns to a grandmaster, if one is the "a" pawn and the other rotates between "b", "c", "d", and "e". These feel more like "just" a two pawn handicap with no added positional advantages on top.
Comments, anyone?
Score of Komodo 3s+0.03s vs Komodo 1.5s+0.015s:
680 - 28 - 292 [0.826] 1000
ELO difference: 271
Score of Komodo 6s+0.06s vs Komodo 3s+0.03s:
507 - 61 - 432 [0.723] 1000
ELO difference: 167
Score of Komodo 24s+0.24s vs Komodo 12s+0.12s:
385 - 51 - 564 [0.667] 1000
ELO difference: 121
Then, similarly (500 games each data point), the extrapolated handicaps to 45'+15'' on 2 cores (at faster control the handicaps are significantly lower):
The anomalies I get:
- Knight b1:
1170 ELO points
Pawns c2 and f2:
710 ELO points
Pawns a2 and d2:
490 ELO points
Pawns a2 and h2:
320 ELO points
Pawn f7:
510 ELO points
Pawn d7:
370 ELO pointsIf Komodo is 3200 FIDE ELO at 45'+15'' on multicore, then one can see the fair matches, for example d7 pawn handicap or a2 and h2 pawns handicap against Carlsen.
- Two pawns c2 and f2 handicap seems fair game for a 2500 GM, it was not.
f7 pawn handicap seems unexpectedly large to me, much larger than two pawns a2 and h2 handicap, and on a par with two pawns a2 and d2 handicap.
If you have the computer time available, could you run a couple more handicaps? Especially the Exchange (remove a1 rook, b8 knight, move rook from a8 to b8), since we will surely have matches with this one. Also "pawn and tho moves" (remove f7, play 1.e4 and it's still White's move). And perhaps a couple more of the two pawn handicaps we may still use, probably a/b, a/c, and a/e.
It is a bit strange that c2/f2 is 200 more than f7 in your results, but it seemed like a day and night difference against Neuman. But maybe with preparation and learning from experience he would do well at f7 as well.
I think that the handicaps will always be more difficult to give to humans than the engine tests show since they know to avoid unclear tactics while the engines do not. But there's nothing you can do about this.
Definitely missing f7 seems very unpleasant in many circumstances, and a2 + b2 handicap seems significantly smaller than a2 + c2.
- Exchange:
540 ELO points
Pawn f7 and two moves (to e4):
600 ELO points
Pawns a2 and c2:
560 ELO points
Pawns a2 and b2:
440 ELO points
Draw odds plus White pieces for human might be fair for 2800 (computer would win almost always with White pieces), but it puts all the emphasis on the opening book rather than on the engine. I might still want to try it sometime, when I have enough time to make a proper book for this. Maybe book length would be severely limited.JJJ wrote:Maybe for 2700 2800 GM, no handicap , and a draw = win for human.
Of course, Komodo is allowed to play with a little book to avoid easy draw opening. At least we gonna see how Komodo win against top human
I do not think that rating against humans is always the same as rating against computers and it is possible that some 2500 GM can perform better against computers than a 2800 GM'slkaufman wrote:Draw odds plus White pieces for human might be fair for 2800 (computer would win almost always with White pieces), but it puts all the emphasis on the opening book rather than on the engine. I might still want to try it sometime, when I have enough time to make a proper book for this. Maybe book length would be severely limited.JJJ wrote:Maybe for 2700 2800 GM, no handicap , and a draw = win for human.
Of course, Komodo is allowed to play with a little book to avoid easy draw opening. At least we gonna see how Komodo win against top human
I am sure you are correct, but the main interest is the general case, not whether some individual has found a way to get draws. We can easily modify Komodo to avoid fortress-type draws at some tiny elo cost.Uri Blass wrote:I do not think that rating against humans is always the same as rating against computers and it is possible that some 2500 GM can perform better against computers than a 2800 GM'slkaufman wrote:Draw odds plus White pieces for human might be fair for 2800 (computer would win almost always with White pieces), but it puts all the emphasis on the opening book rather than on the engine. I might still want to try it sometime, when I have enough time to make a proper book for this. Maybe book length would be severely limited.JJJ wrote:Maybe for 2700 2800 GM, no handicap , and a draw = win for human.
Of course, Komodo is allowed to play with a little book to avoid easy draw opening. At least we gonna see how Komodo win against top human
I remember that in the israeli league some 2200 player got 3 draws from 3 computers when the chess programs performed at least at level of 2500 against other players.
I do not quite understand.Laskos wrote:When rating (or time control or hardware) increases, the same handicap increases its ELO value. Say with Komodo 340 ELO points stronger than today, handicap of c2 and f2 increases by 240 points to 950 ELO points, so there is only little progress in playing this handicap against Carlsen. I don't know if a perfect engine would stand this handicap against Carlsen.duncan wrote:following from this since Pawns c2 and f2(710) is 340 more than Pawn d7 (370), komodo would have to gain another 340 points before having a chance against carlsen with Pawns c2 and f2. taking into account hardware and software improvements maybe 3 years. ? otoh they are not unimportant pawns , so maybe never.Laskos wrote:The anomalies I get:
- Knight b1:
1170 ELO points
Pawns c2 and f2:
710 ELO points
Pawns a2 and d2:
490 ELO points
Pawns a2 and h2:
320 ELO points
Pawn f7:
510 ELO points
Pawn d7:
370 ELO pointsIf Komodo is 3200 FIDE ELO at 45'+15'' on multicore, then one can see the fair matches, for example d7 pawn handicap or a2 and h2 pawns handicap against Carlsen.
- Two pawns c2 and f2 handicap seems fair game for a 2500 GM, it was not.
f7 pawn handicap seems unexpectedly large to me, much larger than two pawns a2 and h2 handicap, and on a par with two pawns a2 and d2 handicap.
The 710 figure was based on the handicap giver being Komodo with 45 minutes + 15", whatever that rating might be. But this was Komodo on 2 threads. For Komodo on 23 threads, the rating and therefore the handicap value would be higher than 710.duncan wrote:I do not quite understand.Laskos wrote:When rating (or time control or hardware) increases, the same handicap increases its ELO value. Say with Komodo 340 ELO points stronger than today, handicap of c2 and f2 increases by 240 points to 950 ELO points, so there is only little progress in playing this handicap against Carlsen. I don't know if a perfect engine would stand this handicap against Carlsen.duncan wrote:following from this since Pawns c2 and f2(710) is 340 more than Pawn d7 (370), komodo would have to gain another 340 points before having a chance against carlsen with Pawns c2 and f2. taking into account hardware and software improvements maybe 3 years. ? otoh they are not unimportant pawns , so maybe never.Laskos wrote:The anomalies I get:
- Knight b1:
1170 ELO points
Pawns c2 and f2:
710 ELO points
Pawns a2 and d2:
490 ELO points
Pawns a2 and h2:
320 ELO points
Pawn f7:
510 ELO points
Pawn d7:
370 ELO pointsIf Komodo is 3200 FIDE ELO at 45'+15'' on multicore, then one can see the fair matches, for example d7 pawn handicap or a2 and h2 pawns handicap against Carlsen.
- Two pawns c2 and f2 handicap seems fair game for a 2500 GM, it was not.
f7 pawn handicap seems unexpectedly large to me, much larger than two pawns a2 and h2 handicap, and on a par with two pawns a2 and d2 handicap.
if Pawns c2 and f2 is 710 ELO points at 3200 then at gm 2500 it will be a lot less (lets say 400 ) as a gm does not have the skill to turn c2 and f2 handicap into a win that a 3200 komodo has.
so to be equivalent to a gm 2500 all komondo would need is 2500 + 400 = 2900.
but you are using the 710 figure and deducting it from komodo's 3200 ?
but to work out the likely result of komodo playing gm 2500 do you not need to know the handicap rating for the 2500 gm. as just because it is a huge handicap for the computer, it may not be such a big handicap for the weaker gm.?lkaufman wrote:
The 710 figure was based on the handicap giver being Komodo with 45 minutes + 15", whatever that rating might be. But this was Komodo on 2 threads. For Komodo on 23 threads, the rating and therefore the handicap value would be higher than 710.
No; the point of Kai's methodology is that it simulates the 2500 GM by trying to determine how Komodo with less than a minute per game would do with the handicap. So a 50% score on two threads against any human would mean (if you accept his methodology) that Komodo on two threads would rate 710 above that human, since the human would have performed the same as Komodo with less than a minute per game.duncan wrote:but to work out the likely result of komodo playing gm 2500 do you not need to know the handicap rating for the 2500 gm. as just because it is a huge handicap for the computer, it may not be such a big handicap for the weaker gm.?lkaufman wrote:
The 710 figure was based on the handicap giver being Komodo with 45 minutes + 15", whatever that rating might be. But this was Komodo on 2 threads. For Komodo on 23 threads, the rating and therefore the handicap value would be higher than 710.