Two Pawn Handicap

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: Two Pawn Handicap

Post by JJJ »

Yeah, It would be nice to see Komodo in an elite GM tournament :)
Of course, the first player after Komodo would be the winner of the tournament, but in the meantime, we could have a better idea of the actual ranking of computer.
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Two Pawn Handicap

Post by lkaufman »

Laskos wrote:
lkaufman wrote:
Laskos wrote:
lkaufman wrote:It seems most of us underestimated the grandmaster in this match or at least underestimated the handicap. I think it's pretty obvious now that the handicap of f2 and c2 pawns was just too much for anyone or anything to give to a grandmaster in a serious game. Yet Komodo won fairly easily playing Black and giving the f7 pawn only, which is the worst pawn for Black to remove.
I think two pawn handicap is still playable against an ordinary (around 2500) grandmaster, if we are a bit more conservative about the choice of the pawns. As it was, White was not only two pawns down, but his king was weakened and his pawns were split up into three groups. Moreover he only had two of the four pawns that can control central squares. I chose this handicap out of deference to the tradition of giving the "f" pawn as a handicap, but it's just too difficult, especially if repeated game after game while the grandmaster learns each time.
When Kasparov gave two pawns to Terrence Chapman (said to be 2150 level) in a match, he removed the "a" pawn plus one other varying pawn. This is what we should have done too, although I think it was a bit unfair to play one game with both edge pawns removed, which is probably no more than the f7 handicap. I think Komodo can still offer two pawns to a grandmaster, if one is the "a" pawn and the other rotates between "b", "c", "d", and "e". These feel more like "just" a two pawn handicap with no added positional advantages on top.
Comments, anyone?
I have computed the handicaps in self-play of Komodo at ultra-fast (2 cores) with time odds, and extrapolated to 45'+15'' by analogy with the c2 and f2 pawns handicap, the only one where I tested this long time control. First, I got the values of doubling in time for Komodo (2 cores) at contempt 0, and the doubling seems indeed to be worth more than that of Houdini:

Score of Komodo 3s+0.03s vs Komodo 1.5s+0.015s:
680 - 28 - 292 [0.826] 1000
ELO difference: 271

Score of Komodo 6s+0.06s vs Komodo 3s+0.03s:
507 - 61 - 432 [0.723] 1000
ELO difference: 167

Score of Komodo 24s+0.24s vs Komodo 12s+0.12s:
385 - 51 - 564 [0.667] 1000
ELO difference: 121

Then, similarly (500 games each data point), the extrapolated handicaps to 45'+15'' on 2 cores (at faster control the handicaps are significantly lower):
  • Knight b1:
    1170 ELO points

    Pawns c2 and f2:
    710 ELO points

    Pawns a2 and d2:
    490 ELO points

    Pawns a2 and h2:
    320 ELO points

    Pawn f7:
    510 ELO points

    Pawn d7:
    370 ELO points
The anomalies I get:
  • Two pawns c2 and f2 handicap seems fair game for a 2500 GM, it was not.
    f7 pawn handicap seems unexpectedly large to me, much larger than two pawns a2 and h2 handicap, and on a par with two pawns a2 and d2 handicap.
If Komodo is 3200 FIDE ELO at 45'+15'' on multicore, then one can see the fair matches, for example d7 pawn handicap or a2 and h2 pawns handicap against Carlsen.
Thanks for running these tests! The relative size of the handicaps looks about right to me, and the 710 value for the c2/f2 handicap is consistent with my direct results if we assume Gaussian rather than Logistic distribution, as you say. I'm running similar tests myself with the aim of measuring the handicaps without having to rely much (if at all) on extrapolation, but I don't expect my results will differ too much from yours.
If you have the computer time available, could you run a couple more handicaps? Especially the Exchange (remove a1 rook, b8 knight, move rook from a8 to b8), since we will surely have matches with this one. Also "pawn and tho moves" (remove f7, play 1.e4 and it's still White's move). And perhaps a couple more of the two pawn handicaps we may still use, probably a/b, a/c, and a/e.
It is a bit strange that c2/f2 is 200 more than f7 in your results, but it seemed like a day and night difference against Neuman. But maybe with preparation and learning from experience he would do well at f7 as well.
I think that the handicaps will always be more difficult to give to humans than the engine tests show since they know to avoid unclear tactics while the engines do not. But there's nothing you can do about this.
I will go on a trip for a week, so I tried to test the handicaps you proposed now, quick 500 game matches at ultra-fast time controls with Komodo on 2 cores, extrapolated to 45'+15'':
  • Exchange:
    540 ELO points

    Pawn f7 and two moves (to e4):
    600 ELO points

    Pawns a2 and c2:
    560 ELO points

    Pawns a2 and b2:
    440 ELO points
Definitely missing f7 seems very unpleasant in many circumstances, and a2 + b2 handicap seems significantly smaller than a2 + c2.
Thanks again for these extra matches. Based on a one minute search on 4 cores of the handicap starting positions, f7 actually shows as more of a handicap than even a2 + c2, 1.32 vs 1.14. Probably this means something isn't quite right about the material vs. positional scoring in Komodo, although it is well-tuned. Something we need to investigate. f7 is quite a large handicap because Black's development must be very modest due to tactical problems, but still a2 + c2 seems larger (and scores as higher in your tests). I think that Komodo's eval tricked me into underestimating the two pawn handicaps.
Komodo rules!
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: Two Pawn Handicap

Post by JJJ »

Maybe for 2700 2800 GM, no handicap , and a draw = win for human.

Of course, Komodo is allowed to play with a little book to avoid easy draw opening. At least we gonna see how Komodo win against top human :)
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Two Pawn Handicap

Post by lkaufman »

JJJ wrote:Maybe for 2700 2800 GM, no handicap , and a draw = win for human.

Of course, Komodo is allowed to play with a little book to avoid easy draw opening. At least we gonna see how Komodo win against top human :)
Draw odds plus White pieces for human might be fair for 2800 (computer would win almost always with White pieces), but it puts all the emphasis on the opening book rather than on the engine. I might still want to try it sometime, when I have enough time to make a proper book for this. Maybe book length would be severely limited.
Komodo rules!
Uri Blass
Posts: 10906
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Two Pawn Handicap

Post by Uri Blass »

lkaufman wrote:
JJJ wrote:Maybe for 2700 2800 GM, no handicap , and a draw = win for human.

Of course, Komodo is allowed to play with a little book to avoid easy draw opening. At least we gonna see how Komodo win against top human :)
Draw odds plus White pieces for human might be fair for 2800 (computer would win almost always with White pieces), but it puts all the emphasis on the opening book rather than on the engine. I might still want to try it sometime, when I have enough time to make a proper book for this. Maybe book length would be severely limited.
I do not think that rating against humans is always the same as rating against computers and it is possible that some 2500 GM can perform better against computers than a 2800 GM's

I remember that in the israeli league some 2200 player got 3 draws from 3 computers when the chess programs performed at least at level of 2500 against other players.
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Two Pawn Handicap

Post by lkaufman »

Uri Blass wrote:
lkaufman wrote:
JJJ wrote:Maybe for 2700 2800 GM, no handicap , and a draw = win for human.

Of course, Komodo is allowed to play with a little book to avoid easy draw opening. At least we gonna see how Komodo win against top human :)
Draw odds plus White pieces for human might be fair for 2800 (computer would win almost always with White pieces), but it puts all the emphasis on the opening book rather than on the engine. I might still want to try it sometime, when I have enough time to make a proper book for this. Maybe book length would be severely limited.
I do not think that rating against humans is always the same as rating against computers and it is possible that some 2500 GM can perform better against computers than a 2800 GM's

I remember that in the israeli league some 2200 player got 3 draws from 3 computers when the chess programs performed at least at level of 2500 against other players.
I am sure you are correct, but the main interest is the general case, not whether some individual has found a way to get draws. We can easily modify Komodo to avoid fortress-type draws at some tiny elo cost.
Komodo rules!
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Two Pawn Handicap

Post by duncan »

Laskos wrote:
duncan wrote:
Laskos wrote:
  • Knight b1:
    1170 ELO points

    Pawns c2 and f2:
    710 ELO points

    Pawns a2 and d2:
    490 ELO points

    Pawns a2 and h2:
    320 ELO points

    Pawn f7:
    510 ELO points

    Pawn d7:
    370 ELO points
The anomalies I get:
  • Two pawns c2 and f2 handicap seems fair game for a 2500 GM, it was not.
    f7 pawn handicap seems unexpectedly large to me, much larger than two pawns a2 and h2 handicap, and on a par with two pawns a2 and d2 handicap.
If Komodo is 3200 FIDE ELO at 45'+15'' on multicore, then one can see the fair matches, for example d7 pawn handicap or a2 and h2 pawns handicap against Carlsen.
following from this since Pawns c2 and f2(710) is 340 more than Pawn d7 (370), komodo would have to gain another 340 points before having a chance against carlsen with Pawns c2 and f2. taking into account hardware and software improvements maybe 3 years. ? otoh they are not unimportant pawns , so maybe never.
When rating (or time control or hardware) increases, the same handicap increases its ELO value. Say with Komodo 340 ELO points stronger than today, handicap of c2 and f2 increases by 240 points to 950 ELO points, so there is only little progress in playing this handicap against Carlsen. I don't know if a perfect engine would stand this handicap against Carlsen.
I do not quite understand.

if Pawns c2 and f2 is 710 ELO points at 3200 then at gm 2500 it will be a lot less (lets say 400 ) as a gm does not have the skill to turn c2 and f2 handicap into a win that a 3200 komodo has.

so to be equivalent to a gm 2500 all komondo would need is 2500 + 400 = 2900.

but you are using the 710 figure and deducting it from komodo's 3200 ?
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Two Pawn Handicap

Post by lkaufman »

duncan wrote:
Laskos wrote:
duncan wrote:
Laskos wrote:
  • Knight b1:
    1170 ELO points

    Pawns c2 and f2:
    710 ELO points

    Pawns a2 and d2:
    490 ELO points

    Pawns a2 and h2:
    320 ELO points

    Pawn f7:
    510 ELO points

    Pawn d7:
    370 ELO points
The anomalies I get:
  • Two pawns c2 and f2 handicap seems fair game for a 2500 GM, it was not.
    f7 pawn handicap seems unexpectedly large to me, much larger than two pawns a2 and h2 handicap, and on a par with two pawns a2 and d2 handicap.
If Komodo is 3200 FIDE ELO at 45'+15'' on multicore, then one can see the fair matches, for example d7 pawn handicap or a2 and h2 pawns handicap against Carlsen.
following from this since Pawns c2 and f2(710) is 340 more than Pawn d7 (370), komodo would have to gain another 340 points before having a chance against carlsen with Pawns c2 and f2. taking into account hardware and software improvements maybe 3 years. ? otoh they are not unimportant pawns , so maybe never.
When rating (or time control or hardware) increases, the same handicap increases its ELO value. Say with Komodo 340 ELO points stronger than today, handicap of c2 and f2 increases by 240 points to 950 ELO points, so there is only little progress in playing this handicap against Carlsen. I don't know if a perfect engine would stand this handicap against Carlsen.
I do not quite understand.

if Pawns c2 and f2 is 710 ELO points at 3200 then at gm 2500 it will be a lot less (lets say 400 ) as a gm does not have the skill to turn c2 and f2 handicap into a win that a 3200 komodo has.

so to be equivalent to a gm 2500 all komondo would need is 2500 + 400 = 2900.

but you are using the 710 figure and deducting it from komodo's 3200 ?
The 710 figure was based on the handicap giver being Komodo with 45 minutes + 15", whatever that rating might be. But this was Komodo on 2 threads. For Komodo on 23 threads, the rating and therefore the handicap value would be higher than 710.
Komodo rules!
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Two Pawn Handicap

Post by duncan »

lkaufman wrote:
The 710 figure was based on the handicap giver being Komodo with 45 minutes + 15", whatever that rating might be. But this was Komodo on 2 threads. For Komodo on 23 threads, the rating and therefore the handicap value would be higher than 710.
but to work out the likely result of komodo playing gm 2500 do you not need to know the handicap rating for the 2500 gm. as just because it is a huge handicap for the computer, it may not be such a big handicap for the weaker gm.?
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Two Pawn Handicap

Post by lkaufman »

duncan wrote:
lkaufman wrote:
The 710 figure was based on the handicap giver being Komodo with 45 minutes + 15", whatever that rating might be. But this was Komodo on 2 threads. For Komodo on 23 threads, the rating and therefore the handicap value would be higher than 710.
but to work out the likely result of komodo playing gm 2500 do you not need to know the handicap rating for the 2500 gm. as just because it is a huge handicap for the computer, it may not be such a big handicap for the weaker gm.?
No; the point of Kai's methodology is that it simulates the 2500 GM by trying to determine how Komodo with less than a minute per game would do with the handicap. So a 50% score on two threads against any human would mean (if you accept his methodology) that Komodo on two threads would rate 710 above that human, since the human would have performed the same as Komodo with less than a minute per game.
Komodo rules!