Thermopylay Marathon 2011 (live!)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27789
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Thermopylay Marathon 2011 (live!)

Post by hgm »

Indeed, this one is a lot stronger. :D It is still the version with the movestogo bug, I didn't download the latest one yet. (But as we are playing from the defaultposition, the bug is not triggered.)
Richard Allbert
Posts: 792
Joined: Wed Jul 19, 2006 9:58 am

Re: Thermopylay Marathon 2011 (live!)

Post by Richard Allbert »

Ok, I hope it survives, because I found another protocol problem straight after the upload :) , hence the quick notice.

I've just seen a bishop fork two kings - you don't see that a lot. Fairy has just two kings vs a Queen + a few pieces.
User avatar
hgm
Posts: 27789
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Thermopylay Marathon 2011 (live!)

Post by hgm »

Are you sure castling is OK? I saw Catalyst play Ke1-f1, where O-O would have seemed much more logical.
Richard Allbert
Posts: 792
Joined: Wed Jul 19, 2006 9:58 am

Re: Thermopylay Marathon 2011 (live!)

Post by Richard Allbert »

There's nothing in the eval to say that's bad.

I'll add it though!

I conecntrated a bit more on the Spartan eval with the last update - specifically hoplites and the structure.

There's a lot of room for imrovement!
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Thermopylay Marathon 2011 (live!)

Post by Evert »

Richard Allbert wrote:There's nothing in the eval to say that's bad.

I'll add it though!

I conecntrated a bit more on the Spartan eval with the last update - specifically hoplites and the structure.

There's a lot of room for imrovement!
When sorting the movelist I give a small bonus for castling moves, so they're tried before other king moves if they're possible, and also tried before other quiet moves (except hash move, killer moves and possible generalisations of killer moves). This seems to help encourage the program to castle rather than move the king. The only concern then is to make sure the program doesn't wreak its castling rights before it has a chance to castle. :)
User avatar
hgm
Posts: 27789
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Thermopylay Marathon 2011 (live!)

Post by hgm »

Hmm, it seems Nebiyu has a bit of a problem seeing checkmates...

Even at 33 ply it does not come up with mate-in-7.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Thermopylay Marathon 2011 (live!)

Post by Evert »

hgm wrote:Hmm, it seems Nebiyu has a bit of a problem seeing checkmates...

Even at 33 ply it does not come up with mate-in-7.
Some of the search depths reported by Nebiyu look insane: I've seen it claim 21 ply after a fraction of a second at timeodds against Sjaak (which was doing 13-14 ply at the same time). I suspect it's extremely selective in its search.
User avatar
hgm
Posts: 27789
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Thermopylay Marathon 2011 (live!)

Post by hgm »

Even so, at some point it was even missing mate-in-3, and still evaluating as 9.6 in KQK.

In the last game there was also somehing that looked suspiciously much like a search bug. In one move its score dropped from a steady 4.3 at move 37 (Spartacus ~-2) to 0.3, while indeed anyone could see it had very bad (promotion) trouble comming way before that.

8/2k1k3/1h2h1g1/1PPBhhch/P7/8/2K5/R6R w - - 0 38
[d]8/2k1k3/1p2p1q1/1PPBpprp/P7/8/2K5/R6R w - - 0 38

Code: Select all

[Event "Thermopilae Marathon 2011"]
[Site "SCHAAKPC"]
[Date "2011.02.10"]
[Round "7.3"]
[White "Spartacus 0.23 / 6"]
[Black "Nebiyu 1.1 / 6"]
[Result "1-0"]
[TimeControl "40/1440"]
[Variant "spartan"]
[Number "42"]
[Annotator "1. -1.88   1... +0.30"]

1. Nc3 {-1.88/15 34} Lf6 {+0.30/17 33} 2. Nf3 {-1.58/15 30} Lc6
{+0.40/19 28} 3. d4 {-1.57/16 34} Hbd5 {+0.80/19 33} 4. e4 {-1.55/16 28}
Lxd4 {+1.00/21 30} 5. Nxd4 {-1.50/16 27} Hxd4 {+1.30/19 21} 6. Qxd4
{-1.61/15 35} Ce6 {+1.10/20 24} 7. f3 {-1.52/14 29} Hge5 {+1.50/17 18} 8.
g4 {-1.53/14 31} Hxe4 {+2.00/19 29} 9. h4 {-1.52/14 35} Wf6 {+1.90/19 38}
10. Qf2 {-1.62/14 32} Cdd6 {+1.90/19 39} 11. Bd3 {-1.59/14 36} We5
{+2.10/17 38} 12. a3 {-1.84/14 36} Cd4 {+2.00/18 27} 13. Be2 {-1.96/15 36}
Cf4 {+1.80/18 20} 14. Qg3 {-1.74/14 29} Hhf5 {+1.80/17 56} 15. Rf1
{-1.89/14 36} Hf6 {+1.90/17 38} 16. h5 {-1.93/14 36} Ke8 {+2.00/14 41} 17.
h6 {-1.26/14 28} Kb7 {+1.00/15 29} 18. h7 {-1.03/14 31} Ke7 {+1.20/18 40}
19. Bd2 {-1.28/14 31} Cd4 {+1.70/16 28} 20. Qf2 {-1.95/14 30} Cxd2
{+2.40/17 20} 21. Kxd2 {-2.05/14 38} Gh8 {+2.00/19 22} 22. Rh1
{-1.97/14 38} Wf4+ {+2.30/20 29} 23. Ke1 {-2.23/15 38} Ce5 {+2.00/19 33}
24. fxe4 {-2.10/14 38} Lxe4 {+2.20/20 29} 25. Nxe4 {-2.06/14 31} Cxe4
{+2.20/20 53} 26. gxf5 {-1.86/15 39} Hxf5 {+2.00/17 24} 27. Bf1
{-2.01/14 37} Ce3+ {+2.20/20 24} 28. Kd1 {-2.02/15 35} Hde6 {+2.20/19 31}
29. c4 {-1.72/14 33} Wg3 {+2.30/19 38} 30. Qxg3 {-2.02/16 39} Cxg3
{+2.20/20 31} 31. c5 {-1.98/16 34} Cg5 {+2.90/17 1:05} 32. Kc2
{-2.16/17 42} Hh5 {+3.70/23 48} 33. b4 {-2.61/17 42} Gxh7 {+3.70/21 42} 34.
a4 {-2.36/16 42} He5 {+3.80/18 23} 35. Bc4 {-2.24/15 43} Gg6 {+3.90/18 25}
36. Bd5 {-2.26/15 43} Kc7 {+4.10/19 33} 37. b5 {-1.40/18 43} Hb6
{+4.30/20 55} 38. a5 {+0.19/18 35} He4 {+0.30/19 29} 39. cxb6 {+0.30/17 50}
Kb8 {+0.00/24 1:04} 40. a6 {+0.35/17 45} Hf5 {+0.70/22 41} 41. Bc6
{+1.00/18 27} Gd6 {-1.10/22 34} 42. a7 {+0.97/18 35} Gc5+ {-1.50/23 37} 43.
Kb3 {+1.62/19 35} Gxb6 {-1.30/23 31} 44. axb8=N {+1.96/18 35} Gxb8
{-1.40/23 37} 45. Ra6 {+2.07/17 35} Gd8 {+0.80/19 21} 46. b6 {+4.09/16 35}
Gd3+ {-2.40/18 22} 47. Kb2 {+5.60/16 35} Gd8 {-2.90/18 36} 48. b7
{+6.50/17 35} Cg6 {-6.70/20 23} 49. Ra8 {+7.44/15 26} Gc7 {-7.60/22 34} 50.
b8=Q {+7.79/15 35} Gxb8+ {-7.20/22 42} 51. Rxb8 {+7.51/14 33} Cf6
{-7.90/19 39} 52. Rxh5 {+8.11/13 30} Hd3 {-8.90/19 37} 53. Bb5
{+9.13/13 36} Hd4 {-10.60/21 43} 54. Rh7+ {+11.26/13 36} Kd6 {-10.50/25 48}
55. Rb6+ {+11.64/13 35} Kc5 {-10.80/29 37} 56. Rxf6 {+14.27/14 35} Kxb5
{-10.80/28 34} 57. Rxf5+ {+10.95/14 33} Kc4 {-10.90/28 19} 58. Rh4
{+319.96/85 36} He2 {-10.90/36 39} 59. Kc2 {+319.91/16 36} Hf1=G
{-10.90/29 1:01} 60. Rxf1 {+319.92/17 36} Kd5 {-10.90/27 39} 61. Rf5+
{+319.93/15 37} Ke6 {-10.90/28 19} 62. Ra5 {+319.94/17 29} He3
{-10.90/27 40} 63. Rh6+ {+319.96/19 29} Ke7 {-10.90/26 3} 64. Ra7+
{+319.98/100 30} Kd8 {-999.98/41 0.1} 65. Rh8# {+319.99/100 0.1}
{Xboard adjudication: Checkmate} 1-0
Spartacus had a spurious mate score at move 58 as well. I will also have to look into that. Again you see that Nebiyu does not see the mate coming at all, despite 26 ply, and misses 'best defence' on many occasions because of it.

Code: Select all

Cross table, sorted by score percentage, Buchholz, SB

                              Spar Nebi Sjaa Fair Ober Ches Cata
 1. Spartacus 0.23 / 6        #### =101 0111 =110 10=1 1111 1110   73%  17.5 (266.0, 187.0)
 2. Nebiyu 1.1 / 6            =010 #### 11=0 1111 1110 0=10 1111   69%  16.5 (270.0, 176.5)
 3. Sjaak 92                  1000 00=1 #### 1=10 1011 0011 1101   54%  13.0 (284.0, 139.8)
 4. Fairy-Max  4.8R           =001 0000 0=01 #### 10=1 11=1 1110   50%  12.0 (288.0, 118.8)
 5. Oberon                    01=0 0001 0100 01=0 #### 10=1 1111   48%  11.5 (290.0, 116.5)
 6. ChessV (Spartan)          0000 1=01 1100 00=0 01=0 #### 1000   31%   7.5 (306.0,  96.5)
 7. Catalyst 3                0001 0000 0010 0001 0000 0111 ####   25%   6.0 (312.0,  65.0)
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Thermopylay Marathon 2011 (live!)

Post by Evert »

hgm wrote:Even so, at some point it was even missing mate-in-3, and still evaluating as 9.6 in KQK.

In the last game there was also somehing that looked suspiciously much like a search bug. In one move its score dropped from a steady 4.3 at move 37 (Spartacus ~-2) to 0.3, while indeed anyone could see it had very bad (promotion) trouble comming way before that.
Perhaps a null-move issue, then?
I think I saw a random mate score in one of Sjaak's games as well, but I don't remember if it was mate for or against. If it was against, it probably just got lucky.

It'll also be interesting to look at positions that the two engines evaluate very differently (and I don't mean -2 versus -4, but things like -3 vs 0, or -2 vs +1). I've seen a few of those and it'd be nice to know first of all whether one engine is systematically right or wrong, and second of all, what evaluation terms cause the large score difference. My guess is the evaluation of passed pawns, but I'm not entirely sure.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Thermopylay Marathon 2011 (live!)

Post by Daniel Shawul »

It seems I have a serious bug. I will try to fix it and send it to you later in the afternoon hopefully before knockout stages start ?. I suspect it is a bug in my incremental move generation .