Page 5 of 7

Re: Thermopylay Marathon 2011 (live!)

Posted: Wed Feb 09, 2011 8:05 pm
by hgm
Indeed, this one is a lot stronger. :D It is still the version with the movestogo bug, I didn't download the latest one yet. (But as we are playing from the defaultposition, the bug is not triggered.)

Re: Thermopylay Marathon 2011 (live!)

Posted: Wed Feb 09, 2011 8:19 pm
by Richard Allbert
Ok, I hope it survives, because I found another protocol problem straight after the upload :) , hence the quick notice.

I've just seen a bishop fork two kings - you don't see that a lot. Fairy has just two kings vs a Queen + a few pieces.

Re: Thermopylay Marathon 2011 (live!)

Posted: Wed Feb 09, 2011 9:29 pm
by hgm
Are you sure castling is OK? I saw Catalyst play Ke1-f1, where O-O would have seemed much more logical.

Re: Thermopylay Marathon 2011 (live!)

Posted: Thu Feb 10, 2011 6:44 am
by Richard Allbert
There's nothing in the eval to say that's bad.

I'll add it though!

I conecntrated a bit more on the Spartan eval with the last update - specifically hoplites and the structure.

There's a lot of room for imrovement!

Re: Thermopylay Marathon 2011 (live!)

Posted: Thu Feb 10, 2011 12:40 pm
by Evert
Richard Allbert wrote:There's nothing in the eval to say that's bad.

I'll add it though!

I conecntrated a bit more on the Spartan eval with the last update - specifically hoplites and the structure.

There's a lot of room for imrovement!
When sorting the movelist I give a small bonus for castling moves, so they're tried before other king moves if they're possible, and also tried before other quiet moves (except hash move, killer moves and possible generalisations of killer moves). This seems to help encourage the program to castle rather than move the king. The only concern then is to make sure the program doesn't wreak its castling rights before it has a chance to castle. :)

Re: Thermopylay Marathon 2011 (live!)

Posted: Thu Feb 10, 2011 1:32 pm
by hgm
Hmm, it seems Nebiyu has a bit of a problem seeing checkmates...

Even at 33 ply it does not come up with mate-in-7.

Re: Thermopylay Marathon 2011 (live!)

Posted: Thu Feb 10, 2011 2:15 pm
by Evert
hgm wrote:Hmm, it seems Nebiyu has a bit of a problem seeing checkmates...

Even at 33 ply it does not come up with mate-in-7.
Some of the search depths reported by Nebiyu look insane: I've seen it claim 21 ply after a fraction of a second at timeodds against Sjaak (which was doing 13-14 ply at the same time). I suspect it's extremely selective in its search.

Re: Thermopylay Marathon 2011 (live!)

Posted: Thu Feb 10, 2011 3:16 pm
by hgm
Even so, at some point it was even missing mate-in-3, and still evaluating as 9.6 in KQK.

In the last game there was also somehing that looked suspiciously much like a search bug. In one move its score dropped from a steady 4.3 at move 37 (Spartacus ~-2) to 0.3, while indeed anyone could see it had very bad (promotion) trouble comming way before that.

8/2k1k3/1h2h1g1/1PPBhhch/P7/8/2K5/R6R w - - 0 38
[d]8/2k1k3/1p2p1q1/1PPBpprp/P7/8/2K5/R6R w - - 0 38

Code: Select all

[Event "Thermopilae Marathon 2011"]
[Site "SCHAAKPC"]
[Date "2011.02.10"]
[Round "7.3"]
[White "Spartacus 0.23 / 6"]
[Black "Nebiyu 1.1 / 6"]
[Result "1-0"]
[TimeControl "40/1440"]
[Variant "spartan"]
[Number "42"]
[Annotator "1. -1.88   1... +0.30"]

1. Nc3 {-1.88/15 34} Lf6 {+0.30/17 33} 2. Nf3 {-1.58/15 30} Lc6
{+0.40/19 28} 3. d4 {-1.57/16 34} Hbd5 {+0.80/19 33} 4. e4 {-1.55/16 28}
Lxd4 {+1.00/21 30} 5. Nxd4 {-1.50/16 27} Hxd4 {+1.30/19 21} 6. Qxd4
{-1.61/15 35} Ce6 {+1.10/20 24} 7. f3 {-1.52/14 29} Hge5 {+1.50/17 18} 8.
g4 {-1.53/14 31} Hxe4 {+2.00/19 29} 9. h4 {-1.52/14 35} Wf6 {+1.90/19 38}
10. Qf2 {-1.62/14 32} Cdd6 {+1.90/19 39} 11. Bd3 {-1.59/14 36} We5
{+2.10/17 38} 12. a3 {-1.84/14 36} Cd4 {+2.00/18 27} 13. Be2 {-1.96/15 36}
Cf4 {+1.80/18 20} 14. Qg3 {-1.74/14 29} Hhf5 {+1.80/17 56} 15. Rf1
{-1.89/14 36} Hf6 {+1.90/17 38} 16. h5 {-1.93/14 36} Ke8 {+2.00/14 41} 17.
h6 {-1.26/14 28} Kb7 {+1.00/15 29} 18. h7 {-1.03/14 31} Ke7 {+1.20/18 40}
19. Bd2 {-1.28/14 31} Cd4 {+1.70/16 28} 20. Qf2 {-1.95/14 30} Cxd2
{+2.40/17 20} 21. Kxd2 {-2.05/14 38} Gh8 {+2.00/19 22} 22. Rh1
{-1.97/14 38} Wf4+ {+2.30/20 29} 23. Ke1 {-2.23/15 38} Ce5 {+2.00/19 33}
24. fxe4 {-2.10/14 38} Lxe4 {+2.20/20 29} 25. Nxe4 {-2.06/14 31} Cxe4
{+2.20/20 53} 26. gxf5 {-1.86/15 39} Hxf5 {+2.00/17 24} 27. Bf1
{-2.01/14 37} Ce3+ {+2.20/20 24} 28. Kd1 {-2.02/15 35} Hde6 {+2.20/19 31}
29. c4 {-1.72/14 33} Wg3 {+2.30/19 38} 30. Qxg3 {-2.02/16 39} Cxg3
{+2.20/20 31} 31. c5 {-1.98/16 34} Cg5 {+2.90/17 1:05} 32. Kc2
{-2.16/17 42} Hh5 {+3.70/23 48} 33. b4 {-2.61/17 42} Gxh7 {+3.70/21 42} 34.
a4 {-2.36/16 42} He5 {+3.80/18 23} 35. Bc4 {-2.24/15 43} Gg6 {+3.90/18 25}
36. Bd5 {-2.26/15 43} Kc7 {+4.10/19 33} 37. b5 {-1.40/18 43} Hb6
{+4.30/20 55} 38. a5 {+0.19/18 35} He4 {+0.30/19 29} 39. cxb6 {+0.30/17 50}
Kb8 {+0.00/24 1:04} 40. a6 {+0.35/17 45} Hf5 {+0.70/22 41} 41. Bc6
{+1.00/18 27} Gd6 {-1.10/22 34} 42. a7 {+0.97/18 35} Gc5+ {-1.50/23 37} 43.
Kb3 {+1.62/19 35} Gxb6 {-1.30/23 31} 44. axb8=N {+1.96/18 35} Gxb8
{-1.40/23 37} 45. Ra6 {+2.07/17 35} Gd8 {+0.80/19 21} 46. b6 {+4.09/16 35}
Gd3+ {-2.40/18 22} 47. Kb2 {+5.60/16 35} Gd8 {-2.90/18 36} 48. b7
{+6.50/17 35} Cg6 {-6.70/20 23} 49. Ra8 {+7.44/15 26} Gc7 {-7.60/22 34} 50.
b8=Q {+7.79/15 35} Gxb8+ {-7.20/22 42} 51. Rxb8 {+7.51/14 33} Cf6
{-7.90/19 39} 52. Rxh5 {+8.11/13 30} Hd3 {-8.90/19 37} 53. Bb5
{+9.13/13 36} Hd4 {-10.60/21 43} 54. Rh7+ {+11.26/13 36} Kd6 {-10.50/25 48}
55. Rb6+ {+11.64/13 35} Kc5 {-10.80/29 37} 56. Rxf6 {+14.27/14 35} Kxb5
{-10.80/28 34} 57. Rxf5+ {+10.95/14 33} Kc4 {-10.90/28 19} 58. Rh4
{+319.96/85 36} He2 {-10.90/36 39} 59. Kc2 {+319.91/16 36} Hf1=G
{-10.90/29 1:01} 60. Rxf1 {+319.92/17 36} Kd5 {-10.90/27 39} 61. Rf5+
{+319.93/15 37} Ke6 {-10.90/28 19} 62. Ra5 {+319.94/17 29} He3
{-10.90/27 40} 63. Rh6+ {+319.96/19 29} Ke7 {-10.90/26 3} 64. Ra7+
{+319.98/100 30} Kd8 {-999.98/41 0.1} 65. Rh8# {+319.99/100 0.1}
{Xboard adjudication: Checkmate} 1-0
Spartacus had a spurious mate score at move 58 as well. I will also have to look into that. Again you see that Nebiyu does not see the mate coming at all, despite 26 ply, and misses 'best defence' on many occasions because of it.

Code: Select all

Cross table, sorted by score percentage, Buchholz, SB

                              Spar Nebi Sjaa Fair Ober Ches Cata
 1. Spartacus 0.23 / 6        #### =101 0111 =110 10=1 1111 1110   73%  17.5 (266.0, 187.0)
 2. Nebiyu 1.1 / 6            =010 #### 11=0 1111 1110 0=10 1111   69%  16.5 (270.0, 176.5)
 3. Sjaak 92                  1000 00=1 #### 1=10 1011 0011 1101   54%  13.0 (284.0, 139.8)
 4. Fairy-Max  4.8R           =001 0000 0=01 #### 10=1 11=1 1110   50%  12.0 (288.0, 118.8)
 5. Oberon                    01=0 0001 0100 01=0 #### 10=1 1111   48%  11.5 (290.0, 116.5)
 6. ChessV (Spartan)          0000 1=01 1100 00=0 01=0 #### 1000   31%   7.5 (306.0,  96.5)
 7. Catalyst 3                0001 0000 0010 0001 0000 0111 ####   25%   6.0 (312.0,  65.0)

Re: Thermopylay Marathon 2011 (live!)

Posted: Thu Feb 10, 2011 3:35 pm
by Evert
hgm wrote:Even so, at some point it was even missing mate-in-3, and still evaluating as 9.6 in KQK.

In the last game there was also somehing that looked suspiciously much like a search bug. In one move its score dropped from a steady 4.3 at move 37 (Spartacus ~-2) to 0.3, while indeed anyone could see it had very bad (promotion) trouble comming way before that.
Perhaps a null-move issue, then?
I think I saw a random mate score in one of Sjaak's games as well, but I don't remember if it was mate for or against. If it was against, it probably just got lucky.

It'll also be interesting to look at positions that the two engines evaluate very differently (and I don't mean -2 versus -4, but things like -3 vs 0, or -2 vs +1). I've seen a few of those and it'd be nice to know first of all whether one engine is systematically right or wrong, and second of all, what evaluation terms cause the large score difference. My guess is the evaluation of passed pawns, but I'm not entirely sure.

Re: Thermopylay Marathon 2011 (live!)

Posted: Thu Feb 10, 2011 4:51 pm
by Daniel Shawul
It seems I have a serious bug. I will try to fix it and send it to you later in the afternoon hopefully before knockout stages start ?. I suspect it is a bug in my incremental move generation .