Test Suite for evaluating qSearch ?

Fguy64 · Post by **Fguy64** » Sat Oct 24, 2009 5:01 am

OK, I have been using the WAC suite for testing my alphaBeta search, and they are useful for when qSearch is turned off, cause you can see with your own eyes exactly how many ply is required and what the final eval should be.

But things get a little cloudy with qSearch, for obvious reasons. So I was hoping that someone could recommend a test suite with some benchmarks, some cut and dry numbers that says something like, if your alphabeta search is set for n-ply, and if you have full capture search with MVV/LVA you should solve the position with the given regular search depth.

see what I mean? Anyways, I should probably have perftt or something, but I don't and probably won't for a while, so test suite is the way for me to go.

regards.

Dann Corbit · Post by **Dann Corbit** » Sat Oct 24, 2009 6:12 am

Fguy64 wrote:OK, I have been using the WAC suite for testing my alphaBeta search, and they are useful for when qSearch is turned off, cause you can see with your own eyes exactly how many ply is required and what the final eval should be.

But things get a little cloudy with qSearch, for obvious reasons. So I was hoping that someone could recommend a test suite with some benchmarks, some cut and dry numbers that says something like, if your alphabeta search is set for n-ply, and if you have full capture search with MVV/LVA you should solve the position with the given regular search depth.

see what I mean? Anyways, I should probably have perftt or something, but I don't and probably won't for a while, so test suite is the way for me to go.

regards.

Here are some very active positions (several from real games, and the 218 possible move position is contrived):
r1b1rqk1/pp3pbp/4p1p1/1Q6/4BBN1/R7/1P3PPP/2R3K1 w - - acd 13; acn 147362823; acs 60; ce 516; pv Nh6+ Kh8 Bd6 Bxh6 Bxf8 Rxf8 Rc2 Bg7 Bxb7 Bxb7 Qxb7 Rab8 Qe7 a5 g3;
r2qr2k/1b1n2pp/p3Qb2/npp5/5B2/2N2N2/PPB2PPP/R3R1K1 w - - acd 12; acn 153418423; acs 60; ce 79; pv Qh3 Nf8 Rxe8 Qxe8 Re1 Qf7 Ng5 Qg8 Qf5 Rd8 Nce4 Bxe4 Nxe4 Bxb2 Nxc5 Qxa2 Nxa6;
rnb1qrk1/2p3pp/pn1bp3/1P2N1B1/2pR4/2N5/1PQ1BPPP/R5K1 w - - acd 12; acn 138298998; acs 60; ce 81; pv Nxc4 Nxc4 Bxc4 Qh5 h4 Bc5 Be2 Qe8 Rd8 Bxf2+ Kh1 Qg6 Qxg6 hxg6 Rxc8 Rxc8 bxa6;
3Q4/1Q4Q1/4Q3/2Q4R/Q4Q2/3Q4/1Q4Rp/1K1BBNNk w - - acd 4; acn 1049094; acs 1; ce 32766; pv Rhxh2#;
r1bqkb1r/ppp5/2n4p/3p1pp1/Q2PnP2/2PB4/PP1NN1PP/R1B2RK1 b kq - acd 11; acn 99227043; acs 60; ce -82; pv a6 Nf3 Be7 Ne5 Bd7 Nxd7 Qxd7 fxg5 hxg5 Bb5 Bd6 h3;

This position will cause more than half the engines that try to solve it to crash. It's not a legal position because it cannot happen in a real game. But if your engine can survive this, then it is very robust.
rnbqkbnr/qqqqqqqq/8/8/8/8/QQQQQQQQ/RNBQKBNR w - - acd 6/32; acn 357064337; acs 520; bm Qfxf7+; ce 609; pv Qfxf7+ Qxf7 Qhxc7 Qd7xc7 Rxh7 Qxd2+ Qdxd2 Qxh7 Qxc7 Qxc7 Qxa8 Qxa8 Qxa8;

Fguy64 · Post by **Fguy64** » Sat Oct 24, 2009 6:55 am

Thanks Dan, but I think I need something a little less challenging. You're talking to a guy who hasn't yet mastered the art of collecting principal variation, no Transposition table, it's a pretty vanilla setup I have. Plain old alphaBeta +Q. And no positional evaluation. I tried the first problem, My engine wanted to play ...Bd6

Dann Corbit · Post by **Dann Corbit** » Sat Oct 24, 2009 10:14 am

Fguy64 wrote:Thanks Dan, but I think I need something a little less challenging. You're talking to a guy who hasn't yet mastered the art of collecting principal variation, no Transposition table, it's a pretty vanilla setup I have. Plain old alphaBeta +Q. And no positional evaluation. I tried the first problem, My engine wanted to play ...Bd6

The idea of these EPD records is to see if your qsearch explodes due to extensions or things of that nature.

Some engines will never exit from ply 1 before they crash in a ball of flame.

Fguy64 · Post by **Fguy64** » Sat Oct 24, 2009 2:51 pm

Dann Corbit wrote:
Fguy64 wrote:Thanks Dan, but I think I need something a little less challenging. You're talking to a guy who hasn't yet mastered the art of collecting principal variation, no Transposition table, it's a pretty vanilla setup I have. Plain old alphaBeta +Q. And no positional evaluation. I tried the first problem, My engine wanted to play ...Bd6
The idea of these EPD records is to see if your qsearch explodes due to extensions or things of that nature.

Some engines will never exit from ply 1 before they crash in a ball of flame.

OK thanks, I'll file them for future reference. I think it is clear that my immediate needs are more basic.

Is my original question a reasonable one? I'm looking for a set of basic positions where I can establish a base set of conditions such as negamax + abp + MVV/LVA + Q, with a given set of piece values, and be able to identify what ply I should have to set my regular search at to get a given eval. Furthermore, It would be easy for me to gauge the contribution of the Q by virtue of the fact that I can easily turn it off and on. All this without perftt or collecting PV. And the WAC positions have been very helpful for evaluating my regular search without using Q.

P. Villanueva · Post by **P. Villanueva** » Sat Oct 24, 2009 2:56 pm

Dann Corbit wrote:This position will cause more than half the engines that try to solve it to crash. It's not a legal position because it cannot happen in a real game. But if your engine can survive this, then it is very robust.
rnbqkbnr/qqqqqqqq/8/8/8/8/QQQQQQQQ/RNBQKBNR w - - acd 6/32; acn 357064337; acs 520; bm Qfxf7+; ce 609; pv Qfxf7+ Qxf7 Qhxc7 Qd7xc7 Rxh7 Qxd2+ Qdxd2 Qxh7 Qxc7 Qxc7 Qxa8 Qxa8 Qxa8;

ply score nodes time bf principal variation
1 1199 235M 9m17s * a2f7 g7f7 f2f7 h7f7 h2c7 d8c7 a1a7 b7a7 c2c7 a7c7 d2d7 b8d7 h1h8
2 1199 254M 10m04s * a2f7 g7f7 f2f7 h7f7 h2c7 d8c7 a1a7 b7a7 c2c7 a7c7 d2d7 b8d7 h1h8
3 1199 272M 10m48s 1.08 a2f7 g7f7 f2f7 h7f7 h2c7 d8c7 a1a7 b7a7 c2c7 a7c7 d2d7 b8d7 h1h8
4 1199 1369M 56m08s 2.32 a2f7 g7f7 f2f7 h7f7 h2c7 d8c7 a1a7 b7a7 c2c7 a7c7 d2d7 b8d7 h1h8
NPS: 405240
Time: 360000cs
Branching factor: 1.70
TT cutoffs: 0%
Depth: 4

235 million nodes for the first iteration!!!
My KMT Chess doesn't crash but can't find the best move after one hour search. It thinks it can win a queen, instead of a tower.

P. Villanueva · Post by **P. Villanueva** » Sat Oct 24, 2009 3:23 pm

Rybka 2 finds the same first move than KMT with differents score and PV.

FEN: rnbqkbnr/qqqqqqqq/8/8/8/8/QQQQQQQQ/RNBQKBNR w - - 0 1

Rybka v2.2n2.mp.w32:
2 04:18 22.046.633 87.409 +4,44 Qf2xf7+
3 05:57 30.440.138 87.230 +4,44 Qf2xf7+
4 08:12 43.408.557 90.303 +2,95 Qf2xf7+ Qg7xf7 Qc2xc7
4 08:29 45.172.063 90.762 +5,03 Qa2xf7+ Qg7xf7 Ra1xa7
5 10:14 56.252.737 93.702 +5,39 Qa2xf7+ Qg7xf7 Ra1xa7 Qb7xg2
6 11:56 66.980.925 95.687 +5,40 Qa2xf7+ Qg7xf7 Ra1xa7 Qb7xg2 Qd2xd7+ Nb8xd7 Qf2xf7+ Qh7xf7 Qh2xg2
7 13:31 76.908.000 97.087 +5,49 Qa2xf7+ Qg7xf7 Ra1xa7 Qc7xh2 Rh1xh2 Qb7xg2 Qf2xf7+ Qh7xf7 Bf1xg2
8 17:20 101.053.671 99.505 +5,49 Qa2xf7+ Qg7xf7 Ra1xa7 Qc7xh2 Rh1xh2 Qb7xg2 Qf2xf7+ Qh7xf7 Bf1xg2 Ra8xa7

michiguel · Post by **michiguel** » Sat Oct 24, 2009 6:11 pm

P. Villanueva wrote:Rybka 2 finds the same first move than KMT with differents score and PV.

FEN: rnbqkbnr/qqqqqqqq/8/8/8/8/QQQQQQQQ/RNBQKBNR w - - 0 1

Rybka v2.2n2.mp.w32:
2 04:18 22.046.633 87.409 +4,44 Qf2xf7+
3 05:57 30.440.138 87.230 +4,44 Qf2xf7+
4 08:12 43.408.557 90.303 +2,95 Qf2xf7+ Qg7xf7 Qc2xc7
4 08:29 45.172.063 90.762 +5,03 Qa2xf7+ Qg7xf7 Ra1xa7
5 10:14 56.252.737 93.702 +5,39 Qa2xf7+ Qg7xf7 Ra1xa7 Qb7xg2
6 11:56 66.980.925 95.687 +5,40 Qa2xf7+ Qg7xf7 Ra1xa7 Qb7xg2 Qd2xd7+ Nb8xd7 Qf2xf7+ Qh7xf7 Qh2xg2
7 13:31 76.908.000 97.087 +5,49 Qa2xf7+ Qg7xf7 Ra1xa7 Qc7xh2 Rh1xh2 Qb7xg2 Qf2xf7+ Qh7xf7 Bf1xg2
8 17:20 101.053.671 99.505 +5,49 Qa2xf7+ Qg7xf7 Ra1xa7 Qc7xh2 Rh1xh2 Qb7xg2 Qf2xf7+ Qh7xf7 Bf1xg2 Ra8xa7

Gaviota 0.74 later changes to another move

Code: Select all

       435   1       0.0    -2.71  1.Qhxc7
     51779   1       0.3    +1.39  1.Qaxf7+ Qxf7 2.Qxd7+ Qdxd7
    322304   2       1.9    +1.23  1.Qaxf7+ Qxf7 2.Qxd7+ Qdxd7 3.Qxd7+
                                   Nxd7 4.Qxf7+ Qxf7 5.Qhxc7 Qxc7 6.Qxa8
   1942939   3       5.4    +1.22  1.Qaxf7+ Qxf7 2.Qxd7+ Bxd7 3.Qhxc7
                                   Qaxf2+ 4.Qxf2
   8271710   4      19.2    +1.80  1.Qaxf7+ Qxf7 2.Qxd7+ Bxd7 3.Qxf7+ Qxf7
                                   4.Qhxc7 Qdxc7 5.Qxc7
  27373800   5      54.1    +2.19  1.Qaxf7+ Qxf7 2.Qxd7+ Nxd7 3.Qhxh7 Rxh7
                                   4.Rxa7 Qa5+ 5.Nc3 Qbxa7 6.Qxh7
  84461515   6     147.1      :-)  1.Qaxf7+
 124675879   6     213.2    +2.91  1.Qaxf7+ Qxf7 2.Qxd7+ Qdxd7 3.Qxd7+
                                   Qxd7 4.Qhxh7 Rxh7 5.Rxa7 Qxf2+ 6.Qxf2
                                   Rxh1 7.Rxb7 Bxb7 8.Qxe7+ Bxe7
 175777069   6:    298.4    +2.91  1.Qaxf7+ Qxf7 2.Qxd7+ Qdxd7 3.Qxd7+
                                   Qxd7 4.Qhxh7 Rxh7 5.Rxa7 Qxf2+ 6.Qxf2
                                   Rxh1 7.Rxb7 Bxb7 8.Qxe7+ Bxe7
 320042730   7     534.3    +2.99  1.Qaxf7+ Qxf7 2.Qxd7+ Qcxd7 3.Qxd7+
                                   Qdxd7 4.Qhxh7 Rxh7 5.Rxa7 Qxf2+ 6.Qxf2
                                   Rxh1 7.Qg6+ Kd8 8.Rxb7 Qxe2+ 9.Bxe2
                                   Bxb7
1243731417   7    1994.8    +3.04  1.Qfxf7+ Qxf7 2.Qcxh7 Qxd2+ 3.Bxd2
                                   Qaxa2 4.Qxa2 Nh6 5.Qaxf7+ Nxf7 6.Rxa8
                                   Rxh7 7.Qxc7
1250599592   7:   2005.7    +3.04  1.Qfxf7+ Qxf7 2.Qcxh7 Qxd2+ 3.Bxd2
                                   Qaxa2 4.Qxa2 Nh6 5.Qaxf7+ Nxf7 6.Rxa8
                                   Rxh7 7.Qxc7
1810768561   8    2843.0    +3.04  1.Qfxf7+ Qxf7 2.Qcxh7 Qxd2+ 3.Bxd2
                                   Qaxa2 4.Qxa2 Nh6 5.Qaxf7+ Nxf7 6.Rxa8
                                   Rxh7 7.Qxc7
2277907910   8    3588.9    +3.14  1.Qaxf7+ Qxf7 2.Qxd7+ Nxd7 3.Qhxh7 Rxh7
                                   4.Rxh7 Qca5+ 5.Rxa5 Qdxa5+ 6.Nc3 Qxh7
                                   7.Qxh7
2621727371   8:   4126.5    +3.14  1.Qaxf7+ Qxf7 2.Qxd7+ Nxd7 3.Qhxh7 Rxh7
                                   4.Rxh7 Qca5+ 5.Rxa5 Qdxa5+ 6.Nc3 Qxh7
                                   7.Qxh7
3210177585   9    5019.9    +3.39  1.Qaxf7+ Qxf7 2.Qxd7+ Bxd7 3.Qhxh7
                                   Qaxf2+ 4.Qxf2 Rxh7 5.Rxh7 Qxf2+ 6.Kxf2
                                   Qxc2 7.Rxe7+ Qxe7 8.Qxb7
3899450090   9:   6071.6    +3.39  1.Qaxf7+ Qxf7 2.Qxd7+ Bxd7 3.Qhxh7
                                   Qaxf2+ 4.Qxf2 Rxh7 5.Rxh7 Qxf2+ 6.Kxf2
                                   Qxc2 7.Rxe7+ Qxe7 8.Qxb7
8418052761  10   13003.6    +3.43  1.Qaxf7+ Qxf7 2.Qxd7+ Nxd7 3.Qhxh7 Rxh7
                                   4.Rxh7 Qaxf2+ 5.Qxf2 Qxf2+ 6.Kxf2 Qbb6+
                                   7.Qxb6 Qxb6+ 8.Kg3 Qdc7+ 9.Qxc7 Qxc7+
                                   10.Bf4
11533922311  10   17623.7    +3.68  1.Qxd7+ Bxd7 2.Qaxf7+ Qxf7 3.Qhxh7
                                   Qaxf2+ 4.Qxf2 Rxh7 5.Rxh7 Qxf2+ 6.Kxf2
                                   Qbb6+ 7.Qdd4 Qxc2 8.Rxe7+ Qxe7 9.Qbxb6
13308312234  10:  20342.2    +3.68  1.Qxd7+ Bxd7 2.Qaxf7+ Qxf7 3.Qhxh7
                                   Qaxf2+ 4.Qxf2 Rxh7 5.Rxh7 Qxf2+ 6.Kxf2
                                   Qbb6+ 7.Qdd4 Qxc2 8.Rxe7+ Qxe7 9.Qbxb6
14979768473  11   22845.8      :-)  1.Qxd7+

with the following stats. The number of quies search nodes is 0.80 of the total. So they are in a ratio 4:1 with normal nodes.

Code: Select all

Score: 3.68 (899)  Evals: 11492496967   Time: 23639.4s   nps: 655082  Q/all: 0.80
-------------------------------------------------------------------
               nodes        cutoffs         missed       tree_exp
-------------------------------------------------------------------
path      3039589229      380458655       27694506         1.07
quies    12446196099     3458656721      284531786         1.08
all      15485785328     3839115376      312226292         1.08
-------------------------------------------------------------------
hashtable=  attempts: 3039506276   hits: 22.6%   perfect: 15.6%

Side-to-wait attack calls: 131255213 calls/node: 0.848%
Lazy evals = low:0  cutoff:0  normal:0
-------------------------------------------------------------------
                            hits         missed     efficiency
-------------------------------------------------------------------
 BB cached in eval    1036161423    10426824476      0.090
  Checks Cheap/Exp   17640005585              0      1.000
InChecks Cheap/Exp   16149600862     1453842288      0.917
Attacks generation  219278755024    74228964011      0.747
       Attacks SEE     320159739      745991839      0.300
         Pawn hash             0    11462979098      0.000
     Material hash   11460955298        2029680      1.000
 Make normal moves   17640005498              0      1.000
-------------------------------------------------------------------

GTB CACHE STATS
  probes:                 0
  efficiency:           0.0%
  average searches      0.0
  occupancy:            0.0%

Profile = fragment_failhigh()
Total    time: 23639.44 s
External time: 127.93 s
Internal time: 23511.52 s
Ratio Internal/Total: 99.46

Martin Brown · Post by **Martin Brown** » Sun Oct 25, 2009 11:44 am

OK, I have been using the WAC suite for testing my alphaBeta search, and they are useful for when qSearch is turned off, cause you can see with your own eyes exactly how many ply is required and what the final eval should be.

Actually I was wondering about the same sort of thing. It would be nice to have a simple toy evaluation function that could be used to produce a set of perft style (BTW why is it called that - I am a newbie here) numbers for basic testing of alpha-beta with and without various enhancements like killer, null-move & qsearch. My setup could do some of these tests but whether it would give the right answers is another matter!

I also tweak my engine based on the testing with WAC suite. It manages about 180/300 at ply 4, 200/300 at 5 ply and 240/300 at 6 ply with a still buggy qsearch on and with it off gives 111, 119 and 158 respectively. It is a bit depressing that state of the art amateur engines now score 296/300!!!

The engine is a resurrected version of a very old VAX Pascal chess program written according to the surviving headers by Bob Kushlis with a little help from Peter Gilbert. I have improved it to meet modern expectations and remove obvious bugs. If I could find either of these guys to ask permission I would show the sourcecode which is presently ported to XDS Modula2. I doubt it will ever be massively strong but it is of some historical interest. And rarity value - I don't know of any other surviving chess engines implemented in Modula2.

Fguy64 · Post by **Fguy64** » Sun Oct 25, 2009 2:50 pm

Martin Brown wrote:
OK, I have been using the WAC suite for testing my alphaBeta search, and they are useful for when qSearch is turned off, cause you can see with your own eyes exactly how many ply is required and what the final eval should be.
Actually I was wondering about the same sort of thing. It would be nice to have a simple toy evaluation function that could be used to produce a set of perft style (BTW why is it called that - I am a newbie here) numbers for basic testing of alpha-beta with and without various enhancements like killer, null-move & qsearch. My setup could do some of these tests but whether it would give the right answers is another matter!

I also tweak my engine based on the testing with WAC suite. It manages about 180/300 at ply 4, 200/300 at 5 ply and 240/300 at 6 ply with a still buggy qsearch on and with it off gives 111, 119 and 158 respectively. It is a bit depressing that state of the art amateur engines now score 296/300!!!

The engine is a resurrected version of a very old VAX Pascal chess program written according to the surviving headers by Bob Kushlis with a little help from Peter Gilbert. I have improved it to meet modern expectations and remove obvious bugs. If I could find either of these guys to ask permission I would show the sourcecode which is presently ported to XDS Modula2. I doubt it will ever be massively strong but it is of some historical interest. And rarity value - I don't know of any other surviving chess engines implemented in Modula2.

I don't even know what you mean by 296/300. I just wing it with these positions. As a benchmark I see how many ply it takes to find a given solution (move & eval) then I measure milliseconds and leaf nodes. I then judge the success of any improvements to my algorithm on whether or not there is a significant reduction in time/nodes to reach the same solution. If I get a different eval or move for a given search depth, then I know there is a problem.

Test Suite for evaluating qSearch ?

Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?

Re: Test Suite for evaluating qSearch ?