Hi Dann,
Interesting! These relative positional scores remind me of the "Chess Magazine's" puzzles.
Back in 1989(!), I created a system to evaluate chess engines using this type of score. The interesting part of the approach was it tried to evaluate how chess engines strength changed at different time controls. It was published by Eric Hallsworth as part of his Selective Search magazine, You can read about it here:
http://www.chesscomputeruk.com/Evaluati ... rams_1.pdf
It would be interesting to use these positions and create an automated evaluation system using something like PyChess.
- Steve
Tony's positional test suite
Moderator: Ras
-
- Posts: 1252
- Joined: Wed Mar 08, 2006 8:28 pm
- Location: Florida, USA
Re: Tony's positional test suite
http://www.chessprogramming.net - Maverick Chess Engine
-
- Posts: 4845
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Sample regression
Code: Select all
A. Processor
Brand : Intel(R) Celeron(R) CPU B800 @ 1.50GHz
Arch : X86_64
Count : 2
B. Engine settings
Threads : 1
Hash (mb) : 128
Time(s)/pos : 30.0
C. Test set
Filename : tony-dcc-caleb.epd
NumPos : 16
D. Results
Engine : Rating Best Score SRate Elap(s)
Stockfish 8 64 : 3334 10 86 0.82 451
Fire 5 x64 : 3132 8 82 0.78 451
Komodo 9.02 64-bit : 3200 8 75 0.71 450
Bobcat v8.0 : 2816 8 70 0.67 428
Texel 1.06 : 2947 7 69 0.66 451
Hannibal 1.7 x64 : 2981 8 67 0.64 451
Cheng 4.39 : 2785 6 67 0.64 451
Deuterium v2017.1.35.431 : 2760 6 63 0.60 451
Arasan 20.2 : 2880 5 62 0.59 450
Rhetoric 1.4.3 x64 : 2631 6 61 0.58 429
Ethereal 8.19 : 2506 7 59 0.56 451
spark-1.0 : 2778 5 58 0.55 450
Gaviota v1.0 : 2716 4 55 0.52 450
Alaric 707 : 2479 3 54 0.51 453
Arminius 2014-01-18 : 2346 4 53 0.50 450
Cheese 1.9 64 bits : 2558 4 52 0.50 450
Maverick 1.5 x64 : 2380 3 43 0.41 451
Estimated Rating = (2443 x ScoreRate) + 1306
ScoreRate = totalScore/maxScore

-
- Posts: 12721
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Sample regression
Thank you for running such a fun experiment.
I really think this is a new kind of result.
Typically, there is a very poor regression between engine strength and EPD test suites.
I remember back in the day, when Shredder topped the Elo charts, it scored 285/300 on WAC which was very average.
I really think this is a new kind of result.
Typically, there is a very poor regression between engine strength and EPD test suites.
I remember back in the day, when Shredder topped the Elo charts, it scored 285/300 on WAC which was very average.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 4845
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Sample regression
Epd suites with multi solution is very much different compared to a suite with single solution when used to compare engine strengths.Dann Corbit wrote:Thank you for running such a fun experiment.
I really think this is a new kind of result.
Typically, there is a very poor regression between engine strength and EPD test suites.
I remember back in the day, when Shredder topped the Elo charts, it scored 285/300 on WAC which was very average.
One way to improve WAC is to supply it with 2nd solution

Identifying which position bears more weight than the others takes more time. One idea is give more weight to position whose bestmove take more time to find, this is also the idea of Steve on the pdf if I interpreted it corectly.
-
- Posts: 4845
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Tony's positional test suite
I download the file rebel.pgn and tried to convert it to tony format. Here are the errors I encountered on move legality.Rebel wrote:Here is some more human analysis, snippet:About 700 of them, I all typed in myself from paper. No internet in those days.Code: Select all
[Event ""] [Site "C3E2=10 G2G4=06 F1D3=05 D2D6=02 D1E1=0"] [Date "1994.05.05"] [Round "1"] [White "beat10 (01)"] [Black "Ply : 7"] [Result "*"] [BlackElo ""] [WhiteElo ""] [FEN "r3r1k1/1p3nqp/2pp4/p4p2/Pn3P1Q/2N4P/1PPR2P1/3R1BK1 w - - 0 1"] { C3E2=10 G2G4=06 F1D3=05 D2D6=02 D1E1=02 H4H5=01 G1H2=01 F1E2=01 } *
http://www.top-5000.nl/misc.htm
This is impressive considering that you had done this by hand

[d]r1bqrbk1/2n4p/3p1pp1/pppP3n/4P3/P1N2NPP/1P1B1PB1/R2QR1K1 w - - 0 1
game: 16
comment: B2B4=10 G3G4=08 A4A5=05 F3H4=02 F3H2=02 D1C2=02
uciMove: a4a5, score: 5
probably illegal move: a4a5
[d]2rqkb1r/3n1p1p/p3p1pn/1p1pP1N1/5P2/2N5/PPP3PP/R1BQ1R1K w kq - 0 1
game: 45
comment: F4F5=10 C3E2=06 G2G4=06 A2A4=06 D1D3=05 D2E3=03
uciMove: d2e3, score: 3
probably illegal move: d2e3
[d]2r1rbk1/1b1n1pp1/p6p/1p1nPB2/2q5/P4NNP/1B1Q1PP1/R3R1K1 b - - 0 1
game: 76
comment: D7C5=10 C8D8=07 E8E6=04 E8D8=04 C4C7=02 C8C7=02 F6H5=02
uciMove: f6h5, score: 2
probably illegal move: f6h5
[d]3q1rk1/pp1bpp1p/3p1npQ/8/3NP1P1/2r2P2/PPP5/2KR3R w - - 0 1
game: 533
comment: G2G4=12
uciMove: g2g4, score: 12
probably illegal move: g2g4
[d]8/8/p1BN1k2/3P4/1p1K1P1p/r7/8/8 b - - 0 1
game: 638
comment: A2A1=10 B4B3=08 H4H3=06 A3F3=05
uciMove: a2a1, score: 10
probably illegal move: a2a1
[d]r1b1kb1r/pp3pp1/4p2p/4q3/3N2PP/4n3/PPPQ1PB1/2KR3R w kq - 0 1
game: 673
comment: F2E3=10 G1E1=08 F2F4=07 D4C6=05 D1E1=03 D2E3=03
uciMove: g1e1, score: 8
probably illegal move: g1e1
[d]r4rk1/pp4bp/2pq2p1/3p1P1n/PP1P4/3B1P2/4NP1P/1R1Q1RK1 w - - 0 1
game: 676
comment: G1H1=10 D1D2=08 B4B5=07 D1C1=06 C1C2=04 F5G6=02
uciMove: c1c2, score: 4
probably illegal move: c1c2
[d]8/8/p1BN1k2/3P4/1p1K1P1p/r7/8/8 b - - 0 1
game: 705
comment: A2A1=10 B4B3=08 H4H3=06 A3F3=05
uciMove: a2a1, score: 10
probably illegal move: a2a1
duplicate
-
- Posts: 4845
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Tony's positional test suite
These are the dupes in rebel.pgn.
Code: Select all
dupes: 8/4bkpp/p4p2/r1pR4/2P2P1P/4BK2/6P1/8 b - - 0 1
dupes: 8/4k1pp/p4p2/7R/2r2P1P/5K2/6P1/8 b - - 0 1
dupes: 8/4k1p1/p4p1p/R7/2r2P1P/5K2/6P1/8 b - - 0 1
dupes: 8/4k1p1/p1r2p1p/R7/5PKP/8/6P1/8 b - - 0 1
dupes: 8/4k1p1/pr3p1p/R4P2/6KP/8/6P1/8 b - - 0 1
dupes: 8/5k2/pr3p1K/5R2/7P/8/6P1/8 b - - 0 1
dupes: 1r6/5k2/p4p2/5R1K/7P/8/6P1/8 b - - 0 1
dupes: 8/5k2/5p2/p7/4K2P/8/6P1/8 b - - 0 1
dupes: 8/p4pkp/1p4p1/3R4/4pK2/4P1P1/n4P1P/8 b - - 0 1
dupes: 8/p4pkp/1p4p1/1R6/3K4/4P1P1/5n1P/8 b - - 0 1
dupes: 8/p5kp/1p4p1/5p2/3K4/4P1P1/1R3n1P/8 b - - 0 1
dupes: 8/p5kp/1p3np1/5p2/3K4/4P1PP/2R5/8 b - - 0 1
dupes: 8/p6p/1p3kp1/5P2/3Kn3/4P2P/2R5/8 b - - 0 1
dupes: 8/p1R4p/1p4p1/5k2/3Kn3/4P2P/8/8 b - - 0 1
dupes: 8/R6p/1p4p1/5kn1/3K4/4P2P/8/8 b - - 0 1
dupes: 8/8/1p4p1/5knp/3K4/R3P2P/8/8 b - - 0 1
dupes: 8/8/1p4p1/5k1p/8/R2KPn1P/8/8 b - - 0 1
dupes: 8/8/1p4p1/5k1p/8/R3P2P/3K4/6n1 b - - 0 1
dupes: 8/8/1p6/5kpp/8/1R2P2P/3K4/6n1 b - - 0 1
dupes: 8/8/2R3k1/p4p2/r6P/6P1/2P3K1/8 b - - 0 1
dupes: 8/5k2/8/p1R2p2/4r2P/5KP1/2P5/8 b - - 0 1
dupes: 8/8/R4k2/5p2/2r4P/5KP1/2P5/8 b - - 0 1
dupes: 8/8/8/4kp2/2r4P/5KP1/1RP5/8 b - - 0 1
dupes: 1R6/8/2r2k2/5p2/7P/6PK/2P5/8 b - - 0 1
dupes: 8/8/1R3k2/5p2/7P/6PK/2r5/8 b - - 0 1
dupes: 8/6k1/1R6/5p1P/8/6PK/2r5/8 b - - 0 1
dupes: r2q1rk1/ppp2pbp/3n4/4p3/3nP3/2NBB2P/PPP2QP1/R4RK1 w - - 0 1
dupes: r2q1r1k/ppp2pbp/3n4/4p3/3nP3/2NBB1QP/PPP3P1/R4RK1 w - - 0 1
dupes: r2q1r1k/pp3pbp/2pn4/4p3/3nP1Q1/2NBB2P/PPP3P1/R4RK1 w - - 0 1
dupes: r3qr1k/pp3pbp/2pn4/4p2Q/3nP3/2NBB2P/PPP3P1/R4RK1 w - - 0 1
dupes: 1r4k1/pB3p1p/4b1p1/8/2P5/1PR5/r4PPP/5RK1 w - - 0 1
dupes: 1R6/4p2p/5k2/1p3P2/p2p1K2/P1n4P/5P2/8 w - - 0 1
dupes: 1r6/6pp/4pn2/2k5/1r1pP3/N4P2/1PKR2PP/7R w - - 0 1
dupes: 6k1/3b1ppp/p7/P1b5/3NpP2/1p2B1P1/1P4KP/8 b - - 0 1
dupes: 5k2/n7/5P2/2N2KP1/8/7p/8/8 b - - 0 1
dupes: 3R1b2/1p3pkp/p3p1pn/2P1P3/1P6/1b5P/6P1/R5K1 w - - 0 1
dupes: 6k1/r2np2p/4N1p1/2pPp3/6P1/1P6/P6P/R4K2 w - - 0 1
dupes: 8/3bk3/2r2p2/2P1p1pp/PK6/1PB3R1/7P/8 b - - 0 1
dupes: 8/2R3pp/5k2/5p2/4nP1P/2p5/6KP/8 w - - 0 1
dupes: 2R2nk1/pp3ppp/4p3/4P3/3p3P/1P3N2/q4PP1/3R2K1 b - - 0 1
dupes: 6k1/pb1r1pp1/1p3n1p/1P2p3/4P3/P3BP2/1n2N1PP/1BR3K1 b - - 0 1
dupes: r2bBk2/pp3p2/6p1/1P2p3/P3P3/2N1B1P1/2n2PK1/R7 w - - 0 1
dupes: 8/2p2p2/p5p1/2p1P3/4kPKp/7P/PP4P1/8 w - - 0 1
dupes: 8/8/p3k3/1pB3p1/5p1p/P1N5/1P1K2bP/8 w - - 0 1
dupes: 8/8/p1r1p1R1/2p2rp1/2K3k1/1PP1R1P1/P7/8 b - - 0 1
dupes: 8/8/4k3/5p2/p1N5/3K2P1/1P5P/3n4 b - - 0 1
dupes: 8/8/p1BN1k2/3P4/1p1K1P1p/r7/8/8 b - - 0 1
dupes: 3r4/3Pkpp1/8/p2R1Pp1/3p4/P5P1/5K1P/8 w - - 0 1
dupes: 8/6pp/3kpp2/p2p4/P7/1K6/4B1PP/8 b - - 0 1
dupes: 8/8/4p3/2Kpk3/5p1R/5P2/4n2P/8 w - - 0 1
dupes: 4B1k1/p1r5/4bP2/8/4R3/1np1B3/5K1P/8 b - - 0 1
-
- Posts: 2821
- Joined: Fri Sep 25, 2015 9:38 pm
- Location: Sortland, Norway
Re: Tony's positional test suite
Code: Select all
3R1b2/1p3pkp/p3p1pn/2P1P3/1P6/1b5P/6P1/R5K1 w - - 0 1
[pgn][Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "New game"]
[Black "?"]
[Result "*"]
[SetUp "1"]
[FEN "3R1b2/1p3pkp/p3p1pn/2P1P3/1P6/1b5P/6P1/R5K1 w - - 0 1"]
[PlyCount "11"]
1. Rxa6 bxa6 2. c6 Nf5 3. c7 Ne7 4. Re8 Bd5 5. Rxe7 Bxe7 6. c8=Q *[/pgn]
-
- Posts: 7280
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Tony's positional test suite
Work to do. How nice, mistakes of about 30 years ago are backfiringFerdy wrote:I download the file rebel.pgn and tried to convert it to tony format. Here are the errors I encountered on move legality.Rebel wrote:Here is some more human analysis, snippet:About 700 of them, I all typed in myself from paper. No internet in those days.Code: Select all
[Event ""] [Site "C3E2=10 G2G4=06 F1D3=05 D2D6=02 D1E1=0"] [Date "1994.05.05"] [Round "1"] [White "beat10 (01)"] [Black "Ply : 7"] [Result "*"] [BlackElo ""] [WhiteElo ""] [FEN "r3r1k1/1p3nqp/2pp4/p4p2/Pn3P1Q/2N4P/1PPR2P1/3R1BK1 w - - 0 1"] { C3E2=10 G2G4=06 F1D3=05 D2D6=02 D1E1=02 H4H5=01 G1H2=01 F1E2=01 } *
http://www.top-5000.nl/misc.htm

Ferdy wrote: This is impressive considering that you had done this by handfor more than 700 positions with multi good move test suite.

No PC in those days, only the Apple 2E, 32Kb Ram, 1 Mhz doing 100-150 NPS and 2 floppy drives each with a capacity of (if I remember right) 360 Kb.
No PGN nor EPD, let alone match facilities. What to do? So positions chosen by good players with multi good moves was a gift. Positions came from chess magazines, Steve already mentioned the Beat the Masters series.
Regarding the errors, either I must have typed the wrong position setup or made a mistake typing the moves. Boring work.
-
- Posts: 4845
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Tony's positional test suite
This is now fully converted. Duplicates are also removed. Illegal moves are discarded and not replaced, if there is only one move and it is illegal, the epd is removed.
Download rebel.epd
https://drive.google.com/file/d/0BwAOsu ... sp=sharing
Sample run at 1s/pos
Code: Select all
r3r1k1/1p3nqp/2pp4/p4p2/Pn3P1Q/2N4P/1PPR2P1/3R1BK1 w - - bm Ne2; c0 "positional scores are: Ne2=10, g4=6, Bd3=5, Rxd6=2, Re1=2, Qh5=1, Kh2=1, Be2=1"; id "rebel.pos.01";
4rrk1/pp1b2pp/5n2/3p1N2/8/2QB1qP1/PP3P1P/4RRK1 w - - bm Rxe8; c0 "positional scores are: Rxe8=10, Ne7+=7, Re3=6, Nd4=4"; id "rebel.pos.02";
r6r/p6p/1pnpkn2/q1p2p1p/2P5/2P1P3/P4PP1/1RBQKB1R w K - bm Rb3; c0 "positional scores are: Rb3=10, Qc2=7, Rxh5=7, Be2=7, Bd3=2, g4=2, e4=2, Rb5=1"; id "rebel.pos.03";
https://drive.google.com/file/d/0BwAOsu ... sp=sharing
Sample run at 1s/pos
Code: Select all
A. Processor
Brand : Intel(R) Celeron(R) CPU B800 @ 1.50GHz
Arch : X86_64
Count : 2
B. Engine settings
Threads : 1
Hash (mb) : 128
Time(s)/pos : 1.0
C. Test set
Filename : rebel.epd
NumPos : 657
D. Results
Engine : Rating Best Score SRate Elap(s)
Stockfish 8 64 : 3334 345 3193 0.64 674
Deuterium v2017.1.35.431 : 2760 278 2650 0.53 673
-
- Posts: 1056
- Joined: Fri Mar 10, 2006 6:07 am
- Location: Basque Country (Spain)
Re: Sample regression
For maxScore it seems that you have used 104, however adding on epd file I think I get 114.Ferdy wrote:Linear regression.Code: Select all
A. Processor Brand : Intel(R) Celeron(R) CPU B800 @ 1.50GHz Arch : X86_64 Count : 2 B. Engine settings Threads : 1 Hash (mb) : 128 Time(s)/pos : 30.0 C. Test set Filename : tony-dcc-caleb.epd NumPos : 16 D. Results Engine : Rating Best Score SRate Elap(s) Stockfish 8 64 : 3334 10 86 0.82 451 Fire 5 x64 : 3132 8 82 0.78 451 Komodo 9.02 64-bit : 3200 8 75 0.71 450 Bobcat v8.0 : 2816 8 70 0.67 428 Texel 1.06 : 2947 7 69 0.66 451 Hannibal 1.7 x64 : 2981 8 67 0.64 451 Cheng 4.39 : 2785 6 67 0.64 451 Deuterium v2017.1.35.431 : 2760 6 63 0.60 451 Arasan 20.2 : 2880 5 62 0.59 450 Rhetoric 1.4.3 x64 : 2631 6 61 0.58 429 Ethereal 8.19 : 2506 7 59 0.56 451 spark-1.0 : 2778 5 58 0.55 450 Gaviota v1.0 : 2716 4 55 0.52 450 Alaric 707 : 2479 3 54 0.51 453 Arminius 2014-01-18 : 2346 4 53 0.50 450 Cheese 1.9 64 bits : 2558 4 52 0.50 450 Maverick 1.5 x64 : 2380 3 43 0.41 451
Estimated Rating = (2443 x ScoreRate) + 1306
ScoreRate = totalScore/maxScore