Page 2 of 22

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 5:20 pm
by lkaufman
Rebel wrote: Sun Jun 21, 2020 4:20 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm
Rebel wrote: Sun Jun 21, 2020 12:31 pm Added a handicap factor of 100, fun with the known names of the past and/or former rating list leaders and world champions.

Progress has been amazing...

http://rebel13.nl/rebel13/stockfish-han ... tches.html
I guess the thought of giving knight odds against oldie version did already occur to you?
Would be fun also.
If you do that it's important to set Stockfish Contempt to maximum value of 100. Also there is the question of how to get variety; do you generate an opening book, and if so how? Also, should it be just the b1 knight, or alternate b1/g1 knights? Another way to get variety is to use four threads for Stockfish 11, which is also reasonable since old engines might not even have MP support. I think that it is likely that Komodo, KomodoMCTS, Lc0 70xxx, and even Stockfish NNUE will all be stronger than Stockfish 11 at giving knight odds, although that's not certain.

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 9:21 pm
by Rebel
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 9:43 pm
by chrisw
Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
I’m tempted to generate a few tens of thousands of test positions. How many would be enough?

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 10:14 pm
by chrisw
Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
With one white knight removed, generated brute force 4-deep, 179096 epds, of which 85159 are unique.

I'll run them by SF11 at 10ms, and find the average eval and the standard deviation, from that should be able to pluck out several thousand positions where neither side did anything stupid (like threw away material in 4-ply) and that should work as a large knight odds test suite ...

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 10:29 pm
by Rebel
chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 10:32 pm
by Rebel
chrisw wrote: Sun Jun 21, 2020 10:14 pm
Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
With one white knight removed, generated brute force 4-deep, 179096 epds, of which 85159 are unique.

I'll run them by SF11 at 10ms, and find the average eval and the standard deviation, from that should be able to pluck out several thousand positions where neither side did anything stupid (like threw away material in 4-ply) and that should work as a large knight odds test suite ...
Takes too much cpu time. I have decided to test at 40/240, 6 seconds average.

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 10:45 pm
by chrisw
Rebel wrote: Sun Jun 21, 2020 10:32 pm
chrisw wrote: Sun Jun 21, 2020 10:14 pm
Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
With one white knight removed, generated brute force 4-deep, 179096 epds, of which 85159 are unique.

I'll run them by SF11 at 10ms, and find the average eval and the standard deviation, from that should be able to pluck out several thousand positions where neither side did anything stupid (like threw away material in 4-ply) and that should work as a large knight odds test suite ...
Takes too much cpu time. I have decided to test at 40/240, 6 seconds average.
OK, just throw away what you don't need.

Average eval after four moves from startpos minus knight is -300 centipawns (which is to be expected), and 66% of those positions are within +/-100 centipawns. I'll try culling at +/-10 centipawns and publish the list. Later ...

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 11:08 pm
by lkaufman
Rebel wrote: Sun Jun 21, 2020 10:29 pm
chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D
I don't see much point in creating artificial problems for this purpose, your current method is much better. But the more interesting way to do it in my opinion is to take chrisw's knight odds opening book, and then run gauntlets of top engines like SF and Komodo vs. several relatively weak engines, whatever engines are about the right strength to score 30-70% at knight odds. Knight odds (either White knight removed) is a clearly defined handicap of a nearly constant magnitude, which makes it ideal for the purpose of seeing how much improvement there has been and which top engine is better at giving the handicap. By the way, knight odds is always used rather than bishop in chess because with one bishop removed the game changes much more, you try to put pawns on particular colors, it doesn't feel like chess anymore. Knights are interchangeable, bishops are not. Also knight odds means odds giver has White, otherwise it is called "knight and move" odds.

Re: Stockfish Handicap Matches

Posted: Sun Jun 21, 2020 11:20 pm
by chrisw
lkaufman wrote: Sun Jun 21, 2020 11:08 pm
Rebel wrote: Sun Jun 21, 2020 10:29 pm
chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D
I don't see much point in creating artificial problems for this purpose, your current method is much better. But the more interesting way to do it in my opinion is to take chrisw's knight odds opening book, and then run gauntlets of top engines like SF and Komodo vs. several relatively weak engines, whatever engines are about the right strength to score 30-70% at knight odds. Knight odds (either White knight removed) is a clearly defined handicap of a nearly constant magnitude, which makes it ideal for the purpose of seeing how much improvement there has been and which top engine is better at giving the handicap. By the way, knight odds is always used rather than bishop in chess because with one bishop removed the game changes much more, you try to put pawns on particular colors, it doesn't feel like chess anymore. Knights are interchangeable, bishops are not. Also knight odds means odds giver has White, otherwise it is called "knight and move" odds.
Done 5600 EPDs off the start position minus b1 knight, played out all four ply combinations, culled all duplicates, culled all positions where SF11 evaluated more than +/-10 centipawns away from 300 centipawns (SF11 average score for all epds), and am now left with 5600 EPDs.

Link: https://github.com/ChrisWhittington/Che ... t-odds.epd

Will upload for no knight at g1 tomorrow am.

Small randomised sample below

Code: Select all

rnbqkbnr/pppp2pp/5p2/4p3/4P2P/8/PPPP1PP1/RNBQKB1R w KQkq - 0 3
rnbqk1nr/pppp1ppp/3bp3/8/3P4/8/PPP1PPPP/RNBQKBR1 w Qkq - 2 3
rnbqkb1r/pppppp1p/6pn/8/3P2P1/8/PPP1PP1P/RNBQKB1R w KQkq - 0 3
rnbqkbnr/p1p1pppp/1p1p4/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 0 3
rnbqkb1r/ppppnppp/8/4p3/8/2P3P1/PP1PPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbnr/p1ppppp1/8/1p5p/8/1QP5/PP1PPPPP/RNB1KB1R w KQkq - 0 3
rnbqkbnr/pp1ppp1p/2p5/6p1/8/N4P2/PPPPP1PP/R1BQKB1R w KQkq - 0 3
rnbqkbnr/p1ppppp1/7p/1p6/1P6/P7/2PPPPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1p1ppppp/p7/2p5/2P5/4P3/PP1P1PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppp1p1p/4p3/6p1/P7/3P4/1PP1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkb1r/1ppppppp/p6n/8/1P6/2N5/P1PPPPPP/R1BQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n5/5p2/4P3/3P4/PPP2PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppppp2/6p1/7p/3P4/2N5/PPP1PPPP/R1BQKB1R w KQkq - 0 3
r1bqkbnr/pppppp1p/n5p1/8/8/P6P/1PPPPPP1/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 1 3
rnbqkb1r/pp1ppppp/7n/2p5/3P4/8/PPPBPPPP/RN1QKB1R w KQkq - 0 3
r1bqkbnr/pppppppp/8/n3P3/8/8/PPPP1PPP/RNBQKB1R w KQkq - 1 3
rnbqkb1r/pppppppp/8/6P1/4n3/8/PPPPPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbr1/pppppppp/5n2/8/1P2P3/8/P1PP1PPP/RNBQKB1R w KQq - 1 3
rnbq1bnr/pppkpppp/8/3p4/8/NP6/P1PPPPPP/R1BQKB1R w KQ - 1 3
r1bqkbnr/ppppp1pp/n4p2/8/8/1P1P4/P1P1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1pppppp1/p6p/8/3P1B2/8/PPP1PPPP/RN1QKB1R w KQkq - 0 3
rnbqkb1r/pppppp1p/7n/6p1/5P2/8/PPPPP1PP/RNBQKBR1 w Qkq - 1 3
r1bqkbnr/pp1ppppp/n1p5/8/3P4/P7/1PP1PPPP/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/4P3/6P1/PPPP1P1P/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppppp1/n6p/2P5/8/8/PP1PPPPP/RNBQKB1R w KQkq - 1 3
rnbqkbn1/pppppppr/7p/8/7P/1P6/P1PPPPP1/RNBQKB1R w KQq - 1 3
rnbqkbnr/ppp1p1pp/3p4/5p2/6P1/2P5/PP1PPP1P/RNBQKB1R w KQkq - 0 3
rnbqkb1r/pppp1ppp/4p2n/8/8/4P3/PPPPBPPP/RNBQK2R w KQkq - 2 3
rnbqkbnr/pppppp1p/8/8/6p1/P5P1/1PPPPP1P/RNBQKB1R w KQkq - 0 3
1nbqkbnr/1ppppppp/r7/p7/8/NP6/P1PPPPPP/R1BQKB1R w KQk - 2 3
r1bqkb1r/pppppppp/2n4n/8/8/2NP4/PPP1PPPP/R1BQKB1R w KQkq - 3 3
rnbqkbr1/pppppppp/5n2/8/8/P6P/1PPPPPP1/RNBQKB1R w KQq - 1 3

Re: Stockfish Handicap Matches

Posted: Mon Jun 22, 2020 12:00 am
by lkaufman
chrisw wrote: Sun Jun 21, 2020 11:20 pm
lkaufman wrote: Sun Jun 21, 2020 11:08 pm
Rebel wrote: Sun Jun 21, 2020 10:29 pm
chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D
I don't see much point in creating artificial problems for this purpose, your current method is much better. But the more interesting way to do it in my opinion is to take chrisw's knight odds opening book, and then run gauntlets of top engines like SF and Komodo vs. several relatively weak engines, whatever engines are about the right strength to score 30-70% at knight odds. Knight odds (either White knight removed) is a clearly defined handicap of a nearly constant magnitude, which makes it ideal for the purpose of seeing how much improvement there has been and which top engine is better at giving the handicap. By the way, knight odds is always used rather than bishop in chess because with one bishop removed the game changes much more, you try to put pawns on particular colors, it doesn't feel like chess anymore. Knights are interchangeable, bishops are not. Also knight odds means odds giver has White, otherwise it is called "knight and move" odds.
Done 5600 EPDs off the start position minus b1 knight, played out all four ply combinations, culled all duplicates, culled all positions where SF11 evaluated more than +/-10 centipawns away from 300 centipawns (SF11 average score for all epds), and am now left with 5600 EPDs.

Link: https://github.com/ChrisWhittington/Che ... t-odds.epd

Will upload for no knight at g1 tomorrow am.

Small randomised sample below

Code: Select all

rnbqkbnr/pppp2pp/5p2/4p3/4P2P/8/PPPP1PP1/RNBQKB1R w KQkq - 0 3
rnbqk1nr/pppp1ppp/3bp3/8/3P4/8/PPP1PPPP/RNBQKBR1 w Qkq - 2 3
rnbqkb1r/pppppp1p/6pn/8/3P2P1/8/PPP1PP1P/RNBQKB1R w KQkq - 0 3
rnbqkbnr/p1p1pppp/1p1p4/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 0 3
rnbqkb1r/ppppnppp/8/4p3/8/2P3P1/PP1PPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbnr/p1ppppp1/8/1p5p/8/1QP5/PP1PPPPP/RNB1KB1R w KQkq - 0 3
rnbqkbnr/pp1ppp1p/2p5/6p1/8/N4P2/PPPPP1PP/R1BQKB1R w KQkq - 0 3
rnbqkbnr/p1ppppp1/7p/1p6/1P6/P7/2PPPPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1p1ppppp/p7/2p5/2P5/4P3/PP1P1PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppp1p1p/4p3/6p1/P7/3P4/1PP1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkb1r/1ppppppp/p6n/8/1P6/2N5/P1PPPPPP/R1BQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n5/5p2/4P3/3P4/PPP2PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppppp2/6p1/7p/3P4/2N5/PPP1PPPP/R1BQKB1R w KQkq - 0 3
r1bqkbnr/pppppp1p/n5p1/8/8/P6P/1PPPPPP1/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 1 3
rnbqkb1r/pp1ppppp/7n/2p5/3P4/8/PPPBPPPP/RN1QKB1R w KQkq - 0 3
r1bqkbnr/pppppppp/8/n3P3/8/8/PPPP1PPP/RNBQKB1R w KQkq - 1 3
rnbqkb1r/pppppppp/8/6P1/4n3/8/PPPPPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbr1/pppppppp/5n2/8/1P2P3/8/P1PP1PPP/RNBQKB1R w KQq - 1 3
rnbq1bnr/pppkpppp/8/3p4/8/NP6/P1PPPPPP/R1BQKB1R w KQ - 1 3
r1bqkbnr/ppppp1pp/n4p2/8/8/1P1P4/P1P1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1pppppp1/p6p/8/3P1B2/8/PPP1PPPP/RN1QKB1R w KQkq - 0 3
rnbqkb1r/pppppp1p/7n/6p1/5P2/8/PPPPP1PP/RNBQKBR1 w Qkq - 1 3
r1bqkbnr/pp1ppppp/n1p5/8/3P4/P7/1PP1PPPP/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/4P3/6P1/PPPP1P1P/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppppp1/n6p/2P5/8/8/PP1PPPPP/RNBQKB1R w KQkq - 1 3
rnbqkbn1/pppppppr/7p/8/7P/1P6/P1PPPPP1/RNBQKB1R w KQq - 1 3
rnbqkbnr/ppp1p1pp/3p4/5p2/6P1/2P5/PP1PPP1P/RNBQKB1R w KQkq - 0 3
rnbqkb1r/pppp1ppp/4p2n/8/8/4P3/PPPPBPPP/RNBQK2R w KQkq - 2 3
rnbqkbnr/pppppp1p/8/8/6p1/P5P1/1PPPPP1P/RNBQKB1R w KQkq - 0 3
1nbqkbnr/1ppppppp/r7/p7/8/NP6/P1PPPPPP/R1BQKB1R w KQk - 2 3
r1bqkb1r/pppppppp/2n4n/8/8/2NP4/PPP1PPPP/R1BQKB1R w KQkq - 3 3
rnbqkbr1/pppppppp/5n2/8/8/P6P/1PPPPPP1/RNBQKB1R w KQq - 1 3
Thanks, but something seems very wrong here, because you say it's based on Stockfish average score of -300 centipawns for the positions. But Stockfish 11 evaluation of knight odds position is way worse than -300, it doesn't sound possible to me that the positions could average only down 300 centipawns. Maybe if I check some of the positions I'll get a clue as to what the problem might be.