Stockfish Handicap Matches

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Stockfish Handicap Matches

Post by lkaufman »

Rebel wrote: Sun Jun 21, 2020 4:20 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm
Rebel wrote: Sun Jun 21, 2020 12:31 pm Added a handicap factor of 100, fun with the known names of the past and/or former rating list leaders and world champions.

Progress has been amazing...

http://rebel13.nl/rebel13/stockfish-han ... tches.html
I guess the thought of giving knight odds against oldie version did already occur to you?
Would be fun also.
If you do that it's important to set Stockfish Contempt to maximum value of 100. Also there is the question of how to get variety; do you generate an opening book, and if so how? Also, should it be just the b1 knight, or alternate b1/g1 knights? Another way to get variety is to use four threads for Stockfish 11, which is also reasonable since old engines might not even have MP support. I think that it is likely that Komodo, KomodoMCTS, Lc0 70xxx, and even Stockfish NNUE will all be stronger than Stockfish 11 at giving knight odds, although that's not certain.
Komodo rules!
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Stockfish Handicap Matches

Post by Rebel »

chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
90% of coding is debugging, the other 10% is writing bugs.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Stockfish Handicap Matches

Post by chrisw »

Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Stockfish Handicap Matches

Post by chrisw »

Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
With one white knight removed, generated brute force 4-deep, 179096 epds, of which 85159 are unique.

I'll run them by SF11 at 10ms, and find the average eval and the standard deviation, from that should be able to pluck out several thousand positions where neither side did anything stupid (like threw away material in 4-ply) and that should work as a large knight odds test suite ...
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Stockfish Handicap Matches

Post by Rebel »

chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Stockfish Handicap Matches

Post by Rebel »

chrisw wrote: Sun Jun 21, 2020 10:14 pm
Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
With one white knight removed, generated brute force 4-deep, 179096 epds, of which 85159 are unique.

I'll run them by SF11 at 10ms, and find the average eval and the standard deviation, from that should be able to pluck out several thousand positions where neither side did anything stupid (like threw away material in 4-ply) and that should work as a large knight odds test suite ...
Takes too much cpu time. I have decided to test at 40/240, 6 seconds average.
90% of coding is debugging, the other 10% is writing bugs.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Stockfish Handicap Matches

Post by chrisw »

Rebel wrote: Sun Jun 21, 2020 10:32 pm
chrisw wrote: Sun Jun 21, 2020 10:14 pm
Rebel wrote: Sun Jun 21, 2020 9:21 pm
chrisw wrote: Sun Jun 21, 2020 1:40 pm I guess the thought of giving knight odds against oldie version did already occur to you?
Made a first setup. 10 positions.

Code: Select all

rnbqkbnr/pppppppp/8/8/3PP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - id 4 tempi;
rnbqkbnr/pppppppp/8/8/2BPP3/2N2N2/PPP2PPP/R1BQK2R w KQkq - id 5 tempi;
rnbqkbnr/pppppppp/8/8/2BPPB2/2N2N2/PPP2PPP/R2QK2R w KQkq - id 6 tempi;
rnbqkbnr/pppp1ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7;
rnbqkbnr/ppp1pppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn d7;
rnbqkbnr/ppp2ppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus pawn e7 and d7;
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Ng8;
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Nc6;
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bf8;
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - id minus Bc8;
Oldies should at least get 4 points from the knight and bishop odds.

Results at 40m/15s

Code: Select all

No. Engine             1     2     3     4     5     6     7     8  Score  Games   Perc   Moves
-----------------------------------------------------------------------------------------------
 1 Stockfish 11    xxxxx   1.5   2.0   2.5   3.5   7.5   8.0   8.5   33.5 /   70 (47.86%)  64.1  
 2 Komodo 14         8.5 xxxxx   0.0   0.0   0.0   0.0   0.0   0.0    8.5 /   10 (85.00%)  58.1  
 3 Ethereal 12       8.0   0.0 xxxxx   0.0   0.0   0.0   0.0   0.0    8.0 /   10 (80.00%)  56.5  
 4 rofChade_2.3      7.5   0.0   0.0 xxxxx   0.0   0.0   0.0   0.0    7.5 /   10 (75.00%)  67.6  
 5 Laser_1.7         6.5   0.0   0.0   0.0 xxxxx   0.0   0.0   0.0    6.5 /   10 (65.00%)  84.4  
 6 Rybka_1.0         2.5   0.0   0.0   0.0   0.0 xxxxx   0.0   0.0    2.5 /   10 (25.00%)  69.6  
 7 Benjamin          2.0   0.0   0.0   0.0   0.0   0.0 xxxxx   0.0    2.0 /   10 (20.00%)  58.0  
 8 Fruit_2.1         1.5   0.0   0.0   0.0   0.0   0.0   0.0 xxxxx    1.5 /   10 (15.00%)  54.3  
1. I can use 10 more interesting positions white having a clear advantage.

2. Will increase the time control to CCRL 40m/15m, or so.
With one white knight removed, generated brute force 4-deep, 179096 epds, of which 85159 are unique.

I'll run them by SF11 at 10ms, and find the average eval and the standard deviation, from that should be able to pluck out several thousand positions where neither side did anything stupid (like threw away material in 4-ply) and that should work as a large knight odds test suite ...
Takes too much cpu time. I have decided to test at 40/240, 6 seconds average.
OK, just throw away what you don't need.

Average eval after four moves from startpos minus knight is -300 centipawns (which is to be expected), and 66% of those positions are within +/-100 centipawns. I'll try culling at +/-10 centipawns and publish the list. Later ...
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Stockfish Handicap Matches

Post by lkaufman »

Rebel wrote: Sun Jun 21, 2020 10:29 pm
chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D
I don't see much point in creating artificial problems for this purpose, your current method is much better. But the more interesting way to do it in my opinion is to take chrisw's knight odds opening book, and then run gauntlets of top engines like SF and Komodo vs. several relatively weak engines, whatever engines are about the right strength to score 30-70% at knight odds. Knight odds (either White knight removed) is a clearly defined handicap of a nearly constant magnitude, which makes it ideal for the purpose of seeing how much improvement there has been and which top engine is better at giving the handicap. By the way, knight odds is always used rather than bishop in chess because with one bishop removed the game changes much more, you try to put pawns on particular colors, it doesn't feel like chess anymore. Knights are interchangeable, bishops are not. Also knight odds means odds giver has White, otherwise it is called "knight and move" odds.
Komodo rules!
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Stockfish Handicap Matches

Post by chrisw »

lkaufman wrote: Sun Jun 21, 2020 11:08 pm
Rebel wrote: Sun Jun 21, 2020 10:29 pm
chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D
I don't see much point in creating artificial problems for this purpose, your current method is much better. But the more interesting way to do it in my opinion is to take chrisw's knight odds opening book, and then run gauntlets of top engines like SF and Komodo vs. several relatively weak engines, whatever engines are about the right strength to score 30-70% at knight odds. Knight odds (either White knight removed) is a clearly defined handicap of a nearly constant magnitude, which makes it ideal for the purpose of seeing how much improvement there has been and which top engine is better at giving the handicap. By the way, knight odds is always used rather than bishop in chess because with one bishop removed the game changes much more, you try to put pawns on particular colors, it doesn't feel like chess anymore. Knights are interchangeable, bishops are not. Also knight odds means odds giver has White, otherwise it is called "knight and move" odds.
Done 5600 EPDs off the start position minus b1 knight, played out all four ply combinations, culled all duplicates, culled all positions where SF11 evaluated more than +/-10 centipawns away from 300 centipawns (SF11 average score for all epds), and am now left with 5600 EPDs.

Link: https://github.com/ChrisWhittington/Che ... t-odds.epd

Will upload for no knight at g1 tomorrow am.

Small randomised sample below

Code: Select all

rnbqkbnr/pppp2pp/5p2/4p3/4P2P/8/PPPP1PP1/RNBQKB1R w KQkq - 0 3
rnbqk1nr/pppp1ppp/3bp3/8/3P4/8/PPP1PPPP/RNBQKBR1 w Qkq - 2 3
rnbqkb1r/pppppp1p/6pn/8/3P2P1/8/PPP1PP1P/RNBQKB1R w KQkq - 0 3
rnbqkbnr/p1p1pppp/1p1p4/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 0 3
rnbqkb1r/ppppnppp/8/4p3/8/2P3P1/PP1PPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbnr/p1ppppp1/8/1p5p/8/1QP5/PP1PPPPP/RNB1KB1R w KQkq - 0 3
rnbqkbnr/pp1ppp1p/2p5/6p1/8/N4P2/PPPPP1PP/R1BQKB1R w KQkq - 0 3
rnbqkbnr/p1ppppp1/7p/1p6/1P6/P7/2PPPPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1p1ppppp/p7/2p5/2P5/4P3/PP1P1PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppp1p1p/4p3/6p1/P7/3P4/1PP1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkb1r/1ppppppp/p6n/8/1P6/2N5/P1PPPPPP/R1BQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n5/5p2/4P3/3P4/PPP2PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppppp2/6p1/7p/3P4/2N5/PPP1PPPP/R1BQKB1R w KQkq - 0 3
r1bqkbnr/pppppp1p/n5p1/8/8/P6P/1PPPPPP1/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 1 3
rnbqkb1r/pp1ppppp/7n/2p5/3P4/8/PPPBPPPP/RN1QKB1R w KQkq - 0 3
r1bqkbnr/pppppppp/8/n3P3/8/8/PPPP1PPP/RNBQKB1R w KQkq - 1 3
rnbqkb1r/pppppppp/8/6P1/4n3/8/PPPPPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbr1/pppppppp/5n2/8/1P2P3/8/P1PP1PPP/RNBQKB1R w KQq - 1 3
rnbq1bnr/pppkpppp/8/3p4/8/NP6/P1PPPPPP/R1BQKB1R w KQ - 1 3
r1bqkbnr/ppppp1pp/n4p2/8/8/1P1P4/P1P1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1pppppp1/p6p/8/3P1B2/8/PPP1PPPP/RN1QKB1R w KQkq - 0 3
rnbqkb1r/pppppp1p/7n/6p1/5P2/8/PPPPP1PP/RNBQKBR1 w Qkq - 1 3
r1bqkbnr/pp1ppppp/n1p5/8/3P4/P7/1PP1PPPP/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/4P3/6P1/PPPP1P1P/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppppp1/n6p/2P5/8/8/PP1PPPPP/RNBQKB1R w KQkq - 1 3
rnbqkbn1/pppppppr/7p/8/7P/1P6/P1PPPPP1/RNBQKB1R w KQq - 1 3
rnbqkbnr/ppp1p1pp/3p4/5p2/6P1/2P5/PP1PPP1P/RNBQKB1R w KQkq - 0 3
rnbqkb1r/pppp1ppp/4p2n/8/8/4P3/PPPPBPPP/RNBQK2R w KQkq - 2 3
rnbqkbnr/pppppp1p/8/8/6p1/P5P1/1PPPPP1P/RNBQKB1R w KQkq - 0 3
1nbqkbnr/1ppppppp/r7/p7/8/NP6/P1PPPPPP/R1BQKB1R w KQk - 2 3
r1bqkb1r/pppppppp/2n4n/8/8/2NP4/PPP1PPPP/R1BQKB1R w KQkq - 3 3
rnbqkbr1/pppppppp/5n2/8/8/P6P/1PPPPPP1/RNBQKB1R w KQq - 1 3
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Stockfish Handicap Matches

Post by lkaufman »

chrisw wrote: Sun Jun 21, 2020 11:20 pm
lkaufman wrote: Sun Jun 21, 2020 11:08 pm
Rebel wrote: Sun Jun 21, 2020 10:29 pm
chrisw wrote: Sun Jun 21, 2020 9:43 pm I’m tempted to generate a few tens of thousands of test positions. How many would be enough?
I prefer 10 special positions, positions like:

[d]8/4kpbn/p1p3p1/Pp2p2p/1P2Pn2/N1P1BP2/5P1P/5BK1 w - - bm Nxb5; id Karpov - Hansen;
1.Nxb5 instantly wins. Since not every engine finds it quickly enough they should at least play 1.c4 with good winning chances.

A job for Larry?

:D
I don't see much point in creating artificial problems for this purpose, your current method is much better. But the more interesting way to do it in my opinion is to take chrisw's knight odds opening book, and then run gauntlets of top engines like SF and Komodo vs. several relatively weak engines, whatever engines are about the right strength to score 30-70% at knight odds. Knight odds (either White knight removed) is a clearly defined handicap of a nearly constant magnitude, which makes it ideal for the purpose of seeing how much improvement there has been and which top engine is better at giving the handicap. By the way, knight odds is always used rather than bishop in chess because with one bishop removed the game changes much more, you try to put pawns on particular colors, it doesn't feel like chess anymore. Knights are interchangeable, bishops are not. Also knight odds means odds giver has White, otherwise it is called "knight and move" odds.
Done 5600 EPDs off the start position minus b1 knight, played out all four ply combinations, culled all duplicates, culled all positions where SF11 evaluated more than +/-10 centipawns away from 300 centipawns (SF11 average score for all epds), and am now left with 5600 EPDs.

Link: https://github.com/ChrisWhittington/Che ... t-odds.epd

Will upload for no knight at g1 tomorrow am.

Small randomised sample below

Code: Select all

rnbqkbnr/pppp2pp/5p2/4p3/4P2P/8/PPPP1PP1/RNBQKB1R w KQkq - 0 3
rnbqk1nr/pppp1ppp/3bp3/8/3P4/8/PPP1PPPP/RNBQKBR1 w Qkq - 2 3
rnbqkb1r/pppppp1p/6pn/8/3P2P1/8/PPP1PP1P/RNBQKB1R w KQkq - 0 3
rnbqkbnr/p1p1pppp/1p1p4/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 0 3
rnbqkb1r/ppppnppp/8/4p3/8/2P3P1/PP1PPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbnr/p1ppppp1/8/1p5p/8/1QP5/PP1PPPPP/RNB1KB1R w KQkq - 0 3
rnbqkbnr/pp1ppp1p/2p5/6p1/8/N4P2/PPPPP1PP/R1BQKB1R w KQkq - 0 3
rnbqkbnr/p1ppppp1/7p/1p6/1P6/P7/2PPPPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1p1ppppp/p7/2p5/2P5/4P3/PP1P1PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppp1p1p/4p3/6p1/P7/3P4/1PP1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkb1r/1ppppppp/p6n/8/1P6/2N5/P1PPPPPP/R1BQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n5/5p2/4P3/3P4/PPP2PPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/pppppp2/6p1/7p/3P4/2N5/PPP1PPPP/R1BQKB1R w KQkq - 0 3
r1bqkbnr/pppppp1p/n5p1/8/8/P6P/1PPPPPP1/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/8/P1N5/1PPPPPPP/R1BQKB1R w KQkq - 1 3
rnbqkb1r/pp1ppppp/7n/2p5/3P4/8/PPPBPPPP/RN1QKB1R w KQkq - 0 3
r1bqkbnr/pppppppp/8/n3P3/8/8/PPPP1PPP/RNBQKB1R w KQkq - 1 3
rnbqkb1r/pppppppp/8/6P1/4n3/8/PPPPPP1P/RNBQKB1R w KQkq - 1 3
rnbqkbr1/pppppppp/5n2/8/1P2P3/8/P1PP1PPP/RNBQKB1R w KQq - 1 3
rnbq1bnr/pppkpppp/8/3p4/8/NP6/P1PPPPPP/R1BQKB1R w KQ - 1 3
r1bqkbnr/ppppp1pp/n4p2/8/8/1P1P4/P1P1PPPP/RNBQKB1R w KQkq - 0 3
rnbqkbnr/1pppppp1/p6p/8/3P1B2/8/PPP1PPPP/RN1QKB1R w KQkq - 0 3
rnbqkb1r/pppppp1p/7n/6p1/5P2/8/PPPPP1PP/RNBQKBR1 w Qkq - 1 3
r1bqkbnr/pp1ppppp/n1p5/8/3P4/P7/1PP1PPPP/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppp1pp/2n2p2/8/4P3/6P1/PPPP1P1P/RNBQKB1R w KQkq - 1 3
r1bqkbnr/ppppppp1/n6p/2P5/8/8/PP1PPPPP/RNBQKB1R w KQkq - 1 3
rnbqkbn1/pppppppr/7p/8/7P/1P6/P1PPPPP1/RNBQKB1R w KQq - 1 3
rnbqkbnr/ppp1p1pp/3p4/5p2/6P1/2P5/PP1PPP1P/RNBQKB1R w KQkq - 0 3
rnbqkb1r/pppp1ppp/4p2n/8/8/4P3/PPPPBPPP/RNBQK2R w KQkq - 2 3
rnbqkbnr/pppppp1p/8/8/6p1/P5P1/1PPPPP1P/RNBQKB1R w KQkq - 0 3
1nbqkbnr/1ppppppp/r7/p7/8/NP6/P1PPPPPP/R1BQKB1R w KQk - 2 3
r1bqkb1r/pppppppp/2n4n/8/8/2NP4/PPP1PPPP/R1BQKB1R w KQkq - 3 3
rnbqkbr1/pppppppp/5n2/8/8/P6P/1PPPPPP1/RNBQKB1R w KQq - 1 3
Thanks, but something seems very wrong here, because you say it's based on Stockfish average score of -300 centipawns for the positions. But Stockfish 11 evaluation of knight odds position is way worse than -300, it doesn't sound possible to me that the positions could average only down 300 centipawns. Maybe if I check some of the positions I'll get a clue as to what the problem might be.
Komodo rules!