Uri Blass wrote: ↑Thu Sep 23, 2021 1:04 am
2)For the question for the reason of different results I can say that I think that the main difference except different time control is simply that
You use different opening positions and not simply the opening position without a queen or without another piece.
I do not know if different opening position help the weaker side or not but it is not normal odd games from my point of view.
Opening book for the odd giver to avoid repetition of the same moves again and again make sense but in this case
I think that only the odd giver should get an opening book when the opponent engine should not get a book.
Of course using an odds opening book is not the same as playing from the initial odds position, but it is not dramatically different based on my testing. However a lot depends on how the book is trimmed to the desired size (in this case one hundred positions). I always select the middle N positions from the Chris W book, as they are ordered from smallest to largest advantage. Random choice would work too. But if you just use the first 100 positions from the list, that would effectively reduce the handicap by something like a pawn, maybe 300 elo or so. So it's a question for Ed, how did you select the 100 positions from the entire list? If they were chosen by some unbiased method, then the use of the book could not explain the disparity of results, it would more likely be due to a GUI issue when no base time is specified. Ed's results seem right to me, consistent with my own.
I am using the first 100 positions of Chris tool.
Maybe he already randomized them; the original book from him was in order of eval, but you may have something later. Maybe I'll check when I have more time.
QUEEN odds match Stockfish 14 vs a pool of 2500 elo rated engines
Time Control : Time control : 40/40
Games : 600
Results from file all.pgn:
No. Name Win Draw Loss Unf. Score Games %
----------------------------------------------------------
1 Stockfish 14 +300 =1 -299 *0 300.5 600 50.1%
2 CT800 1.43 +50 =0 -50 *0 50.0 100 50.0%
3 Foxsee 7.20.1 +50 =0 -50 *0 50.0 100 50.0%
4 Loki 3.5.0 +50 =0 -50 *0 50.0 100 50.0%
5 Marvin 2.0 +50 =0 -50 *0 50.0 100 50.0%
6 Nalwald 1.8.1 +50 =0 -50 *0 50.0 100 50.0%
7 Monolith 0.3 +49 =1 -50 *0 49.5 100 49.5%
Total Games: 600
White Wins: 0 (0.0%)
Black Wins: 599 (99.8%)
Draws: 1 (0.2%)
Unfinished: 0 (0.0%)
Estimated ratings for this elo 2500 pool
# PLAYER : RATING POINTS PLAYED (%)
1 Stockfish 14 : 2500.5 300.5 600 50
2 CT800 1.43 : 2500.5 50.0 100 50
3 Foxsee 7.20.1 : 2500.5 50.0 100 50
4 Nalwald 1.8.1 : 2500.5 50.0 100 50
5 Marvin 2.0 : 2500.5 50.0 100 50
6 Loki 3.5.0 : 2500.5 50.0 100 50
7 Monolith 0.3 : 2497.0 49.5 100 50
Komodo : 51.8%
Stockfish : 50.1%
Stockfish closing in
Next, 2300 engines.
having 50 wins out of 100 and 50 losses out of 100 for almost every engine against stockfish with queen odds seems wrong.
I expect to see more than 1 draw out of 600 games if the score is near 50% and I expect to see some difference between the result of the weak engines.
QUEEN odds match Stockfish 14 vs a pool of 2500 elo rated engines
Time Control : Time control : 40/40
Games : 600
Results from file all.pgn:
No. Name Win Draw Loss Unf. Score Games %
----------------------------------------------------------
1 Stockfish 14 +300 =1 -299 *0 300.5 600 50.1%
2 CT800 1.43 +50 =0 -50 *0 50.0 100 50.0%
3 Foxsee 7.20.1 +50 =0 -50 *0 50.0 100 50.0%
4 Loki 3.5.0 +50 =0 -50 *0 50.0 100 50.0%
5 Marvin 2.0 +50 =0 -50 *0 50.0 100 50.0%
6 Nalwald 1.8.1 +50 =0 -50 *0 50.0 100 50.0%
7 Monolith 0.3 +49 =1 -50 *0 49.5 100 49.5%
Total Games: 600
White Wins: 0 (0.0%)
Black Wins: 599 (99.8%)
Draws: 1 (0.2%)
Unfinished: 0 (0.0%)
Estimated ratings for this elo 2500 pool
# PLAYER : RATING POINTS PLAYED (%)
1 Stockfish 14 : 2500.5 300.5 600 50
2 CT800 1.43 : 2500.5 50.0 100 50
3 Foxsee 7.20.1 : 2500.5 50.0 100 50
4 Nalwald 1.8.1 : 2500.5 50.0 100 50
5 Marvin 2.0 : 2500.5 50.0 100 50
6 Loki 3.5.0 : 2500.5 50.0 100 50
7 Monolith 0.3 : 2497.0 49.5 100 50
Komodo : 51.8%
Stockfish : 50.1%
Stockfish closing in
Next, 2300 engines.
having 50 wins out of 100 and 50 losses out of 100 for almost every engine against stockfish with queen odds seems wrong.
I expect to see more than 1 draw out of 600 games if the score is near 50% and I expect to see some difference between the result of the weak engines.
Good catch Uri !
While creating the 2500 elo pool the -noswap parameter fell off. Meaning that Komodo and Stockfish also played with the black pieces
So, the whole 2500 cycle has to be done again and is running again.
Uri Blass wrote: ↑Thu Sep 23, 2021 1:04 am
2)For the question for the reason of different results I can say that I think that the main difference except different time control is simply that
You use different opening positions and not simply the opening position without a queen or without another piece.
I do not know if different opening position help the weaker side or not but it is not normal odd games from my point of view.
Opening book for the odd giver to avoid repetition of the same moves again and again make sense but in this case
I think that only the odd giver should get an opening book when the opponent engine should not get a book.
Of course using an odds opening book is not the same as playing from the initial odds position, but it is not dramatically different based on my testing. However a lot depends on how the book is trimmed to the desired size (in this case one hundred positions). I always select the middle N positions from the Chris W book, as they are ordered from smallest to largest advantage. Random choice would work too. But if you just use the first 100 positions from the list, that would effectively reduce the handicap by something like a pawn, maybe 300 elo or so. So it's a question for Ed, how did you select the 100 positions from the entire list? If they were chosen by some unbiased method, then the use of the book could not explain the disparity of results, it would more likely be due to a GUI issue when no base time is specified. Ed's results seem right to me, consistent with my own.
I am using the first 100 positions of Chris tool.
Maybe he already randomized them; the original book from him was in order of eval, but you may have something later. Maybe I'll check when I have more time.
rnbqkb1r/p2ppppp/5n2/1ppP4/2P5/8/PP2PPPP/RNBQKB1R w KQkq - 0 4; v=-309
r1bqkbnr/pppp1pp1/2n4p/4p3/2B1P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-329
rnbqk2r/ppppppbp/5np1/8/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-329
r1bqk1nr/pppp1ppp/2n5/2b1p3/2B1P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-331
rnbqkb1r/pp1ppp1p/5np1/2p5/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-333
rnbqk1nr/pp1pppbp/2p3p1/8/2P1P3/2N5/PP1P1PPP/R1BQKB1R w KQkq - 2 4; v=-337
rnbqkb1r/ppp1pp1p/3p1np1/8/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-341
rnbqkb1r/pp2pppp/3p1n2/2pP4/2P5/8/PP2PPPP/RNBQKB1R w KQkq - 0 4; v=-350
rnbqkbnr/pp2pp1p/2pp2p1/8/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-352
rnbqkbnr/pp2pp1p/2pp2p1/8/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-354
r1bqk1nr/pppp1ppp/2n5/1Bb1p3/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-359
rnbqk2r/pppp1ppp/5n2/2b1p3/2B1P3/2P5/PP1P1PPP/RNBQK2R w KQkq - 1 4; v=-361
rnbqkbnr/ppp3pp/4p3/3p1p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-363
rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/RNBQKB1R w KQkq - 0 4; v=-364
rnbqkb1r/pp1ppp1p/5np1/2p5/2PP4/5P2/PP2P1PP/RNBQKB1R w KQkq - 0 4; v=-366
r1bqkb1r/pppp1ppp/2n2n2/1B2p3/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-368
rnbqkbnr/pp2pp1p/3p2p1/2p5/4P3/2P2N2/PP1P1PPP/R1BQKB1R w KQkq - 0 4; v=-369
rnbqkbnr/pp3ppp/4p3/2pp4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-370
rnbqkb1r/ppppp2p/5np1/5p2/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 0 4; v=-370
rn1qkbnr/pbpp1ppp/1p2p3/8/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 1 4; v=-371
rnbqk1nr/ppp1ppbp/3p2p1/8/2PPP3/8/PP3PPP/RNBQKB1R w KQkq - 0 4; v=-373
rnbqkb1r/pppp2pp/4pn2/5p2/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-374
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 2 4; v=-374
rnbqkb1r/pppp2pp/4pn2/5p2/3P4/5NP1/PPP1PP1P/R1BQKB1R w KQkq - 1 4; v=-375
rnbqkb1r/ppppp2p/5np1/5p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-375
rnbqkbnr/pp1p2pp/2p1p3/5p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-376
rnbqkb1r/pp1p1ppp/4pn2/2pP4/2P5/8/PP2PPPP/RNBQKB1R w KQkq - 0 4; v=-376
r1bqkbnr/ppp2ppp/2np4/4p3/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 0 4; v=-377
rnbqkbnr/pp2pppp/8/2p5/2Pp4/4PN2/PP1P1PPP/R1BQKB1R w KQkq - 0 4; v=-377
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 1 4; v=-378
rnbqkb1r/pp1ppppp/8/2p5/3PnB2/8/PPP1PPPP/RN1QKB1R w KQkq - 0 4; v=-381
rnbqkb1r/pp1p1ppp/2p1pn2/8/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-382
r1bqkbnr/pp1p1ppp/2n5/2p1p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 1 4; v=-382
rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/R1BQKBNR w KQkq - 0 4; v=-382
rnbqk1nr/pp1pppbp/2p3p1/8/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 1 4; v=-383
rnbqk1nr/ppp1ppbp/3p2p1/8/2PPP3/8/PP3PPP/R1BQKBNR w KQkq - 0 4; v=-384
r1bqkbnr/pp1p1ppp/2n1p3/1Bp5/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-386
rnbqkb1r/ppp2ppp/3ppn2/8/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-387
rn1qkbnr/pb1ppppp/1p6/2p5/4P3/6P1/PPPP1PBP/RNBQK2R w KQkq - 2 4; v=-387
rnbqkb1r/ppp2ppp/4pn2/3p4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-387
rn1qkb1r/ppp1pppp/5n2/3p4/3P1Bb1/5N2/PPP1PPPP/R2QKB1R w KQkq - 4 4; v=-388
rnbqkb1r/pppp2pp/4pn2/5p2/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-388
rnbqkbnr/ppp3pp/4p3/3p1p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-388
rnbqkbnr/1pp2ppp/4p3/p2p4/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-388
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-390
r1bqkb1r/pppp1ppp/2n2n2/4p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 1 4; v=-391
rnbqkb1r/ppp1pp1p/5np1/3p4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-391
rnbqkb1r/pp1ppp1p/2p2np1/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-391
rnbqk2r/ppppppbp/5np1/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 1 4; v=-391
rnbqkb1r/pppp2pp/4pn2/5p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-392
rnbqkb1r/pppp2pp/4pn2/5p2/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 0 4; v=-392
rnbqk1nr/ppp2ppp/4p3/3p4/1bPP4/4P3/PP3PPP/RNBQKB1R w KQkq - 1 4; v=-392
rnbqkb1r/ppp2ppp/3ppn2/8/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-392
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 1 4; v=-393
rnbqkb1r/p1pp1ppp/1p2pn2/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-393
r1bqkb1r/pppp1ppp/2n1pn2/8/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-394
rn1qkbnr/pp2pppp/2p5/3pPb2/3P4/8/PPP2PPP/R1BQKBNR w KQkq - 1 4; v=-394
rnbqkbnr/pp1p2pp/2p1p3/5p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-394
rn1qkb1r/ppp1pppp/5n2/3p4/6b1/5NP1/PPPPPPBP/R1BQK2R w KQkq - 3 4; v=-395
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 1 4; v=-395
rnbqkb1r/pppp2pp/4pn2/5p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-395
rnbqkb1r/ppppp2p/5np1/5p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-395
rnbqkb1r/pp1ppp1p/5np1/2p5/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-395
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-396
r1bqkb1r/pppp1ppp/2n1pn2/8/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-397
rnbqkbnr/1p1p1ppp/p3p3/2p5/3PP3/2P5/PP3PPP/R1BQKBNR w KQkq - 0 4; v=-398
rnbqkbnr/pp2pp1p/2p3p1/3p4/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-398
rnbqkb1r/pp1p1ppp/4pn2/2pP4/2P5/8/PP2PPPP/R1BQKBNR w KQkq - 0 4; v=-398
rnbqk2r/ppppppbp/5np1/8/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 1 4; v=-399
rn1qkb1r/ppp1pppp/5n2/3p4/3P2b1/4PN2/PPP2PPP/R1BQKB1R w KQkq - 1 4; v=-399
rnbqkb1r/ppp1p1pp/3p1n2/5p2/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 0 4; v=-400
rnbqk2r/ppppppbp/5np1/8/2P1P3/5N2/PP1P1PPP/R1BQKB1R w KQkq - 1 4; v=-400
rnbqkb1r/p1pp1ppp/1p2pn2/8/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 0 4; v=-400
rnbqkb1r/ppp2ppp/4pn2/3p4/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 2 4; v=-401
rnbqkb1r/ppp2ppp/4pn2/3p4/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-401
rnbqkb1r/ppp2ppp/4pn2/3p4/3P4/5NP1/PPP1PP1P/R1BQKB1R w KQkq - 1 4; v=-401
r1bqkbnr/1pp1pppp/p1n5/1B1p4/8/4PN2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-401
rnbqkbnr/pp3ppp/2p1p3/3p4/2PP4/4P3/PP3PPP/RNBQKB1R w KQkq - 0 4; v=-401
r1bqkbnr/ppp2ppp/2np4/4p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-402
rnbqkb1r/ppppp2p/5np1/5p2/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-402
rnbqkb1r/ppp1p1pp/3p1n2/5p2/2P5/6P1/PP1PPPBP/R1BQK1NR w KQkq - 0 4; v=-403
rnbqkb1r/ppp2ppp/4pn2/3p4/3PP3/8/PPPN1PPP/R1BQKB1R w KQkq - 2 4; v=-403
rnbqkbnr/1p1p1ppp/p3p3/2p5/3P4/4PN2/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-403
rnbqkb1r/pp2pppp/3p1n2/2pP4/2P5/8/PP2PPPP/R1BQKBNR w KQkq - 0 4; v=-403
r1bqkbnr/pp2pppp/2np4/1Bp5/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-404
rnbqk1nr/pp1pppbp/6p1/2p5/4P3/6P1/PPPP1PBP/RNBQK2R w KQkq - 2 4; v=-404
r1bqkb1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-405
rnbqkb1r/p2ppppp/1p3n2/2p5/2P5/5NP1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-405
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-405
rnbqk1nr/pp1pbppp/4p3/2p5/4P3/3P1N2/PPP2PPP/R1BQKB1R w KQkq - 2 4; v=-406
r1bqkbnr/pppp1p1p/2n3p1/4p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-406
r1bqkbnr/ppp2ppp/2np4/4p3/2P5/6P1/PP1PPPBP/R1BQK1NR w KQkq - 0 4; v=-406
rnbqkb1r/ppp2ppp/4pn2/3p4/2P1P3/2N5/PP1P1PPP/R1BQKB1R w KQkq - 0 4; v=-407
rnbqkb1r/pp1p1ppp/2p2n2/4p3/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 0 4; v=-407
rnbqk1nr/ppp2ppp/4p3/3p4/1bPP4/4P3/PP3PPP/R1BQKBNR w KQkq - 1 4; v=-407
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/4P3/PP3PPP/RNBQKB1R w KQkq - 1 4; v=-409
r1bqkbnr/pp1ppp1p/2n3p1/2p5/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-410
rnbqkb1r/ppp2ppp/4pn2/3p4/2P5/6P1/PP1PPPBP/R1BQK1NR w KQkq - 2 4; v=-410
r1bqkbnr/pp1p1ppp/2n1p3/2p5/4P3/3P2P1/PPP2P1P/RNBQKB1R w KQkq - 1 4; v=-410
rnbqk1nr/ppp2ppp/4p3/3p4/1b1PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 2 4; v=-411
First 100.
So I think that this means that your test, while quite valid, is a test of slightly less than the indicated handicaps, maybe half a pawn less or so on average. So we should expect that at pure knight, rook, or queen odds Komodo and Stockfish might perform 100-150 elo worse. That's why I use the middle of the list, as I'm trying to predict results vs. humans at the odds. But for comparing Komodo with Stockfish, it shouldn't matter.
Uri Blass wrote: ↑Thu Sep 23, 2021 1:04 am
2)For the question for the reason of different results I can say that I think that the main difference except different time control is simply that
You use different opening positions and not simply the opening position without a queen or without another piece.
I do not know if different opening position help the weaker side or not but it is not normal odd games from my point of view.
Opening book for the odd giver to avoid repetition of the same moves again and again make sense but in this case
I think that only the odd giver should get an opening book when the opponent engine should not get a book.
Of course using an odds opening book is not the same as playing from the initial odds position, but it is not dramatically different based on my testing. However a lot depends on how the book is trimmed to the desired size (in this case one hundred positions). I always select the middle N positions from the Chris W book, as they are ordered from smallest to largest advantage. Random choice would work too. But if you just use the first 100 positions from the list, that would effectively reduce the handicap by something like a pawn, maybe 300 elo or so. So it's a question for Ed, how did you select the 100 positions from the entire list? If they were chosen by some unbiased method, then the use of the book could not explain the disparity of results, it would more likely be due to a GUI issue when no base time is specified. Ed's results seem right to me, consistent with my own.
I am using the first 100 positions of Chris tool.
Maybe he already randomized them; the original book from him was in order of eval, but you may have something later. Maybe I'll check when I have more time.
rnbqkb1r/p2ppppp/5n2/1ppP4/2P5/8/PP2PPPP/RNBQKB1R w KQkq - 0 4; v=-309
r1bqkbnr/pppp1pp1/2n4p/4p3/2B1P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-329
rnbqk2r/ppppppbp/5np1/8/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-329
r1bqk1nr/pppp1ppp/2n5/2b1p3/2B1P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-331
rnbqkb1r/pp1ppp1p/5np1/2p5/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-333
rnbqk1nr/pp1pppbp/2p3p1/8/2P1P3/2N5/PP1P1PPP/R1BQKB1R w KQkq - 2 4; v=-337
rnbqkb1r/ppp1pp1p/3p1np1/8/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-341
rnbqkb1r/pp2pppp/3p1n2/2pP4/2P5/8/PP2PPPP/RNBQKB1R w KQkq - 0 4; v=-350
rnbqkbnr/pp2pp1p/2pp2p1/8/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-352
rnbqkbnr/pp2pp1p/2pp2p1/8/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-354
r1bqk1nr/pppp1ppp/2n5/1Bb1p3/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-359
rnbqk2r/pppp1ppp/5n2/2b1p3/2B1P3/2P5/PP1P1PPP/RNBQK2R w KQkq - 1 4; v=-361
rnbqkbnr/ppp3pp/4p3/3p1p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-363
rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/RNBQKB1R w KQkq - 0 4; v=-364
rnbqkb1r/pp1ppp1p/5np1/2p5/2PP4/5P2/PP2P1PP/RNBQKB1R w KQkq - 0 4; v=-366
r1bqkb1r/pppp1ppp/2n2n2/1B2p3/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-368
rnbqkbnr/pp2pp1p/3p2p1/2p5/4P3/2P2N2/PP1P1PPP/R1BQKB1R w KQkq - 0 4; v=-369
rnbqkbnr/pp3ppp/4p3/2pp4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-370
rnbqkb1r/ppppp2p/5np1/5p2/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 0 4; v=-370
rn1qkbnr/pbpp1ppp/1p2p3/8/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 1 4; v=-371
rnbqk1nr/ppp1ppbp/3p2p1/8/2PPP3/8/PP3PPP/RNBQKB1R w KQkq - 0 4; v=-373
rnbqkb1r/pppp2pp/4pn2/5p2/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-374
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 2 4; v=-374
rnbqkb1r/pppp2pp/4pn2/5p2/3P4/5NP1/PPP1PP1P/R1BQKB1R w KQkq - 1 4; v=-375
rnbqkb1r/ppppp2p/5np1/5p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-375
rnbqkbnr/pp1p2pp/2p1p3/5p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-376
rnbqkb1r/pp1p1ppp/4pn2/2pP4/2P5/8/PP2PPPP/RNBQKB1R w KQkq - 0 4; v=-376
r1bqkbnr/ppp2ppp/2np4/4p3/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 0 4; v=-377
rnbqkbnr/pp2pppp/8/2p5/2Pp4/4PN2/PP1P1PPP/R1BQKB1R w KQkq - 0 4; v=-377
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 1 4; v=-378
rnbqkb1r/pp1ppppp/8/2p5/3PnB2/8/PPP1PPPP/RN1QKB1R w KQkq - 0 4; v=-381
rnbqkb1r/pp1p1ppp/2p1pn2/8/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-382
r1bqkbnr/pp1p1ppp/2n5/2p1p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 1 4; v=-382
rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/R1BQKBNR w KQkq - 0 4; v=-382
rnbqk1nr/pp1pppbp/2p3p1/8/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 1 4; v=-383
rnbqk1nr/ppp1ppbp/3p2p1/8/2PPP3/8/PP3PPP/R1BQKBNR w KQkq - 0 4; v=-384
r1bqkbnr/pp1p1ppp/2n1p3/1Bp5/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-386
rnbqkb1r/ppp2ppp/3ppn2/8/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-387
rn1qkbnr/pb1ppppp/1p6/2p5/4P3/6P1/PPPP1PBP/RNBQK2R w KQkq - 2 4; v=-387
rnbqkb1r/ppp2ppp/4pn2/3p4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-387
rn1qkb1r/ppp1pppp/5n2/3p4/3P1Bb1/5N2/PPP1PPPP/R2QKB1R w KQkq - 4 4; v=-388
rnbqkb1r/pppp2pp/4pn2/5p2/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-388
rnbqkbnr/ppp3pp/4p3/3p1p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-388
rnbqkbnr/1pp2ppp/4p3/p2p4/3PP3/5N2/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-388
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-390
r1bqkb1r/pppp1ppp/2n2n2/4p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 1 4; v=-391
rnbqkb1r/ppp1pp1p/5np1/3p4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-391
rnbqkb1r/pp1ppp1p/2p2np1/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-391
rnbqk2r/ppppppbp/5np1/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 1 4; v=-391
rnbqkb1r/pppp2pp/4pn2/5p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-392
rnbqkb1r/pppp2pp/4pn2/5p2/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 0 4; v=-392
rnbqk1nr/ppp2ppp/4p3/3p4/1bPP4/4P3/PP3PPP/RNBQKB1R w KQkq - 1 4; v=-392
rnbqkb1r/ppp2ppp/3ppn2/8/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-392
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 1 4; v=-393
rnbqkb1r/p1pp1ppp/1p2pn2/8/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-393
r1bqkb1r/pppp1ppp/2n1pn2/8/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-394
rn1qkbnr/pp2pppp/2p5/3pPb2/3P4/8/PPP2PPP/R1BQKBNR w KQkq - 1 4; v=-394
rnbqkbnr/pp1p2pp/2p1p3/5p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-394
rn1qkb1r/ppp1pppp/5n2/3p4/6b1/5NP1/PPPPPPBP/R1BQK2R w KQkq - 3 4; v=-395
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 1 4; v=-395
rnbqkb1r/pppp2pp/4pn2/5p2/3P4/6P1/PPP1PPBP/R1BQK1NR w KQkq - 0 4; v=-395
rnbqkb1r/ppppp2p/5np1/5p2/3P4/6P1/PPP1PPBP/RNBQK2R w KQkq - 0 4; v=-395
rnbqkb1r/pp1ppp1p/5np1/2p5/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-395
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-396
r1bqkb1r/pppp1ppp/2n1pn2/8/2PP4/5N2/PP2PPPP/R1BQKB1R w KQkq - 0 4; v=-397
rnbqkbnr/1p1p1ppp/p3p3/2p5/3PP3/2P5/PP3PPP/R1BQKBNR w KQkq - 0 4; v=-398
rnbqkbnr/pp2pp1p/2p3p1/3p4/3PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-398
rnbqkb1r/pp1p1ppp/4pn2/2pP4/2P5/8/PP2PPPP/R1BQKBNR w KQkq - 0 4; v=-398
rnbqk2r/ppppppbp/5np1/8/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 1 4; v=-399
rn1qkb1r/ppp1pppp/5n2/3p4/3P2b1/4PN2/PPP2PPP/R1BQKB1R w KQkq - 1 4; v=-399
rnbqkb1r/ppp1p1pp/3p1n2/5p2/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 0 4; v=-400
rnbqk2r/ppppppbp/5np1/8/2P1P3/5N2/PP1P1PPP/R1BQKB1R w KQkq - 1 4; v=-400
rnbqkb1r/p1pp1ppp/1p2pn2/8/2PP4/6P1/PP2PP1P/R1BQKBNR w KQkq - 0 4; v=-400
rnbqkb1r/ppp2ppp/4pn2/3p4/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 2 4; v=-401
rnbqkb1r/ppp2ppp/4pn2/3p4/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-401
rnbqkb1r/ppp2ppp/4pn2/3p4/3P4/5NP1/PPP1PP1P/R1BQKB1R w KQkq - 1 4; v=-401
r1bqkbnr/1pp1pppp/p1n5/1B1p4/8/4PN2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-401
rnbqkbnr/pp3ppp/2p1p3/3p4/2PP4/4P3/PP3PPP/RNBQKB1R w KQkq - 0 4; v=-401
r1bqkbnr/ppp2ppp/2np4/4p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-402
rnbqkb1r/ppppp2p/5np1/5p2/2PP4/6P1/PP2PP1P/RNBQKB1R w KQkq - 0 4; v=-402
rnbqkb1r/ppp1p1pp/3p1n2/5p2/2P5/6P1/PP1PPPBP/R1BQK1NR w KQkq - 0 4; v=-403
rnbqkb1r/ppp2ppp/4pn2/3p4/3PP3/8/PPPN1PPP/R1BQKB1R w KQkq - 2 4; v=-403
rnbqkbnr/1p1p1ppp/p3p3/2p5/3P4/4PN2/PPP2PPP/R1BQKB1R w KQkq - 0 4; v=-403
rnbqkb1r/pp2pppp/3p1n2/2pP4/2P5/8/PP2PPPP/R1BQKBNR w KQkq - 0 4; v=-403
r1bqkbnr/pp2pppp/2np4/1Bp5/4P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 0 4; v=-404
rnbqk1nr/pp1pppbp/6p1/2p5/4P3/6P1/PPPP1PBP/RNBQK2R w KQkq - 2 4; v=-404
r1bqkb1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/R1BQK2R w KQkq - 4 4; v=-405
rnbqkb1r/p2ppppp/1p3n2/2p5/2P5/5NP1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-405
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/2N5/PP2PPPP/R1BQKB1R w KQkq - 2 4; v=-405
rnbqk1nr/pp1pbppp/4p3/2p5/4P3/3P1N2/PPP2PPP/R1BQKB1R w KQkq - 2 4; v=-406
r1bqkbnr/pppp1p1p/2n3p1/4p3/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-406
r1bqkbnr/ppp2ppp/2np4/4p3/2P5/6P1/PP1PPPBP/R1BQK1NR w KQkq - 0 4; v=-406
rnbqkb1r/ppp2ppp/4pn2/3p4/2P1P3/2N5/PP1P1PPP/R1BQKB1R w KQkq - 0 4; v=-407
rnbqkb1r/pp1p1ppp/2p2n2/4p3/2P5/6P1/PP1PPPBP/RNBQK2R w KQkq - 0 4; v=-407
rnbqk1nr/ppp2ppp/4p3/3p4/1bPP4/4P3/PP3PPP/R1BQKBNR w KQkq - 1 4; v=-407
rnbqkb1r/pp2pppp/2p2n2/3p4/2PP4/4P3/PP3PPP/RNBQKB1R w KQkq - 1 4; v=-409
r1bqkbnr/pp1ppp1p/2n3p1/2p5/2P5/2N3P1/PP1PPP1P/R1BQKB1R w KQkq - 0 4; v=-410
rnbqkb1r/ppp2ppp/4pn2/3p4/2P5/6P1/PP1PPPBP/R1BQK1NR w KQkq - 2 4; v=-410
r1bqkbnr/pp1p1ppp/2n1p3/2p5/4P3/3P2P1/PPP2P1P/RNBQKB1R w KQkq - 1 4; v=-410
rnbqk1nr/ppp2ppp/4p3/3p4/1b1PP3/2N5/PPP2PPP/R1BQKB1R w KQkq - 2 4; v=-411
First 100.
So I think that this means that your test, while quite valid, is a test of slightly less than the indicated handicaps, maybe half a pawn less or so on average. So we should expect that at pure knight, rook, or queen odds Komodo and Stockfish might perform 100-150 elo worse. That's why I use the middle of the list, as I'm trying to predict results vs. humans at the odds. But for comparing Komodo with Stockfish, it shouldn't matter.
I did two tests of 50 games each the 1st with Contempt = 150 and the 2nd test Contempt = 125 for Rooks Odds, it turn out that Komodo Dragon2 did better versus FoxSee v7.26em with Contempt 125, I never expected this to happen
QUEEN odds match Stockfish 14 vs a pool of 2500 elo rated engines
Time Control : Time control : 40/40
Games : 600
Results from file all.pgn:
No. Name Win Draw Loss Unf. Score Games %
----------------------------------------------------------
1 Stockfish 14 +300 =1 -299 *0 300.5 600 50.1%
2 CT800 1.43 +50 =0 -50 *0 50.0 100 50.0%
3 Foxsee 7.20.1 +50 =0 -50 *0 50.0 100 50.0%
4 Loki 3.5.0 +50 =0 -50 *0 50.0 100 50.0%
5 Marvin 2.0 +50 =0 -50 *0 50.0 100 50.0%
6 Nalwald 1.8.1 +50 =0 -50 *0 50.0 100 50.0%
7 Monolith 0.3 +49 =1 -50 *0 49.5 100 49.5%
Total Games: 600
White Wins: 0 (0.0%)
Black Wins: 599 (99.8%)
Draws: 1 (0.2%)
Unfinished: 0 (0.0%)
Estimated ratings for this elo 2500 pool
# PLAYER : RATING POINTS PLAYED (%)
1 Stockfish 14 : 2500.5 300.5 600 50
2 CT800 1.43 : 2500.5 50.0 100 50
3 Foxsee 7.20.1 : 2500.5 50.0 100 50
4 Nalwald 1.8.1 : 2500.5 50.0 100 50
5 Marvin 2.0 : 2500.5 50.0 100 50
6 Loki 3.5.0 : 2500.5 50.0 100 50
7 Monolith 0.3 : 2497.0 49.5 100 50
Komodo : 51.8%
Stockfish : 50.1%
Stockfish closing in
Next, 2300 engines.
having 50 wins out of 100 and 50 losses out of 100 for almost every engine against stockfish with queen odds seems wrong.
I expect to see more than 1 draw out of 600 games if the score is near 50% and I expect to see some difference between the result of the weak engines.
Uri When I first saw this result I Jumped out of my seat, because with Queen Odds that most that Dragon2 or Stockfish14 can get even score with is with a Pool of Engines rated around 1750 and against humans rated around 1650 since most humans know how to trade when they are ahead in material.
Knight odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 55.6 73.8 89.9
Stockfish 14 28.5 47.2 70.1
Bishop odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 47.1 67.6 81.8
Stockfish 14 14.5 31.3 51.2
Rook odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 25.5 52.9 73.2
Stockfish 14 18.0 41.1 64.0
Queen odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 1.0% 3.6% 9.4%
Stockfish 14 0.0% 0.2% 0.5%
Komodo wins at every odds. I am pretty sure Komodo playing GM's at odds has resulted in program changes when down in material, Larry might comment on that one