Knight odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 55.6 73.8 89.9
Stockfish 14 28.5 47.2 70.1
Bishop odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 47.1 67.6 81.8
Stockfish 14 14.5 31.3 51.2
Rook odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 25.5 52.9 73.2
Stockfish 14 18.0 41.1 64.0
Queen odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 1.0% 3.6% 9.4%
Stockfish 14 0.0% 0.2% 0.5%
Komodo wins at every odds. I am pretty sure Komodo playing GM's at odds has resulted in program changes when down in material, Larry might comment on that one
1. Play queen-odds matches 2000 / 1500 /1000 elo until finally Komodo and/or Stockfish start to win, >50%
2. Look above, instead of 2700 engines test also 2800, 2900, 3000 elo pools.
3. Invite a third engine and repeat the 2700, 2500, 2300 elo cycle. Suggest an engine that does better than SF14, but make a reasonable case for it.
4. Suggest something interesting else.
5. Stop, it's enough.
Pick your preference.
I think that invite a third engine is best.
I believe many engines are going to do better than stockfish14 at least with queen odds including Wasp and RubiChess
It may be interesting also to test Komodo Dragon2.5(with the contempt setting Larry suggest) and Stockfish13 (with maximal possible contempt).
I think that for a different version of stockfish is may be better to test because I am not sure stockfish13 is best for queen odds and I found that the evaluation of stockfish13 is also stupid in this case so it is better to test different engines.
I suggest to test the strongest non stockfish Dragon engine that shows improvement when I search deeper with queen odds.
When I use Wasp I can see that the evaluation at depth d+10 is always worse than the evaluation at depth d and this is the reason I like it.
I dislike engines that show no improvement in the evaluation regardless of search depths so it is better not to choose them as candidate for queen odds because they do not know how black can improves the position and seem to understand nothing.
C:\Users\àåøé\Downloads\Adams\uri.epd Position -1 / 1
FEN: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNB1KBNR w KQkq - 0 1
RubiChess2.2 is also a good candidate for queen odds because it does not show the stupid behaviour of stockfish and I see the evaluation goes down when I search deeper(I used 7 cores but I do not see with stockfish evaluations go down from my testing with the same conditions)
C:\Users\àåøé\Downloads\Adams\uri.epd Position 1 / 1
FEN: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNB1KBNR w KQkq - 0 1
Knight odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 55.6 73.8 89.9
Stockfish 14 28.5 47.2 70.1
Bishop odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 47.1 67.6 81.8
Stockfish 14 14.5 31.3 51.2
Rook odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 25.5 52.9 73.2
Stockfish 14 18.0 41.1 64.0
Queen odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 1.0% 3.6% 9.4%
Stockfish 14 0.0% 0.2% 0.5%
Komodo wins at every odds. I am pretty sure Komodo playing GM's at odds has resulted in program changes when down in material, Larry might comment on that one
1. Play queen-odds matches 2000 / 1500 /1000 elo until finally Komodo and/or Stockfish start to win, >50%
2. Look above, instead of 2700 engines test also 2800, 2900, 3000 elo pools.
3. Invite a third engine and repeat the 2700, 2500, 2300 elo cycle. Suggest an engine that does better than SF14, but make a reasonable case for it.
4. Suggest something interesting else.
5. Stop, it's enough.
Pick your preference.
Yes, we do small things to improve odds play, and include some odds positions in training the nets. Stockfish does strange things that hurt odds play I think. We can send you the new Dragon 2.5, it should be even better than Dragon 2 at odds play; it is 100 elo stronger in FRC blitz play!
Knight odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 55.6 73.8 89.9
Stockfish 14 28.5 47.2 70.1
Bishop odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 47.1 67.6 81.8
Stockfish 14 14.5 31.3 51.2
Rook odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 25.5 52.9 73.2
Stockfish 14 18.0 41.1 64.0
Queen odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 1.0% 3.6% 9.4%
Stockfish 14 0.0% 0.2% 0.5%
Komodo wins at every odds. I am pretty sure Komodo playing GM's at odds has resulted in program changes when down in material, Larry might comment on that one
1. Play queen-odds matches 2000 / 1500 /1000 elo until finally Komodo and/or Stockfish start to win, >50%
2. Look above, instead of 2700 engines test also 2800, 2900, 3000 elo pools.
3. Invite a third engine and repeat the 2700, 2500, 2300 elo cycle. Suggest an engine that does better than SF14, but make a reasonable case for it.
4. Suggest something interesting else.
5. Stop, it's enough.
Pick your preference.
Yes, we do small things to improve odds play, and include some odds positions in training the nets. Stockfish does strange things that hurt odds play I think. We can send you the new Dragon 2.5, it should be even better than Dragon 2 at odds play; it is 100 elo stronger in FRC blitz play!
Thank you for your kind offer, I accept of course! I like Uri's idea to pitch Dragon 2.5 vs SF13 with equal contempt. But what should be the best contempt value?
And of course Dragon 2.5 will be tested for the GRL on my other PC.
90% of coding is debugging, the other 10% is writing bugs.
Knight odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 55.6 73.8 89.9
Stockfish 14 28.5 47.2 70.1
Bishop odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 47.1 67.6 81.8
Stockfish 14 14.5 31.3 51.2
Rook odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 25.5 52.9 73.2
Stockfish 14 18.0 41.1 64.0
Queen odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 1.0% 3.6% 9.4%
Stockfish 14 0.0% 0.2% 0.5%
Komodo wins at every odds. I am pretty sure Komodo playing GM's at odds has resulted in program changes when down in material, Larry might comment on that one
1. Play queen-odds matches 2000 / 1500 /1000 elo until finally Komodo and/or Stockfish start to win, >50%
2. Look above, instead of 2700 engines test also 2800, 2900, 3000 elo pools.
3. Invite a third engine and repeat the 2700, 2500, 2300 elo cycle. Suggest an engine that does better than SF14, but make a reasonable case for it.
4. Suggest something interesting else.
5. Stop, it's enough.
Pick your preference.
Yes, we do small things to improve odds play, and include some odds positions in training the nets. Stockfish does strange things that hurt odds play I think. We can send you the new Dragon 2.5, it should be even better than Dragon 2 at odds play; it is 100 elo stronger in FRC blitz play!
Thank you for your kind offer, I accept of course! I like Uri's idea to pitch Dragon 2.5 vs SF13 with equal contempt. But what should be the best contempt value?
And of course Dragon 2.5 will be tested for the GRL on my other PC.
I would recommend for Dragon Contempt 100 for knight odds, 125 for rook odds, and 175 for queen odds. But I think Stockfish versions that had Contempt limited it to 100, so if you want to use the same value then it has to be 100 for all handicaps. The definition of Contempt isn't the same in the two engines, but I think it is similar enough for your purposes.
Knight odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 55.6 73.8 89.9
Stockfish 14 28.5 47.2 70.1
Bishop odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 47.1 67.6 81.8
Stockfish 14 14.5 31.3 51.2
Rook odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 25.5 52.9 73.2
Stockfish 14 18.0 41.1 64.0
Queen odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 1.0% 3.6% 9.4%
Stockfish 14 0.0% 0.2% 0.5%
Komodo wins at every odds.
So here are the approximate Dragon performance ratings against the closest rating pool (closest to 50%): Knight odds 2739, Bishop odds 2680, Rook odds 2520, queen odds 1906. I expected the rook odds vs knight odds to be about 200 more, it was 219 more. Queen odds perf. was a bit higher than I expected. The bishop vs knight difference was a bit more than I expected for the opening position, where the large number of pawns should help the knights somewhat. If we subtract 120 elo estimated for the difference between using the first 100 positions on the list vs. average/middle positions or pure odds without a book then a fair opponent for Dragon at this blitz time control is still about 2620 at knight odds, about 2560 for bishop odds, 2400 for rook odds, and a bit below 1800 for queen odds. In general, I think that strong engines play blitz roughly at the same strength as humans of the same rating (comparing CCRL blitz to Human FIDE) play Rapid, but these are bullet games, not blitz games, so perhaps another 150 elo or so needs to be deducted to predict human performance at Rapid. That gives about 2470 for human knight odds, 2410 for human bishop odds, 2250 for rook odds, and about 1635 for queen odds. The knight odds figure agrees almost perfectly with our results in 17 Rapid human games, the rook and queen odds figures seem a bit too high.
Knight odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 55.6 73.8 89.9
Stockfish 14 28.5 47.2 70.1
Bishop odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 47.1 67.6 81.8
Stockfish 14 14.5 31.3 51.2
Rook odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 25.5 52.9 73.2
Stockfish 14 18.0 41.1 64.0
Queen odds Pool Pool Pool
Engine 2700 2500 2300
Komodo Dragon 2 1.0% 3.6% 9.4%
Stockfish 14 0.0% 0.2% 0.5%
Komodo wins at every odds.
So here are the approximate Dragon performance ratings against the closest rating pool (closest to 50%): Knight odds 2739, Bishop odds 2680, Rook odds 2520, queen odds 1906. I expected the rook odds vs knight odds to be about 200 more, it was 219 more. Queen odds perf. was a bit higher than I expected. The bishop vs knight difference was a bit more than I expected for the opening position, where the large number of pawns should help the knights somewhat. If we subtract 120 elo estimated for the difference between using the first 100 positions on the list vs. average/middle positions or pure odds without a book then a fair opponent for Dragon at this blitz time control is still about 2620 at knight odds, about 2560 for bishop odds, 2400 for rook odds, and a bit below 1800 for queen odds. In general, I think that strong engines play blitz roughly at the same strength as humans of the same rating (comparing CCRL blitz to Human FIDE) play Rapid, but these are bullet games, not blitz games, so perhaps another 150 elo or so needs to be deducted to predict human performance at Rapid. That gives about 2470 for human knight odds, 2410 for human bishop odds, 2250 for rook odds, and about 1635 for queen odds. The knight odds figure agrees almost perfectly with our results in 17 Rapid human games, the rook and queen odds figures seem a bit too high.
I think it's reasonable assume if we run the 2700 pool at 40/120 (so factor 3 more time) the results will favor the 2700 engines a bit.
90% of coding is debugging, the other 10% is writing bugs.
lkaufman wrote: ↑Fri Sep 24, 2021 5:36 pm I would recommend for Dragon Contempt 100 for knight odds, 125 for rook odds, and 175 for queen odds. But I think Stockfish versions that had Contempt limited it to 100, so if you want to use the same value then it has to be 100 for all handicaps. The definition of Contempt isn't the same in the two engines, but I think it is similar enough for your purposes.
Started the first match with Dragon 2.5 and contempt of 100 vs the 2700 pool.
I assume the 2700 pool is the one you are most interested In ?
lkaufman wrote: ↑Fri Sep 24, 2021 5:36 pm I would recommend for Dragon Contempt 100 for knight odds, 125 for rook odds, and 175 for queen odds. But I think Stockfish versions that had Contempt limited it to 100, so if you want to use the same value then it has to be 100 for all handicaps. The definition of Contempt isn't the same in the two engines, but I think it is similar enough for your purposes.
Started the first match with Dragon 2.5 and contempt of 100 vs the 2700 pool.
I assume the 2700 pool is the one you are most interested In ?
Yes, for knight and bishop odds anyway. Ideally we should pick a pool where we score close to 50%. Dragon 2.5 and Contempt 100 should both help, but probably not dramatically, since "better" chess is not necessarily better at giving big odds. If you switched to using the middle positions from the ChrisW list this would probably lower the performance more than the new version and Contempt would raise it, but perhaps you prefer consistency.