We also found that king safety values should be higher for ltc, but the effect was just a few elo, nothing like 50.cdani wrote:For king safety seems more clear that ltc favoring higher king safety. For passed pawns I found some contradictory results, but maybe because where bad tuned before.lkaufman wrote: Can you make a general statement as to whether longer time controls favored higher king safety and passed scores or lower ones? The one example you give suggests that longer tc favors higher king safety weights.
To Larry Kaufman
Moderator: Ras
-
- Posts: 6257
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: To Larry Kaufman
Komodo rules!
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: To Larry Kaufman
Some dexterity is required in testing here, with issues about overhead and such, I will just show that the tests were perfectly fine. In Cutechess I tested at fixed depth for roughly the same consumed time per game with fixed time matches you don't like. I often prefer fixed time to fixed depth for several reasons like being closer to how engines are usually tested and for having more randomized games.Jesse Gersenson wrote:Where are you getting 3ms? It says "Avg game length = 1.422 sec". Are the games averaging 237 moves per game? Assuming 120 ply per game it's be 11.85ms per move.Laskos wrote:Code: Select all
(Avg game length = 1.422 sec) Settings = Gauntlet/32MB/1ms per move/M 900cp for 5 moves, D 150 moves/EPD:2moves_v1.epd(32000) Time = 2509 sec elapsed, 0 sec remaining 1. Komodo 9.3 KS=30 5122.5/10000 4315-4070-1615 (L: m=4045 t=0 i=0 a=25) (D: r=709 i=237 f=523 s=8 a=138) (tpm=10.9 d=6.03 nps=1746716)
I am not familiar with engine testing but it seems an engine's using 10x it's alloted time is an invalid test.
Is there a guideline for the minimum time per move, on one core, for testing?
For ultra-ultra-fast games -- fixed depth=5, with an almost identical result to that 1 ms per move in LittleBlitzer:
Code: Select all
Score of Komodo 9.3 KS=30 vs Komodo 9.3 KS=50: 3896 - 3647 - 2457 [0.512] 10000
ELO difference: 9
Finished match
Code: Select all
Score of Komodo 9.3 KS=30 vs Komodo 9.3 KS=50: 1367 - 1527 - 2106 [0.484] 5000
ELO difference: -11
Finished match
Also, to show that KS=50 is close to optimal at 30 ms per move or depth=10, after I compared it to KS=30, I compared KS=50 to KS=70:
depth=10
Code: Select all
Score of Komodo 9.3 KS=50 vs Komodo 9.3 KS=70: 311 - 241 - 448 [0.535] 1000
ELO difference: 24
Finished match
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: To Larry Kaufman
Dear Larry,lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess
http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: To Larry Kaufman
I was referring to the added effect of tuning all the parameters for a given time control, i.e., testing until one finds the optimal values for such time control. A lot of worklkaufman wrote: We also found that king safety values should be higher for ltc, but the effect was just a few elo, nothing like 50.

But as most people does not tune for a given tc, what results is that some parameters are better for longer or maybe shorter tc, and the engine ends at some midpoint but with higher probability where his testing method tends to take him.
Daniel José -
http://www.andscacs.com

-
- Posts: 1346
- Joined: Sat Apr 19, 2014 1:47 pm
Re: To Larry Kaufman
You re always worried about Komodo less good than Stockfish. Of course Komodo 9.3 or 9.2 is less good than Stockfish 7 at these condition. For many raisons, the more important is because Stockfish 7 is more recent.beram wrote:Dear Larry,lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess
http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
Just wait like usual the next Komodo to be better than Stockfish 7 on these rating list, you know it's gonna happen anytime soon.
And in the meantime, Komodo is still the best at TCEC condition for sure.
-
- Posts: 6257
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: To Larry Kaufman
The overall ratings for top SF and top Komodo on these two lists are essentially tied. The overall ratings are far more important statistically than the individual match result. For whatever reason, Komodo always seems to do better in the rating lists relative to SF than in direct matches. Perhaps it's just some stylistic thing, or perhaps it's related to contempt, or both. Anyway it's up to each person to decide whether ratings against a variety of opponents or the results of direct matches are more important.beram wrote:Dear Larry,lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess
http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
Komodo rules!
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: To Larry Kaufman
The overall ratings indeed are essentially tied. But the individual match results of Stockfish are better on all these lists !lkaufman wrote:The overall ratings for top SF and top Komodo on these two lists are essentially tied. The overall ratings are far more important statistically than the individual match result. For whatever reason, Komodo always seems to do better in the rating lists relative to SF than in direct matches. Perhaps it's just some stylistic thing, or perhaps it's related to contempt, or both. Anyway it's up to each person to decide whether ratings against a variety of opponents or the results of direct matches are more important.beram wrote:Dear Larry,lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess
http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
So saying that Stockfish only has a small edge on blitz and bullet is denying the thruth ore twisting it.
-
- Posts: 6257
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: To Larry Kaufman
In blitz, looking at both 4 cpu and 1 cpu results for both CEGT and CCRL at 40/4', Stockfish has roughly a fifteen point lead. I would call this a small edge. Going by direct results the gap is indeed larger. Some of this may go away if it is tested without contempt, since the contempt setting is optimized for a range of opponents well below Komodo level, not for a roughly equal opponent. In any case the fifteen elo gap in blitz becomes zero at the intermediate tc, showing the trend that Komodo gains with more time.beram wrote:The overall ratings indeed are essentially tied. But the individual match results of Stockfish are better on all these lists !lkaufman wrote:The overall ratings for top SF and top Komodo on these two lists are essentially tied. The overall ratings are far more important statistically than the individual match result. For whatever reason, Komodo always seems to do better in the rating lists relative to SF than in direct matches. Perhaps it's just some stylistic thing, or perhaps it's related to contempt, or both. Anyway it's up to each person to decide whether ratings against a variety of opponents or the results of direct matches are more important.beram wrote:Dear Larry,lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess
http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
So saying that Stockfish only has a small edge on blitz and bullet is denying the thruth ore twisting it.
Komodo rules!
-
- Posts: 2071
- Joined: Thu May 04, 2006 3:40 am
- Location: Dune
Re: To Larry Kaufman
Agreed. At CEGT 40/20 if Komodo 9.2 or Komodo 9.3 were tested with 0 contempt I think it would win by a small margin against Stockfish 7.lkaufman wrote:In blitz, looking at both 4 cpu and 1 cpu results for both CEGT and CCRL at 40/4', Stockfish has roughly a fifteen point lead. I would call this a small edge. Going by direct results the gap is indeed larger. Some of this may go away if it is tested without contempt, since the contempt setting is optimized for a range of opponents well below Komodo level, not for a roughly equal opponent. In any case the fifteen elo gap in blitz becomes zero at the intermediate tc, showing the trend that Komodo gains with more time.beram wrote:The overall ratings indeed are essentially tied. But the individual match results of Stockfish are better on all these lists !lkaufman wrote:The overall ratings for top SF and top Komodo on these two lists are essentially tied. The overall ratings are far more important statistically than the individual match result. For whatever reason, Komodo always seems to do better in the rating lists relative to SF than in direct matches. Perhaps it's just some stylistic thing, or perhaps it's related to contempt, or both. Anyway it's up to each person to decide whether ratings against a variety of opponents or the results of direct matches are more important.beram wrote:Dear Larry,lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess
http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
So saying that Stockfish only has a small edge on blitz and bullet is denying the thruth ore twisting it.
In the blitz CEGT testing you can see that Komodo 9.2 0 contempt 12CPU scores 4% higher against Stockfish 7 12CPU than does the default contempt of 15. At the moment the 0 contempt version is 13 elo higher than the default version. It would seem to me that contempt does not or at least at the moment is not doing any favors for Komodo on the rating lists.
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: To Larry Kaufman
Stockfish rules over Komodo in all the individual match results at CEGT and CCRL from blitz to 40/20 or 40/40lkaufman wrote:In blitz, looking at both 4 cpu and 1 cpu results for both CEGT and CCRL at 40/4', Stockfish has roughly a fifteen point lead. I would call this a small edge. Going by direct results the gap is indeed larger. Some of this may go away if it is tested without contempt, since the contempt setting is optimized for a range of opponents well below Komodo level, not for a roughly equal opponent. In any case the fifteen elo gap in blitz becomes zero at the intermediate tc, showing the trend that Komodo gains with more time.beram wrote:The overall ratings indeed are essentially tied. But the individual match results of Stockfish are better on all these lists !lkaufman wrote:The overall ratings for top SF and top Komodo on these two lists are essentially tied. The overall ratings are far more important statistically than the individual match result. For whatever reason, Komodo always seems to do better in the rating lists relative to SF than in direct matches. Perhaps it's just some stylistic thing, or perhaps it's related to contempt, or both. Anyway it's up to each person to decide whether ratings against a variety of opponents or the results of direct matches are more important.beram wrote:Dear Larry,lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess
http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
So saying that Stockfish only has a small edge on blitz and bullet is denying the thruth ore twisting it.
With or without contempt Komodo comes second
That is just a fact don't twist it