To Larry Kaufman

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Leto
Posts: 2028
Joined: Thu May 04, 2006 1:40 am
Location: Dune

Re: To Larry Kaufman

Post by Leto » Sat Feb 06, 2016 8:22 pm

beram wrote:
lkaufman wrote:
beram wrote:
lkaufman wrote:
beram wrote:
lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
Dear Larry,
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess

http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
The overall ratings for top SF and top Komodo on these two lists are essentially tied. The overall ratings are far more important statistically than the individual match result. For whatever reason, Komodo always seems to do better in the rating lists relative to SF than in direct matches. Perhaps it's just some stylistic thing, or perhaps it's related to contempt, or both. Anyway it's up to each person to decide whether ratings against a variety of opponents or the results of direct matches are more important.
The overall ratings indeed are essentially tied. But the individual match results of Stockfish are better on all these lists !
So saying that Stockfish only has a small edge on blitz and bullet is denying the thruth ore twisting it.
In blitz, looking at both 4 cpu and 1 cpu results for both CEGT and CCRL at 40/4', Stockfish has roughly a fifteen point lead. I would call this a small edge. Going by direct results the gap is indeed larger. Some of this may go away if it is tested without contempt, since the contempt setting is optimized for a range of opponents well below Komodo level, not for a roughly equal opponent. In any case the fifteen elo gap in blitz becomes zero at the intermediate tc, showing the trend that Komodo gains with more time.
Stockfish rules over Komodo in all the individual match results at CEGT and CCRL from blitz to 40/20 or 40/40
With or without contempt Komodo comes second
That is just a fact don't twist it
In CEGT 40/20 Komodo 9.2 4CPU scored 48.8% against Stockfish 7. Considering that in CEGT 40/4 the 0 contempt version of K9.2 12CPU scored 4% higher against Stockfish 7 12CPU than the default version of K9.2 12CPU, if K9.2 4CPU were tested with 0 contempt in the CEGT 40/20 list and it also scored 4% higher it would score almost 53% against Stockfish 7 4CPU.

beram
Posts: 1187
Joined: Wed Jan 06, 2010 2:11 pm

Re: To Larry Kaufman

Post by beram » Tue Feb 09, 2016 7:03 pm

lkaufman wrote:
beram wrote:
lkaufman wrote:
beram wrote:
lkaufman wrote:We would very much like to improve Komodo's blitz/bullet chess, since that's the one area where Stockfish seems to have a small edge. But we don't yet know why Stockfish is stronger at bullet level play, so it's hard to fix this except by generally improving Komodo. We can say it's because our better eval is a bit slower, but we're not that much slower than Stockfish, so there is something else going on that we would love to identify and fix. One clue: we have never been able to make "probcut" work for us, although it seems to work fine in stockfish. No idea why this is so.
Dear Larry,
The CEGT 40/20 or CCRL 40/40 list is not a blitz chess list
53,6 % and 54 % for SF7 vs K9.3 on 4 cores on these lists is a small margin at LTC chess

http://www.computerchess.org.uk/ccrl/40 ... 4-bit_4CPU
http://www.husvankempen.de/nunn/40_40%2 ... ons/2.html
The overall ratings for top SF and top Komodo on these two lists are essentially tied. The overall ratings are far more important statistically than the individual match result. For whatever reason, Komodo always seems to do better in the rating lists relative to SF than in direct matches. Perhaps it's just some stylistic thing, or perhaps it's related to contempt, or both. Anyway it's up to each person to decide whether ratings against a variety of opponents or the results of direct matches are more important.
The overall ratings indeed are essentially tied. But the individual match results of Stockfish are better on all these lists !
So saying that Stockfish only has a small edge on blitz and bullet is denying the thruth ore twisting it.
In blitz, looking at both 4 cpu and 1 cpu results for both CEGT and CCRL at 40/4', Stockfish has roughly a fifteen point lead. I would call this a small edge. Going by direct results the gap is indeed larger. Some of this may go away if it is tested without contempt, since the contempt setting is optimized for a range of opponents well below Komodo level, not for a roughly equal opponent. In any case the fifteen elo gap in blitz becomes zero at the intermediate tc, showing the trend that Komodo gains with more time.
In my testing between Stockfish and Komodo for the half past year I have not found any measurable better performance for Komodo contempt=0 what so ever
Nor could I find the by you so many times mentioned 'gain by Komodo at longer time controls'
See for instance these last results in testing late SF290116 vs Komodo 9.3 (with contempt=0 !)
100 games at tc 3m2s blitz 53,5% for SF and at 15m10s 55,5%

http://www.talkchess.com/forum/viewtopi ... 15&t=58791

SF 290116 - Komodo 9.3(ct=0) Blitz 3m+2 i5 4200M @2500Mhz, 2 cpu (Fritzmark 5,1)
1 Stockfish 290116 64 BMI2 +24 +18/=71/-11 53.50% 53.5/100
2 Komodo 9.3 64-bit -24 +11/=71/-18 46.50% 46.5/100

SF 290116 - Komodo 9.3(ct=0), Rapid 15m 10s
1 Stockfish 290116 64 BMI2 +38 +21/=69/-10 55.50% 55.5/100
2 Komodo 9.3 64-bit -38 +10/=69/-21 44.50% 44.5/100

JJJ
Posts: 1285
Joined: Sat Apr 19, 2014 11:47 am

Re: To Larry Kaufman

Post by JJJ » Tue Feb 09, 2016 7:49 pm

Yeah Bram, all your test show Stockfish better but loosing at TCEC.

And nobody care about 100 games test. Everybody know that Stockfish DEV is better at short time control on 4 CPU. Even on 15 min + 3.


Now try 1h + 15 sec on 8 core plus and we talk again.

And btw, don't forget everytimes Komodo catch up Stockfish with his update.

But yeah, you loooooooooove make test to see Stockfish increasing his edge. You really enjoy to show all and say "hey , Stockfish is doing 52, 54, now 59% !!!"

And every time, new Komodo update, new 50% score for Stockfish / Komodo on short time control in your test.

But what's your point here ? Just saying "Larry isn't awareness of what he is talking about ? "

But you re not fair at all. Maybe contempt doesn't change anything between Komodo and Stockfish. Or maybe it is about 5 elo and of course, with your 100 game test, you ll never know.

And some test already showed that Komodo need more than 15 min + 3 TC to be better that Stockfish on 4 core or less. Did you watch these test also ?


So, what's your point, trying to explain " Hey, Stockfish is better than Komodo " ?

You ll see resulat at next TCEC again I guess. But for your machine, think what's you want. The best program is not neceserry the one who can beat the "second". See the overall perf like Larry said.

beram
Posts: 1187
Joined: Wed Jan 06, 2010 2:11 pm

Re: To Larry Kaufman

Post by beram » Tue Feb 09, 2016 8:03 pm

JJJ wrote:Yeah Bram, all your test show Stockfish better but loosing at TCEC.

And nobody care about 100 games test. Everybody know that Stockfish DEV is better at short time control on 4 CPU. Even on 15 min + 3.


Now try 1h + 15 sec on 8 core plus and we talk again.

And btw, don't forget everytimes Komodo catch up Stockfish with his update.

But yeah, you loooooooooove make test to see Stockfish increasing his edge. You really enjoy to show all and say "hey , Stockfish is doing 52, 54, now 59% !!!"

And every time, new Komodo update, new 50% score for Stockfish / Komodo on short time control in your test.

But what's your point here ? Just saying "Larry isn't awareness of what he is talking about ? "

But you re not fair at all. Maybe contempt doesn't change anything between Komodo and Stockfish. Or maybe it is about 5 elo and of course, with your 100 game test, you ll never know.

And some test already showed that Komodo need more than 15 min + 3 TC to be better that Stockfish on 4 core or less. Did you watch these test also ?


So, what's your point, trying to explain " Hey, Stockfish is better than Komodo " ?

You ll see resulat at next TCEC again I guess. But for your machine, think what's you want. The best program is not neceserry the one who can beat the "second". See the overall perf like Larry said.
:-) Well at least Komodo is 53% over Stockfish 6 at 40/120....
http://www.husvankempen.de/nunn/40120ne ... ion/1.html

JJJ
Posts: 1285
Joined: Sat Apr 19, 2014 11:47 am

Re: To Larry Kaufman

Post by JJJ » Tue Feb 09, 2016 8:15 pm

Too small sample.

But if you really want to go that way, just remember the last TCEC result, when it was the last Komodo dev versus the last Stockfish dev.

beram
Posts: 1187
Joined: Wed Jan 06, 2010 2:11 pm

Re: To Larry Kaufman

Post by beram » Tue Feb 09, 2016 9:24 pm

JJJ wrote:Too small sample.

But if you really want to go that way, just remember the last TCEC result, when it was the last Komodo dev versus the last Stockfish dev.
Or this http://www.talkchess.com/forum/viewtopi ... 51&t=58692

JJJ
Posts: 1285
Joined: Sat Apr 19, 2014 11:47 am

Re: To Larry Kaufman

Post by JJJ » Tue Feb 09, 2016 9:40 pm

Yes. At that timing Stockfish did great result here.

So, I guess we ll all have to wait until the next Komodo release then.

carldaman
Posts: 1687
Joined: Sat Jun 02, 2012 12:13 am

Re: To Larry Kaufman

Post by carldaman » Wed Feb 10, 2016 12:23 am

JJJ wrote:Too small sample.

But if you really want to go that way, just remember the last TCEC result, when it was the last Komodo dev versus the last Stockfish dev.

The last TCEC result was also a 100 game sample. :wink:

However, wasn't there a test someone ran that showed that Komodo begins to scale better at TCs longer than 30 min/game ? I think we need more such longer TC tests, too.

CL

User avatar
Leto
Posts: 2028
Joined: Thu May 04, 2006 1:40 am
Location: Dune

Re: To Larry Kaufman

Post by Leto » Wed Feb 10, 2016 7:42 pm

JJJ wrote:Yes. At that timing Stockfish did great result here.

So, I guess we ll all have to wait until the next Komodo release then.
But still not at TCEC level. TCEC 8 saw average depth of about 35 during the middlegame phase I believe. The test by Andreas Strangmuller on his 16 core opteron only averaged depth of 30 during the middlegame phase.

leavenfish
Posts: 268
Joined: Mon Sep 02, 2013 6:23 am

Re: To Larry Kaufman

Post by leavenfish » Sat Feb 13, 2016 9:04 pm

For simple evaluation of a position (not a game play situation vs another computer) with plenty of time to evaluate, what would be the best contempt? 0?

Post Reply