TCEC 10

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: TCEC 10

Post by syzygy »

shrapnel wrote:
Milos wrote:Moreover, quoting once in a blue moon event as some example against solid statistics is really the basic fallacy that ppl who don't know much about statistics so often do.
+1.
I must say I agree.
Some lawyers I know could prove with their arguments that Night is actually Day and Day is actually Night !
Ronald de man seems to fall in the same category, juggling around figures to prove anything.
He may be a great Programmer or whatever, but seems to lack basic Common Sense.
Its perfectly obvious to the average Joe watching TCEC, that Houdini is playing much stronger and brilliantly than Komodo; whether it is because of Contempt or any other reason, is another matter.
If the average Joe carefully reads what I wrote, he will see that I never suggested that Houdini is not stronger than Komodo.

Someone suggested that Komodo's supposed NUMA bug cannot be entirely responsible for the "trashing" Komodo is receiving. I responded to point out that with a bit of bad luck, even an equally strong engine may well end up losing the first five decisive games.

So the Komodo from stages 1 and 2, possibly affected by a bug that slows it down by 20% to lose about 16 Elo, can easily lose the first 5 decisive games against the Houdini from stages 1 and 2. There is no real reason to suspect that Komodo 1970.00 is weaker than the previous version (beyond the possible slow down).

It is clear that Houdini has so far greatly outperformed Komodo. It is leading 5-0 with 22 draws.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: TCEC 10

Post by corres »

If the enhancement of speed derived from using higher clock frequency of CPUs, you are right. But in the case of TCEC the enhancement in speed and chess power (Elo) arose from the higher number of cores. It is proved that if you enhance the number of cores the relative speed growth will be higher than the relative growth of chess power.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: TCEC 10

Post by syzygy »

corres wrote:If the enhancement of speed derived from using higher clock frequency of CPUs, you are right. But in the case of TCEC the enhancement in speed and chess power (Elo) arose from the higher number of cores. It is proved that if you enhance the number of cores the relative speed growth will be higher than the relative growth of chess power.
Yes, but I was referring to Komodo's alleged NUMA bug that lowers its nps by 20% without changing the number of cores. (I do not know if Mark has really confirmed that there is a problem in the latest version, but it is clear that Komodo's reported nps has decreased.)
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: TCEC 10

Post by Michel »

Sorry it is you that doesn't understand statistics. The type of calculation you do is only valid after a test has finished. Otherwise you could stop a test when LOS>=95% and everyone here knows that that does not work.
Milos wrote:
syzygy wrote:In a match with two equally strong engines, there is a 1 in 16 probability that the first 5 wins are by the same engine. Are you seriously claiming that there is no such thing as a 1 in 16 event?
I am suggesting you don't understand much about statistics. Probability is trinomial not binomial, draw probability plays a serious role. +1=99-0 is few orders of magnitude more reliable statistics than +1=0-0. Therefore, your simplified comparison with coin toss is just a wrong straw man. Chance that K is not worse than H is cdf for x>2.5sigma which is around 1% not 6.25% as your oversimplified "calculation" suggests.
But I was not even suggesting that they are equally strong. I just pointed out that there is no reason to suspect that the version of Komodo now playing is seriously bugged, i.e. beyond having lower nps than the earlier versions.
20% lower nps at TCEC TC (obvious sore looser excuse btw.) and that strong hardware is at best 10Elo. Houdini's advantage is clearly over 10Elo in the worst case (best for Komodo).
I recently ran a test that had one engine lead the other 10-1-19 before it got trashed. It just got lucky in those first 30 games. Such things happen all the time.
Again you are just mixing apples and oranges. Draw probability is not even remotely similar as to the one in TCEC. In addition TC plays a role. It's a proven fact that extremely long TC's reduce error bars compared to extremely short. Your test is most probably from bullet. So again totally incomparable. Moreover, quoting once in a blue moon event as some example against solid statistics is really the basic fallacy that ppl who don't know much about statistics so often do.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: TCEC 10

Post by corres »

You suppose 50 Elo enhancement. I think it is too high at the conditions of TCEC. For me a growth of ~20 Elo is more real based on the above.
Isaac
Posts: 265
Joined: Sat Feb 22, 2014 8:37 pm

Re: TCEC 10

Post by Isaac »

Next season I hope TCEC will pick Asmfish instead of Stockfish.
Choosing Stockfish is worse than the Komodo 20% slow down due to a NUMA bug.
Not only because it is intentional but also because the slow down is greater than 20% I believe. Anyone please correct me.
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: TCEC 10

Post by Nordlandia »

Isaac wrote:Next season I hope TCEC will pick Asmfish instead of Stockfish.
Choosing Stockfish is worse than the Komodo 20% slow down due to a NUMA bug.
Not only because it is intentional but also because the slow down is greater than 20% I believe. Anyone please correct me.
TCEC does not allow stockfish derivatives such as
  • * Asmfish
    * Brainfish
    * McBrain
    * DON
    * Sting
    * SugaR
Geonerd
Posts: 79
Joined: Fri Mar 10, 2017 1:44 am

Re: TCEC 10

Post by Geonerd »

[profanity], this is a pedantic, argumentative bunch! :(

Have any of the people lecturing the world on "correct statistics" even looked at the bleeping games???
IMO, Houdini has a clear 'pull' over Komodo in the middle-game, and is generally the one with winning chances. Combine that with the current score and - yea - I'd call it a 'thrashing.'
User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: TCEC 10

Post by AdminX »

Geonerd wrote:[profanity], this is a pedantic, argumentative bunch! :(

Have any of the people lecturing the world on "correct statistics" even looked at the bleeping games???
IMO, Houdini has a clear 'pull' over Komodo in the middle-game, and is generally the one with winning chances. Combine that with the current score and - yea - I'd call it a 'thrashing.'
Well it ain't over till it's over, however I do agree with you about the current state of affairs. Of the 5 wins so far, game number 12 has been my favorite.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
User avatar
Eelco de Groot
Posts: 4567
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: TCEC 10

Post by Eelco de Groot »

It's the search gap. Gettit ? Out of this search gap comes all the naive speculation and nonsense that gets written. The program has every style and no style, it has no consistency to play against, only materialism, you can't learn from it, tomorrow it will be different (found another mine in the search gap), only the difference is just a relection of - whoops, trod on another mine. What can you do with such a program ? Use the take-back key and try again ? - and imagine this helps you improve or learn ?

Now, I claim this search gap has no meaning or understanding possibilities for a human. That a human can't relate his heuristics to it. That you can't extract the knowledge out of it and represent it to a human. That you can't even extract the knowledge out of it and represent it to yourself. You can't get heuristics from it. So I call it counting beans - useless for us humans.

Now, take a knowledge program, you can play it and see the play style. You can try and work out what it does and why. There'll be a reason, based on human chess heuristics. The game has plan, and flow, and doesn't consist of hidden minefields. It won't grind you down by search, it will try speculative ideas which it might, or might not, be able to get to work. You can see the speculative ideas, and try them yourself. I think you can, as a human, relate to this type of program. If you know the programmer, maybe you can see patterns into the program that come from him, and so on. I think these types of programs are infused with some force, in so far as any chunk of silicon can be.

I hate materialists.

Chris Whittington
With courtesy copied from Adam Hair's blog Haven't looked at the bleeping games yet, sorry. Not that I call Houdini a beancounter I just thought of this 'echo from the past'. When there's a speed difference you get things like that.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan