nTCEC simulation

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: nTCEC simulation

Post by Laskos »

Milos wrote:
Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.

2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem) ;).
Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: nTCEC simulation

Post by Milos »

Laskos wrote:
Milos wrote:
Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.

2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem) ;).
Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.
Quote 2SD error bars from 48 games with 70% draw rate, can't be that difficult right? ;)
Kai, I like you, but you are like a kid that didn't learn the lesson and teacher just caught him :lol:.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: nTCEC simulation

Post by Don »

Milos wrote:
Don wrote:
Laskos wrote:
Milos wrote:
Laskos wrote:I can give my simulations for Stage 4 and the Superfinal, after 45 games in Stage 3:

To qualify for the Superfinal:

Code: Select all

Komodo:   62%
Houdini:  53%
SF:       27%
Rybka:    15%
Critter:  15%
Bouquet:  14%
Gull:     11%
Hiarcs:    1%
Naum:      1%
Junior:    0%
To win nTCEC:

Code: Select all

Komodo:   47%
Houdini:  30%
SF:       12%
Rybka:     3%
Critter:   3%
Bouquet:   3%
Gull:      2%
Hiarcs:    0%
Naum:      0%
Junior:    0%
From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines. What a load of crap...
Look silly, in 48 games even 20 points difference will reflect in high percentage of win. 53% for 20 points difference is for ONE game, silly.
Milos is criticizing things he doesn't even understand. Your odds of winning a single game if you are 180 ELO stronger is about 74% - but winning a 48 game match if you are 180 stronger would be almost certain. This guy needs to go back to school or at least leave the statistics to people who know what they are talking about.
Quote me first 2SD error bars for 48 games with 70% draw rate.
Don if anyone is here ignorant about statistics it is you. It is well known thing in this forum.
This is what you said:

From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines.

Now that sentence implies that you either have no clue about what you are talking about, or else you just made an honest mistake. Which is it?

Do you honestly believe that if Komodo was 180 ELO stronger than any other engine it would only have a 76% of winning a 48 game superfinal? After having said that you are just not credible.

You are not credible anyway because you came into this loaded up with venom and hatred. I don't know what is wrong with you but please just go away and leave us alone.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: nTCEC simulation

Post by Milos »

Don wrote:
Milos wrote:
Don wrote:
Laskos wrote:
Milos wrote:
Laskos wrote:I can give my simulations for Stage 4 and the Superfinal, after 45 games in Stage 3:

To qualify for the Superfinal:

Code: Select all

Komodo:   62%
Houdini:  53%
SF:       27%
Rybka:    15%
Critter:  15%
Bouquet:  14%
Gull:     11%
Hiarcs:    1%
Naum:      1%
Junior:    0%
To win nTCEC:

Code: Select all

Komodo:   47%
Houdini:  30%
SF:       12%
Rybka:     3%
Critter:   3%
Bouquet:   3%
Gull:      2%
Hiarcs:    0%
Naum:      0%
Junior:    0%
From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines. What a load of crap...
Look silly, in 48 games even 20 points difference will reflect in high percentage of win. 53% for 20 points difference is for ONE game, silly.
Milos is criticizing things he doesn't even understand. Your odds of winning a single game if you are 180 ELO stronger is about 74% - but winning a 48 game match if you are 180 stronger would be almost certain. This guy needs to go back to school or at least leave the statistics to people who know what they are talking about.
Quote me first 2SD error bars for 48 games with 70% draw rate.
Don if anyone is here ignorant about statistics it is you. It is well known thing in this forum.
This is what you said:

From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines.

Now that sentence implies that you either have no clue about what you are talking about, or else you just made an honest mistake. Which is it?

Do you honestly believe that if Komodo was 180 ELO stronger than any other engine it would only have a 76% of winning a 48 game superfinal? After having said that you are just not credible.

You are not credible anyway because you came into this loaded up with venom and hatred. I don't know what is wrong with you but please just go away and leave us alone.
There is an important word in my sentence on average.
Maybe you should try to first learn to read instead of just shouting ;).
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: nTCEC simulation

Post by Laskos »

Milos wrote:
Laskos wrote:
Milos wrote:
Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.

2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem) ;).
Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.
Quote 2SD error bars from 48 games with 70% draw rate, can't be that difficult right? ;)
Kai, I like you, but you are like a kid that didn't learn the lesson and teacher just caught him :lol:.
Look again silly, if you cannot do in mind, take a pencil, 1SD error bars are about 30 points in 48 games with 70% draws. Komodo is expected to meet in the Superfinal an opponent which is 20-30 points weaker on average than itself. That is 2/3 to 1 SD, meaning 70% to 84% to win, and here comes your 76%. Got consistency? Now, shut up and stop posting in this thread your absurdities.
User avatar
Ajedrecista
Posts: 2217
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: nTCEC simulation.

Post by Ajedrecista »

Hello Don:
Don wrote:
Laskos wrote:
Milos wrote:
Laskos wrote:I can give my simulations for Stage 4 and the Superfinal, after 45 games in Stage 3:

To qualify for the Superfinal:

Code: Select all

Komodo:   62%
Houdini:  53%
SF:       27%
Rybka:    15%
Critter:  15%
Bouquet:  14%
Gull:     11%
Hiarcs:    1%
Naum:      1%
Junior:    0%
To win nTCEC:

Code: Select all

Komodo:   47%
Houdini:  30%
SF:       12%
Rybka:     3%
Critter:   3%
Bouquet:   3%
Gull:      2%
Hiarcs:    0%
Naum:      0%
Junior:    0%
From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines. What a load of crap...
Look silly, in 48 games even 20 points difference will reflect in high percentage of win. 53% for 20 points difference is for ONE game, silly.
Milos is criticizing things he doesn't even understand. Your odds of winning a single game if you are 180 ELO stronger is about 74% - but winning a 48 game match if you are 180 stronger would be almost certain. This guy needs to go back to school or at least leave the statistics to people who know what they are talking about.
Firstly and most important: I wish you very good luck with your illness.

I was going to suggest that it would be interesting to give some kind of Elo bonus to white side given the short number of games, but I have just read that you give 40 Elo (the same that I had on mind!).

@Milos: I think that trinomial distribution makes sense here. Just an intuitive example: two equal engines (i.e. the same engine with different names with the purpose of differentiate them) that will have 50% of chances in each game (sides apart that will be compensated playing one game with each colour). The chance of each engine to win this mini-match of two games is not 50% but less because they can draw the match (two draws or one win by each executable).

Some time ago I did a Windows tool for trinomial distributions. You can try it at the link of my signature (the tool is called 'Probabilities_in_a_trinomial_distribution' and it is valid in matches between two engines). You can toy with Elo differences and draw ratios and you will see that, for example, a 20 Elo difference with 60% of draw ratio will bring much more than 52.9% of winning chances for the stronger engine in 48 games. As an important note, I used the hypothesis of equal number of games with white and black for each engine (that is, I did not programmed different Elo gaps regarding colours other than the initially specified Elo difference, which is a simplification).

It is true that guessing Elo differences is hard with very low number of games (even with many thousands of games the error bars with 95% confidence usually have more than a couple of Elo that could change a lot the result of the simulations or the obtained results with trinomial distributions): we all know that in this case error bars are huge. Here is where I understand that Don and even Kai have to apply their assumptions that could be more correct or less correct. Anyway, it always exists the option of disagree publicly once and forget it! ;)

@Moderators: there is no intention to start an absurd war. I only try to be constructive.

Regards from Spain.

Ajedrecista.
Joerg Oster
Posts: 994
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany
Full name: Jörg Oster

Re: nTCEC simulation

Post by Joerg Oster »

Don wrote:
Joerg Oster wrote:
Milos wrote:
Don wrote:Like any simulation I had to make certain assumptions, some of them perhaps rather arbitrary. For example the ELO ratings are based on the long time control rating lists with TCEC results from this season folded in, which give Komodo a 25 ELO advantage over Houdini. I reduced Komodo to only 10 ELO over Houdini, purely based on intuition. I have a hard time believing it is 25 ELO over Houdini even though it's improved over Komodo 6 and it's at a time control ideal for Komodo.
Lol, really have to laugh at this.
You have Komodo advangate over Houdini of 25 Elo (or 10 Elo, or whatever). Are you aware that your 2 sigma error bars are at least 100 Elo? :lol:
How meaningful is your reasult? :lol:

You sound like ignorant crowd at TCEC chat that is basing its predictions on the last 1-3 games.
There is no evidence dev Komodo is stronger than even H3. There is no evidence Komodo is stronger than SF.
Chance for SF not to qualify for next stange is less than 20%. There chance for Komodo to be in super final is at most 33%. To win super final is at most 50%. Overall, chance for Komodo to win all is less than 18%.
How meaningful is your comment?
It is not a rating list, but a simulation.

Komodo does a great job so far in nTCEC. Unlike Stockfish, which is playing a bit unfortunate.
Komodo will be in the super final. No doubt.
It's not a sure thing by any means but I would like to see it there.
Me too!
And Stockfish, of course. :D
Jörg Oster
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: nTCEC simulation

Post by Milos »

Laskos wrote:
Milos wrote:
Laskos wrote:
Milos wrote:
Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.

2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem) ;).
Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.
Quote 2SD error bars from 48 games with 70% draw rate, can't be that difficult right? ;)
Kai, I like you, but you are like a kid that didn't learn the lesson and teacher just caught him :lol:.
Look again silly, if you cannot do in mind, take a pencil, 1SD error bars are about 30 points in 48 games with 70% draws. Komodo is expected to meet in the Superfinal an opponent which is 20-30 points weaker on average than itself. That is 2/3 to 1 SD, meaning 70% to 84% to win, and here comes your 76%. Got consistency? Now, shut up and stop posting in this thread your absurdities.
Finally.
1SD is indeed around 30Elo and this is exactly the difference between Komodo and Houdini/SF in the final (for other engines LoS is above 95% i.e. Komodo is 60, 100 and more Elo stronger) according to your predictions. 72% of winning chances for Komodo against Houdini/SF (when you remove other engines) is 72% LoS which means slightly less than 4 more wins in 48 games match. With 70% draws this is 9+/34=/5- or 29Elo.
Now let me cite you:
I too take a lot of informed guesses, by the way, the same 10 ELO points at this TC and hardware for Komodo above Houdini.
So now somehow 10 Elo magically became 30 :lol: :lol:
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: nTCEC simulation.

Post by Milos »

Ajedrecista wrote:Some time ago I did a Windows tool for trinomial distributions. You can try it at the link of my signature (the tool is called 'Probabilities_in_a_trinomial_distribution' and it is valid in matches between two engines). You can toy with Elo differences and draw ratios and you will see that, for example, a 20 Elo difference with 60% of draw ratio will bring much more than 52.9% of winning chances for the stronger engine in 48 games. As an important note, I used the hypothesis of equal number of games with white and black for each engine (that is, I did not programmed different Elo gaps regarding colours other than the initially specified Elo difference, which is a simplification).
Winning chances for 2 opponents are LoS, 52.9% LoS means slightly less than 1 win in 48 games which is score 8+/33=/7- i.e. 21Elo.
For LoS you don't actually need trinomial distribution since LoS is almost independent of number of draws ;).
User avatar
Ajedrecista
Posts: 2217
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: nTCEC simulation.

Post by Ajedrecista »

Hello Milos:
Milos wrote:
Ajedrecista wrote:Some time ago I did a Windows tool for trinomial distributions. You can try it at the link of my signature (the tool is called 'Probabilities_in_a_trinomial_distribution' and it is valid in matches between two engines). You can toy with Elo differences and draw ratios and you will see that, for example, a 20 Elo difference with 60% of draw ratio will bring much more than 52.9% of winning chances for the stronger engine in 48 games. As an important note, I used the hypothesis of equal number of games with white and black for each engine (that is, I did not programmed different Elo gaps regarding colours other than the initially specified Elo difference, which is a simplification).
Winning chances for 2 opponents are LoS, 52.9% LoS means slightly less than 1 win in 48 games which is score 8+/33=/7- i.e. 21Elo.
For LoS you don't actually need trinomial distribution since LoS is almost independent of number of draws ;).
I was not talking about LOS. I got 0.529 = 52.9% from a rounding of 1/[1 + 10^(-20/400)]. I wanted to say that in this case the stronger engine has a probability of score 52.9% of the points of one game, but in 48 games the probability of the stronger engine of win this 48-game match of my example (I mean, win 24.5 points or more) is much more than 52.9%. For 20 Elo difference and 60% of draw ratio, I obtain the following results with my tool (I hope that other people can verify my numbers):

Code: Select all

Probability that the strongest player wins the match ~  69.8204 %
                         Probability of a tied match ~   7.4696 %
  Probability that the weakest player wins the match ~  22.7099 %
But it looks like we are not going to agree in numbers, so we would waste our time, Milos.

Regards from Spain.

Ajedrecista.