Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.Milos wrote:1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.
2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem).
nTCEC simulation
Moderator: Ras
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: nTCEC simulation
-
Milos
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: nTCEC simulation
Quote 2SD error bars from 48 games with 70% draw rate, can't be that difficult right?Laskos wrote:Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.Milos wrote:1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.
2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem).
Kai, I like you, but you are like a kid that didn't learn the lesson and teacher just caught him
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: nTCEC simulation
This is what you said:Milos wrote:Quote me first 2SD error bars for 48 games with 70% draw rate.Don wrote:Milos is criticizing things he doesn't even understand. Your odds of winning a single game if you are 180 ELO stronger is about 74% - but winning a 48 game match if you are 180 stronger would be almost certain. This guy needs to go back to school or at least leave the statistics to people who know what they are talking about.Laskos wrote:Look silly, in 48 games even 20 points difference will reflect in high percentage of win. 53% for 20 points difference is for ONE game, silly.Milos wrote:From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines. What a load of crap...Laskos wrote:I can give my simulations for Stage 4 and the Superfinal, after 45 games in Stage 3:
To qualify for the Superfinal:To win nTCEC:Code: Select all
Komodo: 62% Houdini: 53% SF: 27% Rybka: 15% Critter: 15% Bouquet: 14% Gull: 11% Hiarcs: 1% Naum: 1% Junior: 0%Code: Select all
Komodo: 47% Houdini: 30% SF: 12% Rybka: 3% Critter: 3% Bouquet: 3% Gull: 2% Hiarcs: 0% Naum: 0% Junior: 0%
Don if anyone is here ignorant about statistics it is you. It is well known thing in this forum.
From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines.
Now that sentence implies that you either have no clue about what you are talking about, or else you just made an honest mistake. Which is it?
Do you honestly believe that if Komodo was 180 ELO stronger than any other engine it would only have a 76% of winning a 48 game superfinal? After having said that you are just not credible.
You are not credible anyway because you came into this loaded up with venom and hatred. I don't know what is wrong with you but please just go away and leave us alone.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
Milos
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: nTCEC simulation
There is an important word in my sentence on average.Don wrote:This is what you said:Milos wrote:Quote me first 2SD error bars for 48 games with 70% draw rate.Don wrote:Milos is criticizing things he doesn't even understand. Your odds of winning a single game if you are 180 ELO stronger is about 74% - but winning a 48 game match if you are 180 stronger would be almost certain. This guy needs to go back to school or at least leave the statistics to people who know what they are talking about.Laskos wrote:Look silly, in 48 games even 20 points difference will reflect in high percentage of win. 53% for 20 points difference is for ONE game, silly.Milos wrote:From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines. What a load of crap...Laskos wrote:I can give my simulations for Stage 4 and the Superfinal, after 45 games in Stage 3:
To qualify for the Superfinal:To win nTCEC:Code: Select all
Komodo: 62% Houdini: 53% SF: 27% Rybka: 15% Critter: 15% Bouquet: 14% Gull: 11% Hiarcs: 1% Naum: 1% Junior: 0%Code: Select all
Komodo: 47% Houdini: 30% SF: 12% Rybka: 3% Critter: 3% Bouquet: 3% Gull: 2% Hiarcs: 0% Naum: 0% Junior: 0%
Don if anyone is here ignorant about statistics it is you. It is well known thing in this forum.
From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines.
Now that sentence implies that you either have no clue about what you are talking about, or else you just made an honest mistake. Which is it?
Do you honestly believe that if Komodo was 180 ELO stronger than any other engine it would only have a 76% of winning a 48 game superfinal? After having said that you are just not credible.
You are not credible anyway because you came into this loaded up with venom and hatred. I don't know what is wrong with you but please just go away and leave us alone.
Maybe you should try to first learn to read instead of just shouting
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: nTCEC simulation
Look again silly, if you cannot do in mind, take a pencil, 1SD error bars are about 30 points in 48 games with 70% draws. Komodo is expected to meet in the Superfinal an opponent which is 20-30 points weaker on average than itself. That is 2/3 to 1 SD, meaning 70% to 84% to win, and here comes your 76%. Got consistency? Now, shut up and stop posting in this thread your absurdities.Milos wrote:Quote 2SD error bars from 48 games with 70% draw rate, can't be that difficult right?Laskos wrote:Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.Milos wrote:1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.
2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem).
Kai, I like you, but you are like a kid that didn't learn the lesson and teacher just caught him.
-
Ajedrecista
- Posts: 2217
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: nTCEC simulation.
Hello Don:
I was going to suggest that it would be interesting to give some kind of Elo bonus to white side given the short number of games, but I have just read that you give 40 Elo (the same that I had on mind!).
@Milos: I think that trinomial distribution makes sense here. Just an intuitive example: two equal engines (i.e. the same engine with different names with the purpose of differentiate them) that will have 50% of chances in each game (sides apart that will be compensated playing one game with each colour). The chance of each engine to win this mini-match of two games is not 50% but less because they can draw the match (two draws or one win by each executable).
Some time ago I did a Windows tool for trinomial distributions. You can try it at the link of my signature (the tool is called 'Probabilities_in_a_trinomial_distribution' and it is valid in matches between two engines). You can toy with Elo differences and draw ratios and you will see that, for example, a 20 Elo difference with 60% of draw ratio will bring much more than 52.9% of winning chances for the stronger engine in 48 games. As an important note, I used the hypothesis of equal number of games with white and black for each engine (that is, I did not programmed different Elo gaps regarding colours other than the initially specified Elo difference, which is a simplification).
It is true that guessing Elo differences is hard with very low number of games (even with many thousands of games the error bars with 95% confidence usually have more than a couple of Elo that could change a lot the result of the simulations or the obtained results with trinomial distributions): we all know that in this case error bars are huge. Here is where I understand that Don and even Kai have to apply their assumptions that could be more correct or less correct. Anyway, it always exists the option of disagree publicly once and forget it!
@Moderators: there is no intention to start an absurd war. I only try to be constructive.
Regards from Spain.
Ajedrecista.
Firstly and most important: I wish you very good luck with your illness.Don wrote:Milos is criticizing things he doesn't even understand. Your odds of winning a single game if you are 180 ELO stronger is about 74% - but winning a 48 game match if you are 180 stronger would be almost certain. This guy needs to go back to school or at least leave the statistics to people who know what they are talking about.Laskos wrote:Look silly, in 48 games even 20 points difference will reflect in high percentage of win. 53% for 20 points difference is for ONE game, silly.Milos wrote:From your results if Komodo gets into the final it has 76% chance of winning it. That means it is on average 180Elo stronger than other engines. What a load of crap...Laskos wrote:I can give my simulations for Stage 4 and the Superfinal, after 45 games in Stage 3:
To qualify for the Superfinal:To win nTCEC:Code: Select all
Komodo: 62% Houdini: 53% SF: 27% Rybka: 15% Critter: 15% Bouquet: 14% Gull: 11% Hiarcs: 1% Naum: 1% Junior: 0%Code: Select all
Komodo: 47% Houdini: 30% SF: 12% Rybka: 3% Critter: 3% Bouquet: 3% Gull: 2% Hiarcs: 0% Naum: 0% Junior: 0%
I was going to suggest that it would be interesting to give some kind of Elo bonus to white side given the short number of games, but I have just read that you give 40 Elo (the same that I had on mind!).
@Milos: I think that trinomial distribution makes sense here. Just an intuitive example: two equal engines (i.e. the same engine with different names with the purpose of differentiate them) that will have 50% of chances in each game (sides apart that will be compensated playing one game with each colour). The chance of each engine to win this mini-match of two games is not 50% but less because they can draw the match (two draws or one win by each executable).
Some time ago I did a Windows tool for trinomial distributions. You can try it at the link of my signature (the tool is called 'Probabilities_in_a_trinomial_distribution' and it is valid in matches between two engines). You can toy with Elo differences and draw ratios and you will see that, for example, a 20 Elo difference with 60% of draw ratio will bring much more than 52.9% of winning chances for the stronger engine in 48 games. As an important note, I used the hypothesis of equal number of games with white and black for each engine (that is, I did not programmed different Elo gaps regarding colours other than the initially specified Elo difference, which is a simplification).
It is true that guessing Elo differences is hard with very low number of games (even with many thousands of games the error bars with 95% confidence usually have more than a couple of Elo that could change a lot the result of the simulations or the obtained results with trinomial distributions): we all know that in this case error bars are huge. Here is where I understand that Don and even Kai have to apply their assumptions that could be more correct or less correct. Anyway, it always exists the option of disagree publicly once and forget it!
@Moderators: there is no intention to start an absurd war. I only try to be constructive.
Regards from Spain.
Ajedrecista.
-
Joerg Oster
- Posts: 994
- Joined: Fri Mar 10, 2006 4:29 pm
- Location: Germany
- Full name: Jörg Oster
Re: nTCEC simulation
Me too!Don wrote:It's not a sure thing by any means but I would like to see it there.Joerg Oster wrote:How meaningful is your comment?Milos wrote:Lol, really have to laugh at this.Don wrote:Like any simulation I had to make certain assumptions, some of them perhaps rather arbitrary. For example the ELO ratings are based on the long time control rating lists with TCEC results from this season folded in, which give Komodo a 25 ELO advantage over Houdini. I reduced Komodo to only 10 ELO over Houdini, purely based on intuition. I have a hard time believing it is 25 ELO over Houdini even though it's improved over Komodo 6 and it's at a time control ideal for Komodo.
You have Komodo advangate over Houdini of 25 Elo (or 10 Elo, or whatever). Are you aware that your 2 sigma error bars are at least 100 Elo?![]()
How meaningful is your reasult?
You sound like ignorant crowd at TCEC chat that is basing its predictions on the last 1-3 games.
There is no evidence dev Komodo is stronger than even H3. There is no evidence Komodo is stronger than SF.
Chance for SF not to qualify for next stange is less than 20%. There chance for Komodo to be in super final is at most 33%. To win super final is at most 50%. Overall, chance for Komodo to win all is less than 18%.
It is not a rating list, but a simulation.
Komodo does a great job so far in nTCEC. Unlike Stockfish, which is playing a bit unfortunate.
Komodo will be in the super final. No doubt.
And Stockfish, of course.
Jörg Oster
-
Milos
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: nTCEC simulation
Finally.Laskos wrote:Look again silly, if you cannot do in mind, take a pencil, 1SD error bars are about 30 points in 48 games with 70% draws. Komodo is expected to meet in the Superfinal an opponent which is 20-30 points weaker on average than itself. That is 2/3 to 1 SD, meaning 70% to 84% to win, and here comes your 76%. Got consistency? Now, shut up and stop posting in this thread your absurdities.Milos wrote:Quote 2SD error bars from 48 games with 70% draw rate, can't be that difficult right?Laskos wrote:Silly, it's all there, look up at numbers, number of games and how the simulations are done, and shut up.Milos wrote:1)You reasonable ground is just horse manure. Just changing default contempt of H3 in matches against Komodo and SF improves H3 score for 30+ Elo.Laskos wrote:1) I assumed on reasonable grounds that Komodo is 10 points above Houdini at this TC and hardware (before Don did it).
2) If Komodo reaches Superfinal, it will have only 53% or so to meet Houdini, therefore, it could well meet weaker opposition, and Komodo chances will improve in the Superfinal.
You assume that for more than a year Houdini didn't improve at all???
And that myth about Komodo and good scaling and H3 and bad scaling is just ridiculous. There is not a single shred of evidence about it except Don's shameful advertizing of his product on this forum.
2)If Komodo reaches final Houdini chance to reach final is certainly not 53% but higher. You should maybe go to basics and try to repeat stuff like a priori and a posteriori probabilities (or just go and try to figure out basic stuff like Monty Hall problem).
Kai, I like you, but you are like a kid that didn't learn the lesson and teacher just caught him.
1SD is indeed around 30Elo and this is exactly the difference between Komodo and Houdini/SF in the final (for other engines LoS is above 95% i.e. Komodo is 60, 100 and more Elo stronger) according to your predictions. 72% of winning chances for Komodo against Houdini/SF (when you remove other engines) is 72% LoS which means slightly less than 4 more wins in 48 games match. With 70% draws this is 9+/34=/5- or 29Elo.
Now let me cite you:
So now somehow 10 Elo magically became 30I too take a lot of informed guesses, by the way, the same 10 ELO points at this TC and hardware for Komodo above Houdini.
-
Milos
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: nTCEC simulation.
Winning chances for 2 opponents are LoS, 52.9% LoS means slightly less than 1 win in 48 games which is score 8+/33=/7- i.e. 21Elo.Ajedrecista wrote:Some time ago I did a Windows tool for trinomial distributions. You can try it at the link of my signature (the tool is called 'Probabilities_in_a_trinomial_distribution' and it is valid in matches between two engines). You can toy with Elo differences and draw ratios and you will see that, for example, a 20 Elo difference with 60% of draw ratio will bring much more than 52.9% of winning chances for the stronger engine in 48 games. As an important note, I used the hypothesis of equal number of games with white and black for each engine (that is, I did not programmed different Elo gaps regarding colours other than the initially specified Elo difference, which is a simplification).
For LoS you don't actually need trinomial distribution since LoS is almost independent of number of draws
-
Ajedrecista
- Posts: 2217
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: nTCEC simulation.
Hello Milos:
But it looks like we are not going to agree in numbers, so we would waste our time, Milos.
Regards from Spain.
Ajedrecista.
I was not talking about LOS. I got 0.529 = 52.9% from a rounding of 1/[1 + 10^(-20/400)]. I wanted to say that in this case the stronger engine has a probability of score 52.9% of the points of one game, but in 48 games the probability of the stronger engine of win this 48-game match of my example (I mean, win 24.5 points or more) is much more than 52.9%. For 20 Elo difference and 60% of draw ratio, I obtain the following results with my tool (I hope that other people can verify my numbers):Milos wrote:Winning chances for 2 opponents are LoS, 52.9% LoS means slightly less than 1 win in 48 games which is score 8+/33=/7- i.e. 21Elo.Ajedrecista wrote:Some time ago I did a Windows tool for trinomial distributions. You can try it at the link of my signature (the tool is called 'Probabilities_in_a_trinomial_distribution' and it is valid in matches between two engines). You can toy with Elo differences and draw ratios and you will see that, for example, a 20 Elo difference with 60% of draw ratio will bring much more than 52.9% of winning chances for the stronger engine in 48 games. As an important note, I used the hypothesis of equal number of games with white and black for each engine (that is, I did not programmed different Elo gaps regarding colours other than the initially specified Elo difference, which is a simplification).
For LoS you don't actually need trinomial distribution since LoS is almost independent of number of draws.
Code: Select all
Probability that the strongest player wins the match ~ 69.8204 %
Probability of a tied match ~ 7.4696 %
Probability that the weakest player wins the match ~ 22.7099 %Regards from Spain.
Ajedrecista.