Did you take the pencil? After giving you simulations, explanations, you still insist with your dumb remarks? For SF I took 20 ELO being behind Komodo, and if you were smarter, you had observed that the combined chances of still weaker engines (by 50 or more points) to qualify for Superfinal are non-negligible, and that adds up to the average of 20-30, or, to please you, 20-25 points, close to 0.8 SD, hence your 76% (which you mistakenly assumed to be 180 ELO points difference in 48 games). Are you going to highjack this thread with your silly remarks? Observe how you started with claiming 100 points uncertainties, and now you pick on 5 points "discrepancy" with your wrong assumptions.Milos wrote:Finally.Laskos wrote:
Look again silly, if you cannot do in mind, take a pencil, 1SD error bars are about 30 points in 48 games with 70% draws. Komodo is expected to meet in the Superfinal an opponent which is 20-30 points weaker on average than itself. That is 2/3 to 1 SD, meaning 70% to 84% to win, and here comes your 76%. Got consistency? Now, shut up and stop posting in this thread your absurdities.
1SD is indeed around 30Elo and this is exactly the difference between Komodo and Houdini/SF in the final (for other engines LoS is above 95% i.e. Komodo is 60, 100 and more Elo stronger) according to your predictions. 72% of winning chances for Komodo against Houdini/SF (when you remove other engines) is 72% LoS which means slightly less than 4 more wins in 48 games match. With 70% draws this is 9+/34=/5- or 29Elo.
Now let me cite you:So now somehow 10 Elo magically became 30I too take a lot of informed guesses, by the way, the same 10 ELO points at this TC and hardware for Komodo above Houdini.![]()
nTCEC simulation
Moderator: Ras
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: nTCEC simulation
Last edited by Laskos on Mon Oct 28, 2013 8:38 pm, edited 2 times in total.
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: nTCEC simulation
Here is an update after the 48th round Bouquet 1.8b vs Gull 2.3 draw:
Code: Select all
Name Win Odds Stage 4
--------- ---------- ----------
Komodo 52.627 99.038
Houdini 29.747 96.995
Bouquet 7.994 89.065
Critter 3.986 75.050
Rybka 2.705 69.402
Hiarcs 1.378 51.050
Gull 0.786 51.962
Stockfish 0.663 48.579
Naum 0.114 18.317
Junior 0.000 0.542
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
Milos
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: nTCEC simulation.
I don't understand this sentence. Between 2 engines there is only win/draw/loss probability for a single game and this is trinomial distribution as you correctly noted. What is probability of score in % of points in one game is beyond understanding for me.Ajedrecista wrote:I wanted to say that in this case the stronger engine has a probability of score 52.9% of the points of one game.
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: nTCEC simulation
In my case I tried to make the simulation as accurate as I possibly could, based on the information we have. I reduced Komodo's ELO as an engineering decisions from the values given to me by Miguel and Adams calculation since it is the version I am most interested in I wanted the result to be conservative. See this post for part of the reason I feel that Komodo is indeed at least slightly superior:Laskos wrote:Did you took the pencil? After giving you simulations, explanations, you still insist with your dumb remarks? For SF I took 20 ELO being behind Komodo, and if you were smarter, you had observed that the combined chances of still weaker engines to qualify for Superfinal are non-negligible, and that adds up to the average of 20-30, or, to please you, 20-25 points, close to 0.8 SD, hence your 76% (which you mistakenly assumed to be 180 ELO points difference in 48 games). Are you going to highjack this thread with your silly remarks? Observe how you started with claiming 100 points uncertainties, and now you pick on 5 points "discrepancy" with your wrong assumptions.Milos wrote:Finally.Laskos wrote:
Look again silly, if you cannot do in mind, take a pencil, 1SD error bars are about 30 points in 48 games with 70% draws. Komodo is expected to meet in the Superfinal an opponent which is 20-30 points weaker on average than itself. That is 2/3 to 1 SD, meaning 70% to 84% to win, and here comes your 76%. Got consistency? Now, shut up and stop posting in this thread your absurdities.
1SD is indeed around 30Elo and this is exactly the difference between Komodo and Houdini/SF in the final (for other engines LoS is above 95% i.e. Komodo is 60, 100 and more Elo stronger) according to your predictions. 72% of winning chances for Komodo against Houdini/SF (when you remove other engines) is 72% LoS which means slightly less than 4 more wins in 48 games match. With 70% draws this is 9+/34=/5- or 29Elo.
Now let me cite you:So now somehow 10 Elo magically became 30I too take a lot of informed guesses, by the way, the same 10 ELO points at this TC and hardware for Komodo above Houdini.![]()
http://talkchess.com/forum/viewtopic.php?t=49829
Note that the version of Komodo playing in TCEC is NOT Komodo 6 but an improved Komodo. It's impressive performance in this seasons TCEC (even beating Houdini and Stockfish) is not the primary factor here since it is based on only a handful of games.
The rating compression I applied (80%) is an attempt to make my simulation more accurate and reflect the reality of super long time controls games, the relative difference between programs generally shrinks with time and I even gave a 40 ELO advantage to white to reflect the fact that white has a much easier go at it.
I also tried to accurately measure the high draw ratio's of long time control games and the fact that as the ELO difference goes up, the chances of a draw decrease.
How good is the simulation? I have no idea. A lot of this was guesswork and supposition but I did it for fun and I think it probably has a lot of relevance, at least in the big picture.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
Ajedrecista
- Posts: 2214
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: nTCEC simulation.
Hello again:
Sadly, this thread has degenerated very quickly from the intention of the original post of bring orientative probabilities of certain events. This paragraph will not be quoted wih lots of luck although I think just the opposite. I do not want to spend more time answering more quotes while there is not intention of bring solutions but only obstacles.
Moderation team, feel free to delete this post if you think it is the best thing to maintsin peace in this forum.
Regards from Spain.
Ajedrecista.
I did not write in the correct terms. What I really mean is that the stronger engine is expected to score circa 52.9% (plus/minus uncertainties) of the points of the match.Milos wrote:I don't understand this sentence. Between 2 engines there is only win/draw/loss probability for a single game and this is trinomial distribution as you correctly noted. What is probability of score in % of points in one game is beyond understanding for me.Ajedrecista wrote:I wanted to say that in this case the stronger engine has a probability of score 52.9% of the points of one game.
Sadly, this thread has degenerated very quickly from the intention of the original post of bring orientative probabilities of certain events. This paragraph will not be quoted wih lots of luck although I think just the opposite. I do not want to spend more time answering more quotes while there is not intention of bring solutions but only obstacles.
Moderation team, feel free to delete this post if you think it is the best thing to maintsin peace in this forum.
Regards from Spain.
Ajedrecista.
-
Adam Hair
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: nTCEC simulation.
But you do understand that an engine with a 20 Elo advantage will have a greater than 52.9% chance of winning a 48 game match, right? And that an engine that has a 76% chance of winning a 48 game match need not be 180 Elo, on averageMilos wrote:I don't understand this sentence. Between 2 engines there is only win/draw/loss probability for a single game and this is trinomial distribution as you correctly noted. What is probability of score in % of points in one game is beyond understanding for me.Ajedrecista wrote:I wanted to say that in this case the stronger engine has a probability of score 52.9% of the points of one game.
-
Milos
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: nTCEC simulation.
If rating uncertainty is 100Elo, and difference between engines is 20Elo, there are 48 games played with 70% draw rate, what is probability with 95% certainty that engine with higher Elo will win the match.Adam Hair wrote:But you do understand that an engine with a 20 Elo advantage will have a greater than 52.9% chance of winning a 48 game match, right? And that an engine that has a 76% chance of winning a 48 game match need not be 180 Elo, on average, stronger than the other engines, correct?
If you answer me this question (without help from Miguel) we can talk further.
However, I'm pretty sure that you have no clue what I'm talking about, and can't answer this simple question
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: nTCEC simulation
I did similar things, rating compression of 75% compared to CCRL 40/40, assumed a bit better scaling of Komodo and SF vs. Houdini, of Rybka vs. Critter and Bouquet, Hiarcs vs. Naum and Junior. I also gathered all the info from this forum, including the link you gave me. I don't have a draw model or white advantage, so my estimations are rougher, but it seems that our simulations agree pretty well, coming from completely unrelated approaches. The 3rd Stage is about only 18 games each engine, and we see that luck is still important, even SF has sizable chance to not qualify. 4th stage is 30 games each, and the Superfinal 48 games, we will see how our assumptions work. It's fun, because we really don't know what will happen, in the past two seasons we had a clear favourite.Don wrote:In my case I tried to make the simulation as accurate as I possibly could, based on the information we have. I reduced Komodo's ELO as an engineering decisions from the values given to me by Miguel and Adams calculation since it is the version I am most interested in I wanted the result to be conservative. See this post for part of the reason I feel that Komodo is indeed at least slightly superior:Laskos wrote:Did you took the pencil? After giving you simulations, explanations, you still insist with your dumb remarks? For SF I took 20 ELO being behind Komodo, and if you were smarter, you had observed that the combined chances of still weaker engines to qualify for Superfinal are non-negligible, and that adds up to the average of 20-30, or, to please you, 20-25 points, close to 0.8 SD, hence your 76% (which you mistakenly assumed to be 180 ELO points difference in 48 games). Are you going to highjack this thread with your silly remarks? Observe how you started with claiming 100 points uncertainties, and now you pick on 5 points "discrepancy" with your wrong assumptions.Milos wrote:Finally.Laskos wrote:
Look again silly, if you cannot do in mind, take a pencil, 1SD error bars are about 30 points in 48 games with 70% draws. Komodo is expected to meet in the Superfinal an opponent which is 20-30 points weaker on average than itself. That is 2/3 to 1 SD, meaning 70% to 84% to win, and here comes your 76%. Got consistency? Now, shut up and stop posting in this thread your absurdities.
1SD is indeed around 30Elo and this is exactly the difference between Komodo and Houdini/SF in the final (for other engines LoS is above 95% i.e. Komodo is 60, 100 and more Elo stronger) according to your predictions. 72% of winning chances for Komodo against Houdini/SF (when you remove other engines) is 72% LoS which means slightly less than 4 more wins in 48 games match. With 70% draws this is 9+/34=/5- or 29Elo.
Now let me cite you:So now somehow 10 Elo magically became 30I too take a lot of informed guesses, by the way, the same 10 ELO points at this TC and hardware for Komodo above Houdini.![]()
http://talkchess.com/forum/viewtopic.php?t=49829
Note that the version of Komodo playing in TCEC is NOT Komodo 6 but an improved Komodo. It's impressive performance in this seasons TCEC (even beating Houdini and Stockfish) is not the primary factor here since it is based on only a handful of games.
The rating compression I applied (80%) is an attempt to make my simulation more accurate and reflect the reality of super long time controls games, the relative difference between programs generally shrinks with time and I even gave a 40 ELO advantage to white to reflect the fact that white has a much easier go at it.
I also tried to accurately measure the high draw ratio's of long time control games and the fact that as the ELO difference goes up, the chances of a draw decrease.
How good is the simulation? I have no idea. A lot of this was guesswork and supposition but I did it for fun and I think it probably has a lot of relevance, at least in the big picture.
-
Adam Hair
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: nTCEC simulation.
Who do you think you and I are, Johann and Jacob Bernoulli?Milos wrote:If rating uncertainty is 100Elo, and difference between engines is 20Elo, there are 48 games played with 70% draw rate, what is probability with 95% certainty that engine with higher Elo will win the match.Adam Hair wrote:But you do understand that an engine with a 20 Elo advantage will have a greater than 52.9% chance of winning a 48 game match, right? And that an engine that has a 76% chance of winning a 48 game match need not be 180 Elo, on average, stronger than the other engines, correct?
If you answer me this question (without help from Miguel) we can talk further.
However, I'm pretty sure that you have no clue what I'm talking about, and can't answer this simple question.
To be honest, Milos, I do not know how to compute the probability that the higher rated player wins the match when the ratings difference is subject to uncertainty. Feel free to enlighten me. Given your reticence to show actual computations and the errors that I have seen you make in the past (such as a mistake in a simple application of Bayes theorem), I have some doubt on whether you know how to.
However, if the ratings difference is truly 20 Elo, then the probability that the higher rated player will win is 72% to 73%.
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: nTCEC simulation
Here is the sim after Hodini beats Rybka in round 50
Code: Select all
Name Win Odds Stage 4
--------- ---------- ----------
Komodo 44.791 99.113
Houdini 44.269 99.284
Bouquet 6.320 90.212
Critter 2.001 74.127
Hiarcs 1.160 60.497
Gull 0.551 54.432
Stockfish 0.518 51.339
Rybka 0.290 50.616
Naum 0.099 19.771
Junior 0.000 0.609
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.