CCCC Rapid Rumble results simulator

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: CCCC Rapid Rumble results simulator

Post by Laskos »

Werewolf wrote: Sun Sep 09, 2018 1:44 pm
Laskos wrote: Sun Sep 09, 2018 1:39 pm
RTX 2080Ti will probably have the full fp16 support, so it will be much faster than 1080Ti, and 2 of these will probably be even more than what CCCC shows for Lc0 (it doesn't scale well to 4 GPUs).
2080 Ti will be much faster than 1080 Ti for the reasons you give.
But I don't think 2x 2080 Ti will be faster than 2 x Titan V (or V100) according to the specs.

The Titan V will still be about 10% faster.
Well, even if that is correct, it is also probable that AMD Threadripper 2990X 32 core CPU is a bit slower than that 46 core machine used in CCCC. I just wanted to show that the conditions are not very unbalanced price-wise if one is on limited budget but has 2-3 months of patience.
Last edited by Laskos on Sun Sep 09, 2018 1:53 pm, edited 1 time in total.
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: CCCC Rapid Rumble results simulator

Post by Uri Blass »

JJJ wrote: Sun Sep 09, 2018 11:59 am It is easy to predict even with a short tournament , because each engine is ranked with a good margin over the other, except Komodo / Houdini. So Stockfish is number 1 with a good margin, Lc0 number 4 with a good margin, then Fire, then Shredder, so the probability to have the number in the good order is pretty high. Higher than your simulator who needs to take in account the elo much more than the starting ranking of a tournament.
It is easy to predict only that Stockfish Houdini and Komodo are going to be the first 3 cpu programs because there are a lot of ranking lists at different time control when they are the top 3.

Lc0 is not in many rating lists at different time control and nothing is easy to predict for her.
chrisw
Posts: 4317
Joined: Tue Apr 03, 2012 4:28 pm

Re: CCCC Rapid Rumble results simulator

Post by chrisw »

Uri Blass wrote: Sun Sep 09, 2018 1:52 pm
JJJ wrote: Sun Sep 09, 2018 11:59 am It is easy to predict even with a short tournament , because each engine is ranked with a good margin over the other, except Komodo / Houdini. So Stockfish is number 1 with a good margin, Lc0 number 4 with a good margin, then Fire, then Shredder, so the probability to have the number in the good order is pretty high. Higher than your simulator who needs to take in account the elo much more than the starting ranking of a tournament.
It is easy to predict only that Stockfish Houdini and Komodo are going to be the first 3 cpu programs because there are a lot of ranking lists at different time control when they are the top 3.

Lc0 is not in many rating lists at different time control and nothing is easy to predict for her.
Which is a good reason to stress results-so-far in a prediction sim. There are programs whose given initial ratings don't reflect their current reality; not rated or different version or rating is on different hardware, or, or, or. The best shot at guessing their current reality is their current tournament performance.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: CCCC Rapid Rumble results simulator

Post by Laskos »

chrisw wrote: Sun Sep 09, 2018 11:24 am 290 rounds so far

"broken" - Milos
"LOL" - Laskos
"really inaccurate" - JJJ
"ridiculous" - Milos

Code: Select all

Engine Tournament Init   1st  2nd  3rd  4th  5th  6th  7th ....
Houdini     3452  3400   0.51 0.31 0.16 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Stockfish   3451  3439   0.34 0.40 0.23 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Komodo      3436  3404   0.14 0.28 0.51 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Shredder    3346  3287   0.00 0.00 0.04 0.34 0.29 0.20 0.09 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Lc0         3346  3300   0.00 0.00 0.03 0.32 0.28 0.22 0.10 0.04 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fire        3348  3326   0.00 0.00 0.02 0.22 0.28 0.28 0.13 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Booot       3314  3276   0.00 0.00 0.00 0.04 0.09 0.18 0.35 0.22 0.09 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Ethereal    3304  3283   0.00 0.00 0.00 0.01 0.04 0.09 0.22 0.37 0.18 0.06 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Andscacs    3268  3244   0.00 0.00 0.00 0.00 0.01 0.03 0.08 0.21 0.41 0.16 0.06 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fritz       3231  3200   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.12 0.26 0.23 0.15 0.09 0.05 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Xiphos      3218  3179   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.25 0.23 0.16 0.10 0.06 0.04 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Texel       3204  3144   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.16 0.19 0.17 0.13 0.09 0.06 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00
Pedone      3194  3090   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.07 0.12 0.16 0.18 0.15 0.12 0.08 0.05 0.03 0.01 0.00 0.00 0.00 0.00 0.00
Gull        3200  3184   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.07 0.11 0.14 0.16 0.16 0.12 0.09 0.05 0.02 0.01 0.00 0.00 0.00 0.00
Vajolet     3177  3101   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.06 0.10 0.15 0.17 0.16 0.13 0.09 0.05 0.03 0.01 0.00 0.00 0.00 0.00
Fizbo       3185  3259   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.07 0.10 0.14 0.18 0.16 0.13 0.08 0.04 0.02 0.00 0.00 0.00
Laser       3167  3226   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.05 0.07 0.11 0.15 0.18 0.18 0.12 0.07 0.03 0.00 0.00 0.00
Arasan      3152  3123   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.06 0.10 0.14 0.19 0.19 0.14 0.08 0.04 0.00 0.00 0.00
Nemorino    3123  3099   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.02 0.04 0.07 0.12 0.19 0.25 0.28 0.01 0.00 0.00
Wasp        3112  3041   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.03 0.06 0.11 0.21 0.27 0.30 0.01 0.00 0.00
Ivanhoe     3116  3115   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.03 0.06 0.11 0.19 0.26 0.30 0.01 0.00 0.00
Senpai      3028  3112   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.76 0.17 0.04
Nirvana     2998  3186   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.15 0.53 0.32
Crafty      2961  3013   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.07 0.29 0.64
Yes, your table after 19/46 rounds was pretty ridiculous. Now, after 24/46 rounds it looks more reasonable, but probably not because your simulations are more reasonable, but mostly because the current standings started to look more reasonable. Sure, with more games played, even bad simulations will approach some level of confidence, as flukes in standings start to dampen and less games remain to change things significantly. The problem is, as with previous simulations for TCEC, you have wild oscillations in predictions early in the tournament, according to standings, while just by "gut feeling" I had 50% for SF, 30% for Houdini, 18% for Komodo to finish the first after 19/46 games played in CCCC, and I have pretty much the same numbers now, after 24/46 games played. It's pretty normal to have some stability in the first parts of the tournament, as the CCRL prior is very strong, and only later real-life results start to show prominently, especially towards the end, when too few games remain for the prior to play a significant role. Towards the end I can have wild oscillations too, but remember TCEC simulations, where too early you had 3% for Lc0, but I had 20%, and Lc0 later had a real 20% to win one of the last games against a not very strong engine (the last or one before the last of its games), but it drew it.
chrisw
Posts: 4317
Joined: Tue Apr 03, 2012 4:28 pm

Re: CCCC Rapid Rumble results simulator

Post by chrisw »

Laskos wrote: Sun Sep 09, 2018 2:50 pm
chrisw wrote: Sun Sep 09, 2018 11:24 am 290 rounds so far

"broken" - Milos
"LOL" - Laskos
"really inaccurate" - JJJ
"ridiculous" - Milos

Code: Select all

Engine Tournament Init   1st  2nd  3rd  4th  5th  6th  7th ....
Houdini     3452  3400   0.51 0.31 0.16 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Stockfish   3451  3439   0.34 0.40 0.23 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Komodo      3436  3404   0.14 0.28 0.51 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Shredder    3346  3287   0.00 0.00 0.04 0.34 0.29 0.20 0.09 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Lc0         3346  3300   0.00 0.00 0.03 0.32 0.28 0.22 0.10 0.04 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fire        3348  3326   0.00 0.00 0.02 0.22 0.28 0.28 0.13 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Booot       3314  3276   0.00 0.00 0.00 0.04 0.09 0.18 0.35 0.22 0.09 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Ethereal    3304  3283   0.00 0.00 0.00 0.01 0.04 0.09 0.22 0.37 0.18 0.06 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Andscacs    3268  3244   0.00 0.00 0.00 0.00 0.01 0.03 0.08 0.21 0.41 0.16 0.06 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fritz       3231  3200   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.12 0.26 0.23 0.15 0.09 0.05 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Xiphos      3218  3179   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.25 0.23 0.16 0.10 0.06 0.04 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Texel       3204  3144   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.16 0.19 0.17 0.13 0.09 0.06 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00
Pedone      3194  3090   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.07 0.12 0.16 0.18 0.15 0.12 0.08 0.05 0.03 0.01 0.00 0.00 0.00 0.00 0.00
Gull        3200  3184   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.07 0.11 0.14 0.16 0.16 0.12 0.09 0.05 0.02 0.01 0.00 0.00 0.00 0.00
Vajolet     3177  3101   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.06 0.10 0.15 0.17 0.16 0.13 0.09 0.05 0.03 0.01 0.00 0.00 0.00 0.00
Fizbo       3185  3259   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.07 0.10 0.14 0.18 0.16 0.13 0.08 0.04 0.02 0.00 0.00 0.00
Laser       3167  3226   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.05 0.07 0.11 0.15 0.18 0.18 0.12 0.07 0.03 0.00 0.00 0.00
Arasan      3152  3123   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.06 0.10 0.14 0.19 0.19 0.14 0.08 0.04 0.00 0.00 0.00
Nemorino    3123  3099   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.02 0.04 0.07 0.12 0.19 0.25 0.28 0.01 0.00 0.00
Wasp        3112  3041   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.03 0.06 0.11 0.21 0.27 0.30 0.01 0.00 0.00
Ivanhoe     3116  3115   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.03 0.06 0.11 0.19 0.26 0.30 0.01 0.00 0.00
Senpai      3028  3112   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.76 0.17 0.04
Nirvana     2998  3186   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.15 0.53 0.32
Crafty      2961  3013   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.07 0.29 0.64
Yes, your table after 19/46 rounds was pretty ridiculous. Now, after 24/46 rounds it looks more reasonable, but probably not because your simulations are more reasonable, but mostly because the current standings started to look more reasonable. Sure, with more games played, even bad simulations will approach some level of confidence, as flukes in standings start to dampen and less games remain to change things significantly. The problem is, as with previous simulations for TCEC, you have wild oscillations in predictions early in the tournament, according to standings, while just by "gut feeling" I had 50% for SF, 30% for Houdini, 18% for Komodo to finish the first after 19/46 games played in CCCC, and I have pretty much the same numbers now, after 24/46 games played. It's pretty normal to have some stability in the first parts of the tournament, as the CCRL prior is very strong, and only later real-life results start to show prominently, especially towards the end, when too few games remain for the prior to play a significant role. Towards the end I can have wild oscillations too, but remember TCEC simulations, where too early you had 3% for Lc0, but I had 20%, and Lc0 later had a real 20% to win one of the last games against a not very strong engine (the last or one before the last of its games), but it drew it.
Well, you are what is known as "wrong". A simulation is a simulation. Let's spell it out:

Simulation table = [Results so far] + Weighted function of(elo-randomised results to come).

Results so far are fixed.
Elo randomised results to come are dependant on elo differences, pretty much according to formula.
Elos are fixed at start (and quite probably unreliable), change to elo during the tournament is open to interpretation.
Weighting function is open to interpretation.

All there is to play with is the weighting function and the tournament elos.
Since you nor anyone else has sufficient data, and the actual tournament result is going to be anecdotal, one of many possibles, your attack adjectives, and all the other attack adjectives are actually JFS.

There is no right, nor wrong, nor correct, nor accurate. It's a prediction about the future using actually quite unreliable data in the first place. And unlike, say weather forecasts, where there's plenty of results to test models against, and the input data is reliable, you have zero result to test against. And never will have more than one anedcote at the end.

Why am I wasting my time, I ask myself.
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: CCCC Rapid Rumble results simulator

Post by JJJ »

Uri Blass wrote: Sun Sep 09, 2018 1:52 pm
JJJ wrote: Sun Sep 09, 2018 11:59 am It is easy to predict even with a short tournament , because each engine is ranked with a good margin over the other, except Komodo / Houdini. So Stockfish is number 1 with a good margin, Lc0 number 4 with a good margin, then Fire, then Shredder, so the probability to have the number in the good order is pretty high. Higher than your simulator who needs to take in account the elo much more than the starting ranking of a tournament.
It is easy to predict only that Stockfish Houdini and Komodo are going to be the first 3 cpu programs because there are a lot of ranking lists at different time control when they are the top 3.

Lc0 is not in many rating lists at different time control and nothing is easy to predict for her.
https://docs.google.com/spreadsheets/d/ ... 1707964751
There is many test showing some net of Leela stronger than Fire and close to Stockfish 8. You might see some here :https://groups.google.com/forum/#!forum/lczero
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: CCCC Rapid Rumble results simulator

Post by Laskos »

chrisw wrote: Sun Sep 09, 2018 3:37 pm
Laskos wrote: Sun Sep 09, 2018 2:50 pm
chrisw wrote: Sun Sep 09, 2018 11:24 am 290 rounds so far

"broken" - Milos
"LOL" - Laskos
"really inaccurate" - JJJ
"ridiculous" - Milos

Code: Select all

Engine Tournament Init   1st  2nd  3rd  4th  5th  6th  7th ....
Houdini     3452  3400   0.51 0.31 0.16 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Stockfish   3451  3439   0.34 0.40 0.23 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Komodo      3436  3404   0.14 0.28 0.51 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Shredder    3346  3287   0.00 0.00 0.04 0.34 0.29 0.20 0.09 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Lc0         3346  3300   0.00 0.00 0.03 0.32 0.28 0.22 0.10 0.04 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fire        3348  3326   0.00 0.00 0.02 0.22 0.28 0.28 0.13 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Booot       3314  3276   0.00 0.00 0.00 0.04 0.09 0.18 0.35 0.22 0.09 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Ethereal    3304  3283   0.00 0.00 0.00 0.01 0.04 0.09 0.22 0.37 0.18 0.06 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Andscacs    3268  3244   0.00 0.00 0.00 0.00 0.01 0.03 0.08 0.21 0.41 0.16 0.06 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fritz       3231  3200   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.12 0.26 0.23 0.15 0.09 0.05 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Xiphos      3218  3179   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.25 0.23 0.16 0.10 0.06 0.04 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Texel       3204  3144   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.16 0.19 0.17 0.13 0.09 0.06 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00
Pedone      3194  3090   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.07 0.12 0.16 0.18 0.15 0.12 0.08 0.05 0.03 0.01 0.00 0.00 0.00 0.00 0.00
Gull        3200  3184   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.07 0.11 0.14 0.16 0.16 0.12 0.09 0.05 0.02 0.01 0.00 0.00 0.00 0.00
Vajolet     3177  3101   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.06 0.10 0.15 0.17 0.16 0.13 0.09 0.05 0.03 0.01 0.00 0.00 0.00 0.00
Fizbo       3185  3259   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.07 0.10 0.14 0.18 0.16 0.13 0.08 0.04 0.02 0.00 0.00 0.00
Laser       3167  3226   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.05 0.07 0.11 0.15 0.18 0.18 0.12 0.07 0.03 0.00 0.00 0.00
Arasan      3152  3123   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.06 0.10 0.14 0.19 0.19 0.14 0.08 0.04 0.00 0.00 0.00
Nemorino    3123  3099   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.02 0.04 0.07 0.12 0.19 0.25 0.28 0.01 0.00 0.00
Wasp        3112  3041   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.03 0.06 0.11 0.21 0.27 0.30 0.01 0.00 0.00
Ivanhoe     3116  3115   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.03 0.06 0.11 0.19 0.26 0.30 0.01 0.00 0.00
Senpai      3028  3112   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.76 0.17 0.04
Nirvana     2998  3186   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.15 0.53 0.32
Crafty      2961  3013   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.07 0.29 0.64
Yes, your table after 19/46 rounds was pretty ridiculous. Now, after 24/46 rounds it looks more reasonable, but probably not because your simulations are more reasonable, but mostly because the current standings started to look more reasonable. Sure, with more games played, even bad simulations will approach some level of confidence, as flukes in standings start to dampen and less games remain to change things significantly. The problem is, as with previous simulations for TCEC, you have wild oscillations in predictions early in the tournament, according to standings, while just by "gut feeling" I had 50% for SF, 30% for Houdini, 18% for Komodo to finish the first after 19/46 games played in CCCC, and I have pretty much the same numbers now, after 24/46 games played. It's pretty normal to have some stability in the first parts of the tournament, as the CCRL prior is very strong, and only later real-life results start to show prominently, especially towards the end, when too few games remain for the prior to play a significant role. Towards the end I can have wild oscillations too, but remember TCEC simulations, where too early you had 3% for Lc0, but I had 20%, and Lc0 later had a real 20% to win one of the last games against a not very strong engine (the last or one before the last of its games), but it drew it.
Well, you are what is known as "wrong". A simulation is a simulation. Let's spell it out:

Simulation table = [Results so far] + Weighted function of(elo-randomised results to come).

Results so far are fixed.
Elo randomised results to come are dependant on elo differences, pretty much according to formula.
Elos are fixed at start (and quite probably unreliable), change to elo during the tournament is open to interpretation.
Weighting function is open to interpretation.

All there is to play with is the weighting function and the tournament elos.
Since you nor anyone else has sufficient data, and the actual tournament result is going to be anecdotal, one of many possibles, your attack adjectives, and all the other attack adjectives are actually JFS.

There is no right, nor wrong, nor correct, nor accurate. It's a prediction about the future using actually quite unreliable data in the first place. And unlike, say weather forecasts, where there's plenty of results to test models against, and the input data is reliable, you have zero result to test against. And never will have more than one anedcote at the end.

Why am I wasting my time, I ask myself.
It seems quite complicated, and I don't see the fruits of this computation, if even you don't stand by your results. What is clear is that we have no good prior for Lc0. So, your result with hardly any clear prior for engines, is probably as good as any other for Lc0 only.
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: CCCC Rapid Rumble results simulator

Post by Werewolf »

Laskos wrote: Sun Sep 09, 2018 1:51 pm
Werewolf wrote: Sun Sep 09, 2018 1:44 pm
Laskos wrote: Sun Sep 09, 2018 1:39 pm
RTX 2080Ti will probably have the full fp16 support, so it will be much faster than 1080Ti, and 2 of these will probably be even more than what CCCC shows for Lc0 (it doesn't scale well to 4 GPUs).
2080 Ti will be much faster than 1080 Ti for the reasons you give.
But I don't think 2x 2080 Ti will be faster than 2 x Titan V (or V100) according to the specs.

The Titan V will still be about 10% faster.
Well, even if that is correct, it is also probable that AMD Threadripper 2990X 32 core CPU is a bit slower than that 46 core machine used in CCCC. I just wanted to show that the conditions are not very unbalanced price-wise if one is on limited budget but has 2-3 months of patience.
Well that is certainly true. The Titans are overpriced and the V100s...don't get me started - a complete rip off.

To me it seems odd Nvidia included FP16 with the new gaming cards because it eats right into the segment of their more high end cards. Perhaps a new Titan is around the corner to pull ahead again...
chrisw
Posts: 4317
Joined: Tue Apr 03, 2012 4:28 pm

Re: CCCC Rapid Rumble results simulator

Post by chrisw »

Laskos wrote: Sun Sep 09, 2018 4:30 pm
chrisw wrote: Sun Sep 09, 2018 3:37 pm
Laskos wrote: Sun Sep 09, 2018 2:50 pm
chrisw wrote: Sun Sep 09, 2018 11:24 am 290 rounds so far

"broken" - Milos
"LOL" - Laskos
"really inaccurate" - JJJ
"ridiculous" - Milos

Code: Select all

Engine Tournament Init   1st  2nd  3rd  4th  5th  6th  7th ....
Houdini     3452  3400   0.51 0.31 0.16 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Stockfish   3451  3439   0.34 0.40 0.23 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Komodo      3436  3404   0.14 0.28 0.51 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Shredder    3346  3287   0.00 0.00 0.04 0.34 0.29 0.20 0.09 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Lc0         3346  3300   0.00 0.00 0.03 0.32 0.28 0.22 0.10 0.04 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fire        3348  3326   0.00 0.00 0.02 0.22 0.28 0.28 0.13 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Booot       3314  3276   0.00 0.00 0.00 0.04 0.09 0.18 0.35 0.22 0.09 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Ethereal    3304  3283   0.00 0.00 0.00 0.01 0.04 0.09 0.22 0.37 0.18 0.06 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Andscacs    3268  3244   0.00 0.00 0.00 0.00 0.01 0.03 0.08 0.21 0.41 0.16 0.06 0.03 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Fritz       3231  3200   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.12 0.26 0.23 0.15 0.09 0.05 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Xiphos      3218  3179   0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.25 0.23 0.16 0.10 0.06 0.04 0.02 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Texel       3204  3144   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.10 0.16 0.19 0.17 0.13 0.09 0.06 0.03 0.02 0.01 0.00 0.00 0.00 0.00 0.00
Pedone      3194  3090   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.07 0.12 0.16 0.18 0.15 0.12 0.08 0.05 0.03 0.01 0.00 0.00 0.00 0.00 0.00
Gull        3200  3184   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.04 0.07 0.11 0.14 0.16 0.16 0.12 0.09 0.05 0.02 0.01 0.00 0.00 0.00 0.00
Vajolet     3177  3101   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.06 0.10 0.15 0.17 0.16 0.13 0.09 0.05 0.03 0.01 0.00 0.00 0.00 0.00
Fizbo       3185  3259   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.07 0.10 0.14 0.18 0.16 0.13 0.08 0.04 0.02 0.00 0.00 0.00
Laser       3167  3226   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.03 0.05 0.07 0.11 0.15 0.18 0.18 0.12 0.07 0.03 0.00 0.00 0.00
Arasan      3152  3123   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.04 0.06 0.10 0.14 0.19 0.19 0.14 0.08 0.04 0.00 0.00 0.00
Nemorino    3123  3099   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.02 0.04 0.07 0.12 0.19 0.25 0.28 0.01 0.00 0.00
Wasp        3112  3041   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.03 0.06 0.11 0.21 0.27 0.30 0.01 0.00 0.00
Ivanhoe     3116  3115   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.03 0.06 0.11 0.19 0.26 0.30 0.01 0.00 0.00
Senpai      3028  3112   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.76 0.17 0.04
Nirvana     2998  3186   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.15 0.53 0.32
Crafty      2961  3013   0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.07 0.29 0.64
Yes, your table after 19/46 rounds was pretty ridiculous. Now, after 24/46 rounds it looks more reasonable, but probably not because your simulations are more reasonable, but mostly because the current standings started to look more reasonable. Sure, with more games played, even bad simulations will approach some level of confidence, as flukes in standings start to dampen and less games remain to change things significantly. The problem is, as with previous simulations for TCEC, you have wild oscillations in predictions early in the tournament, according to standings, while just by "gut feeling" I had 50% for SF, 30% for Houdini, 18% for Komodo to finish the first after 19/46 games played in CCCC, and I have pretty much the same numbers now, after 24/46 games played. It's pretty normal to have some stability in the first parts of the tournament, as the CCRL prior is very strong, and only later real-life results start to show prominently, especially towards the end, when too few games remain for the prior to play a significant role. Towards the end I can have wild oscillations too, but remember TCEC simulations, where too early you had 3% for Lc0, but I had 20%, and Lc0 later had a real 20% to win one of the last games against a not very strong engine (the last or one before the last of its games), but it drew it.
Well, you are what is known as "wrong". A simulation is a simulation. Let's spell it out:

Simulation table = [Results so far] + Weighted function of(elo-randomised results to come).

Results so far are fixed.
Elo randomised results to come are dependant on elo differences, pretty much according to formula.
Elos are fixed at start (and quite probably unreliable), change to elo during the tournament is open to interpretation.
Weighting function is open to interpretation.

All there is to play with is the weighting function and the tournament elos.
Since you nor anyone else has sufficient data, and the actual tournament result is going to be anecdotal, one of many possibles, your attack adjectives, and all the other attack adjectives are actually JFS.

There is no right, nor wrong, nor correct, nor accurate. It's a prediction about the future using actually quite unreliable data in the first place. And unlike, say weather forecasts, where there's plenty of results to test models against, and the input data is reliable, you have zero result to test against. And never will have more than one anedcote at the end.

Why am I wasting my time, I ask myself.
It seems quite complicated, and I don't see the fruits of this computation, if even you don't stand by your results. What is clear is that we have no good prior for Lc0. So, your result with hardly any clear prior for engines, is probably as good as any other for Lc0 only.
We don't have "no good prior", we have an estimate. If it's badly out, then the tournament elo adjustment will compensate. In actuality, the LC0 prior at 3300 doesn't seem too far off when ranked against the others..
The priors are by definition noisy (engine code changes, different hardware, insufficient test games bla bla lots of reasons). Tournament games, by definition, reduce the noise, so my algorithm factors them in, supposedly too fast for for all these "critics". Well, we'll see.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: CCCC Rapid Rumble results simulator

Post by Milos »

Laskos wrote: Sun Sep 09, 2018 1:39 pm RTX 2080Ti will probably have the full fp16 support, so it will be much faster than 1080Ti for Lc0, and 2 of these will probably be even more than what CCCC shows for Lc0 (it doesn't scale well to 4 GPUs).
I don't know the source of that nonsense, but 2080Ti doesn't have and will not have FP16 support. That's a fact.