Oh, why so much mystification? Calibrate both lists to same engine's fixed rating, an engine well connected in the pool of engines and having many games. I am not sure, maybe CCRL already does that. Then add some 200 Elo points to longer TC 40/40 list compared to 40/4 list. Then the ratings can roughly be compared in some absolute CCRL Elo points valid for both 40/4 and 40/40 lists. Rough method, but pretty solid.Ovyron wrote: ↑Sun Jul 28, 2019 11:51 amStill way better than what we have now.Dann Corbit wrote: ↑Sun Jul 28, 2019 2:58 amHowever, some engines have opponents that really clobber then (better than they should) and other engines that they clobber (better than they should) and so the ranking you get would contain data from just the one cross tie
We think we know relative strengths, but it hasn't been tested, so how do we know?Dann Corbit wrote: ↑Sun Jul 28, 2019 2:58 amWe, also.'' already know relative strengths, and if the goal is better data I do not think that there are any real short cuts.So what did we really gain?
I was genuinely confused by all this at first, with 40/4 showing higher rating, I wrongly assumed 40/4 meant "40 minutes for 4 moves." Any change could only make things better.
future of top engines:how much more elo?
Moderator: Ras
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: future of top engines:how much more elo?
-
- Posts: 4558
- Joined: Tue Jul 03, 2007 4:30 am
Re: future of top engines:how much more elo?
This is the key point, we need to figure out what this numbers is, and after doing so, we could even mix the lists and add [40/40]/[40/4] to the engines' monikers, so they can be compared directly on a list, just like 1CPU and 4CPU can be compared on a list.
(caveat: 40/40 ratings seem really overrated, specially as it's equivalent to blitz in some machines, so I'd rather have 200 subtracted from the 40/4 one)
But, as the laziest solution, just adding the rating points to the 40/40 list (without any further testing) would be much better than what we have now. So that's an easy fix.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 1482
- Joined: Mon Apr 23, 2018 7:54 am
Re: future of top engines:how much more elo?
Yes, if numbers to translate between different lists aren't figured out, people just assume numbers, which might be totally wrong.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: future of top engines:how much more elo?
I am very sorry you have huge difficulties translating 10x time factor in average Elo points shft across the engines from 40/4 to 40/40 in CCRL conditions. I seem to have less difficulties, and I estimate it to be 200 +/- 30 Elo points. The sole serious problem is that the scale of differences of the two lists might be not quite a factor 1.0, but say 0.9. Then the translation from one list to another might look like Elo2 - 2800 = 0.9 x (Elo1 - 2800) + 200, for example. Sure, still a rough result, but a very easy translation. Playing hundreds of thousands of gamed for mixing the time controls is basically building a new rating list, a huge endeavor and almost an idiotic one.
-
- Posts: 4558
- Joined: Tue Jul 03, 2007 4:30 am
Re: future of top engines:how much more elo?
We shouldn't care about scale factors, but move quality, if there's some 40/40 engine that plays at the same strength as one on the 40/4, then it'd be able to appear on the 40/40 list without disrupting anything (no more disruption than increasing the 40/40 engine that plays at that strength anyway).
What I'm saying is that 40/40 allows engines with 1CPU to play engines with 4CPU with no problems, we're not restricting engines to only play others in with the same CPU and then have 1CPU and 4CPU lists that can't be compared (where 1CPU engines appear with higher rating than 4CPU...) It's the same thing with time control so it'd make sense to have a single list that shows ratings for 40/40 and 40/4 where they can be compared.
Even if it's only done with 1 40/20 engine and we assume a scale factor of 1 (which could be wrong), and it ends with 200 elo difference (so Laskos can say "you just wasted testing time! told you so!"), using it to compare the rating lists would be much better than what we have now.
But don't remain static just because the best solution would be idiotic to implement, the simplest solution (calibrating to -200) that improves the situation is worth implementing.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: future of top engines:how much more elo?
I don't understand what you say. I am on the phone now and for some time. On a computer, one can do a linear regression in 10 minutes by say picking 20 engines behaving regularly, assuming that regular engines, one by one, is about 200 Elo points stronger at 40/40 than at 40/4.Ovyron wrote: ↑Wed Jul 31, 2019 4:52 pmWe shouldn't care about scale factors, but move quality, if there's some 40/40 engine that plays at the same strength as one on the 40/4, then it'd be able to appear on the 40/40 list without disrupting anything (no more disruption than increasing the 40/40 engine that plays at that strength anyway).
What I'm saying is that 40/40 allows engines with 1CPU to play engines with 4CPU with no problems, we're not restricting engines to only play others in with the same CPU and then have 1CPU and 4CPU lists that can't be compared (where 1CPU engines appear with higher rating than 4CPU...) It's the same thing with time control so it'd make sense to have a single list that shows ratings for 40/40 and 40/4 where they can be compared.
Even if it's only done with 1 40/20 engine and we assume a scale factor of 1 (which could be wrong), and it ends with 200 elo difference (so Laskos can say "you just wasted testing time! told you so!"), using it to compare the rating lists would be much better than what we have now.
But don't remain static just because the best solution would be idiotic to implement, the simplest solution (calibrating to -200) that improves the situation is worth implementing.
One will get Elo2 = a*Elo1 + b relationship between the two lists and then linearly too, one can easily build a common list. Also, one would probably get some 20 Elo points additional methodological error, but keeping in mind that we anyway have usually 10-20 Elo point margins on both lists, that's not such a grave issue. I prefer doing 10 minutes work than years of pretty redundant tests.
-
- Posts: 10803
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: future of top engines:how much more elo?
The point is that the assumption of 200 elo+-20 elo difference between 40/40 and 40/4 is not something that
we know to be proved by games at the relevant time control.
I do not know if it is 200 elo or 150 elo or 250 elo.
Maybe people already played games to find the difference to be 95% certain it is 180-220 elo but I do not know about them.
we know to be proved by games at the relevant time control.
I do not know if it is 200 elo or 150 elo or 250 elo.
Maybe people already played games to find the difference to be 95% certain it is 180-220 elo but I do not know about them.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: future of top engines:how much more elo?
Probably 200 +/- 30 or so. No great mystery, there is a plethora of studies on that even on this forum, and tests of Andreas and others discussed here. If people have difficulties remembering anything related to numbers, I don't think they need rating lists.Uri Blass wrote: ↑Wed Jul 31, 2019 7:03 pm The point is that the assumption of 200 elo+-20 elo difference between 40/40 and 40/4 is not something that
we know to be proved by games at the relevant time control.
I do not know if it is 200 elo or 150 elo or 250 elo.
Maybe people already played games to find the difference to be 95% certain it is 180-220 elo but I do not know about them.
70 first doubling
60 second doubling
55 third doubling
15 for 1.25 final factor
==============
about 200 Elo points for a factor of 10 from CCRL 40/4 to 40/40. Give or take 20, at most 30 Elo points.
-
- Posts: 4558
- Joined: Tue Jul 03, 2007 4:30 am
Re: future of top engines:how much more elo?
I guess all these discussions are useless, the rating lists are built from volunteer work and what those volunteers want to test (that's why Stockfish 9 tops the 40/4 list...)
What we'd need to do is convincing one of those testers to do the engine 40/40 vs. 40/4 test thing, hopefully one of them is reading...
What we'd need to do is convincing one of those testers to do the engine 40/40 vs. 40/4 test thing, hopefully one of them is reading...
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 919
- Joined: Sat May 31, 2014 8:28 am
Re: future of top engines:how much more elo?
Like Kai said, it's not worth their efforts. Just do the math and be content that it's that easy. If you want more precision you'll have to run the tests yourself!Ovyron wrote: ↑Wed Jul 31, 2019 10:57 pm I guess all these discussions are useless, the rating lists are built from volunteer work and what those volunteers want to test (that's why Stockfish 9 tops the 40/4 list...)
What we'd need to do is convincing one of those testers to do the engine 40/40 vs. 40/4 test thing, hopefully one of them is reading...
Regards,
Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.