But people still do, even though they know they shouldn't. Not the two ratings lists Ovyron mentioned, but e.g. human & computer ratings lists.
future of top engines:how much more elo?
Moderator: Ras
-
- Posts: 12778
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: future of top engines:how much more elo?
As long as you don't use the absolute numbers as a reference, comparing lists isn't silly.
There may be some minor differences due to scaling implementations, but for the most part, a carefully prepared ranking list will look a lot like another ranking list at a different time control or thread count.
Naturally, due to excellent or awful implementations of threading there can be some differences. But I guess that they are pretty rare.
IOW, if engine X is stronger than engine Y on list A, then chances are pretty good that it is also stronger on list B. I assume, of course, that error bars are small enough that the rankings are not randomized to some degree and there is substantial LOS.
There may be some minor differences due to scaling implementations, but for the most part, a carefully prepared ranking list will look a lot like another ranking list at a different time control or thread count.
Naturally, due to excellent or awful implementations of threading there can be some differences. But I guess that they are pretty rare.
IOW, if engine X is stronger than engine Y on list A, then chances are pretty good that it is also stronger on list B. I assume, of course, that error bars are small enough that the rankings are not randomized to some degree and there is substantial LOS.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 44204
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: future of top engines:how much more elo?
Correct.
The 40/40 and 40/4 lists are constructed from separate databases.
However, within each list, the 1CPU and 4CPU ratings can be compared.
gbanksnz at gmail.com
-
- Posts: 10805
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: future of top engines:how much more elo?
The idea is that people want rating to measure playing strength when the idea is that rating at short time control should be lower than rating at long time control because the level of playing is lower.
It is obvious that it is not the case when you compare CCRL 40/40 and 40/4 and it is possible to change it by making games at unequal time control.
-
- Posts: 4559
- Joined: Tue Jul 03, 2007 4:30 am
Re: future of top engines:how much more elo?
The whole point is that we should be able to compare them, like, what is the answer to this question:
What is the weakest engine that needs a time control of 40/40 to reach the strength of 40/4 Stockfish 9?
That's an interesting question, and nobody knows. If the 40/4 list was correctly calibrated, we could answer this, and any other such question, at a glance.
So it looks like a flaw in their calibration, and it's easy to fix.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 12778
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: future of top engines:how much more elo?
Aside from playing a few hundred thousand games with some engines at the slow time control verses engines running the fast time control, i would be curious to know what your easy fix is
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 4559
- Joined: Tue Jul 03, 2007 4:30 am
Re: future of top engines:how much more elo?
One should be enough, mainly, the top one of the 40/4 list.
At the very least, playing 1000 games between that top engine and the ones from 40/40, get a rating and calibrate for that rating the entire 40/40 list would be better than what we have now. Much, much better.
At the very least, playing 1000 games between that top engine and the ones from 40/40, get a rating and calibrate for that rating the entire 40/40 list would be better than what we have now. Much, much better.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 12778
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: future of top engines:how much more elo?
I don't think that approach works. If you play 1000 games (for instance) with SF at 40/20 and using 40/4 time control for the second engine, it would link the lists.Ovyron wrote: ↑Sat Jul 27, 2019 11:14 am One should be enough, mainly, the top one of the 40/4 list.
At the very least, playing 1000 games between that top engine and the ones from 40/40, get a rating and calibrate for that rating the entire 40/40 list would be better than what we have now. Much, much better.
However, some engines have opponents that really clobber then (better than they should) and other engines that they clobber (better than they should) and so the ranking you get would contain data from just the one cross tie,
We, also.'' already know relative strengths, and if the goal is better data I do not think that there are any real short cuts.So what did we really gain?
I actually think it would add to the confusion also,
I guess that people who saw SF with 4 threads rated at 3400 Elo would be surprised to see the same engine rated 175 Elo lower due to the other time control setting,
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 4559
- Joined: Tue Jul 03, 2007 4:30 am
Re: future of top engines:how much more elo?
Still way better than what we have now.Dann Corbit wrote: ↑Sun Jul 28, 2019 2:58 amHowever, some engines have opponents that really clobber then (better than they should) and other engines that they clobber (better than they should) and so the ranking you get would contain data from just the one cross tie
We think we know relative strengths, but it hasn't been tested, so how do we know?Dann Corbit wrote: ↑Sun Jul 28, 2019 2:58 amWe, also.'' already know relative strengths, and if the goal is better data I do not think that there are any real short cuts.So what did we really gain?
I was genuinely confused by all this at first, with 40/4 showing higher rating, I wrongly assumed 40/4 meant "40 minutes for 4 moves." Any change could only make things better.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 1482
- Joined: Mon Apr 23, 2018 7:54 am
Re: future of top engines:how much more elo?
Ideally, we'd have data to let us match (by time or nodes handicap) the playing strength of any two engines we want to play against each other, without just guessing. That seems way too big a job, though.Ovyron wrote: ↑Sun Jul 28, 2019 11:51 amWe think we know relative strengths, but it hasn't been tested, so how do we know?Dann Corbit wrote: ↑Sun Jul 28, 2019 2:58 amWe, also already know relative strengths, and if the goal is better data I do not think that there are any real short cuts.So what did we really gain?