future of top engines:how much more elo?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

jp
Posts: 1482
Joined: Mon Apr 23, 2018 7:54 am

Re: future of top engines:how much more elo?

Post by jp »

carldaman wrote: Wed Jul 24, 2019 10:41 pm They are separate rating lists, meaning you can't compare those two ratings.
But people still do, even though they know they shouldn't. Not the two ratings lists Ovyron mentioned, but e.g. human & computer ratings lists.
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: future of top engines:how much more elo?

Post by Dann Corbit »

As long as you don't use the absolute numbers as a reference, comparing lists isn't silly.
There may be some minor differences due to scaling implementations, but for the most part, a carefully prepared ranking list will look a lot like another ranking list at a different time control or thread count.

Naturally, due to excellent or awful implementations of threading there can be some differences. But I guess that they are pretty rare.

IOW, if engine X is stronger than engine Y on list A, then chances are pretty good that it is also stronger on list B. I assume, of course, that error bars are small enough that the rankings are not randomized to some degree and there is substantial LOS.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Graham Banks
Posts: 44204
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: future of top engines:how much more elo?

Post by Graham Banks »

carldaman wrote: Wed Jul 24, 2019 10:41 pm They are separate rating lists, meaning you can't compare those two ratings.

Unless you were joking, of course. :|
Correct.
The 40/40 and 40/4 lists are constructed from separate databases.

However, within each list, the 1CPU and 4CPU ratings can be compared.
gbanksnz at gmail.com
Uri Blass
Posts: 10805
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: future of top engines:how much more elo?

Post by Uri Blass »

carldaman wrote: Wed Jul 24, 2019 10:41 pm They are separate rating lists, meaning you can't compare those two ratings.

Unless you were joking, of course. :|

The idea is that people want rating to measure playing strength when the idea is that rating at short time control should be lower than rating at long time control because the level of playing is lower.

It is obvious that it is not the case when you compare CCRL 40/40 and 40/4 and it is possible to change it by making games at unequal time control.
User avatar
Ovyron
Posts: 4559
Joined: Tue Jul 03, 2007 4:30 am

Re: future of top engines:how much more elo?

Post by Ovyron »

carldaman wrote: Wed Jul 24, 2019 10:41 pm They are separate rating lists, meaning you can't compare those two ratings.
The whole point is that we should be able to compare them, like, what is the answer to this question:

What is the weakest engine that needs a time control of 40/40 to reach the strength of 40/4 Stockfish 9?

That's an interesting question, and nobody knows. If the 40/4 list was correctly calibrated, we could answer this, and any other such question, at a glance.

So it looks like a flaw in their calibration, and it's easy to fix.
Your beliefs create your reality, so be careful what you wish for.
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: future of top engines:how much more elo?

Post by Dann Corbit »

Ovyron wrote: Fri Jul 26, 2019 5:47 am
So it looks like a flaw in their calibration, and it's easy to fix.
Aside from playing a few hundred thousand games with some engines at the slow time control verses engines running the fast time control, i would be curious to know what your easy fix is
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Ovyron
Posts: 4559
Joined: Tue Jul 03, 2007 4:30 am

Re: future of top engines:how much more elo?

Post by Ovyron »

One should be enough, mainly, the top one of the 40/4 list.

At the very least, playing 1000 games between that top engine and the ones from 40/40, get a rating and calibrate for that rating the entire 40/40 list would be better than what we have now. Much, much better.
Your beliefs create your reality, so be careful what you wish for.
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: future of top engines:how much more elo?

Post by Dann Corbit »

Ovyron wrote: Sat Jul 27, 2019 11:14 am One should be enough, mainly, the top one of the 40/4 list.

At the very least, playing 1000 games between that top engine and the ones from 40/40, get a rating and calibrate for that rating the entire 40/40 list would be better than what we have now. Much, much better.
I don't think that approach works. If you play 1000 games (for instance) with SF at 40/20 and using 40/4 time control for the second engine, it would link the lists.

However, some engines have opponents that really clobber then (better than they should) and other engines that they clobber (better than they should) and so the ranking you get would contain data from just the one cross tie,

We, also.'' already know relative strengths, and if the goal is better data I do not think that there are any real short cuts.So what did we really gain?

I actually think it would add to the confusion also,
I guess that people who saw SF with 4 threads rated at 3400 Elo would be surprised to see the same engine rated 175 Elo lower due to the other time control setting,
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Ovyron
Posts: 4559
Joined: Tue Jul 03, 2007 4:30 am

Re: future of top engines:how much more elo?

Post by Ovyron »

Dann Corbit wrote: Sun Jul 28, 2019 2:58 amHowever, some engines have opponents that really clobber then (better than they should) and other engines that they clobber (better than they should) and so the ranking you get would contain data from just the one cross tie
Still way better than what we have now.
Dann Corbit wrote: Sun Jul 28, 2019 2:58 amWe, also.'' already know relative strengths, and if the goal is better data I do not think that there are any real short cuts.So what did we really gain?
We think we know relative strengths, but it hasn't been tested, so how do we know?
Dann Corbit wrote: Sun Jul 28, 2019 2:58 amI actually think it would add to the confusion also,
I was genuinely confused by all this at first, with 40/4 showing higher rating, I wrongly assumed 40/4 meant "40 minutes for 4 moves." Any change could only make things better.
Your beliefs create your reality, so be careful what you wish for.
jp
Posts: 1482
Joined: Mon Apr 23, 2018 7:54 am

Re: future of top engines:how much more elo?

Post by jp »

Ovyron wrote: Sun Jul 28, 2019 11:51 am
Dann Corbit wrote: Sun Jul 28, 2019 2:58 amWe, also already know relative strengths, and if the goal is better data I do not think that there are any real short cuts.So what did we really gain?
We think we know relative strengths, but it hasn't been tested, so how do we know?
Ideally, we'd have data to let us match (by time or nodes handicap) the playing strength of any two engines we want to play against each other, without just guessing. That seems way too big a job, though.