Ordo vs. Bayeselo

Discussion of chess software programming and technical issues.

Moderator: Ras

Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Ordo vs. Bayeselo

Post by Michel »

This discusion throws a lot of unrelated things together. Let me summarize what is relevant.

Statement of facts

(1) Adam's experiments show that both Ordo and BayesElo are good score predictors. This means that their underlying models (versions of the "elo model") are somewhat sound.

(2) So in statistics terminology both Ordo and BayesElo produce estimators for elolike ratings (which by (1) we can assume to exist).

(3) The only way to say which estimator is better is to compare their variances. This has not been done so the discussion in this long thread is actually void.

(4) BayesElo uses maximum likelihood estimation (MLE) which produces theoretically the most efficient estimators, if the model used is correct. Since the true model is unknown this advantage of BayesElo may be void.


Conclusion: Unless somebody comes up with more data with regard to (3) there is nothing more to be said.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Ordo vs. Bayeselo

Post by Laskos »

Michel wrote:
I have raised a legitimate concern I have with Bayeselo,
Adam:

I hope you have read my earlier mail on this, but let me reiterate.

Your concern is not valid (and there is no need for a "solution").

The scaled ratings produced by BayesElo do not fit the BayesElo model (which is your concern). This is normal. They are designed to fit the logistic model.

The unscaled ratings do fit the BayesElo model (as you observed in your experiments) but they are inflated with respect to the usual ratings. This is normal too since they measure something different. One could say they are not expressed in elos but in bayeselos. With this terminology the scale parameter converts bayeselos to elos.
I have the same understanding, but if it's all clear to you, could you say what "scale" I have to set in this case:
Ratings
A 2200
B 2400
C 2500
...
to have a prediction that B scores 75% against A? Ordo does this, how to do that with Bayeselo? It seems that neither the default nor "scale=1" gives this prediction.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Ordo vs. Bayeselo

Post by Daniel Shawul »

Kai, I think you simply don't want to accept the need for scaling. Here is my post trying to convince CCRL to go back to using calculated scale. The difference in magnitude between Elostat and unscaled bayeselo rating could be very big. So you will not be able to compare them without the scale. But using calculated scale , they matched within one elo.
I run the effect of the scale and how it helps for comparison with elostat using scct data. First run is elostat (i.e one within bayeselo). Then it is bayselo with calculated scale (which turned out to be 0.7), and finally the third one with scale = 1 as you use it now. Note that I didn't even need to calculate ratings again because scale is such a 'post processing' parameter, much like offset. The ratings are magnified by 1/0.7=1.4x times.That is a difference of 100 elo will become 140 elo. Clearly list 1 and list 2 are comparable while the third one has magnified values. Provide this example to CCRL team and ask them if that is what they want.. In my opinion it was good before i.e using calculated scale (defalult bayeselo), but changing it to scale=1 has caused problems for no apparent advantage of using it...

Summary:

Exampe comparison: Gull and Vitruvius
Elostat: 55 - 7 = 48 elo
Bayeselo default = 46 -- 3 = 49 elo
Bayeselo (scale = 1) as used right now in CCRL = 67 - -4 = 71 elo

Clearly elostat and bayeselo are comparable ~49elo difference between the two but scale = 1 gives 71 elo. That is 1.4 x 50 = 70 elo as I predicted
Modern Times
Posts: 3707
Joined: Thu Jun 07, 2012 11:02 pm

Re: Ordo vs. Bayeselo

Post by Modern Times »

Daniel Shawul wrote:Kai, I think you simply don't want to accept the need for scaling. Here is my post trying to convince CCRL to go back to using calculated scale.
CCRL are dropping "scale 1" and going back to default scaling.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Ordo vs. Bayeselo

Post by Daniel Shawul »

Modern Times wrote:
Daniel Shawul wrote:Kai, I think you simply don't want to accept the need for scaling. Here is my post trying to convince CCRL to go back to using calculated scale.
CCRL are dropping "scale 1" and going back to default scaling.
Thanks Ray for confirming that.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Ordo vs. Bayeselo

Post by Laskos »

Daniel Shawul wrote:Kai, I think you simply don't want to accept the need for scaling. Here is my post trying to convince CCRL to go back to using calculated scale. The difference in magnitude between Elostat and unscaled bayeselo rating could be very big. So you will not be able to compare them without the scale. But using calculated scale , they matched within one elo.
I run the effect of the scale and how it helps for comparison with elostat using scct data. First run is elostat (i.e one within bayeselo). Then it is bayselo with calculated scale (which turned out to be 0.7), and finally the third one with scale = 1 as you use it now. Note that I didn't even need to calculate ratings again because scale is such a 'post processing' parameter, much like offset. The ratings are magnified by 1/0.7=1.4x times.That is a difference of 100 elo will become 140 elo. Clearly list 1 and list 2 are comparable while the third one has magnified values. Provide this example to CCRL team and ask them if that is what they want.. In my opinion it was good before i.e using calculated scale (defalult bayeselo), but changing it to scale=1 has caused problems for no apparent advantage of using it...

Summary:

Exampe comparison: Gull and Vitruvius
Elostat: 55 - 7 = 48 elo
Bayeselo default = 46 -- 3 = 49 elo
Bayeselo (scale = 1) as used right now in CCRL = 67 - -4 = 71 elo

Clearly elostat and bayeselo are comparable ~49elo difference between the two but scale = 1 gives 71 elo. That is 1.4 x 50 = 70 elo as I predicted
Daniel, I don't care about EloStat, and I know EloStat gives wrong predictions. I don't want Bayeselo to adjust its scale to be in line with EloStat, but as Remi said it here and wrote on his site

2005.12.18:
New scale command to scale ratings. By default, maximum-likelihood ratings are now scaled down so that they look more like Elostat/SSDF ratings.


There must be a scale for which Bayeselo gives correct predictions according to the usual 400 logistic. Neither the default nor "scale=1" (if Larry is correct) seem to work.

Kai
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Ordo vs. Bayeselo

Post by Michel »

Neither the default nor "scale=1" (if Larry is correct) seem to work.
Kai: if you reread the thread then you will see that no evidence has been produced that the default scale does not work.

Please stop stating things which have no basis in facts.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Ordo vs. Bayeselo

Post by Laskos »

Michel wrote:
Neither the default nor "scale=1" (if Larry is correct) seem to work.
Kai: if you reread the thread then you will see that no evidence has been produced that the default scale does not work.

Please stop stating things which have no basis in facts.
Adam's plots show that. Have you seen them? That was the problem to which Remi answered in that thread.
http://www.talkchess.com/forum/viewtopi ... o+bayeselo
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Ordo vs. Bayeselo

Post by Michel »

Adam's plots show that. Have you seen them? That was the problem to which Remi answered in that thread.
http://www.talkchess.com/forum/viewtopi ... o+bayeselo
Thanks for the link. But I have already answered Adam. Please reread that. There is no "problem".
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Ordo vs. Bayeselo

Post by Laskos »

Michel wrote:
Adam's plots show that. Have you seen them? That was the problem to which Remi answered in that thread.
http://www.talkchess.com/forum/viewtopi ... o+bayeselo
Thanks for the link. But I have already answered Adam. Please reread that. There is no "problem".
I see a problem with the "default" ratings. 200 points difference in Bayeselo default ratings do not predict 75% performance. I am a bit puzzled reading your or Daniel statements that it's irrelevant and that adjusting to the wrong EloStat is more important than to give 400 logistic predictions.

I will state: testing groups using EloStat and "default" Bayeselo (which is adjusted to match EloStat) give compressed ratings by some 10-30% compared to the usual 400 logistic. Ordo gives correct predictions, in accordance with the usual logistic.
Bayeselo can use the "scale" factor to give correct predictions, but people don't know how to use it.

Kai