Ordo v0.5 (Release)

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Thank you very much!

Post by michiguel »

Daniel Shawul wrote:
Ratings are less compressed than in BayesElo, as many people have noted since Ordo 0.2 release; I do not find anything wrong with either algorithm, they are just different, that is all! Just for comparison, here is BayesElo output of the same PGN, courtesy of Adam (download link):
Please stop making such ridiculous claims! First understand why the scaling is applied which they both do anyway. Now Ordo has added white advantage to the curious. They both do maximum likelihood estimates so algorithm is still subset of bayeselo. It still needs to do a lot more to catch up with Bayeselo. That is the fact. Draw model, prior, LOS etc. Don't make it a popularity contest, by saying this is excellent this is very good blah blah.
For the record, Ordo is not trying to catch up with anything. It is doing things in a way it makes the author happy :-). BTW, Ordo calculates errors in complete different way (after simulations), and it does not have that LOS parameter because it calculates the SD of the head to head matches. For my personal use, I find that more useful.

Miguel
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Thank you very much!

Post by Daniel Shawul »

michiguel wrote:
Daniel Shawul wrote:
Ratings are less compressed than in BayesElo, as many people have noted since Ordo 0.2 release; I do not find anything wrong with either algorithm, they are just different, that is all! Just for comparison, here is BayesElo output of the same PGN, courtesy of Adam (download link):
Please stop making such ridiculous claims! First understand why the scaling is applied which they both do anyway. Now Ordo has added white advantage to the curious. They both do maximum likelihood estimates so algorithm is still subset of bayeselo. It still needs to do a lot more to catch up with Bayeselo. That is the fact. Draw model, prior, LOS etc. Don't make it a popularity contest, by saying this is excellent this is very good blah blah.
For the record, Ordo is not trying to catch up with anything. It is doing things in a way it makes the author happy :-). BTW, Ordo calculates errors in complete different way (after simulations), and it does not have that LOS parameter because it calculates the SD of the head to head matches. For my personal use, I find that more useful.

Miguel
Well it is my opinion that bayeselo is far advanced than the rest and YET that has been taken against it to claim it compresses ratings etc... User don't know how to use tool, user blames tool. I have nothing against additional rating tools (but I know how this would sound after the whole day I spent arguing against). It is just that there is some very wrong notion (as demonstrated by Jesus's post) that got into people that bayeselo compresses ratings which has been used against it to praise others.. For me it is not a popularity contest, but objective discussions like what the algorithmic differences are b/n the two would show which is better. Oh btw bayeselo has simulation tools and a whole lot of other fancy things..
User avatar
jshriver
Posts: 1342
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

Re: Ordo v0.5 (Release)

Post by jshriver »

Very nice keep up the great work.

-Josh
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Thank you very much!

Post by Laskos »

Daniel Shawul wrote:
michiguel wrote:
Daniel Shawul wrote:
Ratings are less compressed than in BayesElo, as many people have noted since Ordo 0.2 release; I do not find anything wrong with either algorithm, they are just different, that is all! Just for comparison, here is BayesElo output of the same PGN, courtesy of Adam (download link):
Please stop making such ridiculous claims! First understand why the scaling is applied which they both do anyway. Now Ordo has added white advantage to the curious. They both do maximum likelihood estimates so algorithm is still subset of bayeselo. It still needs to do a lot more to catch up with Bayeselo. That is the fact. Draw model, prior, LOS etc. Don't make it a popularity contest, by saying this is excellent this is very good blah blah.
For the record, Ordo is not trying to catch up with anything. It is doing things in a way it makes the author happy :-). BTW, Ordo calculates errors in complete different way (after simulations), and it does not have that LOS parameter because it calculates the SD of the head to head matches. For my personal use, I find that more useful.

Miguel
Well it is my opinion that bayeselo is far advanced than the rest and YET that has been taken against it to claim it compresses ratings etc... User don't know how to use tool, user blames tool.
But you have to admit that the default Bayeselo for many years did compress the Elo ratings (if they are Elo), and the "scale" parameter was unknown to most users. When Remi was asked repeatedly what happens with the Bayeselo rating compared to the averaged individual performances, the reply was not convincing, and only recently he showed that "scale" parameter solves the thing (now one can roughly take the weighted average of individual performances and get similar rating to "scale 1" shown rating). So, Bayeselo was a very good tool used pretty wrongly by most in building rating lists, giving them some Elos which were not really Elos.

Kai
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Thank you very much!

Post by Daniel Shawul »

But you have to admit that the default Bayeselo for many years did compress the Elo ratings (if they are Elo), and the "scale" parameter was unknown to most users. When Remi was asked repeatedly what happens with the Bayeselo rating compared to the averaged individual performances, the reply was not convincing, and only recently he showed that "scale" parameter solves the thing (now one can roughly take the weighted average of individual performances and get similar rating to "scale 1" shown rating). So, Bayeselo was a very good tool used pretty wrongly by most in building rating lists, giving them some Elos which were not really Elos.

Kai
Kai, atleast I feel like you are someone I can argue and discuss with and at the end of the day part ways peacfully even if we don't agree. That is a compliment :)
On the topic at hand, I ,like you, at first thought the use of the scale a bit confusing because I thought ok we did this long calculations only to multiply it by a factor !? The reason it was added was because people requested for it, and Remi didn't have it originally. Even after that you may need to adjust the scale manually further to get the maximum fit with what people expect (matching the slope at 0 is just a good approximation). So if mm calculated scale of 0.8 it means using scale=1 magnifies rating by 1.25x . The problem is much worse with a modified draw model I have, so I thought this is going to be really difficult to convince others to use it without scaling. I think we both agree more or less on the dilemma and current situation anyway. I would be more than happy to learn about the benefits of Ordo or other tools, but I know that sounds empty words from me right now..
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Thank you very much!

Post by Adam Hair »

Daniel Shawul wrote:
michiguel wrote:
Daniel Shawul wrote:
Ratings are less compressed than in BayesElo, as many people have noted since Ordo 0.2 release; I do not find anything wrong with either algorithm, they are just different, that is all! Just for comparison, here is BayesElo output of the same PGN, courtesy of Adam (download link):
Please stop making such ridiculous claims! First understand why the scaling is applied which they both do anyway. Now Ordo has added white advantage to the curious. They both do maximum likelihood estimates so algorithm is still subset of bayeselo. It still needs to do a lot more to catch up with Bayeselo. That is the fact. Draw model, prior, LOS etc. Don't make it a popularity contest, by saying this is excellent this is very good blah blah.
For the record, Ordo is not trying to catch up with anything. It is doing things in a way it makes the author happy :-). BTW, Ordo calculates errors in complete different way (after simulations), and it does not have that LOS parameter because it calculates the SD of the head to head matches. For my personal use, I find that more useful.

Miguel
Well it is my opinion that bayeselo is far advanced than the rest and YET that has been taken against it to claim it compresses ratings etc... User don't know how to use tool, user blames tool. I have nothing against additional rating tools (but I know how this would sound after the whole day I spent arguing against). It is just that there is some very wrong notion (as demonstrated by Jesus's post) that got into people that bayeselo compresses ratings which has been used against it to praise others.. For me it is not a popularity contest, but objective discussions like what the algorithmic differences are b/n the two would show which is better. Oh btw bayeselo has simulation tools and a whole lot of other fancy things..
To be fair, I was using the term "compress" when I was comparing the three rating programs. It was not until Remi mentioned the scale parameter (which I had forgotten about :( ) that I realized my mistake and stop using the word "compress". So, that is my fault if others read my comparisons and used the term.
Michel
Posts: 2271
Joined: Mon Sep 29, 2008 1:50 am

Re: Thank you very much!

Post by Michel »

I have no idea what name to give it,


Ok. Abstractly your method is as follows. Given a distribution with unknown parameters
you define a number of statistics, defined on samples, equal to the number of unknown parameters.

Then you match the computed expectation values of the statistics to the observed ones
for a given sample.

In your case the unknown parameters are the elo's of the players, and the statistics
are the observed scores of each player.
but I think it is similar to ML.


No it is simlar to the method of moments. In the method of moments the statistics mentionned above are the moments. But for the logistic distribution, by some accident
of nature, your method happens to be equivalent to ML.....
In fact, please correct me since I am not a mathematician, you cannot do ML because there is no set of parameters that will guarantee a prediction (most likely outcome) that will match the results you have (that is the definition of ML if I am not wrong).
You can certainly do ML. In fact BayesElo does it. Basically ML picks the elo's so that the probability of the actually observed game outcomes is maximal among all possible elo's. And this is defined, regardless how bizarre (e.g. intransitive) these outcomes might be (this just means that the probability of the observation will be very small).

This is the frequentist approach to ML. The Bayesian version without prior is mathematically equivalent. The Bayesian version of ML is slightly more general in the
sense that it allows for the inclusion of a prior in a non-adhoc way (but of course
the prior itself is adhoc...).

The reason why people like ML is that it is asymptotically optimal (i.e. the variance
of a ML estimator approaches the theoretically minimal value for large samples). And of course it is also domain independent.

If you write things out you see that ML estimation leads to a system of equations, very similar to the one you obtain.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Thank you very much!

Post by Daniel Shawul »

To be fair, I was using the term "compress" when I was comparing the three rating programs. It was not until Remi mentioned the scale parameter (which I had forgotten about ) that I realized my mistake and stop using the word "compress". So, that is my fault if others read my comparisons and used the term.
Adam I have no doubt that you and many others have understood it right after Remi metnioned about the scaling. But many like Jesuz here seem to have missed on that fact and use it to praise other tools. Fact is scale is a much similar factor as offset. If I say a rating calculated with an offset of 2500 is more magnified than that calculated with a 2300, it would be a gigantic nonsense. Saying something similar for the scale is still nonsense may be not to the same degree.
User avatar
Ajedrecista
Posts: 1952
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Thank you very much!

Post by Ajedrecista »

Hello again:
Daniel Shawul wrote:
To be fair, I was using the term "compress" when I was comparing the three rating programs. It was not until Remi mentioned the scale parameter (which I had forgotten about ) that I realized my mistake and stop using the word "compress". So, that is my fault if others read my comparisons and used the term.
Adam I have no doubt that you and many others have understood it right after Remi metnioned about the scaling. But many like Jesuz here seem to have missed on that fact and use it to praise other tools. Fact is scale is a much similar factor as offset. If I say a rating calculated with an offset of 2500 is more magnified than that calculated with a 2300, it would be a gigantic nonsense. Saying something similar for the scale is still nonsense may be not to the same degree.
I understand now a little better (but only a little, so no miracles) the scaling issue. I have not blamed BayesElo at all; in fact, I 'praise' both BayesElo and Ordo (I have not used EloSTAT but I figure that it is also good). I am not an advanced user so it is sure that I can not take advantage of BayesElo until its best. But if you take a look in the thread I opened about diminishing returns in fixed depth testing you will see that I only used BayesElo... I also used mm 1 1 twice, that enables white advantage and draw elo models IIRC (maybe the names are wrong, please correct me if that). So, not blaming to BayesElo at all! I think that BayesElo is the reference rating software in computer chess and it is well deserved because Rémi managed to work a great complex system of models, and it has a huge merit.

@Miguel: Thank you very much for your kind explanations. Good work!

Regards from Spain.

Ajedrecista.
Vinvin
Posts: 5223
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Ordo v0.5 (Release)

Post by Vinvin »

A possible bug : when I use ordoprep with "-g 20", I found player with les than 20 (11, 12, 13,...) but not less than 10. Is it possible that the value is divided by 2 in the code ?
Thanks !