CCRL question re bayeselo

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Modern Times
Posts: 3752
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL question re bayeselo

Post by Modern Times »

lkaufman wrote: Fri Jan 05, 2024 4:29 am
Basically it comes down to which range of engines you want to get "right". The assumption of constant drawElo is just not realistic for a huge range of elo values, so Ordo is better to get the entire range right, but if you mostly care about a specific range (presumably the top portion), then using BayesElo with a fairly high DrawElo is okay. The value of 192 would probably be about right for the 2700 or 2800 to 4000 range. The 154 value might be best for a range like 2200 to 4000 or so. The current value might be right for the entire range, but surely there is far more interest in 3000 level ratings than in 1500 level ratings, given that the lower ones don't correlate as well with human ratings anyway.
Yes, at the time Adam did say that after he pulled out all of the games where opponents were less than 50 Elo apart, he then put those into individual buckets, so 3100 and 3199 were put in one bucket, 3000 through 3099 in another one etc. The draw rates were different for each bucket. I'm going to do the same thing and see what I get. Norm Pollock's pgn tools enable this to be done fairly easily.
Modern Times
Posts: 3752
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL question re bayeselo

Post by Modern Times »

I ran the complete 40/15 database through Ordo, and it tells me:
White advantage = 36.19 +/- 0.19
Draw rate (equal opponents) = 49.59 % +/- 0.04
So if indeed a bayeselo drawElo value of 192 means half draws between equals, then that on the face of it seems the best value. Still experimenting.
Modern Times
Posts: 3752
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL question re bayeselo

Post by Modern Times »

like going down a rabbit hole this is... investigating parameters, "mm 1 1" is another possibility that we experimented in the past, and if I'm reading it right it ought to be calculating and using draw values from the database itself. I'll need some more time :)
User avatar
Gabor Szots
Posts: 1452
Joined: Sat Jul 21, 2018 7:43 am
Location: Budapest, Hungary
Full name: Gabor Szots

Re: CCRL question re bayeselo

Post by Gabor Szots »

If we leave BayesElo to its own devices (mm 1 1), it calculates a drawelo of 194 using the complete 40/15 database.
Gabor Szots
CCRL testing group
Modern Times
Posts: 3752
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL question re bayeselo

Post by Modern Times »

Gabor Szots wrote: Sat Jan 06, 2024 12:16 pm If we leave BayesElo to its own devices (mm 1 1), it calculates a drawelo of 194 using the complete 40/15 database.

According to previous posts here, "mm 1 1" makes Bayeselo compute White advantage and drawElo (or eloDraw) from the database and use those values in computing the ratings, instead of the default assumed values. It seems so clear that that is exactly what you would want to happen, I wonder why would it not be the default ? Very curious

"mm 1 1" makes total sense, better than entering hard-coded values for them because they could change over time. It also seems clear to me that we should be using it. It produces an Elo range about the same as Ordo does.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: CCRL question re bayeselo

Post by Michel »

There is something fundamentally wrong with the BayesElo draw model.

The predicted draw ratio depends only on the Elo difference, and not on the absolute Elo values. This does not correspond with reality if we are comparing engines with a wide range of Elo values.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Modern Times
Posts: 3752
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL question re bayeselo

Post by Modern Times »

I don't think there is a "perfect" rating system, even Ordo has its critics. But to be sure, bayeselo with "mm 1 1" is better than the default "mm".
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CCRL question re bayeselo

Post by lkaufman »

Modern Times wrote: Sun Jan 07, 2024 2:32 pm I don't think there is a "perfect" rating system, even Ordo has its critics. But to be sure, bayeselo with "mm 1 1" is better than the default "mm".
That seems very clear to me as well. Although Ordo (basically Elo) does have some flaws, they don't seem as serious as this incorrect assumption of constant drawelo at all levels, and in any case it is just not compatible with Elo ratings, it will be more compact at the top and more spread out at the bottom. But switching to mm 1 1 will at least make the spread right (compared to Elo/Ordo) on average, it seems.
Komodo rules!
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: CCRL question re bayeselo

Post by Michel »

This research https://unclejerry9466728.wordpress.com/2018/12/20/172/ seems to suggest that requiring that drawElo depends linearly on the "common mode" (the average elo rating of the opponents) should give reasonably good results. This would replace drawElo by two parameters (slope and intercept).

I guess I would need to download the CCRL database to double check.

Another more complicated issue with BayesElo is that it does not properly account for heavily unbalanced books in case openings are replayed with reversed colors.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CCRL question re bayeselo

Post by lkaufman »

Michel wrote: Tue Jan 09, 2024 12:07 pm This research https://unclejerry9466728.wordpress.com/2018/12/20/172/ seems to suggest that requiring that drawElo depends linearly on the "common mode" (the average elo rating of the opponents) should give reasonably good results. This would replace drawElo by two parameters (slope and intercept).

I guess I would need to download the CCRL database to double check.

Another more complicated issue with BayesElo is that it does not properly account for heavily unbalanced books in case openings are replayed with reversed colors.
If the books are heavily unbalanced, with White always the superior side, then the White advantage (the elo parameter in bayeselo for this) would be enormous. Would setting it at that enormous value make BayesElo work properly, or would it still be incorrect?
Komodo rules!