CCRL question re bayeselo

Modern Times · Post by **Modern Times** » Fri Jan 05, 2024 4:38 am

lkaufman wrote: ↑Fri Jan 05, 2024 4:29 am
Basically it comes down to which range of engines you want to get "right". The assumption of constant drawElo is just not realistic for a huge range of elo values, so Ordo is better to get the entire range right, but if you mostly care about a specific range (presumably the top portion), then using BayesElo with a fairly high DrawElo is okay. The value of 192 would probably be about right for the 2700 or 2800 to 4000 range. The 154 value might be best for a range like 2200 to 4000 or so. The current value might be right for the entire range, but surely there is far more interest in 3000 level ratings than in 1500 level ratings, given that the lower ones don't correlate as well with human ratings anyway.

Yes, at the time Adam did say that after he pulled out all of the games where opponents were less than 50 Elo apart, he then put those into individual buckets, so 3100 and 3199 were put in one bucket, 3000 through 3099 in another one etc. The draw rates were different for each bucket. I'm going to do the same thing and see what I get. Norm Pollock's pgn tools enable this to be done fairly easily.

Modern Times · Post by **Modern Times** » Sat Jan 06, 2024 7:51 am

I ran the complete 40/15 database through Ordo, and it tells me:

White advantage = 36.19 +/- 0.19
Draw rate (equal opponents) = 49.59 % +/- 0.04

So if indeed a bayeselo drawElo value of 192 means half draws between equals, then that on the face of it seems the best value. Still experimenting.

Modern Times · Post by **Modern Times** » Sat Jan 06, 2024 9:19 am

like going down a rabbit hole this is... investigating parameters, "mm 1 1" is another possibility that we experimented in the past, and if I'm reading it right it ought to be calculating and using draw values from the database itself. I'll need some more time

Gabor Szots · Post by **Gabor Szots** » Sat Jan 06, 2024 12:16 pm

If we leave BayesElo to its own devices (mm 1 1), it calculates a drawelo of 194 using the complete 40/15 database.

Modern Times · Post by **Modern Times** » Sat Jan 06, 2024 2:02 pm

Gabor Szots wrote: ↑Sat Jan 06, 2024 12:16 pm If we leave BayesElo to its own devices (mm 1 1), it calculates a drawelo of 194 using the complete 40/15 database.

According to previous posts here, "mm 1 1" makes Bayeselo compute White advantage and drawElo (or eloDraw) from the database and use those values in computing the ratings, instead of the default assumed values. It seems so clear that that is exactly what you would want to happen, I wonder why would it not be the default ? Very curious

"mm 1 1" makes total sense, better than entering hard-coded values for them because they could change over time. It also seems clear to me that we should be using it. It produces an Elo range about the same as Ordo does.

Michel · Post by **Michel** » Sun Jan 07, 2024 11:01 am

There is something fundamentally wrong with the BayesElo draw model.

The predicted draw ratio depends only on the Elo difference, and not on the absolute Elo values. This does not correspond with reality if we are comparing engines with a wide range of Elo values.

Modern Times · Post by **Modern Times** » Sun Jan 07, 2024 2:32 pm

I don't think there is a "perfect" rating system, even Ordo has its critics. But to be sure, bayeselo with "mm 1 1" is better than the default "mm".

lkaufman · Post by **lkaufman** » Mon Jan 08, 2024 7:50 am

Modern Times wrote: ↑Sun Jan 07, 2024 2:32 pm I don't think there is a "perfect" rating system, even Ordo has its critics. But to be sure, bayeselo with "mm 1 1" is better than the default "mm".

That seems very clear to me as well. Although Ordo (basically Elo) does have some flaws, they don't seem as serious as this incorrect assumption of constant drawelo at all levels, and in any case it is just not compatible with Elo ratings, it will be more compact at the top and more spread out at the bottom. But switching to mm 1 1 will at least make the spread right (compared to Elo/Ordo) on average, it seems.

Michel · Post by **Michel** » Tue Jan 09, 2024 12:07 pm

This research https://unclejerry9466728.wordpress.com/2018/12/20/172/ seems to suggest that requiring that drawElo depends linearly on the "common mode" (the average elo rating of the opponents) should give reasonably good results. This would replace drawElo by two parameters (slope and intercept).

I guess I would need to download the CCRL database to double check.

Another more complicated issue with BayesElo is that it does not properly account for heavily unbalanced books in case openings are replayed with reversed colors.

lkaufman · Post by **lkaufman** » Tue Jan 09, 2024 6:46 pm

Michel wrote: ↑Tue Jan 09, 2024 12:07 pm This research https://unclejerry9466728.wordpress.com/2018/12/20/172/ seems to suggest that requiring that drawElo depends linearly on the "common mode" (the average elo rating of the opponents) should give reasonably good results. This would replace drawElo by two parameters (slope and intercept).

I guess I would need to download the CCRL database to double check.

Another more complicated issue with BayesElo is that it does not properly account for heavily unbalanced books in case openings are replayed with reversed colors.

If the books are heavily unbalanced, with White always the superior side, then the White advantage (the elo parameter in bayeselo for this) would be enormous. Would setting it at that enormous value make BayesElo work properly, or would it still be incorrect?

CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo

Re: CCRL question re bayeselo