lkaufman wrote: ↑Tue Jan 09, 2024 6:46 pm
If the books are heavily unbalanced, with White always the superior side, then the White advantage (the elo parameter in bayeselo for this) would be enormous. Would setting it at that enormous value make BayesElo work properly, or would it still be incorrect?
I updated the Chess324 list with "mm 1 1" and the effect there was to compress the ratings slightly vs before, and the error margins reduced.
Michel wrote: ↑Tue Jan 09, 2024 12:07 pm
This research https://unclejerry9466728.wordpress.com/2018/12/20/172/ seems to suggest that requiring that drawElo depends linearly on the "common mode" (the average elo rating of the opponents) should give reasonably good results. This would replace drawElo by two parameters (slope and intercept).
I guess I would need to download the CCRL database to double check.
Another more complicated issue with BayesElo is that it does not properly account for heavily unbalanced books in case openings are replayed with reversed colors.
If the books are heavily unbalanced, with White always the superior side, then the White advantage (the elo parameter in bayeselo for this) would be enormous. Would setting it at that enormous value make BayesElo work properly, or would it still be incorrect?
If White (or Black) is always the superior side then using the whiteAdvantage parameter is correct.
In general it is theoretically incorrect but I do not really know what the effect of this is. The only case I know is the case of two engines. In that case the Elo values are correct but the error bars are too large.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
lkaufman wrote: ↑Tue Jan 09, 2024 6:46 pm
If the books are heavily unbalanced, with White always the superior side, then the White advantage (the elo parameter in bayeselo for this) would be enormous. Would setting it at that enormous value make BayesElo work properly, or would it still be incorrect?
I updated the Chess324 list with "mm 1 1" and the effect there was to compress the ratings slightly vs before, and the error margins reduced.
For this dataset, the White advantage parameter being determined from the data by mm 1 1 is the important one I believe; draw percentage is fairly normal in that data, so I think this makes sense. But if you did the same thing for the 40/15 list, where balanced/normal opening books are mostly used, I would expect a noticeable expansion of ratings, since the draw percentage would be the main issue. If you only ran the top engines data, say over 3000 for example, it should be even more dramatic.