By the way, I have been unsure about how the prior is applied. Is each engine assumed to have x draws against every engine in the database or is it assumed to have x draws against each engine it has played against?
Well it is a "prior". This means it is fixed before there are any "observations" (games).
So the only possibility is the first one it seems to me.
I resorted to using the CCRL 40/4 pure database instead of the entire database. Bayeselo would not converge in a reasonable amount of time when using small priors for the entire database.
Decreasing the prior had a relatively small effect on the ratings. I believe this is due to the fact that the average engine plays ~30 games against 15 to 20 opponents (in the case of the pure database). The set is well-connected.
With Brutus RND set at 0 Elo, here is the Elo rating for Houdini 3 4CPU for various priors:
By the way, here is a description of how the prior is applied from the Winboard forum:
Remi wrote:
HGM wrote:
In my experience with drawing statistical conclusions fom noisy data maximum-likelihood methods are far superior to using the data as if it were exact. Long ago, before computers were around for the layman I used a poor-man's solution to the problem of calculating ratings from a small data set without any prior knowledge of the player's strengths.
I simply replaced a result (s points from n encounters) by (s+1)/(n+2). This is the Bayesian expectation for the probability of a coin-flip, if the a-priori assumption was that any probability was equally likely. Even though this does not solve the average-opponent problem, it improved the ease with which a selfconsitent set of ratings could be determined enormously. Taking the results at face value made the rating diverge because of players that won or lost nearly all their games.
What you describe is precisely what bayeselo does. Maybe you had understood already, but I cannot tell from your post. This is equivalent to adding two virtual draws between a pair of players. The "prior" command in bayeselo lets the user control the number of virtual draws. In case a player has had many opponents, these virtual draws are split between opponents, that is to say 2 virtual games are not added for every pair of players, but fractional virtual games when there are many opponents. When the number of games is small, this technique does indeed improve the quality of result predictions significantly.
So it it seems I was wrong about how the prior is applied:-( Sorry about that. I will have to check how Remi's description fits into Bayesian statistics.
For a large database the prior as described in Remi's post should indeed have minor influence. Which your tests confirms. Thanks again!
I just want to mention that the way prior is currently done in bayeselo is not the best. When comparing different draw models, I had to completely disable it to avoid any errors coming from it. Remi and I agreed a better way is to include a virtual player against which virtual draws are assigned, instead of virtual draws against every opponent a player played. I was too lazy to implement this. Edit: Actually I did as the second part of the code below shows. The n+1 player that virtual player. I also added prior for elostat mode