Feed bayeselo with pure game results without PGN
Moderators: hgm, Harvey Williamson, bob
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.

 Posts: 202
 Joined: Mon Sep 12, 2011 9:27 pm
 Location: Moscow, Russia
 Contact:
Feed bayeselo with pure game results without PGN
Hello all!
Do you have any ideas how to quickly feed bayeselo with game results array? In my tuning framework I'm going to play several millions of games per day, so storing them in PGN and then feeding them to elostat will be too slow. Of course, it's possible to tweak elostat source to do it, but may be someone already did it?..
Do you have any ideas how to quickly feed bayeselo with game results array? In my tuning framework I'm going to play several millions of games per day, so storing them in PGN and then feeding them to elostat will be too slow. Of course, it's possible to tweak elostat source to do it, but may be someone already did it?..
The Force Be With You!

 Posts: 3502
 Joined: Tue Mar 14, 2006 10:34 am
 Location: Ethiopia
 Contact:
Re: Feed bayeselo with pure game results without PGN
Why not feed it a minimal PGN file with the players and the result ? cutechesscli should be able to produce minimal PGN.
Note that you can have multiple players in the PGN, each playing either white or black (bayeselo takes into consideration home advantage), with W/D/L result, so it is not a straight forward feed it an array.
I had at one point implemented such a thing in bayeselo using std::map for the players and results that bypasses PGN reading, but i don't have the source code now.
Daniel
Note that you can have multiple players in the PGN, each playing either white or black (bayeselo takes into consideration home advantage), with W/D/L result, so it is not a straight forward feed it an array.
I had at one point implemented such a thing in bayeselo using std::map for the players and results that bypasses PGN reading, but i don't have the source code now.
Daniel

 Posts: 202
 Joined: Mon Sep 12, 2011 9:27 pm
 Location: Moscow, Russia
 Contact:
Re: Feed bayeselo with pure game results without PGN
Because it's too slow to parse PGN with 10 mln games, even minimal, especially when you need to do it after every batch of games per CPU core.Daniel Shawul wrote:Why not feed it a minimal PGN file with the players and the result ? cutechesscli should be able to produce minimal PGN.
Moreover, it's hard to even store this PGN. It's better to feed just info of number of wins/draws/losses (i.e. 3000/5000/2500) per each pair of engine versions instead of parsing 10500 minimal PGN game headers.
It's easy to store W/D/L for white and black separately.Note that you can have multiple players in the PGN, each playing either white or black (bayeselo takes into consideration home advantage), with W/D/L result, so it is not a straight forward feed it an array.
That's sadI had at one point implemented such a thing in bayeselo using std::map for the players and results that bypasses PGN reading, but i don't have the source code now.
The Force Be With You!

 Posts: 3502
 Joined: Tue Mar 14, 2006 10:34 am
 Location: Ethiopia
 Contact:
Re: Feed bayeselo with pure game results without PGN
Nope you need to feed it the name of the players in the form of string. You are just thinkingof a situation with two players only (as in the case of tuning).It's easy to store W/D/L for white and black separately.
What you need to feed bayeselo, on the other hand, per a game result are:
"white player's name"
"black player's name"
"result"
You can feed in dummy player names but bayeselo expects player names anyway.
I have Bopo source code, derived from Bayeselo with two additonal draw models, with the feature you need if you are interested.
Daniel
P.S: It may be better to just compute the elo from winning percentage using logistic formula in case you have only two players and results. Scorpio does this when tuning evaluation actually
Code you need
Code: Select all
static inline double score_to_elo(double p) {
return 400.0 * log10(p / (1  p));
}
static inline double gamma_to_elo(double g) {
return 400.0 * log10(g);
}
static inline double elo_to_gamma(double eloDelta) {
return pow(10.0,eloDelta / 400.0);
}
static inline double logistic(double eloDelta) {
return 1 / (1 + pow(10.0,eloDelta / 400.0));
}
static inline double gaussian(double eloDelta) {
return (1 + erf(eloDelta / 400.0)) / 2;
}
static double win_prob(double eloDelta, int eloH, int eloD) {
if(ELO_MODEL == 0) {
return logistic(eloDelta  eloH + eloD);
} else if(ELO_MODEL == 1) {
double thetaD = elo_to_gamma(eloD);
double f = thetaD * sqrt(logistic(eloDelta + eloH) * logistic(eloDelta  eloH));
return logistic(eloDelta  eloH) / (1 + f);
} else {
return gaussian(eloDelta  eloH + eloD);
}
}
static double loss_prob(double eloDelta, int eloH, int eloD) {
if(ELO_MODEL == 0) {
return logistic(eloDelta + eloH + eloD);
} else if(ELO_MODEL == 1) {
double thetaD = elo_to_gamma(eloD);
double f = thetaD * sqrt(logistic(eloDelta + eloH) * logistic(eloDelta  eloH));
return logistic(eloDelta + eloH) / (1 + f);
} else {
return gaussian(eloDelta + eloH + eloD);
}
}
static double draw_prob(double eloDelta, int eloH, int eloD) {
return 1  win_prob(eloDelta,eloH,eloD)  loss_prob(eloDelta,eloH,eloD);
}
static double get_scale(double eloD, double eloH) {
const double K = log(10)/400.0;
double df;
if(ELO_MODEL == 0) {
double f = 1.0 / (1 + exp(K*(eloD  eloH)));
df = f * (1  f) * K;
} else if(ELO_MODEL == 1) {
double dg = elo_to_gamma(eloD)  1;
double f = 1.0 / (1 + exp(K*(eloD  eloH)));
double dfx = f * (1  f);
double dx = dg * sqrt(dfx);
double b = 1 + dx;
double c = (dg * f * (1  2 * f)) / (2 * sqrt(dfx));
df = ((b  c) / (b * b)) * dfx * K;
} else if(ELO_MODEL == 2) {
const double pi = 3.14159265359;
double x = (eloD  eloH)/400.0;
df = exp(x*x) / (400.0 * sqrt(pi));
}
return (4.0 / K) * df;
}
double get_log_likelihood(int result, double se) {
double factor_m = double(material) / MAX_MATERIAL;
int eloH = 0; //we have stm bonus
int eloD = ELO_DRAW + factor_m * ELO_DRAW_SLOPE_PHASE;
double scale = get_scale(eloD,eloH);
se = se / scale;
if(result == 1)
return log(win_prob(se,eloH,eloD));
else if(result == 1)
return log(loss_prob(se,eloH,eloD));
else
return log(draw_prob(se,eloH,eloD));
}
Last edited by Daniel Shawul on Thu Mar 08, 2018 6:08 pm, edited 1 time in total.
 Guenther
 Posts: 2462
 Joined: Wed Oct 01, 2008 4:33 am
 Location: Regensburg, Germany
 Full name: Guenther Simon
 Contact:
Re: Feed bayeselo with pure game results without PGN
If you would use Ordo you could use the additional ordoprep tool, whichSergei S. Markoff wrote:
That's sad :)
strips off all moves and most headers of the pgn file.
It will leave only the players names and the result and can be processed
further with ordo then.
(AFAIK most people use Ordo anyway nowadays for rating calculation)
It is well documented and open source.)
Using ordoprep speeds up calculations a lot of course when using huge pgn files.
https://github.com/michiguel/Ordo
https://sites.google.com/site/gaviotachessengine/ordo
Strangely I cannot find a newer version of ordoprep than the one given
in the last link. Yet I have a newer one on my HD. If you need it I can send it.
Edit:
Reading Daniels last post which came inbetween it seems you could even use ordoprep and then still calculate the resulting file with bayeselo.
The ordoprep output will be like this:
Code: Select all
[White "Ace 01"]
[Black "MicroChess 1976"]
[Result "01"]
01
Guenther Simon
http://rwbcchess.de/chronology.htm
http://rwbcchess.de/chronology.htm

 Posts: 202
 Joined: Mon Sep 12, 2011 9:27 pm
 Location: Moscow, Russia
 Contact:
Re: Feed bayeselo with pure game results without PGN
Thank you!
But I have a multiple versions, so I need to fit their ratings.
My framework is based on genetical approach. The previous version was based on playing vs base version, but there are two problems: 1) if you will be able to use games of some sibligngs vs each other it will help you to save 50% of time, 2) when you're playing vs base version there is a problem of overfitting — some of your siblings can be successful vs base version but not successful vs broad range of opponents.
But I have a multiple versions, so I need to fit their ratings.
My framework is based on genetical approach. The previous version was based on playing vs base version, but there are two problems: 1) if you will be able to use games of some sibligngs vs each other it will help you to save 50% of time, 2) when you're playing vs base version there is a problem of overfitting — some of your siblings can be successful vs base version but not successful vs broad range of opponents.
The Force Be With You!

 Posts: 202
 Joined: Mon Sep 12, 2011 9:27 pm
 Location: Moscow, Russia
 Contact:
Re: Feed bayeselo with pure game results without PGN
Thank you!
I'm going to make a tool to feed results in a most compact form, for example:
engine1: {engine2: {w: {100/100/101}, b: {95/100/105}}, engine3: {w: 101/100/100, b: {96104}}
engine2: ...
etc
So instead of multiple games or game headers you will have just results. In the case you're having more than 10000 engines and several million games it seems to be the only way to store and process this data.
I'm going to make a tool to feed results in a most compact form, for example:
engine1: {engine2: {w: {100/100/101}, b: {95/100/105}}, engine3: {w: 101/100/100, b: {96104}}
engine2: ...
etc
So instead of multiple games or game headers you will have just results. In the case you're having more than 10000 engines and several million games it seems to be the only way to store and process this data.
The Force Be With You!