Fast LOS estimation

Discussion of chess software programming and technical issues.

Moderator: Ras

gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Fast LOS estimation

Post by gladius »

I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
User avatar
Ajedrecista
Posts: 2178
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Fast LOS estimation.

Post by Ajedrecista »

Hello Gary:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
In my Fortran programme LOS_and_Elo_uncertainties_calculator (it can be found here) I calculate LOS in two different ways: one with an approach of mean and sample standard deviation of a normal distribution (I count draws), and the other one with an equation that Rémi proposed here (the last equation), where draws do not count:

Image

I have to solve integrals in each way and I use a very good approximation using the composite Simpson's rule.

I do not know a reference when you refer to a large number of games (10000, 20000, 30000, ...? Including draws or without including them?). Just for the record, I have an old Intel Pentium D930 (of year 2006) at 3 GHz: my programme is single core, 32-bit; in some cases of wins + loses = 16000, it calculates error bars for a given confidence level (including the calculation of z score) and LOS with the two ways I said before in something less than 0.5 seconds in my PC. If wins + loses > 16000 then I disable the calculation of LOS with Rémi's method (the programme itself refuses to do this calculation with wins + loses ~ 16500, probably due to huge number of elements I use, so I had to established that limit) and the programme is much faster: an example with +120100 =360000 -119900 (600000 games with a draw ratio of 60%) took 36 ms more less (LOS ~ 65.85%). This value should be a great approximation due to the central limit theorem. Anyway, I am not specially gifted in Statistics, so please take my results with care.

I do not know if it is useful to you. Anyway, you can download my tools and try LOS_andElo_uncertainties_calculator (by the way, it is open source of Fortran 95). I hope I will not fool you with nonsense.

Regards from Spain.

Ajedrecista.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Fast LOS estimation

Post by Laskos »

gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Fast LOS estimation

Post by Laskos »

Laskos wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.
Another rule of thumb:

LOS = Erf{0.85*(wins-losses)/sqrt(wins+losses)}
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Fast LOS estimation

Post by gladius »

Laskos wrote:
Laskos wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.
Another rule of thumb:

LOS = Erf{0.85*(wins-losses)/sqrt(wins+losses)}
Ah, perfect! erf is built into python, so that's nice and fast :).

Thanks!
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Fast LOS estimation.

Post by gladius »

Ajedrecista wrote:Hello Gary:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
In my Fortran programme LOS_and_Elo_uncertainties_calculator (it can be found here) I calculate LOS in two different ways: one with an approach of mean and sample standard deviation of a normal distribution (I count draws), and the other one with an equation that Rémi proposed here (the last equation), where draws do not count:

Image

I have to solve integrals in each way and I use a very good approximation using the composite Simpson's rule.

I do not know a reference when you refer to a large number of games (10000, 20000, 30000, ...? Including draws or without including them?). Just for the record, I have an old Intel Pentium D930 (of year 2006) at 3 GHz: my programme is single core, 32-bit; in some cases of wins + loses = 16000, it calculates error bars for a given confidence level (including the calculation of z score) and LOS with the two ways I said before in something less than 0.5 seconds in my PC. If wins + loses > 16000 then I disable the calculation of LOS with Rémi's method (the programme itself refuses to do this calculation with wins + loses ~ 16500, probably due to huge number of elements I use, so I had to established that limit) and the programme is much faster: an example with +120100 =360000 -119900 (600000 games with a draw ratio of 60%) took 36 ms more less (LOS ~ 65.85%). This value should be a great approximation due to the central limit theorem. Anyway, I am not specially gifted in Statistics, so please take my results with care.

I do not know if it is useful to you. Anyway, you can download my tools and try LOS_andElo_uncertainties_calculator (by the way, it is open source of Fortran 95). I hope I will not fool you with nonsense.

Regards from Spain.

Ajedrecista.
Very interesting, thanks Jesus. I will definitely take a look at the LOS calculations including draws.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Fast LOS estimation

Post by Laskos »

gladius wrote:
Laskos wrote:
Laskos wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.
Another rule of thumb:

LOS = Erf{0.85*(wins-losses)/sqrt(wins+losses)}
Ah, perfect! erf is built into python, so that's nice and fast :).

Thanks!
Correction (ans a pretty accurate one):

LOS=0.5*[1+ Erf{0.707*(wins-losses)/sqrt(wins+losses)}]

It works pretty well.
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Fast LOS estimation

Post by gladius »

Laskos wrote:Correction (ans a pretty accurate one):

LOS=0.5*[1+ Erf{0.707*(wins-losses)/sqrt(wins+losses)}]

It works pretty well.
Great, thanks :). Yes, I had noticed it was giving results between -1..1 so I had applied the 0.5 * 1 + erf() fix already.

Why 0.707 though? For sqrt(2)/2?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Fast LOS estimation

Post by bob »

gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
BayesElo ought to be easy enough. You can execute it and send the output to a file, then read the file in to your python program and do with the info as you see fit. I do this in my cluster testing, although I simply use a c-shell script generally.
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Fast LOS estimation

Post by gladius »

bob wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that :).
BayesElo ought to be easy enough. You can execute it and send the output to a file, then read the file in to your python program and do with the info as you see fit. I do this in my cluster testing, although I simply use a c-shell script generally.
This is for presenting output on a website, so I don't really want to be running an external program if I can avoid it. Also, I don't have a PGN readily available.

Anyhow, Kai's formula works great :).