Fast LOS estimation

gladius · Post by **gladius** » Mon Feb 18, 2013 7:00 pm

I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that

.

Ajedrecista · Post by **Ajedrecista** » Mon Feb 18, 2013 7:47 pm

Hello Gary:

gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .

In my Fortran programme LOS_and_Elo_uncertainties_calculator (it can be found here) I calculate LOS in two different ways: one with an approach of mean and sample standard deviation of a normal distribution (I count draws), and the other one with an equation that Rémi proposed here (the last equation), where draws do not count:

I have to solve integrals in each way and I use a very good approximation using the composite Simpson's rule.

I do not know a reference when you refer to a large number of games (10000, 20000, 30000, ...? Including draws or without including them?). Just for the record, I have an old Intel Pentium D930 (of year 2006) at 3 GHz: my programme is single core, 32-bit; in some cases of wins + loses = 16000, it calculates error bars for a given confidence level (including the calculation of z score) and LOS with the two ways I said before in something less than 0.5 seconds in my PC. If wins + loses > 16000 then I disable the calculation of LOS with Rémi's method (the programme itself refuses to do this calculation with wins + loses ~ 16500, probably due to huge number of elements I use, so I had to established that limit) and the programme is much faster: an example with +120100 =360000 -119900 (600000 games with a draw ratio of 60%) took 36 ms more less (LOS ~ 65.85%). This value should be a great approximation due to the central limit theorem. Anyway, I am not specially gifted in Statistics, so please take my results with care.

I do not know if it is useful to you. Anyway, you can download my tools and try LOS_andElo_uncertainties_calculator (by the way, it is open source of Fortran 95). I hope I will not fool you with nonsense.

Regards from Spain.

Ajedrecista.

Laskos · Post by **Laskos** » Mon Feb 18, 2013 9:12 pm

gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .

I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.

Laskos · Post by **Laskos** » Mon Feb 18, 2013 9:45 pm

Laskos wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .
I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.

Another rule of thumb:

LOS = Erf{0.85*(wins-losses)/sqrt(wins+losses)}

gladius · Post by **gladius** » Mon Feb 18, 2013 11:00 pm

Laskos wrote:
Laskos wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .
I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.
Another rule of thumb:

LOS = Erf{0.85*(wins-losses)/sqrt(wins+losses)}

Ah, perfect! erf is built into python, so that's nice and fast

.

Thanks!

gladius · Post by **gladius** » Mon Feb 18, 2013 11:13 pm

Ajedrecista wrote:Hello Gary:

gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .
In my Fortran programme LOS_and_Elo_uncertainties_calculator (it can be found here) I calculate LOS in two different ways: one with an approach of mean and sample standard deviation of a normal distribution (I count draws), and the other one with an equation that Rémi proposed here (the last equation), where draws do not count:

I have to solve integrals in each way and I use a very good approximation using the composite Simpson's rule.

I do not know a reference when you refer to a large number of games (10000, 20000, 30000, ...? Including draws or without including them?). Just for the record, I have an old Intel Pentium D930 (of year 2006) at 3 GHz: my programme is single core, 32-bit; in some cases of wins + loses = 16000, it calculates error bars for a given confidence level (including the calculation of z score) and LOS with the two ways I said before in something less than 0.5 seconds in my PC. If wins + loses > 16000 then I disable the calculation of LOS with Rémi's method (the programme itself refuses to do this calculation with wins + loses ~ 16500, probably due to huge number of elements I use, so I had to established that limit) and the programme is much faster: an example with +120100 =360000 -119900 (600000 games with a draw ratio of 60%) took 36 ms more less (LOS ~ 65.85%). This value should be a great approximation due to the central limit theorem. Anyway, I am not specially gifted in Statistics, so please take my results with care.

I do not know if it is useful to you. Anyway, you can download my tools and try LOS_andElo_uncertainties_calculator (by the way, it is open source of Fortran 95). I hope I will not fool you with nonsense.

Regards from Spain.

Ajedrecista.

Very interesting, thanks Jesus. I will definitely take a look at the LOS calculations including draws.

Laskos · Post by **Laskos** » Mon Feb 18, 2013 11:29 pm

gladius wrote:
Laskos wrote:
Laskos wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .
I am using
LOS = (Sum_over_i [Binomial[wins + losses, i], {i from 0 to wins - 1}] + 0.5*Binomial[wins + losses, wins]) / 2^(wins + losses)

Rule of thumb for LOS of 95%: (wins-losses)/sqrt(wins+losses)=1.75
For 99% 2.20 or so instead of 1.75.
Another rule of thumb:

LOS = Erf{0.85*(wins-losses)/sqrt(wins+losses)}
Ah, perfect! erf is built into python, so that's nice and fast .

Thanks!

Correction (ans a pretty accurate one):

LOS=0.5*[1+ Erf{0.707*(wins-losses)/sqrt(wins+losses)}]

It works pretty well.

gladius · Post by **gladius** » Mon Feb 18, 2013 11:50 pm

Laskos wrote:Correction (ans a pretty accurate one):

LOS=0.5*[1+ Erf{0.707*(wins-losses)/sqrt(wins+losses)}]

It works pretty well.

Great, thanks

. Yes, I had noticed it was giving results between -1..1 so I had applied the 0.5 * 1 + erf() fix already.

Why 0.707 though? For sqrt(2)/2?

bob · Post by **bob** » Tue Feb 19, 2013 12:09 am

gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .

BayesElo ought to be easy enough. You can execute it and send the output to a file, then read the file in to your python program and do with the info as you see fit. I do this in my cluster testing, although I simply use a c-shell script generally.

gladius · Post by **gladius** » Tue Feb 19, 2013 12:13 am

bob wrote:
gladius wrote:I'm currently using Remi's excellent WhoIsBest (http://www.talkchess.com/forum/viewtopi ... 82&t=30624) for quick estimation of LOS. However, it's O(n^2), and when implemented in python, can be a bit slow for large numbers of games.

This is also for self-testing, so a relatively high draw rate. Any suggestions on a faster algorithm? I can't easily plug in bayeselo, or would definitely be using that .
BayesElo ought to be easy enough. You can execute it and send the output to a file, then read the file in to your python program and do with the info as you see fit. I do this in my cluster testing, although I simply use a c-shell script generally.

This is for presenting output on a website, so I don't really want to be running an external program if I can avoid it. Also, I don't have a PGN readily available.

Anyhow, Kai's formula works great

.

Fast LOS estimation

Fast LOS estimation

Re: Fast LOS estimation.

Re: Fast LOS estimation

Re: Fast LOS estimation

Re: Fast LOS estimation

Re: Fast LOS estimation.

Re: Fast LOS estimation

Re: Fast LOS estimation

Re: Fast LOS estimation

Re: Fast LOS estimation