Error bar calculations - help!

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

asanjuan
Posts: 214
Joined: Thu Sep 01, 2011 5:38 pm
Location: Seville, Spain

Re: Error bar calculations - help!

Post by asanjuan »

I alway thought that the relevant point is that you choose the poorest individual to die,
and among the bests for reproduce.
This way we improve the average strength of the population.
well, there are a tons of different G.A. It depends on how you want your population to be, but the idea of create new generations is the basis.
Basically, what we do in a AG is:

- create a population of N individuals. Each individual is a set of integer parameters, or a sequence of bits.
- evaluate the population. Give points to each individual to get "who are the bests" with what we call a "fitness function" (in our case, a tournament).
- Reproduce the best individuals with others choosen randomly until you get a set of new N individuals. There are lots of variations: passing the best (this is elitism.... ), use a "roulete" method ... there is a lot of literature.
- in a small percentage, change something in any of the individuals, so you will produce a mutation.
- repeat this until you get bored

At the end, you will get a set of parameters that "maximizes" (with a high probability, because you will never be sure) the points gained for your fitness function.

For reproduction, there are some methods: the most known are crossover and uniform crossover, but you can create your own.

For Rhetoric i created a reproduction function that given two people A and B, it produced 3 children: one more look-like father A (in 75%), other more similar to B, and other that had A+B/2. My fitness function was the sum of hits in an epd test doing a search of depth 1 + quies (test with 20.000 positions extracted from games played by GM Anatoly Karpov). I called this test "The Karpov test".
I'm proud to say that Rhetoric gets a hit rate of 33,9% in this test, wich is very high for a 1 ply search, and is higer than the 32% reached in the literature.
my tests (not very exact, must say) suggests that each time you get a 1% more on your hit rate in this test, you gain 100 elo in 4ply tournaments. But it isn't a reality, is only my hipotesis, not an exact score. This is one of the reasons that i need how to measure the real elo gain.
Is this true ? or only the correct choice of the best individuals are relevant.
Absolutely NOT. Mathematically you are doing a search in a N-dimensional space, looking for a local or global maximum, so the information of the "adn" of every individual is important. If you discard every "bad" item, you can loose important information (directions in your search space). (i.e. the value of a queen)
This is why is important to reproduce the best with the rest randomly, and to introduce mutations to recover lost parameters.

But, i'm not an expert. This was my first AG!!
Still learning how to play chess...
knigths move in "L" shape ¿right?
User avatar
Ajedrecista
Posts: 1975
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Other programme for Windows: Minimum_score_for_no_regression

Post by Ajedrecista »

Hello again:
The second link provided by Fermín gave me yesterday an idea: to write other simple programme in Fortran 95, using the same formula for standard deviation than in Elo_uncertainties_calculator, but now with other purpose: trying to get the minimum score where:

Code: Select all

mu - k*sigma = 0.5 (In the case of a match between two engines).
Where mu is the score of an engine, sigma is the standard deviation of the match and k denotes the confidence level (~ 1.96 for 95% confidence, ~ 2.5758 for 99% confidence and so on).

I have done the table of Joseph Ciarrochi for view if my results were similar to that table... my results are more convincing with high number of games than with low number:

Code: Select all

Draw ratio = 32%; the minimum scores are rounded up to 0.01%:

Games     Cutoff = 5%     Cutoff = 1%     Cutoff = 0.1%
   10        71.72%          76.04%           N/A
   20        66.55%          70.58%          74.44%
   30        63.89%          67.55%          71.23%
   40        62.20%          65.55%          69.03%
   50        61.01%          64.11%          67.40%
   75        59.10%          61.75%          64.64%
  100        57.93%          60.28%          62.89%
  150        56.52%          58.49%          60.70%
  200        55.66%          57.39%          59.34%
  300        54.64%          56.06%          57.70%
  500        53.60%          54.72%          56.00%
 1000        52.55%          53.35%          54.27%
N/A means not avaliable because the mathematical model I use in this programme fails for the low number of games (I also think that the cutoff is very small given only ten games). Here: cutoff = 1 - (confidence level).

I have used the Newton's method for solve the equation given in the first code box. I learnt this method three years ago and have not used it since then; fortunately, I remember the basis of the method: I took pencil and paper and got the correct formula in around two or three minutes without hurrying... well, I still have relatively good memory for some things. I see now that my implementation in the programme is correct (I supposed it due to the similar results I got in comparison with Ciarrochi's ones).

The use of Minimum_score_for_no_regression (I know that I am the king of short names... :P) is very simple: double click the executable and follow the instructions. Here is an example:

Code: Select all

 Minimum_score_for_no_regression, © 2012.

 Calculation of the minimum score for no regression in a match between two engines:

 Write down the number of games of the match (it must be a positive integer):

2000

 Write down the draw ratio (in percentage):

40

 Write down k (for making confidence intervals of mu +/- k*sigma in a normal distribution); k must be positive:

1.96

 Minimum score for no regresion:     51.695781943756155    %

 End of the calculations.

 Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
The most annoying thing is k: one should use a table for normal distributios. I have used k ~ 1.96 for 5% of cutoff, k ~ 2.5758 for 1% of cuttoff and k ~ 3.2905 for 0.1% of cuttoff.

With the model I use, all those minimum scores vary with the draw ratio (as Ciarrochi stated). I expected that the variations would be a little higher than they are in reality:

Code: Select all

n = 1000 games, cutoff = 5%; minimum scores are rounded up to 0.01%:

Draw ratio = 25% ---> 52.68 %
Draw ratio = 30% ---> 52.59 %
Draw ratio = 35% ---> 52.49 %
Draw ratio = 40% ---> 52.40 %
Draw ratio = 45% ---> 52.29 %
Draw ratio = 50% ---> 52.19 %
My goal with this programme is to provide more liberty for chosing the number of games, the draw ratio and the cutoff. I hope that this programme will be somewhat useful for someone:

Minimum_scores_for_no_regression.rar (0.59 MB)

Here is the code, where you can see all the drawbacks of this tiny programme:

Code: Select all

program No_regression

implicit none

real(KIND=3) :: mu(0:11), sigma(0:10), draw_ratio, k, f_de_mu(0:10), derivada(0:10), d_mu(0:10)
integer :: i, n

write(*,*)
write(*,*) 'Minimum_score_for_no_regression, © 2012.'
write(*,*)
write(*,*) 'Calculation of the minimum score for no regression in a match between two engines:'
write(*,*)
write(*,*) 'Write down the number of games of the match (it must be a positive integer):'
write(*,*)
read(*,*) n
write(*,*)

if &#40;n <= 0&#41; then
  write&#40;*,*) 'Incorrect number of games.'
  write&#40;*,*)
  write&#40;*,*) 'Please close and try again. Press Enter to exit.'
  read&#40;*,'()')
  stop
end if

write&#40;*,*) 'Write down the draw ratio &#40;in percentage&#41;&#58;'
write&#40;*,*)
read&#40;*,*) draw_ratio
write&#40;*,*)

if (&#40;draw_ratio < 0d0&#41; .or. &#40;draw_ratio > 1d2&#41;)  then
  write&#40;*,*) 'Incorrect draw ratio.'
  write&#40;*,*)
  write&#40;*,*) 'Please close and try again. Press Enter to exit.'
  read&#40;*,'()')
  stop
end if

if &#40;draw_ratio == 1d2&#41; then
  write&#40;*,*) 'The mathematical model used in Minimum_score_for_no_regression does not support a draw ratio of 100%'
  write&#40;*,*)
  write&#40;*,*) 'Please close and try again. Press Enter to exit.'
  read&#40;*,'()')
  stop
end if

write&#40;*,*) 'Write down k &#40;for making confidence intervals of mu +/- k*sigma in a normal distribution&#41;; k must be positive&#58;'
write&#40;*,*)
read&#40;*,*) k
write&#40;*,*)

if &#40;k <= 0d0&#41;  then
  write&#40;*,*) 'Incorrect value for k.'
  write&#40;*,*)
  write&#40;*,*) 'Please close and try again. Press Enter to exit.'
  read&#40;*,'()')
  stop
end if

mu&#40;0&#41; = 5d-1  ! Valor para iniciar las iteraciones del método de Newton.

do i = 0, 10  ! Implementación del método de Newton para resolver ecuaciones no lineales de una incógnita.
  sigma&#40;i&#41; = sqrt&#40;&#40;mu&#40;i&#41;*&#40;1d0 - mu&#40;i&#41;) - 2.5d-3*draw_ratio&#41;/n&#41;
  f_de_mu&#40;i&#41; = mu&#40;i&#41; - k*sigma&#40;i&#41; - 5d-1
  derivada&#40;i&#41; = 1d0 - 5d-1*&#40;1d0 - 2d0*mu&#40;i&#41;)*k/sqrt&#40;n*&#40;mu&#40;i&#41;*&#40;1d0 - mu&#40;i&#41;) - 2.5d-3*draw_ratio&#41;)
  d_mu&#40;i&#41; = -f_de_mu&#40;i&#41;/derivada&#40;i&#41;
  mu&#40;i+1&#41; = mu&#40;i&#41; + d_mu&#40;i&#41;
  if (&#40;mu&#40;i+1&#41; < 5d-3*draw_ratio&#41; .or. &#40;mu&#40;i+1&#41; > 1d0 - 5d-3*draw_ratio&#41;) then  ! La puntuación ha de estar entre esos dos valores.
    write&#40;*,*) 'The mathematical model used in Minimum_score_for_no_regression fails for at least one of these two reasons&#58;'
    write&#40;*,*)
    write&#40;*,*) 'a&#41; It does not support such a high draw ratio.'
    write&#40;*,*) 'b&#41; It does not support such a low number of games.'
    write&#40;*,*)
    write&#40;*,*) 'Please close and try again. Press Enter to exit.'
    read&#40;*,'()')
    stop
  end if
end do

write&#40;*,*) 'Minimum score for no regresion&#58;', 1d2*mu&#40;11&#41;, '%'
write&#40;*,*)
write&#40;*,*) 'End of the calculations.'
write&#40;*,*)
write&#40;*,*) 'Thanks for using Minimum_score_for_no_regression. Press Enter to exit.'
read&#40;*,'()')

end program No_regression
Please take with care the given results.

Regarding Elo_uncertainties_calculator, I have counted at least four downloads, which is a total success for me! Thank you very much.

Regards from Spain.

Ajedrecista.