Error margin estimation when no ELOinfo available

Kempelen · Post by **Kempelen** » Thu Apr 16, 2009 10:38 am

Hi,

I need a little help with error margin estimation. Now I am using this table for calculate the error margin between tournaments results:

Code: Select all

NºGames    ERROR ELO (aproximated)
--------   ----------
96             +-60
200            +-42
400            +-30
800            +-20
1600           +-15
3200           +-11

I don't remember where I get this table.

The problem I have now is that due to diverse circunstances I dont know the elo of opponents and even dont know the elo of the engine I want to test.

If I run a Gountlet tournament, and then repeat the same tournament, what is the error margin in score points I could expect? What I am asking is for the last column of the following table, which I have fill only as an example:

Code: Select all

Nº Games      Score 1st. tourn.    Score 2nd. tourn.
----------    -------------------  --------------------
96                   50 points        +/- 50 points 
200                  50 points        +/- 25 points
400                  50 points        +/- 12 points
800                  50 points        +/- 6 points   
1600                 50 points        +/- 3 points
3200                 50 points        +/- 1 point

Thanks,

FS

pedrox · Post by **pedrox** » Thu Apr 16, 2009 12:34 pm

Hi Fermin,

You might find useful the following table.

Code: Select all

TABLA DE CONVERSIÓN DE LA DIFERENCIA DE RANKING,  A LA PROBABILIDAD DE PUNTOS Pe

Dif    Pe          Dif      Pe           Dif        Pe           Dif      Pe 
0-3   .50 .50   	 92-98   .63 .37 		198-206 .76 .24 		345-357 .89 .11 
4-10  .51 .49  	 99-106  .64 .36 		207-215 .77 .23 		358-374 .90 .10 
11-17 .52 .48   	 107-113 .65 .35 		216-225 .78 .22 		375-391 .91 .89 
18-25 .53 .47	 114-121 .66 .34		226-235 .79 .21 		392-411 .92 .08 
26-32 .54 .46 	 122-129 .67 .33 		236-245 .80 .20 		412-432 .93 .07 
33-39 .55 .45	 130-137 .68 .32 		246-256 .81 .19 		433-456 .94 .06 
40-46 .56 .44	 138-145 .69 .31 		257-267 .82 .18 		457-484 .95 .05 
47-53 .57 .43	 146-153 .70 .30 		268-278 .83 .17 		485-517 .96 .04 
54-61 .58 .42 	 154-162 .71 .29 		279-290 .84 .16 		518-559 .97 .03 
62-68 .59 .41 	 163-170 .72 .28 		291-302 .85 .15		560-619 .98 .02 
69-76 .60 .40 	 171-179 .73 .27 		303-315 .86 .14		620-735 .99 .01 
77-83 .61 .39 	 180-188 .74 .26 		316-328 .87 .13 		+de 735 1.0 .00 
84-91 .62 .38 	 189-197 .75 .25 		329-344 .88 .12

Example.

With 96 games, the error can be 60 ELO points. If we want this value into the table we see that a player would have won 58% of the games and the other 42%.

Pedro

Kempelen · Post by **Kempelen** » Thu Apr 16, 2009 12:56 pm

pedrox wrote:Hi Fermin,

You might find useful the following table.

Code: Select all

TABLA DE CONVERSIÓN DE LA DIFERENCIA DE RANKING,  A LA PROBABILIDAD DE PUNTOS Pe

Dif    Pe          Dif      Pe           Dif        Pe           Dif      Pe 
0-3   .50 .50   	 92-98   .63 .37 		198-206 .76 .24 		345-357 .89 .11 
4-10  .51 .49  	 99-106  .64 .36 		207-215 .77 .23 		358-374 .90 .10 
11-17 .52 .48   	 107-113 .65 .35 		216-225 .78 .22 		375-391 .91 .89 
18-25 .53 .47	 114-121 .66 .34		226-235 .79 .21 		392-411 .92 .08 
26-32 .54 .46 	 122-129 .67 .33 		236-245 .80 .20 		412-432 .93 .07 
33-39 .55 .45	 130-137 .68 .32 		246-256 .81 .19 		433-456 .94 .06 
40-46 .56 .44	 138-145 .69 .31 		257-267 .82 .18 		457-484 .95 .05 
47-53 .57 .43	 146-153 .70 .30 		268-278 .83 .17 		485-517 .96 .04 
54-61 .58 .42 	 154-162 .71 .29 		279-290 .84 .16 		518-559 .97 .03 
62-68 .59 .41 	 163-170 .72 .28 		291-302 .85 .15		560-619 .98 .02 
69-76 .60 .40 	 171-179 .73 .27 		303-315 .86 .14		620-735 .99 .01 
77-83 .61 .39 	 180-188 .74 .26 		316-328 .87 .13 		+de 735 1.0 .00 
84-91 .62 .38 	 189-197 .75 .25 		329-344 .88 .12

Example.

With 96 games, the error can be 60 ELO points. If we want this value into the table we see that a player would have won 58% of the games and the other 42%.

Pedro

I see the point. So, when repeating a tournament the score points of the second tournament will vary from the first in the following error margin %:

Code: Select all

Nº Games      ELO Error    Score Points Error
----------    ----------   --------------------
96            +-60         +/- 8% 
200           +-42         +/- 6%
400           +-30         +/- 4%
800           +-20         +/- 3%
1600          +-15         +/- 2%
3200          +-11         +/- 1%

Sven · Post by **Sven** » Thu Apr 16, 2009 3:37 pm

Isn't this also dependent on the strength difference between the two opponents, or in case no ELO strength is available a priori, on the closeness of the scores?

Sven

MattieShoes · Post by **MattieShoes** » Mon May 04, 2009 7:34 pm

Dr. Muller recently gave a rule of thumb for comparing gauntlet results....

For 95% error bars, around the score, the error bars would be
~78.4% / sqrt(games)

For 96 games, that'd be around 8%
So if the engine scores 50%, you're 95% sure it should be between 42% and 58%

For comparing two gauntlet results...
sqrt(errA^2 + errB^2) will give how far apart they need to be. In this case, it'd be about 11% apart, so 44% for one and 56% for the other would be good enough.

...

Which would be around 42 points different I think.

For fast games though, ~86%/sqrt(games) would probably be better, since draws are less frequent in fast games.

Laskos · Post by **Laskos** » Tue May 05, 2009 12:45 am

MattieShoes wrote:Dr. Muller recently gave a rule of thumb for comparing gauntlet results....

For 95% error bars, around the score, the error bars would be
~78.4% / sqrt(games)

For 96 games, that'd be around 8%
So if the engine scores 50%, you're 95% sure it should be between 42% and 58%

For comparing two gauntlet results...
sqrt(errA^2 + errB^2) will give how far apart they need to be. In this case, it'd be about 11% apart, so 44% for one and 56% for the other would be good enough.

...

Which would be around 42 points different I think.

For fast games though, ~86%/sqrt(games) would probably be better, since draws are less frequent in fast games.

Here you are talking about two error bars, and 0.784 and 0.86 are not percentages, just factors. For one error bar (68% confidence) the approximate formula (for normal distribution) is

sqrt(score*(1-score) - drawFraction/4) / sqrt(nrOfGames)

Two error bars are roughly 95% confidence. With very drawish matches or very large strength differences one has to be careful even with this formula, as the distribution will be far from normal. But, generally, it is a very good rule of thumb formula.

Kai

Error margin estimation when no ELOinfo available

Error margin estimation when no ELOinfo available

Re: Error margin estimation when no ELOinfo available

Re: Error margin estimation when no ELOinfo available

Re: Error margin estimation when no ELOinfo available

Re: Error margin estimation when no ELOinfo available

Re: Error margin estimation when no ELOinfo available