What is New in Toga II 1.3x4?

mclane · Post by **mclane** » Wed Apr 11, 2007 9:22 pm

when the blueberry setting stabilized as weaker than the default, i stopped testing.

Terry McCracken · Post by **Terry McCracken** » Wed Apr 11, 2007 9:26 pm

mclane wrote:when the results showed that the gap between the 2 programs was to big, i throw out the redundant older versions to get more data from the more important ones.
the behaviour in those tournaments is so sure that you can reproduce it at any time.

It's not much stronger Thorsten, it was a mistake to discard it.

Terry McCracken · Post by **Terry McCracken** » Wed Apr 11, 2007 9:27 pm

mclane wrote:when the blueberry setting stabilized as weaker than the default, i stopped testing.

How many games did you test this setting with?

Eelco de Groot · Post by **Eelco de Groot** » Wed Apr 11, 2007 9:42 pm

mclane wrote:when the results showed that the gap between the 2 programs was to big, i throw out the redundant older versions to get more data from the more important ones.
the behaviour in those tournaments is so sure that you can reproduce it at any time.

Okay Thorsten, I take it you refer mainly to tournament c here?

But in tournament c does that mean that Toga 1.2.1a did play no more than 20 games, scoring 12½/20 and Toga 1.3x4 scoring 15/20? http://f50.parsimony.net/forum200336/messages/8014.htm
I do think that was a bit early then to conclude that Toga 1.2.1a played a redundant result in that tournament... And if you can really exactly reproduce this any time, then I just don't understand the statistics of Eloratings anymore at all. I'm not saying it's not an accurate result, but there is still supposed to be a lot of statistical noise in that, and the result easily could have been the other way round, even if Toga 1.3x4 is really stronger.

I still don't understand the result of Tournament k3...

Regards, Eelco

Terry McCracken · Post by **Terry McCracken** » Wed Apr 11, 2007 9:44 pm

Eelco de Groot wrote:
mclane wrote:when the results showed that the gap between the 2 programs was to big, i throw out the redundant older versions to get more data from the more important ones.
the behaviour in those tournaments is so sure that you can reproduce it at any time.
Okay Thorsten, I take it you refer mainly to tournament c here?

But in tournament c does that mean that Toga 1.2.1a did play no more than 20 games, scoring 12½/20 and Toga 1.3x4 scoring 15/20? http://f50.parsimony.net/forum200336/messages/8014.htm
I do think that was a bit early then to conclude that Toga 1.2.1a played a redundant result in that tournament... And if you can really exactly reproduce this any time, then I just don't understand the statistics of Eloratings anymore at all. I'm not saying it's not an accurate result, but there is still supposed to be a lot of statistical noise in that, and the result easily could have been the other way round, even if Toga 1.3x4 is really stronger.

I still don't understand the result of Tournament k3...

Regards, Eelco

Well said, Eelco!

Best,
Terry

mclane · Post by **mclane** » Wed Apr 11, 2007 10:34 pm

no it was and is not early. it can be reproduced at any time.

e.g. here

Code: Select all

    Motor                      Punkte RyToHiPxFrLoSpPyGlthCoToNoPr    S-B 
01&#58; Rybka v2.3.lk.x64          8,5/9  ·     1 1 1 1 =   1   1 1 1   30,25 
02&#58; TogaII1.3x4 &#91;default&#93;      7,5/9    ·   = = 1 1 1 1     1 = 1   26,00 
03&#58; Hiarcs11.1UCI              7,5/9      · =     0 1 1 1 1 1 1 1   19,00 
04&#58; program x                  6,0/9  0 = = · = =     1   1   1 1   20,00 
05&#58; Fruit-061115a              6,0/9  0 =   = ·     1 = 1 =   1 1   18,75 
06&#58; LoopMP 12.32               5,0/9  0 0   =   ·   = = 1 1 = 1     13,50 
07&#58; Spike1.2                   4,5/8  0 0 1       · = = = 1 1       16,25 
08&#58; program y                  4,5/9  = 0 0   0 = = ·   1 1 1       15,50 
09&#58; Glaurung121-EM64T          3,0/8    0 0 0 = = =   ·       = 1    9,25 
10&#58; theBaron1.8.1Uci           3,0/9  0   0   0 0 = 0   · 1 =   1    6,00 
11&#58; Colossus2006f              2,0/9      0 0 = 0 0 0   0 · =   1    4,75 
12&#58; TogaII1.2.1a               1,5/8  0 0 0     = 0 0   = = ·        5,00 
13&#58; Now0704                    1,0/8  0 = 0 0 0 0     =       · 0    5,25 
14&#58; ProDeo1.2                  1,0/9  0 0 0 0 0       0 0 0   1 ·    1,00 

61 Partien von 182 gespielt

here

Code: Select all

    Motor                      Punkte HiRyToGlLoChNaToRuNoBaChCoPrZaNo    S-B 
01&#58; Hiarcs11.1UCI              13,5/15 · = 1 1 1 = 1 = 1 1 1 1 1 1 1 1   92,00 
02&#58; Rybka v2.3.lk.w32          11,0/15 = · 0 1 = 0 0 1 1 1 1 1 1 1 1 1   69,00 
03&#58; TogaII1.3x4 &#91;default&#93;      10,5/15 0 1 · 1 0 1 1 = 0 1 = 1 = 1 1 1   68,50 
04&#58; Glaurung1.2.1-32bit        10,0/15 0 0 0 · = = 1 1 1 1 = 1 = 1 1 1   60,25 
05&#58; LoopMP 12.32               9,5/15  0 = 1 = · = = = = = 1 = 1 1 = 1   61,50 
06&#58; program x                  9,5/15  = 1 0 = = · 0 0 0 1 1 1 1 1 1 1   58,00 
07&#58; Naum 2.0                   8,5/15  0 1 0 0 = 1 · 1 = = 1 0 1 0 1 1   54,75 
08&#58; TogaII 1.2.1a              8,5/15  = 0 = 0 = 1 0 · 1 = = = = 1 1 1   53,75 
09&#58; Ruffian2.1.0               8,5/15  0 0 1 0 = 1 = 0 · = = 1 = 1 1 1   51,00 
10&#58; Now0704                    6,5/15  0 0 0 0 = 0 = = = · 1 1 0 1 1 =   36,75 
11&#58; Baron1.8.1.Uci             6,0/15  0 0 = = 0 0 0 = = 0 · 1 1 0 1 1   32,25 
12&#58; program y                  6,0/15  0 0 0 0 = 0 1 = 0 0 0 · 1 1 1 1   29,50 
13&#58; Colossus2006f              4,5/15  0 0 = = 0 0 0 = = 1 0 0 · = 0 1   28,00 
14&#58; ProDeo 1.2                 4,5/15  0 0 0 0 0 0 1 0 0 0 1 0 = · 1 1   19,75 
15&#58; Zarkov 4.5t                2,5/15  0 0 0 0 = 0 0 0 0 0 0 0 1 0 · 1    9,75 
16&#58; Now0604                    0,5/15  0 0 0 0 0 0 0 0 0 = 0 0 0 0 0 ·    3,25 

120 Partien von 240 gespielt

etc.

the faster the machine, or the more computation time, the bigger the gap.
togaII 1.2.1a is overtaken by togaII 1.3x4

mclane · Post by **mclane** » Wed Apr 11, 2007 10:43 pm

And if you can really exactly reproduce this any time, then I just don't understand the statistics of Eloratings anymore at all.

?
what is the problem ?

program A is stronger than B.
you make a tournament. and A gets more points than B.

you play another tournament.

A gets more points than B.

you play another tournament.

A gets more points than B.

what is the thing you don't understand ?

In ALL my tournaments toga1.3 gets more points than 1.2.1a.
NO exception. of course the gap between the results is different. depending on the rounds, the time control, the opponents and the hardware.
but the difference between the programs is big enough that in all tournaments the better program is capable to get a higher result.

thats what my results show.
if i would have had ONE exception, i would not be so sure.
if i would have had 2 exceptions, i would be not sure.
but there was none so far.

since both programs use performance.bin as book, it can also not
be a matter of the book.

please - which thing is the thing you don't understand ?

mclane · Post by **mclane** » Wed Apr 11, 2007 10:59 pm

i have some more games for tournament:

Code: Select all

    Motor                      Punkte RyToHiPxFrSpLoPyGlTothCoNoPr    S-B
01&#58; Rybka v2.3.lk.x64          10,0/11 ·   = 1 1 1 1 =   1 1 1 1 1   45,25
02&#58; TogaII1.3x4 &#91;default&#93;      9,0/11    ·   = = 1 1 1 1 1 1 = = 1   39,00
03&#58; Hiarcs11.1UCI              9,0/11  =   · = 1 0   1 1 1 1 1 1 1   36,00
04&#58; Program x                  8,0/11  0 = = · = 1 =   1 1   1 1 1   34,00
05&#58; Fruit-061115a              6,5/11  0 = 0 = ·   = 1 =   1 = 1 1   25,75
06&#58; Spike1.2                   6,5/11  0 0 1 0   ·   = = 1 = 1 1 1   23,00
07&#58; LoopMP 12.32               6,5/11  0 0   = =   · = = = 1 1 1 1   21,25
08&#58; Program y                  5,5/10  = 0 0   0 = = · 1 1 1 1       24,50
09&#58; Glaurung121-EM64T          4,5/11    0 0 0 = = = 0 · = 1   = 1   15,75
10&#58; TogaII1.2.1a               3,0/11  0 0 0 0   0 = 0 = · = = 1      9,25
11&#58; theBaron1.8.1Uci           3,0/11  0 0 0   0 = 0 0 0 = · 1   1    8,25
12&#58; Colossus2006f              2,5/11  0 = 0 0 = 0 0 0   = 0 ·   1   10,25
13&#58; Now0704                    1,0/10  0 = 0 0 0 0 0   = 0     · 0    6,75
14&#58; ProDeo1.2                  1,0/11  0 0 0 0 0 0 0   0   0 0 1 ·    1,00

76 Partien von 182 gespielt

togaII1.3x4 in the leading group of programs with 9 from 11.
togaII1.2.1a 3 from 11.

for me the important question is not why this is a pattern/behaviour i get in all of my tournaments on all machines, for me the major question is why the elo-list guys cannot reproduce this.

Graham Banks · Post by **Graham Banks** » Wed Apr 11, 2007 11:21 pm

mclane wrote: for me the important question is not why this is a pattern/behaviour i get in all of my tournaments on all machines, for me the major question is why the elo-list guys cannot reproduce this.

Are you sure that you have Toga II 1.2.1a set up correctly Thorsten?

Both CCRL and CEGT have played thousands of games with Toga II 1.2.1a at a range of time controls and your results seem to be out of character.

Of course, could just be bad luck.

Regards, Graham.

mclane · Post by **mclane** » Wed Apr 11, 2007 11:26 pm

of course. you forget that this result can be reproduded on several of my machines.
different hardware. different arena versions. different time controls. IMO not the result of toga1.2.1a is out of order, the program plays as it has always played. over the time it was overtaken by other programs. togaII1.3x4 is astonishing since it is fighting almost in the top group. with hiarcs and rybka in the group of 1-3 ranked programs.

i did all these testings 1 month or so ago. later i tested blueberry and found it weaker than default.
i stopped testing 1.2.1a.
it makes IMO not much sense to test when a later version and stronger program can replace it.

only because some people asked i did some more tournaments with both versions participating. with the "usual" result.
IMO the gap between the 2 toga versions is more than 10 or 20 elo.

while toga1.2.1a is more in the region of hiarcs x50 hyp.
toga1.3x4 is more in the region of hiarcs11.1.

this means at least 40 ELO points (that togaII1.3x4 is stronger than toga1.2.1a.)

What is New in Toga II 1.3x4?

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)

Re: Some more data to the fire :-)