What is New in Toga II 1.3x4?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Eelco de Groot
Posts: 4596
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: Some more data to the fire :-)

Post by Eelco de Groot »

mclane wrote:when the results showed that the gap between the 2 programs was to big, i throw out the redundant older versions to get more data from the more important ones.
the behaviour in those tournaments is so sure that you can reproduce it at any time.
Okay Thorsten, I take it you refer mainly to tournament c here?

But in tournament c does that mean that Toga 1.2.1a did play no more than 20 games, scoring 12½/20 and Toga 1.3x4 scoring 15/20? http://f50.parsimony.net/forum200336/messages/8014.htm
I do think that was a bit early then to conclude that Toga 1.2.1a played a redundant result in that tournament... And if you can really exactly reproduce this any time, then I just don't understand the statistics of Eloratings anymore at all. I'm not saying it's not an accurate result, but there is still supposed to be a lot of statistical noise in that, and the result easily could have been the other way round, even if Toga 1.3x4 is really stronger.

I still don't understand the result of Tournament k3...

Regards, Eelco
Terry McCracken

Re: Some more data to the fire :-)

Post by Terry McCracken »

Eelco de Groot wrote:
mclane wrote:when the results showed that the gap between the 2 programs was to big, i throw out the redundant older versions to get more data from the more important ones.
the behaviour in those tournaments is so sure that you can reproduce it at any time.
Okay Thorsten, I take it you refer mainly to tournament c here?

But in tournament c does that mean that Toga 1.2.1a did play no more than 20 games, scoring 12½/20 and Toga 1.3x4 scoring 15/20? http://f50.parsimony.net/forum200336/messages/8014.htm
I do think that was a bit early then to conclude that Toga 1.2.1a played a redundant result in that tournament... And if you can really exactly reproduce this any time, then I just don't understand the statistics of Eloratings anymore at all. I'm not saying it's not an accurate result, but there is still supposed to be a lot of statistical noise in that, and the result easily could have been the other way round, even if Toga 1.3x4 is really stronger.

I still don't understand the result of Tournament k3...

Regards, Eelco
Well said, Eelco!

Best,
Terry
User avatar
mclane
Posts: 18775
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Some more data to the fire :-)

Post by mclane »

no it was and is not early. it can be reproduced at any time.

e.g. here

Code: Select all

    Motor                      Punkte RyToHiPxFrLoSpPyGlthCoToNoPr    S-B 
01: Rybka v2.3.lk.x64          8,5/9  ·     1 1 1 1 =   1   1 1 1   30,25 
02: TogaII1.3x4 [default]      7,5/9    ·   = = 1 1 1 1     1 = 1   26,00 
03: Hiarcs11.1UCI              7,5/9      · =     0 1 1 1 1 1 1 1   19,00 
04: program x                  6,0/9  0 = = · = =     1   1   1 1   20,00 
05: Fruit-061115a              6,0/9  0 =   = ·     1 = 1 =   1 1   18,75 
06: LoopMP 12.32               5,0/9  0 0   =   ·   = = 1 1 = 1     13,50 
07: Spike1.2                   4,5/8  0 0 1       · = = = 1 1       16,25 
08: program y                  4,5/9  = 0 0   0 = = ·   1 1 1       15,50 
09: Glaurung121-EM64T          3,0/8    0 0 0 = = =   ·       = 1    9,25 
10: theBaron1.8.1Uci           3,0/9  0   0   0 0 = 0   · 1 =   1    6,00 
11: Colossus2006f              2,0/9      0 0 = 0 0 0   0 · =   1    4,75 
12: TogaII1.2.1a               1,5/8  0 0 0     = 0 0   = = ·        5,00 
13: Now0704                    1,0/8  0 = 0 0 0 0     =       · 0    5,25 
14: ProDeo1.2                  1,0/9  0 0 0 0 0       0 0 0   1 ·    1,00 

61 Partien von 182 gespielt
here

Code: Select all

    Motor                      Punkte HiRyToGlLoChNaToRuNoBaChCoPrZaNo    S-B 
01: Hiarcs11.1UCI              13,5/15 · = 1 1 1 = 1 = 1 1 1 1 1 1 1 1   92,00 
02: Rybka v2.3.lk.w32          11,0/15 = · 0 1 = 0 0 1 1 1 1 1 1 1 1 1   69,00 
03: TogaII1.3x4 [default]      10,5/15 0 1 · 1 0 1 1 = 0 1 = 1 = 1 1 1   68,50 
04: Glaurung1.2.1-32bit        10,0/15 0 0 0 · = = 1 1 1 1 = 1 = 1 1 1   60,25 
05: LoopMP 12.32               9,5/15  0 = 1 = · = = = = = 1 = 1 1 = 1   61,50 
06: program x                  9,5/15  = 1 0 = = · 0 0 0 1 1 1 1 1 1 1   58,00 
07: Naum 2.0                   8,5/15  0 1 0 0 = 1 · 1 = = 1 0 1 0 1 1   54,75 
08: TogaII 1.2.1a              8,5/15  = 0 = 0 = 1 0 · 1 = = = = 1 1 1   53,75 
09: Ruffian2.1.0               8,5/15  0 0 1 0 = 1 = 0 · = = 1 = 1 1 1   51,00 
10: Now0704                    6,5/15  0 0 0 0 = 0 = = = · 1 1 0 1 1 =   36,75 
11: Baron1.8.1.Uci             6,0/15  0 0 = = 0 0 0 = = 0 · 1 1 0 1 1   32,25 
12: program y                  6,0/15  0 0 0 0 = 0 1 = 0 0 0 · 1 1 1 1   29,50 
13: Colossus2006f              4,5/15  0 0 = = 0 0 0 = = 1 0 0 · = 0 1   28,00 
14: ProDeo 1.2                 4,5/15  0 0 0 0 0 0 1 0 0 0 1 0 = · 1 1   19,75 
15: Zarkov 4.5t                2,5/15  0 0 0 0 = 0 0 0 0 0 0 0 1 0 · 1    9,75 
16: Now0604                    0,5/15  0 0 0 0 0 0 0 0 0 = 0 0 0 0 0 ·    3,25 

120 Partien von 240 gespielt
etc.

the faster the machine, or the more computation time, the bigger the gap.
togaII 1.2.1a is overtaken by togaII 1.3x4
User avatar
mclane
Posts: 18775
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Some more data to the fire :-)

Post by mclane »

And if you can really exactly reproduce this any time, then I just don't understand the statistics of Eloratings anymore at all.
?
what is the problem ?

program A is stronger than B.
you make a tournament. and A gets more points than B.

you play another tournament.

A gets more points than B.

you play another tournament.

A gets more points than B.

what is the thing you don't understand ?

In ALL my tournaments toga1.3 gets more points than 1.2.1a.
NO exception. of course the gap between the results is different. depending on the rounds, the time control, the opponents and the hardware.
but the difference between the programs is big enough that in all tournaments the better program is capable to get a higher result.

thats what my results show.
if i would have had ONE exception, i would not be so sure.
if i would have had 2 exceptions, i would be not sure.
but there was none so far.

since both programs use performance.bin as book, it can also not
be a matter of the book.

please - which thing is the thing you don't understand ?
User avatar
mclane
Posts: 18775
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Some more data to the fire :-)

Post by mclane »

i have some more games for tournament:

Code: Select all

    Motor                      Punkte RyToHiPxFrSpLoPyGlTothCoNoPr    S-B
01: Rybka v2.3.lk.x64          10,0/11 ·   = 1 1 1 1 =   1 1 1 1 1   45,25
02: TogaII1.3x4 [default]      9,0/11    ·   = = 1 1 1 1 1 1 = = 1   39,00
03: Hiarcs11.1UCI              9,0/11  =   · = 1 0   1 1 1 1 1 1 1   36,00
04: Program x                  8,0/11  0 = = · = 1 =   1 1   1 1 1   34,00
05: Fruit-061115a              6,5/11  0 = 0 = ·   = 1 =   1 = 1 1   25,75
06: Spike1.2                   6,5/11  0 0 1 0   ·   = = 1 = 1 1 1   23,00
07: LoopMP 12.32               6,5/11  0 0   = =   · = = = 1 1 1 1   21,25
08: Program y                  5,5/10  = 0 0   0 = = · 1 1 1 1       24,50
09: Glaurung121-EM64T          4,5/11    0 0 0 = = = 0 · = 1   = 1   15,75
10: TogaII1.2.1a               3,0/11  0 0 0 0   0 = 0 = · = = 1      9,25
11: theBaron1.8.1Uci           3,0/11  0 0 0   0 = 0 0 0 = · 1   1    8,25
12: Colossus2006f              2,5/11  0 = 0 0 = 0 0 0   = 0 ·   1   10,25
13: Now0704                    1,0/10  0 = 0 0 0 0 0   = 0     · 0    6,75
14: ProDeo1.2                  1,0/11  0 0 0 0 0 0 0   0   0 0 1 ·    1,00

76 Partien von 182 gespielt
togaII1.3x4 in the leading group of programs with 9 from 11.
togaII1.2.1a 3 from 11.

for me the important question is not why this is a pattern/behaviour i get in all of my tournaments on all machines, for me the major question is why the elo-list guys cannot reproduce this.
User avatar
Graham Banks
Posts: 41990
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Some more data to the fire :-)

Post by Graham Banks »

mclane wrote: for me the important question is not why this is a pattern/behaviour i get in all of my tournaments on all machines, for me the major question is why the elo-list guys cannot reproduce this.
Are you sure that you have Toga II 1.2.1a set up correctly Thorsten?

Both CCRL and CEGT have played thousands of games with Toga II 1.2.1a at a range of time controls and your results seem to be out of character.

Of course, could just be bad luck.

Regards, Graham.
User avatar
mclane
Posts: 18775
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Some more data to the fire :-)

Post by mclane »

of course. you forget that this result can be reproduded on several of my machines.
different hardware. different arena versions. different time controls. IMO not the result of toga1.2.1a is out of order, the program plays as it has always played. over the time it was overtaken by other programs. togaII1.3x4 is astonishing since it is fighting almost in the top group. with hiarcs and rybka in the group of 1-3 ranked programs.

i did all these testings 1 month or so ago. later i tested blueberry and found it weaker than default.
i stopped testing 1.2.1a.
it makes IMO not much sense to test when a later version and stronger program can replace it.

only because some people asked i did some more tournaments with both versions participating. with the "usual" result.
IMO the gap between the 2 toga versions is more than 10 or 20 elo.

while toga1.2.1a is more in the region of hiarcs x50 hyp.
toga1.3x4 is more in the region of hiarcs11.1.

this means at least 40 ELO points (that togaII1.3x4 is stronger than toga1.2.1a.)
Last edited by mclane on Thu Apr 12, 2007 12:01 am, edited 3 times in total.
Shaun
Posts: 323
Joined: Wed Mar 08, 2006 9:55 pm
Location: Brighton - UK

Re: What is New in Toga II 1.3x4?

Post by Shaun »

Here is a view of the CCRL results just looking at Toga II 1.2.1a and Toga II 1.3x4 - I find this type of view very useful for side-by-side comparision.

CCRL 40/4

CCRL 40/12

The opponent overlap is not as good as I would like at 40/12 but I will be running more games.

Shaun
User avatar
Denis P. Mendoza
Posts: 415
Joined: Fri Dec 15, 2006 9:46 pm
Location: Philippines

Re: What is New in Toga II 1.3x4?

Post by Denis P. Mendoza »

Considering the improved history and futilitility pruning, as well as other evaluation functions, Toga 1.3x4 (default) has improved slightly compared to Toga 1.21a (default). CEGT and CCRL results are just enough. to prove the difference.

But I just remembered a test made by Gajendra Singh last year with Toga 1.2.1a BH: http://www.superchessengine.com/october.htm
HT=75 improved performance of Toga 1.2.1a BH. Even increasing HT to 80 is ok (but not more than this). In my experience, the Bryan Hoffman compile seems to perform well than the original (Toga 1.2.1a standard) released at UCIengines site.

I haven't done engine tests for quite some time due to defects on my. But I was lucky that my neighbor spared me some time to test how well the Toga 1.2.1a BH (HT=75) fares with the Toga 1.3x4 (original release). Since its just a normal sparring of two brothers, and the PC is not mine, no opening books or egbb was used. I just set a 30-game, 40/4" time interval for the bout. The result was :

Toga 1.2.1a BH (HT=75) 21.5/30 16 wins, 11 draws, 3 losses
Toga 1.3x4 (original) 8.5/30 3 wins, 11 draws. 16 losses

We just can't say 1.2.1a is just lucky!
ernest
Posts: 2042
Joined: Wed Mar 08, 2006 8:30 pm

Re: What is New in Toga II 1.3x4?

Post by ernest »

Shaun wrote:CCRL 40/4

CCRL 40/12
Shaun
Just for the record, what (if any) EGBBs were used with 1.3x4?
Same question for you, Thorsten.