Would you happen to know how CEGT would handle the same scenario as above?
And thanks again in advance-
george
Toga II 3.0 released
Moderators: hgm, Rebel, chrisw
-
- Posts: 4567
- Joined: Sun Mar 12, 2006 2:40 am
- Full name:
Re: Toga II 3.0 released
It arrived not exactly later that day but one year later there it was: Toga CMLX 1.4.5e. Teemu's Toga CMLX 1.4.5e version is hopefully still downloadable from this 2009 thread on Rybka forum.Eelco de Groot wrote: I see that Teemu Pudas (Vempele) writes in the same thread that he adapted the code so that would imply it is already in one of the CMLX versions. I did not remember that... He probably placed the code better, my version was placed somewhere as one of of the 'win' recognizers/situations early in eval.cpp but I don't know how Teemu did it.
Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
-
- Posts: 8087
- Joined: Thu Mar 09, 2006 9:51 pm
- Location: Near the Intel Plant in the Land of Manana
- Full name: Timothy Frohlick
Re: Toga II 3.0 released
Toga II 3.0 is going head to head with Stockfish 2.3. on all my dual processor machines. Each have two wins and one draw with a total of five games. Very small sample size but if 1,000,000 players get the same results then that is statistically significant.
I am impressed.
TJF
PS I haven't installed Houdini 3 yet.. I am waiting for my eight core machine with 32 GB of RAM to arrive.
I am impressed.
TJF
PS I haven't installed Houdini 3 yet.. I am waiting for my eight core machine with 32 GB of RAM to arrive.
A New Way Comes Upon Earth.
God is an infinitely variable Constant.
Man marks his ground with ideologies.
Galaxies are the dreidels of God.
War is a punishment for implacability.
Peace flows from forgiveness of sins.
God is an infinitely variable Constant.
Man marks his ground with ideologies.
Galaxies are the dreidels of God.
War is a punishment for implacability.
Peace flows from forgiveness of sins.
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: Toga II 3.0 released
Yes I do run the 64bit version. If someone would bother to read my conditions it becomes very clear. Unfortunately you are not the only one who doesn't care and just look to the rank, rating and if in good mood to the time control. That other conditions are at least as important seems to be ignored by many ... at least according to the emails I get from my web site ...geots wrote:Ok- my mistake. I guess Ingo is running Z Mexico 64bit.
Interesting thought ! But not what I meant. I meant 150(or 200) games against ONE opponent. That IS irrelevant!geots wrote: ... And don't listen to anyone who says 200 games mean nothing and you need over 1000. Stats are what they are, but if under 1000 meant nothing, you could throw out about 50% or more of CCRLs ratings.
This is an exceprt from the latest Toga run:
Code: Select all
Toga II 3.0 32b - Zappa Mexico II (2703) 63.0 - 87.0 42.00% Perf=2647
Toga II 3.0 32b - Chiron 1.5 (2845) 57.5 - 92.5 38.33% Perf=2763
1. You can't draw conclusions out of 150 games against ONE opponent
2. You only get a valid average result if you run against a higher number of opponents. (which makes lists with many times the same opponent and just a few others doubtfull)
Of course, if you are just interested in how does the engine perform against one particular engine you might consider the result interesting ...
Bye
Ingo
*some engines even have a bigger performance gap than 116 Elo between opponents
-
- Posts: 269
- Joined: Wed Oct 24, 2012 2:07 am
Re: Toga II 3.0 released
Ok thanks Eelco!Eelco de Groot wrote:It arrived not exactly later that day but one year later there it was: Toga CMLX 1.4.5e. Teemu's Toga CMLX 1.4.5e version is hopefully still downloadable from this 2009 thread on Rybka forum.Eelco de Groot wrote: I see that Teemu Pudas (Vempele) writes in the same thread that he adapted the code so that would imply it is already in one of the CMLX versions. I did not remember that... He probably placed the code better, my version was placed somewhere as one of of the 'win' recognizers/situations early in eval.cpp but I don't know how Teemu did it.
Eelco
I'll have a look when I start to work on Toga again.
Jerry
-
- Posts: 4790
- Joined: Sat Mar 11, 2006 12:42 am
Re: Toga II 3.0 released
IWB wrote:Yes I do run the 64bit version. If someone would bother to read my conditions it becomes very clear. Unfortunately you are not the only one who doesn't care and just look to the rank, rating and if in good mood to the time control. That other conditions are at least as important seems to be ignored by many ... at least according to the emails I get from my web site ...geots wrote:Ok- my mistake. I guess Ingo is running Z Mexico 64bit.
Interesting thought ! But not what I meant. I meant 150(or 200) games against ONE opponent. That IS irrelevant!geots wrote: ... And don't listen to anyone who says 200 games mean nothing and you need over 1000. Stats are what they are, but if under 1000 meant nothing, you could throw out about 50% or more of CCRLs ratings.
This is an exceprt from the latest Toga run:
That is a difference of 116 Elo* - which cant be explained by statistics! This is simply the difference that occurs because of the playing style suits better or not. So:Code: Select all
Toga II 3.0 32b - Zappa Mexico II (2703) 63.0 - 87.0 42.00% Perf=2647 Toga II 3.0 32b - Chiron 1.5 (2845) 57.5 - 92.5 38.33% Perf=2763
1. You can't draw conclusions out of 150 games against ONE opponent
2. You only get a valid average result if you run against a higher number of opponents. (which makes lists with many times the same opponent and just a few others doubtfull)
Of course, if you are just interested in how does the engine perform against one particular engine you might consider the result interesting ...
Bye
Ingo
*some engines even have a bigger performance gap than 116 Elo between opponents
You have no clue what I looked at and care about or don't care about. FYI I saw 32b, but I mistakenly was putting it with the engine on the right. Have you ever considered that in your info you have "32b" highlighted in red. A lot of people might very well think all they have to see is the red and when they don't might assume it is 64. I would highlight it in both places- or in neither- if it were me. But it isn't me, so whatever.
So if I can't draw conclusions from 150 games ag. ONE opponent, then I don't suppose I can draw conclusions from 50 games ag. ONE opponent. Bullshit. I have a 50 game match where a lower tier Ivanhoe version beat Fritz 13 (wins and losses only) 29-2. I have another (2) 50 game matches where Strelka 5.6 beat the strongest Ivanhoe in existence at the moment, 32-4 and 31-6. If I were you, I would add a caveat to your statement. The results are true- if I sounded a bit sarcastic- that is probably true as well, as I'm not at the moment in the mood for testing lessons. But no harm- no foul.
Best,
-
- Posts: 4790
- Joined: Sat Mar 11, 2006 12:42 am
Please IGNORE Both "Help" Threads- Sorry!
I needed to go ahead and run the matches- I will live with 40/3.
Best,
Best,
-
- Posts: 10311
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Toga II 3.0 released
I think that it is possible to get conclusions based on 150 games against a single opponent and the question is what are the conclusions.IWB wrote:Yes I do run the 64bit version. If someone would bother to read my conditions it becomes very clear. Unfortunately you are not the only one who doesn't care and just look to the rank, rating and if in good mood to the time control. That other conditions are at least as important seems to be ignored by many ... at least according to the emails I get from my web site ...geots wrote:Ok- my mistake. I guess Ingo is running Z Mexico 64bit.
Interesting thought ! But not what I meant. I meant 150(or 200) games against ONE opponent. That IS irrelevant!geots wrote: ... And don't listen to anyone who says 200 games mean nothing and you need over 1000. Stats are what they are, but if under 1000 meant nothing, you could throw out about 50% or more of CCRLs ratings.
This is an exceprt from the latest Toga run:
That is a difference of 116 Elo* - which cant be explained by statistics! This is simply the difference that occurs because of the playing style suits better or not. So:Code: Select all
Toga II 3.0 32b - Zappa Mexico II (2703) 63.0 - 87.0 42.00% Perf=2647 Toga II 3.0 32b - Chiron 1.5 (2845) 57.5 - 92.5 38.33% Perf=2763
1. You can't draw conclusions out of 150 games against ONE opponent
2. You only get a valid average result if you run against a higher number of opponents. (which makes lists with many times the same opponent and just a few others doubtfull)
Of course, if you are just interested in how does the engine perform against one particular engine you might consider the result interesting ...
Bye
Ingo
*some engines even have a bigger performance gap than 116 Elo between opponents
A program may perform better against one opponent because of playing style but there is a limit for it.
If I see for example result like 120-30 I can be practically sure that the winner is better.
If I see for example result like 90-60 I cannot be practically sure that the winner is better but I can be practically sure that the winner is not more than 100 elo worse than the loser.
These conclusions are based on common sense and previous experience and I was relatively careful here.
I believe that there is no single practical case when A score even 70% against B in 150 games when A score worse than B against C.
Note that C should have a similiar rating to the average of A and B(not more than 100 elo difference).
If you think that I am wrong then I would like to see a single case including the names of the programs A,B,C so everybody can reproduce the results(with possible small differences).
Note that I do not claim that it is impossible to build programs A,B,C when it happens but only that it practically does not happen.
-
- Posts: 4790
- Joined: Sat Mar 11, 2006 12:42 am
Re: Toga II 3.0 released
Uri Blass wrote:I think that it is possible to get conclusions based on 150 games against a single opponent and the question is what are the conclusions.IWB wrote:Yes I do run the 64bit version. If someone would bother to read my conditions it becomes very clear. Unfortunately you are not the only one who doesn't care and just look to the rank, rating and if in good mood to the time control. That other conditions are at least as important seems to be ignored by many ... at least according to the emails I get from my web site ...geots wrote:Ok- my mistake. I guess Ingo is running Z Mexico 64bit.
Interesting thought ! But not what I meant. I meant 150(or 200) games against ONE opponent. That IS irrelevant!geots wrote: ... And don't listen to anyone who says 200 games mean nothing and you need over 1000. Stats are what they are, but if under 1000 meant nothing, you could throw out about 50% or more of CCRLs ratings.
This is an exceprt from the latest Toga run:
That is a difference of 116 Elo* - which cant be explained by statistics! This is simply the difference that occurs because of the playing style suits better or not. So:Code: Select all
Toga II 3.0 32b - Zappa Mexico II (2703) 63.0 - 87.0 42.00% Perf=2647 Toga II 3.0 32b - Chiron 1.5 (2845) 57.5 - 92.5 38.33% Perf=2763
1. You can't draw conclusions out of 150 games against ONE opponent
2. You only get a valid average result if you run against a higher number of opponents. (which makes lists with many times the same opponent and just a few others doubtfull)
Of course, if you are just interested in how does the engine perform against one particular engine you might consider the result interesting ...
Bye
Ingo
*some engines even have a bigger performance gap than 116 Elo between opponents
A program may perform better against one opponent because of playing style but there is a limit for it.
If I see for example result like 120-30 I can be practically sure that the winner is better.
If I see for example result like 90-60 I cannot be practically sure that the winner is better but I can be practically sure that the winner is not more than 100 elo worse than the loser.
These conclusions are based on common sense and previous experience and I was relatively careful here.
I believe that there is no single practical case when A score even 70% against B in 150 games when A score worse than B against C.
Note that C should have a similiar rating to the average of A and B(not more than 100 elo difference).
If you think that I am wrong then I would like to see a single case including the names of the programs A,B,C so everybody can reproduce the results(with possible small differences).
Note that I do not claim that it is impossible to build programs A,B,C when it happens but only that it practically does not happen.
You are NOT wrong, my friend. I could not have said it any better!
Best,
george
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: Toga II 3.0 released
I have no idea why people are so nitpicking. Your example of 70% is about 150 Elo difference (or 105 to 45 in a 150 game match) while in the given case we have a 50 Elo difference.Uri Blass wrote:
I think that it is possible to get conclusions based on 150 games against a single opponent and the question is what are the conclusions.
A program may perform better against one opponent because of playing style but there is a limit for it.
If I see for example result like 120-30 I can be practically sure that the winner is better.
If I see for example result like 90-60 I cannot be practically sure that the winner is better but I can be practically sure that the winner is not more than 100 elo worse than the loser.
These conclusions are based on common sense and previous experience and I was relatively careful here.
I believe that there is no single practical case when A score even 70% against B in 150 games when A score worse than B against C.
Note that C should have a similiar rating to the average of A and B(not more than 100 elo difference).
If you think that I am wrong then I would like to see a single case including the names of the programs A,B,C so everybody can reproduce the results(with possible small differences).
Note that I do not claim that it is impossible to build programs A,B,C when it happens but only that it practically does not happen.
So yes, there are case where you can draw conclusions,
1. Where you know the result in advance and
2. Where you have differences which are huge.
The Toga example shows quite nice a shown 50 Elo difference is in reality 0 and that goes in both directions. Where Toga underperformed roughly 70 Elo against Zappa it overperformed ~40 Elo vs Chiron. If you guys are happy with a 150 Elo (70%) accuracy, than fine you can draw conclusions, for myself this is not good enough.
Best
Ingo
EDIT: All the data of the IPON is online, you can search for any best/worst perfomance there, and I am sure there are examples with more than 116 Elo ... (Toga will be updated in a few hours)