Page 1 of 2

Home On The Range

Posted: Wed Dec 26, 2007 8:36 pm
by Graham Banks
HOME ON THE RANGE

Dual Athlon XP2000+
Deep Shredder 11 GUI
128mb hash each
3-4-5 piece tablebases
Ponder off
Xmas2640b.bkt
40 moves in 68 minutes repeating (adapted for the CCRL)
4 cycles (44 rounds)


Participants

Parrot 07.07.22
Alf 1.09
Matheus 2.3
Mustang 4.97
Lime 62
Adam 3.1
BigLion 2.23x
OBender 3.1.0
Monarch 1.7
Marvin 1.3.0
Smash 1.0.3
Clueless 1.4

Image

Image

Image

Image

Image

Image

Image

Image

Image

Image

Image

Image


The tournament can be followed and games downloaded after every couple of rounds from here:
http://kirr.homeunix.org/chess/discussi ... f=7&t=2913

Standings after Round 6 of 44

Posted: Sun Dec 30, 2007 9:44 pm
by Graham Banks
HOME ON THE RANGE

Dual Athlon XP2000+
Deep Shredder 11 GUI
128mb hash each
3-4-5 piece tablebases
Ponder off
Xmas2640b.bkt
40 moves in 68 minutes repeating (adapted for the CCRL)
4 cycles (44 rounds)


Standings after Round 6

4.5 - Alf 1.09
4.5 - BigLion 2.23x
4.0 - Monarch 1.7
3.5 - Parrot 07.07.22
3.5 - OBender 3.1.0
3.0 - Lime 62
2.5 - Matheus 2.3
2.5 - Adam 3.1
2.5 - Clueless 1.4
2.0 - Marvin 1.3.0
2.0 - Smash 1.0.3
1.5 - Mustang 4.97


The tournament can be followed and games downloaded after every couple of rounds from here:
http://kirr.homeunix.org/chess/discussi ... f=7&t=2913

Re: Standings after Round 6 of 44

Posted: Sun Dec 30, 2007 9:56 pm
by Dr.Wael Deeb
So Smash is a weak engine after all playing around 1800 Elo and even a little bit less.... 8-)

Re: Standings after Round 6 of 44

Posted: Sun Dec 30, 2007 10:13 pm
by Graham Banks
Dr.Wael Deeb wrote:So Smash is a weak engine after all playing around 1800 Elo and even a little bit less.... 8-)
Not based on the 180 games or so that we have played! :P
Early days in this tourney of course.

Standings after Round 12 of 44

Posted: Mon Jan 07, 2008 8:49 am
by Graham Banks
HOME ON THE RANGE

Dual Athlon XP2000+
Deep Shredder 11 GUI
128mb hash each
3-4-5 piece tablebases
Ponder off
Xmas2640b.bkt
40 moves in 68 minutes repeating (adapted for the CCRL)
4 cycles (44 rounds)


Standings after Round 12

9.5 - Alf 1.09
8.5 - Monarch 1.7
8.0 - BigLion 2.23x
7.5 - Parrot 07.07.22
6.5 - Lime 62
6.0 - Adam 3.1
5.5 - Matheus 2.3
5.0 - OBender 3.1.0
5.0 - Smash 1.0.3
4.0 - Mustang 4.97
4.0 - Clueless 1.4
2.5 - Marvin 1.3.0


The tournament can be followed and games downloaded after every couple of rounds from here:
http://kirr.homeunix.org/chess/discussi ... f=7&t=2913

Re: Standings after Round 6 of 44

Posted: Mon Jan 07, 2008 9:26 am
by hgm
Dr.Wael Deeb wrote:So Smash is a weak engine after all playing around 1800 Elo and even a little bit less.... 8-)
If you think you can base a meaningful rating on just 3 won games, you just don't know how to determine a rating.

It does not matter if you have a score of 3 out of 60, 3 out of 1000, 3 out of 100,000. In all cases the rating will be crap, and the probability the engine would have scored 2 or 4 points in stead of 3 hardly any lower than that it has 3. In fact the standard error in the rating would be far greater with 3 out of 100,000 games than it would be for 3 out of 10 games, as the same statistical error in score would map to a much larger uncertainty in Elo becuase of the (apparently) increased strength difference between the tested engine and its opponents.

So remember: an Elo based on N points can at most be as accurate as any Elo based on ~2N games. So for 3 points that is only 6 effective games. Most of the tail of your rating list is just based on 6 games, and all the other games were a meaningless exercise. This is why your ratings are crap.

Re: Standings after Round 6 of 44

Posted: Mon Jan 07, 2008 12:14 pm
by Dr.Wael Deeb
hgm wrote:
Dr.Wael Deeb wrote:So Smash is a weak engine after all playing around 1800 Elo and even a little bit less.... 8-)
If you think you can base a meaningful rating on just 3 won games, you just don't know how to determine a rating.

It does not matter if you have a score of 3 out of 60, 3 out of 1000, 3 out of 100,000. In all cases the rating will be crap, and the probability the engine would have scored 2 or 4 points in stead of 3 hardly any lower than that it has 3. In fact the standard error in the rating would be far greater with 3 out of 100,000 games than it would be for 3 out of 10 games, as the same statistical error in score would map to a much larger uncertainty in Elo becuase of the (apparently) increased strength difference between the tested engine and its opponents.

So remember: an Elo based on N points can at most be as accurate as any Elo based on ~2N games. So for 3 points that is only 6 effective games. Most of the tail of your rating list is just based on 6 games, and all the other games were a meaningless exercise. This is why your ratings are crap.
All the results of the engines under the 2000 barrier are deleted and will be restarted....
And you are right,the tail of my rating list is not accurate,as I pay attention to the higher divisions,now there is another story running for more than 5 yeaars now....

Re: Standings after Round 6 of 44

Posted: Mon Jan 07, 2008 12:37 pm
by hgm
Well, there is no reason to delete any results. Just supplement them with more results of the weak engines amongst each other, so that they get more wins and a score closer to 50%. The results you already have will remain helpful to position them, as a group, relative to the stronger engines in an accurate way.

Re: Standings after Round 6 of 44

Posted: Mon Jan 07, 2008 1:40 pm
by Dr.Wael Deeb
hgm wrote:Well, there is no reason to delete any results. Just supplement them with more results of the weak engines amongst each other, so that they get more wins and a score closer to 50%. The results you already have will remain helpful to position them, as a group, relative to the stronger engines in an accurate way.
I see,but 6 effective games are way too small and I've started a new series of tournaments on a lower hardware as I won't use my new core 2 duo machine for engines like Cassandre,Soldat,etc....

Re: Standings after Round 6 of 44

Posted: Mon Jan 07, 2008 7:52 pm
by Graham Banks
hgm wrote:This is why your ratings are crap.
That's a bit harsh on Wael. :(