Tony's first rating list (2007)

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Tony Thomas

Re: Tony's first rating list (2007)

Post by Tony Thomas »

Matthias Gemuh wrote:Well done, Tony !
I always like such tournaments that include engines of all strengths.
Only the elo calculation program sucks quite a bit: the elo gradient is too steep, underrating weak engines.

Best,
Matthias.
I should try to figure out how to use Bayesian elo. I have used it in the past, but I didnt know how to set a start rating, so the ratings came out as + or - 120 or something similar. Biglion is probably a tad bit stronger than 1924, I would say it is probably rated between 2050-2150, but at least the ranking is somewhat OK which is all I care about. Do you expect any improvements from your new engines? Is Biglion still your strongest?

A side note, I wont be testing updated versions for a while, I am trying to integrate new engines on to my list. Results of Mediocre coming soon.
Vempele

Re: Tony's first rating list (2007)

Post by Vempele »

Tony Thomas wrote:
Matthias Gemuh wrote:Well done, Tony !
I always like such tournaments that include engines of all strengths.
Only the elo calculation program sucks quite a bit: the elo gradient is too steep, underrating weak engines.

Best,
Matthias.
I should try to figure out how to use Bayesian elo. I have used it in the past, but I didnt know how to set a start rating, so the ratings came out as + or - 120 or something similar.
offset 2800 Fruit. Or just 'offset 2800' to get high ratings. :P
Tony Thomas

Re: Tony's first rating list (2007)

Post by Tony Thomas »

Vempele wrote:
Tony Thomas wrote:
Matthias Gemuh wrote:Well done, Tony !
I always like such tournaments that include engines of all strengths.
Only the elo calculation program sucks quite a bit: the elo gradient is too steep, underrating weak engines.

Best,
Matthias.
I should try to figure out how to use Bayesian elo. I have used it in the past, but I didnt know how to set a start rating, so the ratings came out as + or - 120 or something similar.
offset 2800 Fruit. Or just 'offset 2800' to get high ratings. :P
I did try that, but the ratings still came out the same. May be I was doing something wrong back then, I will try again when I make another rating list. Can you offer any help as far as removing double entries, and weaker versions of the same engine? Is there anyway to sort the list without having to use commandline tools? I wont mind using a commandline tool if it is available for free.
CRoberson
Posts: 2094
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Re: Tony's first rating list (2007)

Post by CRoberson »

Tony Thomas wrote:
Charles, what did you use to scan the list? ....
My eyes.
Tony Thomas

Re: Tony's first rating list (2007)

Post by Tony Thomas »

CRoberson wrote:
Tony Thomas wrote:
Charles, what did you use to scan the list? ....
My eyes.
I guess mine fooled me, I must have thought that I was seeing the same engine.
Tony Thomas

Mediocre gets a warm welcome

Post by Tony Thomas »

I introduced Mediocre in to my rating list, it gets a first rating of 2283, which makes it Rank #101 out of 190 engines that I have tested. Here are the results. I still have a list of 100 engines or so that I have never played with...That's life I say. Note that I am assuming that the rating of all the other engines stay the same until I sit down and produce another list in another 6 months, which isnt the case in real life. However, I dont feel like manually weeding out a list of 299 engines everyday.

Code: Select all

191(101) Mediocre v 0.332 JA       : 2283   62 (+ 34,=  4,- 24), 58.1 %

Ayito 0.2.994                 :   2 (+  0,=  0,-  2),  0.0 %
Horizon 4.3                   :   2 (+  2,=  0,-  0), 100.0 %
Latista 1.5                   :   2 (+  1,=  0,-  1), 50.0 %
Deuterium 06.06.23.04         :   2 (+  1,=  0,-  1), 50.0 %
Gaviota                       :   2 (+  1,=  0,-  1), 50.0 %
Delphil 1.6b                  :   2 (+  1,=  0,-  1), 50.0 %
BugChess2_V1_3                :   2 (+  0,=  1,-  1), 25.0 %
BlackBishop 0.9.7i            :   2 (+  1,=  0,-  1), 50.0 %
Natwarlal                     :   2 (+  1,=  0,-  1), 50.0 %
Atlas220uci                   :   2 (+  0,=  0,-  2),  0.0 %
NanoSzachy 2.5                :   2 (+  1,=  0,-  1), 50.0 %
Gnuchess4TM                   :   2 (+  1,=  0,-  1), 50.0 %
Dorky                         :   2 (+  1,=  1,-  0), 75.0 %
Sage 2.2a                     :   2 (+  2,=  0,-  0), 100.0 %
Wing 2.0a                     :   2 (+  1,=  0,-  1), 50.0 %
Aice 0.98                     :   2 (+  1,=  0,-  1), 50.0 %
Adam 3.0                      :   2 (+  2,=  0,-  0), 100.0 %
Alarm0931                     :   2 (+  1,=  1,-  0), 75.0 %
CyberPagno                    :   2 (+  2,=  0,-  0), 100.0 %
NagaSkaki 4.0 orig            :   2 (+  1,=  0,-  1), 50.0 %
LittleThought-0.95            :   2 (+  1,=  0,-  1), 50.0 %
Monarch 1.7                   :   2 (+  1,=  0,-  1), 50.0 %
Uragano_3D 0.87               :   2 (+  1,=  1,-  0), 75.0 %
AliChess 4.06                 :   2 (+  2,=  0,-  0), 100.0 %
Fortress                      :   2 (+  1,=  0,-  1), 50.0 %
Neurosis 2.2                  :   2 (+  1,=  0,-  1), 50.0 %
Zeus 1.28                     :   2 (+  1,=  0,-  1), 50.0 %
Buzz 0.05                     :   2 (+  1,=  0,-  1), 50.0 %
Now 0.1y beta 2               :   2 (+  1,=  0,-  1), 50.0 %
Popochin 2.9                  :   2 (+  2,=  0,-  0), 100.0 %
Homer 2.0                     :   2 (+  1,=  0,-  1), 50.0 %
User avatar
Matthias Gemuh
Posts: 3245
Joined: Thu Mar 09, 2006 9:10 am

Re: Tony's first rating list (2007)

Post by Matthias Gemuh »

Tony Thomas wrote: I should try to figure out how to use Bayesian elo. I have used it in the past, but I didnt know how to set a start rating, so the ratings came out as + or - 120 or something similar. Biglion is probably a tad bit stronger than 1924, I would say it is probably rated between 2050-2150, but at least the ranking is somewhat OK which is all I care about. Do you expect any improvements from your new engines? Is Biglion still your strongest?

Although I know of some bugs that frequently kill BigLion, I am too busy/lazy to fix them now. First I must improve ChessGUI and then ArcBishop/Chancellor, before working on BigLion/Taktix.

Best,
Matthias.
My engine was quite strong till I added knowledge to it.
http://www.chess.hylogic.de
Tony Thomas

Re: Tony's first rating list (2007)

Post by Tony Thomas »

Matthias Gemuh wrote:
Tony Thomas wrote: I should try to figure out how to use Bayesian elo. I have used it in the past, but I didnt know how to set a start rating, so the ratings came out as + or - 120 or something similar. Biglion is probably a tad bit stronger than 1924, I would say it is probably rated between 2050-2150, but at least the ranking is somewhat OK which is all I care about. Do you expect any improvements from your new engines? Is Biglion still your strongest?

Although I know of some bugs that frequently kill BigLion, I am too busy/lazy to fix them now. First I must improve ChessGUI and then ArcBishop/Chancellor, before working on BigLion/Taktix.

Best,
Matthias.
Do you think that Archbishop will catch up with Biglion anytime soon? Do you expect much of a strength improvement if I were to use Biglion 2.23X?
User avatar
Matthias Gemuh
Posts: 3245
Joined: Thu Mar 09, 2006 9:10 am

Re: Tony's first rating list (2007)

Post by Matthias Gemuh »

Tony Thomas wrote: Do you think that Archbishop will catch up with Biglion anytime soon? Do you expect much of a strength improvement if I were to use Biglion 2.23X?

I must first optimize my home-made 80-bit bitboard class, then I'll know if ArcBishop (encumbered by several chess variants) can catch up with BigLion anytime soon. 2.23x is not better than 2.23w.

Matthias.
My engine was quite strong till I added knowledge to it.
http://www.chess.hylogic.de
Tony Thomas

Re: Tony's first rating list (2007)

Post by Tony Thomas »

Matthias Gemuh wrote:
Tony Thomas wrote: Do you think that Archbishop will catch up with Biglion anytime soon? Do you expect much of a strength improvement if I were to use Biglion 2.23X?

I must first optimize my home-made 80-bit bitboard class, then I'll know if ArcBishop (encumbered by several chess variants) can catch up with BigLion anytime soon. 2.23x is not better than 2.23w.

Matthias.
I hope that you make an updated Lion so it can stay Big. I have tried Chess GUI, it is pretty decent, not at the level of Arena but it's getting there slowly but surely.