Unified eval tournament?

cyberfish · Post by **cyberfish** » Thu Jan 22, 2009 5:17 am

I was reading the chessprogramming wiki, and read about the "simplified eval" -
http://chessprogramming.wikispaces.com/ ... n+function

and Tomasz Michniewski's idea of doing a "unified eval" test tournament, where every participant would use the same basic eval detailed on that page.

I find the idea really interesting, and seems to make sense, too.

Did that tournament actually happen? I can find no documentation of it on Google.

If not, is anyone interested in doing something like that?

The simplified eval is really simple, and should not take more than 15 minutes or so for engine authors to implement.

Tord Romstad · Post by **Tord Romstad** » Thu Jan 22, 2009 8:07 am

cyberfish wrote:I was reading the chessprogramming wiki, and read about the "simplified eval" -
http://chessprogramming.wikispaces.com/ ... n+function

and Tomasz Michniewski's idea of doing a "unified eval" test tournament, where every participant would use the same basic eval detailed on that page.

I find the idea really interesting, and seems to make sense, too.

Did that tournament actually happen? I can find no documentation of it on Google.

It did happen, but unfortunately only six programs participated. You can see the results and download the games here.

The tournament was the first ever appearance of Glaurung 2, which I had started working on about one month earlier. Unlike the other programs, Glaurung was playing without a book, which explains the numerous weird opening lines.

If not, is anyone interested in doing something like that?

I would certainly be interested.

Tord

cyberfish · Post by **cyberfish** » Thu Jan 22, 2009 8:18 am

Ah thanks!

We just need to get a few more people now...

I just implemented the simplified eval in my engine, and in ~2 seconds (limited depth) games, it's 52-72 elo points weaker.

Code: Select all

Rank Name                      Elo    +    - games score oppo. draws 
   1 Brainless 09-20 64-bit     31    5    5  6536   58%   -31    9% 
   2 Brainless 09-SIM 64-bit   -31    5    5  6536   42%    31    9%

I was worrying about what am I going to do if the "simplified" eval turns out to be better than my own

. Fortunately that didn't happen.

PK · Post by PK » Thu Jan 22, 2009 8:43 am

Actually, set of pcsq tables produced by Tomasz Michniewski is an impressive achievement - it allows a program to play decent game, though it hates flank openings and displays ludicrously high scores.

Perhaps the latter can be helped by constructing tables where

score[sq] = ( score[sq] * 4 ) / 5
or even score[sq] = ( score[sq] * 3 ) / 5

can anyone test it?

I'd like to participate as well (the early version of Hopeless lost the first UFO tournament quite badly), but I'm not likely to do *any* programming before March, since I'm busy with a grant for cataloguing pre-1600 manuscripts in Kraków, and I'm far away from any MSVC compiler.

regards

Pawel Koziol

cyberfish · Post by **cyberfish** » Thu Jan 22, 2009 8:51 am

Just multiplying all the table by 4/5 and 3/5? I can certainly test that.

Uri Blass · Post by **Uri Blass** » Thu Jan 22, 2009 10:22 am

cyberfish wrote:I was reading the chessprogramming wiki, and read about the "simplified eval" -
http://chessprogramming.wikispaces.com/ ... n+function

and Tomasz Michniewski's idea of doing a "unified eval" test tournament, where every participant would use the same basic eval detailed on that page.

I find the idea really interesting, and seems to make sense, too.

Did that tournament actually happen? I can find no documentation of it on Google.

If not, is anyone interested in doing something like that?

The simplified eval is really simple, and should not take more than 15 minutes or so for engine authors to implement.

I think that it may be also interesting to do also simplified search tournament.
The idea is that everybody use the same simple search function but people are free to change the evaluation.

Uri

hgm · Post by **hgm** » Thu Jan 22, 2009 10:37 am

This 'simplified' eval is in some respects even more complicated than micro-Max' eval, as micro-Max does not use piece-square tables, but only a single centralization table shared by Pawns, minor pieces and King. (Rooks and Queen move neutrally over the board.) This table is

Code: Select all

 0  5 12 15 16 15 12  5
 6 11 18 21 22 21 18 11
10 15 22 25 26 25 22 15
12 17 24 27 28 27 24 17
12 17 24 27 28 27 24 17
10 15 22 25 26 25 22 15
 6 11 18 21 22 21 18 11
 0  5 12 15 16 15 12  5

Pawns get a bonus for being pushed, and for reaching 6th and 7th rank, though, which could have been implemented as a piece-square table. But the bonuses for Pawns on the 6th and 7th rank proposed on the Wiki page seem way too low. What micro-Max uses, translated to the Wiki piece values, would be more like 100 on 6th rank and 180 on the 7th, rather than ~20 and 50.

Aggressive handling of Pawns is the main source of strength for micro-Max, and allows it to regularly beat engines that outseaarch it and out-evaluate it in all other respects. This includes a very keen sense for when to start pushing its Pawns, which is implemented by making the Pawn-push bonus dependent on the amount of non-pawn material on the board. The evaluation proposed in the Wiki does allow for the principle of using a different piece-square table in a different game stage (used for King), and I think it would greatly add to the strength if it also did this for Pawns. The currently proposed PST is good for the middle game, but the effective freezing of Pawns on a2, b2 and g2, h2 is really very detrimental in the end-game. Moves like b2-b4 should get +10 there (which seems even low in case this was a passer), not -10.

Tord Romstad · Post by **Tord Romstad** » Thu Jan 22, 2009 12:15 pm

cyberfish wrote:Ah thanks!

We just need to get a few more people now...

I just implemented the simplified eval in my engine, and in ~2 seconds (limited depth) games, it's 52-72 elo points weaker.

That's far less than I would have thought. What does your evaluation contain, apart from material and piece square tables?

I just finished a quick Silver match between the normal version of my program and an otherwise identical version with the evaluation function replaced by Toasz Michniewski's piece square table evaluation:

Code: Select all

Glaurung 090122: 86.5 (+81,=11,-8)
Glaurung UFO 090122: 13.5 (+8,=11,-81)

Tord

Uri Blass · Post by **Uri Blass** » Thu Jan 22, 2009 12:20 pm

hgm wrote:This 'simplified' eval is in some respects even more complicated than micro-Max' eval, as micro-Max does not use piece-square tables, but only a single centralization table shared by Pawns, minor pieces and King. (Rooks and Queen move neutrally over the board.) This table is
Code: Select all
 0  5 12 15 16 15 12  5
 6 11 18 21 22 21 18 11
10 15 22 25 26 25 22 15
12 17 24 27 28 27 24 17
12 17 24 27 28 27 24 17
10 15 22 25 26 25 22 15
 6 11 18 21 22 21 18 11
 0  5 12 15 16 15 12  5
Pawns get a bonus for being pushed, and for reaching 6th and 7th rank, though, which could have been implemented as a piece-square table. But the bonuses for Pawns on the 6th and 7th rank proposed on the Wiki page seem way too low. What micro-Max uses, translated to the Wiki piece values, would be more like 100 on 6th rank and 180 on the 7th, rather than ~20 and 50.

Aggressive handling of Pawns is the main source of strength for micro-Max, and allows it to regularly beat engines that outseaarch it and out-evaluate it in all other respects. This includes a very keen sense for when to start pushing its Pawns, which is implemented by making the Pawn-push bonus dependent on the amount of non-pawn material on the board. The evaluation proposed in the Wiki does allow for the principle of using a different piece-square table in a different game stage (used for King), and I think it would greatly add to the strength if it also did this for Pawns. The currently proposed PST is good for the middle game, but the effective freezing of Pawns on a2, b2 and g2, h2 is really very detrimental in the end-game. Moves like b2-b4 should get +10 there (which seems even low in case this was a passer), not -10.

I think that your simple evaluation for micromax is not simple in terms of time of implementing it in a new program because the evaluation is not dependent only on the position on the board but also on what happened earlier.

Uri

Stan Arts · Post by **Stan Arts** » Thu Jan 22, 2009 12:23 pm

Sure, entertaining concept and I'd be interested.

Such a tournament could be played online (I guess most authors have an ICC account) or simply in one's own basement.

I'll upload my version with it somewhere once I'm done.

Stan

Unified eval tournament?

Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?

Re: Unified eval tournament?