Tuning for rating lists ?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Tuning for rating lists ?

Post by michiguel »

sje wrote:
Don wrote:UCI as well as XBOARD already have a provision for reporting this to the programs.
Other than supplying a vault to a PGN tag, Symbolic ignores rating data from an ICS.

"Play the board, not the man (or machine)."

Does this cost points? Maybe at times, but I don't care. Sometimes it may gain points. I suspect that over time, things are even.
Throwing information goes against the idea of making an artificial intelligent entity. Ideally, programs should be provided the name of the opponent and the tournament results until that point. Then, the program will decide if drawing will be ok because it qualifies to the next round, change the style, used previous specific learning, etc. But we are far from it.

Miguel
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Tuning for rating lists ?

Post by Don »

michiguel wrote:
sje wrote:
Don wrote:UCI as well as XBOARD already have a provision for reporting this to the programs.
Other than supplying a vault to a PGN tag, Symbolic ignores rating data from an ICS.

"Play the board, not the man (or machine)."

Does this cost points? Maybe at times, but I don't care. Sometimes it may gain points. I suspect that over time, things are even.
Throwing information goes against the idea of making an artificial intelligent entity. Ideally, programs should be provided the name of the opponent and the tournament results until that point. Then, the program will decide if drawing will be ok because it qualifies to the next round, change the style, used previous specific learning, etc. But we are far from it.

Miguel
As a former tournament player the model that seems the most appropriate is the one that every tournament or match player uses. If you are playing some random player for a one-off game it doesn't matter. That model is the ELO rating of your opponent and any self-knowledge you might have about them. But the most important knowledge is the ELO rating of your opponent and this is something that humans HAVE and make strong use of when playing games.

It's tricker with computers because the rating varies based on version and hardware and other factors - but one could take a comprehensive list such as CEGT and use those as a reference point - it's about as close to official as you can get while still still including a LOT Of programs.

In a massive round robin auto/test situation you could of course keep the programs ratings updated based on the results - but it really makes little difference if programs are within 50-100 ELO of each other.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Opponent modeling

Post by sje »

Taking the higher view, a contempt factor is perhaps the simplest instance of the much more general topic of opponent modeling. And maybe it does gain some rating points in the long run.

But the use of a contempt factor is quite limited when compared to opponent modeling techniques seen in human play and indeed in some prior cases of program play. Knowing an opponent's identify can also mean knowing he opponent's opening repertoire, style of play time utilization, etc. All of this could be used to affect search in a sufficiently sophisticated program.

In some program tournaments in the Old Days, authors have manually adjusted a program's opening book for particular machine opponents, sometimes with devastating effect. There have also been cases when a human opponent wipes out a program by re-playing, without thought, known winning variations. Is this good sportsmanship? Is it that much different from using a contempt factor?
User avatar
Eelco de Groot
Posts: 4567
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: Opponent modeling

Post by Eelco de Groot »

Wasn't there a big row, maybe even back in the time of RGCC with MChess that learned and used information from the long matches played for the SSDF lis? I think book learning is allowed by them, but MChess went a bit further than that. This was deemed against the spirit of the rating lists. Years laters there where arguments with Chris, Marty Hirsch had long since left the scene, claiming that Hirsch' commercial interests had been irreparably damaged by this controversy. Well anyway, you could have contempt learning, especially in short time control matches that go over a lot of games, without any info supplied by pgn tags necessary. It would be artificial intelligence, and should be applauded as such, but it is no longer pure computerchess alone. What Don proposes is also at the face of it, a more transparent form because of the pgn info that has to be supplied to work, but how do you check if the program really plays by the rules? You can't verify without opening your sources. So I suggest the ICCA meetings should take a stance on this, and whether the program itself could be disqualified too, if, in its mad desire to fulfill its master's winning wishes does no longer obey the rules set by either ICCA or its internal code of honour, it starts re-adding pieces captured by the opponent back on the board, when Jaap isn't looking, fakes a crash in bad positions, to gain time to think, starts banging on the digital clock in timetrouble, or starts blowing smoke from its overheated circuits in the face of the non smoking opponent programmers etc.

Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: Tuning for rating lists ?

Post by BubbaTough »

mcostalba wrote:There is this magical parameter, called "contempt" that has this interesting property, if enabled lets your engine to become strong with the weakest and weak with the strongest yes, is not a very ethical one :-) Apart from the technical merit of this: I have serious doubts that just tweaking the draw score is enough to enable a more aggressive and risky style of play, but this is another topic.

Here the topic is, assuming that this contempt does the trick, it happens that in the rating lists with many engines weaker than yours, so when your engine is in the top half of the list, this contempt can, more or less, artificially push you up. The side effect is that your engine becomes weaker with the strongest, so if you, for instance, want to participate to a tournament with elimination rounds until the final, this contempt factor perhaps it is wise to disable.

Personally I'd prefer to be strong with the strongest and...merciful :-) with the weakest, IOW I prefer the engine does well in tournaments and in one-to-one direct matches even if this means to give up some points in the rating lists.
I totally agree. Its good to have contempt, and to use it as appropriate, but setting the default rating to non-zero feel wrong to me as its sole purpose is to mislead rating lists about strength. Ideally, testers would find a method for letting engines know the strength of the opponent so that the engines could use contempt appropriately. Otherwise, I would prefer testers to set all engine contempts to 0 independent of what the default settings are.

-Sam
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Tuning for rating lists ?

Post by Don »

BubbaTough wrote:
mcostalba wrote:There is this magical parameter, called "contempt" that has this interesting property, if enabled lets your engine to become strong with the weakest and weak with the strongest yes, is not a very ethical one :-) Apart from the technical merit of this: I have serious doubts that just tweaking the draw score is enough to enable a more aggressive and risky style of play, but this is another topic.

Here the topic is, assuming that this contempt does the trick, it happens that in the rating lists with many engines weaker than yours, so when your engine is in the top half of the list, this contempt can, more or less, artificially push you up. The side effect is that your engine becomes weaker with the strongest, so if you, for instance, want to participate to a tournament with elimination rounds until the final, this contempt factor perhaps it is wise to disable.

Personally I'd prefer to be strong with the strongest and...merciful :-) with the weakest, IOW I prefer the engine does well in tournaments and in one-to-one direct matches even if this means to give up some points in the rating lists.
I totally agree. Its good to have contempt, and to use it as appropriate, but setting the default rating to non-zero feel wrong to me as its sole purpose is to mislead rating lists about strength. Ideally, testers would find a method for letting engines know the strength of the opponent so that the engines could use contempt appropriately. Otherwise, I would prefer testers to set all engine contempts to 0 independent of what the default settings are.

-Sam
We have contempt in Komodo and it is certainly not there to mislead anyone - in fact it's visible and configurable. Our primary purpose in having it is simply to encourage Komodo to play for the win no matter who it is playing as long as the score is close.

I don't think most people want to see a program playing immediately for a draw because the score is -0.01. We use a conservative value of 7 for contempt which isn't even enough to compensate for the white advantage, but at least Komodo won't play for a draw just because it's score is slightly down such as -0.01. I don't care WHO it's playing, I don't really want to see a draw here.

In my opinion ALL programs should have this by default - we want the programs to play out the games. It's not about trying to trick our way out of trouble.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.