Arena

Rebel · Post by **Rebel** » Wed Nov 16, 2011 4:07 pm

Can you in Arena (while playing an eng-eng-match) as an engine force the interface to abort the current game and adjudicate the game in the PGN output with an "*" ?

It would speed-up certain types of tuning, examples:

1. Testing King-Safety, once the ladies left terminate the game;
2. Testing Rook-eval, no rooks on the board terminate the game;
3. etc. etc.

It would also give you automatically the number of games the element you are testing played a role and those that are not which ain't bad information.

Current score needs to fall in a reasonable margin of course as you don't want to "*" a game with a score above 3.xx.

bob · Post by **bob** » Wed Nov 16, 2011 5:02 pm

Rebel wrote:Can you in Arena (while playing an eng-eng-match) as an engine force the interface to abort the current game and adjudicate the game in the PGN output with an "*" ?

It would speed-up certain types of tuning, examples:

1. Testing King-Safety, once the ladies left terminate the game;
2. Testing Rook-eval, no rooks on the board terminate the game;
3. etc. etc.

It would also give you automatically the number of games the element you are testing played a role and those that are not which ain't bad information.

Current score needs to fall in a reasonable margin of course as you don't want to "*" a game with a score above 3.xx.

Why would you want to do that? Not count a game when queens come off? Suppose you were winning (or losing) because of queen evaluations BEFORE the queens came off?

hgm · Post by **hgm** » Wed Nov 16, 2011 5:38 pm

Rebel wrote:Can you in Arena (while playing an eng-eng-match) as an engine force the interface to abort the current game and adjudicate the game in the PGN output with an "*" ?

Neither WB nor UCI protocol support commands for that. So it wouldonly be possible for a GUI to do this if it implemented commands outside the protocol. I doubt if Arena would have that.

Long ago I was requested to implemet something in WinBoard that happens to do exactly what you want. As I saw no general application, I implemented this asan undocumented feature:

Whenever an engine sends setboard <FEN> to the GUI, the current game is aborted (with result 'unfinished'), and a new human-engine game is started, which has the FEN as initial position, and where the human has the move. I guess that in match mode new eng-eng game would automatically start after that, aborting the human-engine game before any moves were made, so that you would never know this game ever existed.

Of course it would be rather trivial to make WinBoard respond to a command

* {no rooks}

just as it responds now to the commands

1-0 {checkmate}
0-1 {resigns}
1/2-1/2 {repetition draw}

etc. That is the power of open source!

Rebel · Post by **Rebel** » Wed Nov 16, 2011 6:24 pm

bob wrote:
Rebel wrote:Can you in Arena (while playing an eng-eng-match) as an engine force the interface to abort the current game and adjudicate the game in the PGN output with an "*" ?

It would speed-up certain types of tuning, examples:

1. Testing King-Safety, once the ladies left terminate the game;
2. Testing Rook-eval, no rooks on the board terminate the game;
3. etc. etc.

It would also give you automatically the number of games the element you are testing played a role and those that are not which ain't bad information.

Current score needs to fall in a reasonable margin of course as you don't want to "*" a game with a score above 3.xx.
Why would you want to do that? Not count a game when queens come off? Suppose you were winning (or losing) because of queen evaluations BEFORE the queens came off?

The bold.

bob · Post by **bob** » Wed Nov 16, 2011 6:37 pm

Rebel wrote:
bob wrote:
Rebel wrote:Can you in Arena (while playing an eng-eng-match) as an engine force the interface to abort the current game and adjudicate the game in the PGN output with an "*" ?

It would speed-up certain types of tuning, examples:

1. Testing King-Safety, once the ladies left terminate the game;
2. Testing Rook-eval, no rooks on the board terminate the game;
3. etc. etc.

It would also give you automatically the number of games the element you are testing played a role and those that are not which ain't bad information.

Current score needs to fall in a reasonable margin of course as you don't want to "*" a game with a score above 3.xx.
Why would you want to do that? Not count a game when queens come off? Suppose you were winning (or losing) because of queen evaluations BEFORE the queens came off?
The bold.

Makes no sense to me. +1.0 can be winning. +0.5 can be winning. I don't see how you can reasonably exclude a game when queens come off, regardless of the score. Something got you to that point. That "something" might be the eval changes you are measuring. Eventually +.5 can become +5.0 and beyond, if you don't bail out.

Rebel · Post by **Rebel** » Wed Nov 16, 2011 6:53 pm

I will try something like that with Arena HGM.

But there is a dark side, programmers can cheat with official releases, when behind a pawn (or so) force the next game.

Ouch....

Rebel · Post by **Rebel** » Wed Nov 16, 2011 7:02 pm

bob wrote:
Rebel wrote:
bob wrote:
Rebel wrote:Can you in Arena (while playing an eng-eng-match) as an engine force the interface to abort the current game and adjudicate the game in the PGN output with an "*" ?

It would speed-up certain types of tuning, examples:

1. Testing King-Safety, once the ladies left terminate the game;
2. Testing Rook-eval, no rooks on the board terminate the game;
3. etc. etc.

It would also give you automatically the number of games the element you are testing played a role and those that are not which ain't bad information.

Current score needs to fall in a reasonable margin of course as you don't want to "*" a game with a score above 3.xx.
Why would you want to do that? Not count a game when queens come off? Suppose you were winning (or losing) because of queen evaluations BEFORE the queens came off?
The bold.
Makes no sense to me. +1.0 can be winning. +0.5 can be winning. I don't see how you can reasonably exclude a game when queens come off, regardless of the score. Something got you to that point. That "something" might be the eval changes you are measuring. Eventually +.5 can become +5.0 and beyond, if you don't bail out.

Of course a reasonable margin would be 0.25

But even a higher margin would work because in self-play the knife statistically cuts both ways and every randomness is flattened by the volume of games. The goal is to shorten time.

hgm · Post by **hgm** » Wed Nov 16, 2011 11:18 pm

Rebel wrote:I will try something like that with Arena HGM.

But there is a dark side, programmers can cheat with official releases, when behind a pawn (or so) force the next game.

Ouch....

Indeed. So my current policy is to only react on such engine->GUI commands under GUIsettings you would not use for official matches. E.g. I added a 'setup' command, which an engine can use to define board format and initial position of a variant that is unknown to the GUI. But it is only accepted when legality testing is off, so that engines cannot cheat by giving their opponent pawn odds when you run with legality testing on.

If I were to implement the '*' command for terminating a game as unfinished, I would make it subject to the setting of 'verify engine claims', and would make it result in a forfeit whe such verification is switched on. Like now it forfeits engines that claim a win for themselves through the 1-0 command in positions that are not a checkmate.

Evert · Post by **Evert** » Thu Nov 17, 2011 7:06 am

hgm wrote:E.g. I added a 'setup' command, which an engine can use to define board format and initial position of a variant that is unknown to the GUI. But it is only accepted when legality testing is off

Aha!
That is the information I had been missing.
Is there an update anywhere describing the communication protocol for 4.5? Right now the information I have is spread out over a couple of forum posts.

hgm · Post by **hgm** » Thu Nov 17, 2011 9:44 am

No, this is another undocumented and 'unstable' feature, and will probably remain so until I am sure how to best handle it (i.e. what information exactly to put in it,and when to accept it). Currently the formats supported in WinBoard 4.5.x is

setup <FEN>
setup (<pieceToCharTable>) <FEN>

But the WinBoard Alien edition already supports the additional format

setup (<pieceToCharTable>) <BoardSize> <FEN>

where <BoardSize> is something like 10x8+7 to specify a 10x8 board with holdings for 7 pieces. The command is accepted from the first engine only (and then loaded into the second, to allow for shuffle games where the two engines might not agree on the initial setup), when legality testing is off. (In WinBoard 4.6.x I accept it in variant fairy even with legality testing on, but the results are not yet entirely satisfactory.) When the user started from a position setup by himself, the FEN part is ignored. (A bug I corrected only recently is that the pieceToCharTable in that case was also ignored.)

I am still in doubt if it should also be made possible to transfer info on the rules, e.g. define how the pieces move. It could be useful to allow that, so that any variant can be played with legality-testing on (and give better SAN). But it might be better to create a separate command for that, like

piece <ID> <gait>

where ID is the single-letter code as defined in the pieceToCharTable, and <gait> is a representation of the moves that piece can make, like

piece A {1,1}* {1,2}
piece H {1,0;2,1}
piece P (1,0)m (1,1)c (1,-1)c (1,0;2,0)vm
piece C {1,0}m* {1,0}hc*

Note that the Alien edition defines other protocol extensions as well, all designed to allow total control of the GUI by engine. WB protocol was already pretty good at that, e.g. leaving the engine in charge of legality checking through proper use of the Illegal Move error message and result-claim commands. But some features, like marking the target squares of a move, where not possible without the GUI knowing the piece moves. To remedy that, we added commands that inform the engine when a user picks up a piece, puts it down again, or hovers it over a capture square (lift <square>, put <square> and hover <square>), enabling the engine to send highlight <colorFEN> commands to cause highlighting of the indicated squares. Nebiyu is the only engine currently using that.

Arena

Arena

Re: Arena

Re: Arena

Re: Arena

Re: Arena

Re: Arena

Re: Arena

Re: Arena

Re: Arena

Re: Arena