Repetition detection structure.

Discussion of chess software programming and technical issues.

Moderator: Ras

gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Repetition detection structure.

Post by gladius »

bob wrote:to follow up, would you get to play more games if you stopped each one after move 30 and declared a winner if both programs liked the same side?
You'd play many more in the same amount of time of course, but I don't think stopping after move 30 is quite the same as stopping when both programs agree there is an 8 pawn advantage :).

It's just the degree of uncertainty in the game results that is changing in the 30 moves, vs 8 pawns adjudication. But I think everyone would agree that adjudicating after 30 moves if both programs agree is close to random, while an 8 pawn advantage has a much, much higher probability of reflecting the game result. There will be some draws missed, and even the occasional win, but between good programs, it won't happen very often.
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Repetition detection structure.

Post by hgm »

This is a very questionable statement. Uri selected 4 games for an independent reason (just because they were long, and contained a long sequence of reversible moves), and it turns out 2 of them were adjudicated wrong. That is 50%. I see no reasons why perpetuals would be more likely after a long sequence of reversible moves (if you adjudicate after 3-5 moves with the same score), so there is no reason why this 50% should not be representative of the total population of games.

And how much time would it really save you to terminate games that are truly won at +8? 10%?

If you really want to save time, you should adjudicate games draw after 40 reversible moves! (You can do this in WinBoard_F! 8-) ) It saves more time, and has a larger probability of being correct.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Repetition detection structure.

Post by bob »

gladius wrote:
bob wrote:to follow up, would you get to play more games if you stopped each one after move 30 and declared a winner if both programs liked the same side?
You'd play many more in the same amount of time of course, but I don't think stopping after move 30 is quite the same as stopping when both programs agree there is an 8 pawn advantage :).

It's just the degree of uncertainty in the game results that is changing in the 30 moves, vs 8 pawns adjudication. But I think everyone would agree that adjudicating after 30 moves if both programs agree is close to random, while an 8 pawn advantage has a much, much higher probability of reflecting the game result. There will be some draws missed, and even the occasional win, but between good programs, it won't happen very often.
It is an arbitrary stopping point nonetheless. And in the game in question, I don't believe it was +8...
Uri Blass
Posts: 10790
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Repetition detection structure.

Post by Uri Blass »

hgm wrote:This is a very questionable statement. Uri selected 4 games for an independent reason (just because they were long, and contained a long sequence of reversible moves), and it turns out 2 of them were adjudicated wrong. That is 50%. I see no reasons why perpetuals would be more likely after a long sequence of reversible moves (if you adjudicate after 3-5 moves with the same score), so there is no reason why this 50% should not be representative of the total population of games.

And how much time would it really save you to terminate games that are truly won at +8? 10%?

If you really want to save time, you should adjudicate games draw after 40 reversible moves! (You can do this in WinBoard_F! 8-) ) It saves more time, and has a larger probability of being correct.
I think that the probability is bigger for weak programs.

In the case of the perpetual check stronger programs can see the draw
at 40/40 time control so they are not going to evaluate the position as big advantage.

The second case was no case of drawn position but case of a program that does not know how to win and probably stronger program know how to win.

50% is also not the probability for Joker
Here are the first 10 positions that were adjudicated for Joker1.14 and probably all of them were adjudicated with no mistake in the result.

[d]2k5/2P5/3pN3/pp1P4/1p6/3K2r1/P7/7R w - - 0 107
[d]8/3P1k1p/5R2/2pp2p1/8/6P1/5P1K/8 b - - 0 46
[d]5r2/8/5KP1/5R2/3k2P1/8/8/8 w - - 0 79
[d]8/8/5k2/8/1K6/8/8/N7 b - - 0 130 (joker did not see the draw)
[d]8/8/p7/2R5/4Pqpk/7p/4K3/2R5 w - - 0 58
[d]R7/8/4k3/4b3/8/B1pK4/4p3/r7 w - - 0 112
[d]2k5/7p/6p1/p1p1Pp2/1p2pP2/4P1P1/3K3P/8 w - - 0 42
[d]5RRr/4k2P/3n4/p7/8/2b5/6K1/1B6 b - - 0 78
[d]8/5pk1/6rp/4r2p/3q4/8/3n2PQ/3RB2K w - - 0 55
[d]7r/5R1P/8/8/8/6K1/4p3/5k2 b - - 0 90
Uri Blass
Posts: 10790
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Repetition detection structure.

Post by Uri Blass »

bob wrote:
gladius wrote:
bob wrote:to follow up, would you get to play more games if you stopped each one after move 30 and declared a winner if both programs liked the same side?
You'd play many more in the same amount of time of course, but I don't think stopping after move 30 is quite the same as stopping when both programs agree there is an 8 pawn advantage :).

It's just the degree of uncertainty in the game results that is changing in the 30 moves, vs 8 pawns adjudication. But I think everyone would agree that adjudicating after 30 moves if both programs agree is close to random, while an 8 pawn advantage has a much, much higher probability of reflecting the game result. There will be some draws missed, and even the occasional win, but between good programs, it won't happen very often.
It is an arbitrary stopping point nonetheless. And in the game in question, I don't believe it was +8...
It is an arbitrary stopping point but I never see games that adjudicated as a win with less than +4 of both sides

In the garbochess game it was +4.68 by garbochess and +4.91 by Joker
and I guess that the gui decided to adjudicate because of the small increase in the evaluation relative to previous moves

Evaluations of moves 98-102 from that game

98 4.62 4.68
99 4.65 4.70
100 4.62 4.91
101 4.65 4.91
102 4.68

The gui probably decided that there is enough progress in score to adjudicate it as win.

Uri
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Repetition detection structure.

Post by hgm »

Be that as it may, there still is no good case for adjudicating games before they are finished. If you want to save testing time, playing out the game at high speed would serve that purpose just as well.

I don't think you would have to worry much about disadvantaging engines that are not good at blitz. For one, different engines do not change their relative strength that much at faster time controls, and second, even if they lose a couple of hundred rating points compared to the opponent, they still should easily be able to win from a +8 situation. (Based on the Rybka self-play experiment pawns-odds gives a 74% score advantage, i.e. ~170 rating points, so with 8 times that advantage a few hundred rating points don't matter that much.)
Uri Blass
Posts: 10790
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Repetition detection structure.

Post by Uri Blass »

hgm wrote:Be that as it may, there still is no good case for adjudicating games before they are finished. If you want to save testing time, playing out the game at high speed would serve that purpose just as well.

I don't think you would have to worry much about disadvantaging engines that are not good at blitz. For one, different engines do not change their relative strength that much at faster time controls, and second, even if they lose a couple of hundred rating points compared to the opponent, they still should easily be able to win from a +8 situation. (Based on the Rybka self-play experiment pawns-odds gives a 74% score advantage, i.e. ~170 rating points, so with 8 times that advantage a few hundred rating points don't matter that much.)
You are probably right for the strong engines.
The main problem with the weak engines is that
There are engines that simply often lose on time at blitz or cause stupid repetition when they only get depth 1 in time trouble.

Uri
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Repetition detection structure.

Post by hgm »

I don't think that any of the engines that CCRL tests would fall in the category you refer to (that would only reach depth 1 in a 1-min game). That might be different for WBEC. But engines that lose on time in 40/5 games are not even admitted in WBEC. So you could always finish the game at 40/5, and still reap 87.5% of the time savings you were after, which for practical purposes should be good enough.

I think you are worrying now about hypothetical problems that are not likely to occur in practice at all.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Repetition detection structure.

Post by bob »

hgm wrote:
bob wrote:Note that was apparantely a FIDE rule, which is OK by me. But if a program says "this is a draw" and it isn't, the game should continue unless the program refuses and loses on time.
Yes, it was a FIDE rule, and by that FIDE rule all computer programs would forfeit their game even before it would start, as they have no hands and do not greet their opponent in any alternative way. That just illustrates how silly it is to require that Human rules apply without any adaptation to computers.

Furthermore note that the engines under WinBoard_F (if the user sets this option) do not forfeit because they say "this is a draw", but because they say "I won't play on" in a position that is not won or drawn by any rule. And if they don't mean that, they'd just better not say it. This is also completely similar to Human Chess, where there is also no way back if you say "I resign", and later claim that you meant something else... Even if you think this is undesirable (which I don't), in practice there is simply no way to accomodate engines that substitute arbitry commands for arbitrary other commands, by having a GUI second-guess what they really mean. Because where would it stop? You might as well have the GUI say, when a program plays 1. a2a3 "Well, but this is such a poor move, it cannot possibly mean that. Let's assume it is just bad pronunciation, and that it means 1. e2e3"... If you don't stick to the protocol, the consequences are entirely for you. That is true if you number the board files differently, and it is true when you decide 'resign' means castling. So why should it be any different when you say "this game is definitely over" when you mean you want to play on?
Sorry, but a computer playing in a FIDE event does have to shake hands with the opponent before the game starts. It is done by the human operator. The official rules have always been modified to allow computers to play, since computers obviously don't have hands to make moves or hit the clock, etc. USCF has computers using almost the same rules as those used by blind players, who can have an "assistant" that makes the move, hits the clock, and tells the player when a move has been made, etc...

But functional rules can be handled properly, things like draw offers and such...
Uri Blass
Posts: 10790
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Repetition detection structure.

Post by Uri Blass »

bob wrote:
hgm wrote:
bob wrote:Note that was apparantely a FIDE rule, which is OK by me. But if a program says "this is a draw" and it isn't, the game should continue unless the program refuses and loses on time.
Yes, it was a FIDE rule, and by that FIDE rule all computer programs would forfeit their game even before it would start, as they have no hands and do not greet their opponent in any alternative way. That just illustrates how silly it is to require that Human rules apply without any adaptation to computers.

Furthermore note that the engines under WinBoard_F (if the user sets this option) do not forfeit because they say "this is a draw", but because they say "I won't play on" in a position that is not won or drawn by any rule. And if they don't mean that, they'd just better not say it. This is also completely similar to Human Chess, where there is also no way back if you say "I resign", and later claim that you meant something else... Even if you think this is undesirable (which I don't), in practice there is simply no way to accomodate engines that substitute arbitry commands for arbitrary other commands, by having a GUI second-guess what they really mean. Because where would it stop? You might as well have the GUI say, when a program plays 1. a2a3 "Well, but this is such a poor move, it cannot possibly mean that. Let's assume it is just bad pronunciation, and that it means 1. e2e3"... If you don't stick to the protocol, the consequences are entirely for you. That is true if you number the board files differently, and it is true when you decide 'resign' means castling. So why should it be any different when you say "this game is definitely over" when you mean you want to play on?
Sorry, but a computer playing in a FIDE event does have to shake hands with the opponent before the game starts. It is done by the human operator. The official rules have always been modified to allow computers to play, since computers obviously don't have hands to make moves or hit the clock, etc. USCF has computers using almost the same rules as those used by blind players, who can have an "assistant" that makes the move, hits the clock, and tells the player when a move has been made, etc...

But functional rules can be handled properly, things like draw offers and such...
There is a reason to have different functional rules for computer and humans in few cases like draw claims(offering draws is no problem).
The reason is that there are mistakes that non buggy programs do not do.

A wrong draw claim for 50 move rule or for repetition is one of them.
Engines usually also agree about non sufficient material draw claims
(K vs K draw
KN vs K draw KB vs K draw KB vs KB draw when the bishops have the same color)

Most engines are not smart enough to make correct draw claims in practical different cases and if there is an author that his program is smart enough to do it then he may send his code when to claim draws to H.G.Muller and I guess that Muller is not going to have problems to use it in the adjudication.

The target of interface is to support engines of different strength but
not to support a small number of buggy engines at the price of spending computer time so the simplest solution is that engines that make wrong draw claim fix the bugs and claim draw correctly.

If the interface decide to continue after wrong draw claim then the engine that did the wrong claim may often lose on time(because the author usually assume that the game is not going to continue after draw claim so he does not care how his engine behaves in that case) and you may lose computer time relative to the case that the interface adjudicates the game based on the wrong draw claim.


Finally I believe that H.G.Muller's code for winboard is free so you are free to fix it and release your winboard version if you do not like adjudicating games in case of draw claims

The previous situation of accepting wrong draw claims cannot be considered as better even from your point of view.

Detecting wrong draw claims is the first step even from your point of view and you can add option to continue the game after a wrong draw claim.

I guess that most testers will prefer to use H.G.Muller's version and not your version if you do it but testers are free to do what they like.

Uri