TCEC stage 3 , New Houdini starts with a bang

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

syzygy
Posts: 5713
Joined: Tue Feb 28, 2012 11:56 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by syzygy »

Michel wrote:
I must disagree with the coin analogy. A proper be an unbalanced coin was flipped and replaced with a fair coin. The first flip was not correct since the parameters of the coin were not correct.
The coin analogy is correct (since the bias was very small).

If the game would also have been replayed in the case of a draw/loss for Houdini, then the decision was fair.

However a more likely scenario in case of a win for K would have been that the game would not have been replayed, the reason being that it was K that was handicapped, not Houdini.... (and it would have been hard to argue with this).
I agree with you on each and every point ;-)

I would really like to see the outcry if the handicapped side had won and the game had not been restarted. Personally I do not believe there would have been any discussion at all in that case.

Of course I may be completely wrong in thinking that the game might not have been replayed had the handicapped side won, but that there is reasonable doubt about it (for lack of a clear rule) suffices.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: TCEC stage 3 , New Houdini starts with a bang

Post by bob »

syzygy wrote:
bob wrote:There is one overriding question here. What is the purpose of playing the games? (a) to see if the human operators can set things up correctly? or (b) to see which program wins the game, using optimal settings? I've always gone for (b). And, in fact, ICGA tournament rules require this. If a game is re-started, with wrong settings, it gets backed up to the last correct point and resumed. If the wrong move is entered, the game backs up to the last correct move and continues. The goal has always been to see which program plays better at that particular instant in time, rather than whether or not an operator makes a simple configuration error.
Are you sure that also applies to wrong settings at ICGA tournaments?

For example, an operator might have inadvertently configured the engine to use too little time. Halfway the game he discovers his error. Is the game now restarted from the point where the mistake was made (i.e. from move 1)?

It seems to me that would allow for too much abuse. I guess there must be some restrictions on the type of mistake that can be corrected.

(Of course in TCEC these is no such operator involvement, so things are normally much simpler.)
It is more difficult at an ICGA event because in general the programmer is the one running the engine. But even in that case, if something is broken, the game does back up to that point, the mistake is corrected, and the game resumes.

I have only seen one example to the contrary. In the 1983 WCCC event, we were playing BCP (Don Beal's program). He set the time control to blitz, and since he was the programmer, David (Levy) did not allow him to change the settings back later (the thinking was that he had made several very fast moves, and then using all that saved time might produce an advantage...

For me, I prefer "let the programs decide the outcome". I played in the 1985 ACM event in Denver. We played Zarkov running on an overclocked HP chip running in the HP lab. We were in an interesting KR + pawns ending that most agreed we should win, but there was lots of discussion about whether a program could actually win the ending or not. John ran into some sort of issue where his program crashed repeatedly. Rather than claiming a win on time, we elected to let him have enough time to correct the problem (don't remember whether he moved to a different machine or what, he would have to answer that), so that we could see what happened. Turns out we were able to win it because of the "if we must play a move that leads to a draw, play the one that draws as far away as possible from the root..."

So my penchant is pretty clear, and others have done the same as well. Of course, that was back in the days of no clones, no ethically-challenged opponents... :)
Robert Pope
Posts: 567
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Robert Pope »

I can just imagine the scenario where they accidentally played Stockfish against Houdini in the game. And then people argue that the game shouldn't be replayed because Stockfish is pretty much the same strength as Komodo, so using the wrong engine only disadvantaged Komodo by 5 ELO in that game.
:)
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by mjlef »

Rochester wrote:Sensible only play with the default setting. Then the programer must code it correctly always. He can make the program use the setting he want. Make new program when change setting.

And when programmer go to the movie he can't complain when he return. Movie is more important? NO!

I try to code the specific things we want in the program. For example Komodo 10.1's Dynamism default is 115. But they manually set it to 110 on TCEC. Also, some things like SyzygyPath vary from machine to machine. They are allowed to change the machine between stages, but we are not told the paths to hard code in the program. So these have to be manually set.

As for the movie, I had no way of knowing when Komodo would play. I would have chosen another time to go to the movie if I knew Komodo would play in the first game. But that was not announced in advance.

Of course I can complain when I find a setting that was not correctly made, just as any other programmer. Larry and I would be the only ones who could check it since we chose them. And I dashed out an email within about one minute of finding the errors.

They have been very accurate in the past setting everything correctly. But they are only human. They corrected it quickly.
syzygy
Posts: 5713
Joined: Tue Feb 28, 2012 11:56 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by syzygy »

Raptor wrote:
Michel wrote:The issue is the part of the above statement in bold. This would make the decision biased (in favour of Houdini). So Mark would indeed have to agree to such arrangement.

So did Houdini's author agree with the replay? It seems likely he wasn't even asked.
The issue is not about who's favour it would be in (I am saying even if TCEC re-played the game without asking Mark he would be fine).

It's about playing with settings and configurations that the programmers intended. This is not something Mark requested ad-hoc, it wasn't like 'Oh Komodo is losing, maybe Contempt 7 will do better let me ask for it'.

The settings were requested before the stage began.

In this case, If I were the affected programmer, I would request for replaying the game because I demand the settings I asked for. The organizers, within rules allowed me that, and failed to implement it for the game.

The point I am trying to make is, irrespective of the result the game is null and void the moment it started with wrong settings for either of the participants.

I would be equally vocal in support of a replay, no matter what engines were involved, in fact I would push for a replay if Raptor was impacted.

So Team Komodo requesting the setting change and pointing out the organizer mistake is well within their rights. And as a response I feel that the decision taken by the TCEC team is fair and just, they would do it for anyone.
If the rules had been clear, the situation would have been easy to resolve: the first game was void and must be replayed (no whats or ifs).

So far nobody has pointed out a clear rule. Apparently there was an earlier case that was replayed, but then one engine was set to search with one thread only. That game was clearly meaningless. With Komodo's settings, the game was not meaningless. At least, I have difficulty accepting that the values that were good for K10 in stage 2 were no longer reasonable for K10.1 in stage 3. Clearly the values should have been set correctly and K10.1 was disadvantaged, but the game itself was still a game.

So the rules were not clear, as far as I can tell. And there is an easy naive argument that goes "Komodo was handicapped and lost (or was behind) and so the game must be replayed". If that was the reason for the (very quick) decision, then the decision was wrong.

There is the other issue of what would have happened if the error had been detected only fives weeks from now. Would all Komodo gaves have been replayed? If not, then the decision to replay this one game was wrong.

So consider these questions... what if Komodo had won this game? What if the mistake had been detected only five weeks from now? If there is no doubt on your mind that the games would have been replayed no matter what, then you should feel confident that the decision to replay was the only correct decision. But if there is doubt...
Just out of curiosity what if by mistake TCEC set Stockfish with Skill Level=0, would you want a replay or be like 'that's unfair to the opponent', think about who it is unfair the most to!
Then the game would have been meaningless and should be replayed.

I don't know where to draw the line between meaningless and meaningful. But there is no reason to ask me. Why not consult the engine authors.
lkaufman
Posts: 6255
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: TCEC stage 3 , New Houdini starts with a bang

Post by lkaufman »

syzygy wrote:
Raptor wrote:
Michel wrote:The issue is the part of the above statement in bold. This would make the decision biased (in favour of Houdini). So Mark would indeed have to agree to such arrangement.

So did Houdini's author agree with the replay? It seems likely he wasn't even asked.
The issue is not about who's favour it would be in (I am saying even if TCEC re-played the game without asking Mark he would be fine).

It's about playing with settings and configurations that the programmers intended. This is not something Mark requested ad-hoc, it wasn't like 'Oh Komodo is losing, maybe Contempt 7 will do better let me ask for it'.

The settings were requested before the stage began.

In this case, If I were the affected programmer, I would request for replaying the game because I demand the settings I asked for. The organizers, within rules allowed me that, and failed to implement it for the game.

The point I am trying to make is, irrespective of the result the game is null and void the moment it started with wrong settings for either of the participants.

I would be equally vocal in support of a replay, no matter what engines were involved, in fact I would push for a replay if Raptor was impacted.

So Team Komodo requesting the setting change and pointing out the organizer mistake is well within their rights. And as a response I feel that the decision taken by the TCEC team is fair and just, they would do it for anyone.
If the rules had been clear, the situation would have been easy to resolve: the first game was void and must be replayed (no whats or ifs).

So far nobody has pointed out a clear rule. Apparently there was an earlier case that was replayed, but then one engine was set to search with one thread only. That game was clearly meaningless. With Komodo's settings, the game was not meaningless. At least, I have difficulty accepting that the values that were good for K10 in stage 2 were no longer reasonable for K10.1 in stage 3. Clearly the values should have been set correctly and K10.1 was disadvantaged, but the game itself was still a game.

So the rules were not clear, as far as I can tell. And there is an easy naive argument that goes "Komodo was handicapped and lost (or was behind) and so the game must be replayed". If that was the reason for the (very quick) decision, then the decision was wrong.

There is the other issue of what would have happened if the error had been detected only fives weeks from now. Would all Komodo gaves have been replayed? If not, then the decision to replay this one game was wrong.

So consider these questions... what if Komodo had won this game? What if the mistake had been detected only five weeks from now? If there is no doubt on your mind that the games would have been replayed no matter what, then you should feel confident that the decision to replay was the only correct decision. But if there is doubt...
Just out of curiosity what if by mistake TCEC set Stockfish with Skill Level=0, would you want a replay or be like 'that's unfair to the opponent', think about who it is unfair the most to!
Then the game would have been meaningless and should be replayed.

I don't know where to draw the line between meaningless and meaningful. But there is no reason to ask me. Why not consult the engine authors.
Perhaps you are not a tournament chess player, I don't know, but in real chess tournaments the general rule is that complaints must be made during, not after, the game. So it does not follow that just because an error was corrected during the game, an error must be corrected five weeks later. But TCEC should clarify when an error must be caught to force a replay. I would say something like until 24 hours after game end, because the game might be played in the middle of the night for the programmer.
Komodo rules!
Uri Blass
Posts: 10872
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Uri Blass »

lkaufman wrote:
syzygy wrote:
Raptor wrote:
Michel wrote:The issue is the part of the above statement in bold. This would make the decision biased (in favour of Houdini). So Mark would indeed have to agree to such arrangement.

So did Houdini's author agree with the replay? It seems likely he wasn't even asked.
The issue is not about who's favour it would be in (I am saying even if TCEC re-played the game without asking Mark he would be fine).

It's about playing with settings and configurations that the programmers intended. This is not something Mark requested ad-hoc, it wasn't like 'Oh Komodo is losing, maybe Contempt 7 will do better let me ask for it'.

The settings were requested before the stage began.

In this case, If I were the affected programmer, I would request for replaying the game because I demand the settings I asked for. The organizers, within rules allowed me that, and failed to implement it for the game.

The point I am trying to make is, irrespective of the result the game is null and void the moment it started with wrong settings for either of the participants.

I would be equally vocal in support of a replay, no matter what engines were involved, in fact I would push for a replay if Raptor was impacted.

So Team Komodo requesting the setting change and pointing out the organizer mistake is well within their rights. And as a response I feel that the decision taken by the TCEC team is fair and just, they would do it for anyone.
If the rules had been clear, the situation would have been easy to resolve: the first game was void and must be replayed (no whats or ifs).

So far nobody has pointed out a clear rule. Apparently there was an earlier case that was replayed, but then one engine was set to search with one thread only. That game was clearly meaningless. With Komodo's settings, the game was not meaningless. At least, I have difficulty accepting that the values that were good for K10 in stage 2 were no longer reasonable for K10.1 in stage 3. Clearly the values should have been set correctly and K10.1 was disadvantaged, but the game itself was still a game.

So the rules were not clear, as far as I can tell. And there is an easy naive argument that goes "Komodo was handicapped and lost (or was behind) and so the game must be replayed". If that was the reason for the (very quick) decision, then the decision was wrong.

There is the other issue of what would have happened if the error had been detected only fives weeks from now. Would all Komodo gaves have been replayed? If not, then the decision to replay this one game was wrong.

So consider these questions... what if Komodo had won this game? What if the mistake had been detected only five weeks from now? If there is no doubt on your mind that the games would have been replayed no matter what, then you should feel confident that the decision to replay was the only correct decision. But if there is doubt...
Just out of curiosity what if by mistake TCEC set Stockfish with Skill Level=0, would you want a replay or be like 'that's unfair to the opponent', think about who it is unfair the most to!
Then the game would have been meaningless and should be replayed.

I don't know where to draw the line between meaningless and meaningful. But there is no reason to ask me. Why not consult the engine authors.
Perhaps you are not a tournament chess player, I don't know, but in real chess tournaments the general rule is that complaints must be made during, not after, the game. So it does not follow that just because an error was corrected during the game, an error must be corrected five weeks later. But TCEC should clarify when an error must be caught to force a replay. I would say something like until 24 hours after game end, because the game might be played in the middle of the night for the programmer.
In this case there is a problem because the programmer may decide if to complain or not complain based on the result and it is not fair.

I do not blame the komodo team that they decided to complain only because Komodo lost but I can imagine some programmer that decide not to complain if his program win the first game and the estimate for the elo loss by him is not more than 5 elo.

I think that in the future the correct setting should be shown before the first game so everybody and not only the programmers can complain if something is wrong.

I also think that if we find that there is a wrong setting before the end of the stage the games of the relevant program should be replayed.
syzygy
Posts: 5713
Joined: Tue Feb 28, 2012 11:56 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by syzygy »

bob wrote:
syzygy wrote:
bob wrote:There is one overriding question here. What is the purpose of playing the games? (a) to see if the human operators can set things up correctly? or (b) to see which program wins the game, using optimal settings? I've always gone for (b). And, in fact, ICGA tournament rules require this. If a game is re-started, with wrong settings, it gets backed up to the last correct point and resumed. If the wrong move is entered, the game backs up to the last correct move and continues. The goal has always been to see which program plays better at that particular instant in time, rather than whether or not an operator makes a simple configuration error.
Are you sure that also applies to wrong settings at ICGA tournaments?

For example, an operator might have inadvertently configured the engine to use too little time. Halfway the game he discovers his error. Is the game now restarted from the point where the mistake was made (i.e. from move 1)?

It seems to me that would allow for too much abuse. I guess there must be some restrictions on the type of mistake that can be corrected.

(Of course in TCEC these is no such operator involvement, so things are normally much simpler.)
It is more difficult at an ICGA event because in general the programmer is the one running the engine. But even in that case, if something is broken, the game does back up to that point, the mistake is corrected, and the game resumes.

I have only seen one example to the contrary. In the 1983 WCCC event, we were playing BCP (Don Beal's program). He set the time control to blitz, and since he was the programmer, David (Levy) did not allow him to change the settings back later (the thinking was that he had made several very fast moves, and then using all that saved time might produce an advantage...
Unless the rules specifically state something else, I would tend to allow to change the settings back later (if there is no real doubt that the current setting was never intended), but I would not allow the game to be backed up "to the last correct point". Because that would give the side making the error the option to see how it goes, and if doesn't go well, to have another try.

I might be more lenient if the incorrect setting evidently led to hopeless play.
For me, I prefer "let the programs decide the outcome". I played in the 1985 ACM event in Denver. We played Zarkov running on an overclocked HP chip running in the HP lab. We were in an interesting KR + pawns ending that most agreed we should win, but there was lots of discussion about whether a program could actually win the ending or not. John ran into some sort of issue where his program crashed repeatedly. Rather than claiming a win on time, we elected to let him have enough time to correct the problem (don't remember whether he moved to a different machine or what, he would have to answer that), so that we could see what happened. Turns out we were able to win it because of the "if we must play a move that leads to a draw, play the one that draws as far away as possible from the root..."

So my penchant is pretty clear, and others have done the same as well. Of course, that was back in the days of no clones, no ethically-challenged opponents... :)
In my view, if the rules were clear and left no choice to the participants, they should have been followed. (But maybe the rules only states you were entitled to claim the point. If you then don't claim it, then that's fine of course.)

Some time ago I came across an old thread about a similar incident (but I don't think it was this 1985 incident). In that case the side that should have been awarded the point according to the clear rules requested the game to be continued. The request was allowed and the "gracious" side ended up losing the point (or maybe half a point).

In that thread, Bruce Moreland (if I am not mistaken) very convincingly argued why this was wrong. If the rules are clear, they should just be followed. Otherwise you put people under pressure because nobody wants to become known as ethically challenged. Exercising your right (in a game context) is no sign of an ethical weakness, but that is difficult to explain when others are trying to put you in a bad light. The best way to avoid such situations is for the TD to enforce the rules.

When the rules are silent, things become more difficult.
syzygy
Posts: 5713
Joined: Tue Feb 28, 2012 11:56 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by syzygy »

lkaufman wrote:Perhaps you are not a tournament chess player, I don't know, but in real chess tournaments the general rule is that complaints must be made during, not after, the game.
I am aware of that. But I can assure you that opinions in the "replay was correct" camp vary wildly on the question whether a replay would have been correct if the error was reported only after the game had finished. Even Mark may have a different view on that than you do (but I don't remember clearly if he has expressed a view on this).

So your argument only confirms that the "replay rule", if there was one at all, is pretty vague.

I can also point out an obvious problem with the "replay if error reported during the game" rule. If that was the rule, then an engine author noticing that the engine was misconfigured would have the option to wait until the game is effectively decided and report the error before the end of the game only if he does not like the outcome. If he does like the outcome, simply wait until the end of the game and have the settings corrected for the next game.

So the rule that you are proposing is flawed. (OK, I understand from your last paragraph that this is not the rule you are proposing. Instead, you are proposing yet another rule. I do hope that you agree that rules applicable to past events should not be made up as one goes along.)

As I said, the only way the replay could be fair is if the game had been replayed "no matter what". And that rule should have been in place before the stage started.
So it does not follow that just because an error was corrected during the game, an error must be corrected five weeks later. But TCEC should clarify when an error must be caught to force a replay. I would say something like until 24 hours after game end, because the game might be played in the middle of the night for the programmer.
So with that rule, the programmer can take his time and consider whether he wants a replay (-> complain within 24 hours) or is happy with the outcome (-> complain only after 24 hours have passed). Not a good rule, even if it had been in place from the start and known to the participants.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: TCEC stage 3 , New Houdini starts with a bang

Post by bob »

syzygy wrote:
lkaufman wrote:Perhaps you are not a tournament chess player, I don't know, but in real chess tournaments the general rule is that complaints must be made during, not after, the game.
I am aware of that. But I can assure you that opinions in the "replay was correct" camp vary wildly on the question whether a replay would have been correct if the error was reported only after the game had finished. Even Mark may have a different view on that than you do (but I don't remember clearly if he has expressed a view on this).

So your argument only confirms that the "replay rule", if there was one at all, is pretty vague.

I can also point out an obvious problem with the "replay if error reported during the game" rule. If that was the rule, then an engine author noticing that the engine was misconfigured would have the option to wait until the game is effectively decided and report the error before the end of the game only if he does not like the outcome. If he does like the outcome, simply wait until the end of the game and have the settings corrected for the next game.

So the rule that you are proposing is flawed. (OK, I understand from your last paragraph that this is not the rule you are proposing. Instead, you are proposing yet another rule. I do hope that you agree that rules applicable to past events should not be made up as one goes along.)

As I said, the only way the replay could be fair is if the game had been replayed "no matter what". And that rule should have been in place before the stage started.
So it does not follow that just because an error was corrected during the game, an error must be corrected five weeks later. But TCEC should clarify when an error must be caught to force a replay. I would say something like until 24 hours after game end, because the game might be played in the middle of the night for the programmer.
So with that rule, the programmer can take his time and consider whether he wants a replay (-> complain within 24 hours) or is happy with the outcome (-> complain only after 24 hours have passed). Not a good rule, even if it had been in place from the start and known to the participants.
There is a trap here, and the rule must say "replay" or "accept as played" period. You can't give someone the option of watching a game with wrong settings, and if they win they say nothing, if they lose they protest and cause a replay.

I personally would change the way this is done and simply say "we will alter NO settings whatsoever. Use whatever values you want in the executable and that will be your only opportunity to set values." If you have specific values you want to change for a specific opponent, that needs to be done INTERNALLY within the engine, assuming TCEC is giving the opponent's name (xboard/winboard does this for example).

The days of "hand-setting" options is past its prime and there is really no good reason for doing this.