what happened to scorpio NN in TCEC?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Uri Blass
Posts: 8435
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

what happened to scorpio NN in TCEC?

Post by Uri Blass » Sat Nov 17, 2018 7:54 am

I remember seeing it in the table few days ago with 0 points.

It is also in the participants list in division 4
http://www.chessdom.com/tcec-season-14- ... y-to-boom/

For some reasons I do not see it in the table now
https://tcec.chessdom.com/

I thought that after some X games people decided not to include it so wanted to see what happend in the games but when I click on schedule I find no games of it.

I do not understand why they simply delete history.
Even if they decided to remove it because of some problems(and I do not know what is the problem) I expected them to keep the history correct.

Note that last time I looked at the table I remember seeing that scorpio crashed twice but pirarucu crashed 3 times and they did not remove it.

zenpawn
Posts: 280
Joined: Sat Aug 06, 2016 6:31 pm
Location: United States

Re: what happened to scorpio NN in TCEC?

Post by zenpawn » Sat Nov 17, 2018 12:30 pm

Its games are still in the archive http://legacy-tcec.chessdom.com/archive.php
Erin Dame
Author of RookieMonster

Branko Radovanovic
Posts: 53
Joined: Sat Sep 13, 2014 2:12 pm

Re: what happened to scorpio NN in TCEC?

Post by Branko Radovanovic » Sat Nov 17, 2018 4:09 pm

Meanwhile, the entire Div4 has been restarted, without Scorpio. The chat message is:

hi everyone i am sorry to inform you Entrance Division 4 had to be restarted due to too many inconsistencies in book orders, probably as a consequence of the many crashes among other things. this seemed to be the only option that guarantees fairness for all. so we restarted, with the one restriction: scorpio nn has proven to be too unstable and causing cutechess crashes, so it stays out. --kanchess (see !eventpgn for games)

I like TCEC but it can be very confusing and frustrating to follow. Scorpio was disqualified after 4 crashes, although the limit is supposed to be 3, and even that rule seems to be unofficial, as the Rules page says nothing about "3 strikes = DQ". Pirarucu on the other hand kept playing with 3 crashes because there was apparently some mixup with the logs so the first crash didn't count, adding further to the confusion. And now this - restarting from scratch, without any explanation, except for chat messages. Still, am I supposed to sit and read everything people say in the chat until explanation randomly comes along?

Confusion aside, the very idea of removing the engine with three crashes from competition and discarding all of its games - supposedly in the interest of fairness - is absolutely misguided, because crashes are essentially random events, just as normal wins, losses and draws are, and discarding valid games actually hurts fairness instead of improving it.

kasinp
Posts: 188
Joined: Sat Dec 02, 2006 9:47 pm
Location: Toronto

Re: what happened to scorpio NN in TCEC?

Post by kasinp » Sat Nov 17, 2018 4:46 pm

Branko Radovanovic wrote:
Sat Nov 17, 2018 4:09 pm
Meanwhile, the entire Div4 has been restarted, without Scorpio. The chat message is:

hi everyone i am sorry to inform you Entrance Division 4 had to be restarted due to too many inconsistencies in book orders, probably as a consequence of the many crashes among other things. this seemed to be the only option that guarantees fairness for all. so we restarted, with the one restriction: scorpio nn has proven to be too unstable and causing cutechess crashes, so it stays out. --kanchess (see !eventpgn for games)

I like TCEC but it can be very confusing and frustrating to follow. Scorpio was disqualified after 4 crashes, although the limit is supposed to be 3, and even that rule seems to be unofficial, as the Rules page says nothing about "3 strikes = DQ". Pirarucu on the other hand kept playing with 3 crashes because there was apparently some mixup with the logs so the first crash didn't count, adding further to the confusion. And now this - restarting from scratch, without any explanation, except for chat messages. Still, am I supposed to sit and read everything people say in the chat until explanation randomly comes along?

Confusion aside, the very idea of removing the engine with three crashes from competition and discarding all of its games - supposedly in the interest of fairness - is absolutely misguided, because crashes are essentially random events, just as normal wins, losses and draws are, and discarding valid games actually hurts fairness instead of improving it.
Imagine a Stockfish version with a particularly nasty bug that causes it to crash after capturing a knight on f4. This scenario will occur infrequently, but will cause SF to lose the game on the spot. A tournament in which this happens in, say 6 games out of 30, will lead to a distorted view of the strength of other engines.

Wins and losses are not random as with the growing number of games they reflect the relative strength of engines with increasing accuracy. OTOH the selection of opponents against which an engine happens to crash is random.

PK

Daniel Shawul
Posts: 3661
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: what happened to scorpio NN in TCEC?

Post by Daniel Shawul » Sat Nov 17, 2018 5:19 pm

From what I gather it is not the fault of Scorpio but cutechess-cli.

Apparently cutechess-cli only waits 10 seconds for an engine to load but loading Scorpio neural networks may take upto 30 seconds.
It is not a problem for winboard if an engine takes an hour to initialize because the winboard protocol says this
done (integer, no default)
If you set done=1 during the initial two-second timeout after xboard sends you the "xboard" command, the timeout will end and xboard will not look for any more feature commands before starting normal operation. If you set done=0, the initial timeout is increased to one hour; in this case, you must set done=1 before xboard will enter normal operation.
The xboard protocol provides a way to counter this by doing:
"feature done 0" ... then time-taking operation ... "feature done 1"
So the engine can take upto 1 hour initializing its stuff and there shouldn't be a problem.
I implemented that and expect it to work in every GUI but i guess cutechess-cli just resumes normal operation after waiting only 10 seconds....

I am fine with scorpio getting out of the tournament due to its hangs but it should not be alluded that the cause of the tournament being restarted
is scoprio especially when i did things the rightway and their GUI (cutechess-cli) happens not to implement winboard correctly.

Daniel

Branko Radovanovic
Posts: 53
Joined: Sat Sep 13, 2014 2:12 pm

Re: what happened to scorpio NN in TCEC?

Post by Branko Radovanovic » Sat Nov 17, 2018 5:27 pm

kasinp wrote:
Sat Nov 17, 2018 4:46 pm
Imagine a Stockfish version with a particularly nasty bug that causes it to crash after capturing a knight on f4. This scenario will occur infrequently, but will cause SF to lose the game on the spot. A tournament in which this happens in, say 6 games out of 30, will lead to a distorted view of the strength of other engines.
Are you saying that engines that won points from SF's crashes will falsely appear stronger relative to engines that did not happen to win points in the same way? That is true but, as you noted, we don't know a priori which engines will fall into each group.
kasinp wrote:
Sat Nov 17, 2018 4:46 pm
Wins and losses are not random as with the growing number of games they reflect the relative strength of engines with increasing accuracy. OTOH the selection of opponents against which an engine happens to crash is random.
What I meant is that individual wins, losses and draws are random (not flip-of-the-perfect-coin-random, of course, but distributed in accordance with relative strength). If "fairness" of a certain ruleset means "maximizing the prior probability that a stronger engine gets promoted over the weaker one", then removing valid game outcomes will inevitably work against fairness. Using your example, removing 24 actual or potential "valid" outcomes as well as 6 "tainted" outcomes will have the net negative effect in measuring the relative strength of competitors. (That is my hypothesis anyway - I believe it could be proven with a Monte Carlo simulation, for example.)

Daniel Shawul
Posts: 3661
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: what happened to scorpio NN in TCEC?

Post by Daniel Shawul » Sat Nov 17, 2018 5:41 pm

Daniel Shawul wrote:
Sat Nov 17, 2018 5:19 pm
From what I gather it is not the fault of Scorpio but cutechess-cli.

Apparently cutechess-cli only waits 10 seconds for an engine to load but loading Scorpio neural networks may take upto 30 seconds.
It is not a problem for winboard if an engine takes an hour to initialize because the winboard protocol says this
done (integer, no default)
If you set done=1 during the initial two-second timeout after xboard sends you the "xboard" command, the timeout will end and xboard will not look for any more feature commands before starting normal operation. If you set done=0, the initial timeout is increased to one hour; in this case, you must set done=1 before xboard will enter normal operation.
The xboard protocol provides a way to counter this by doing:
"feature done 0" ... then time-taking operation ... "feature done 1"
So the engine can take upto 1 hour initializing its stuff and there shouldn't be a problem.
I implemented that and expect it to work in every GUI but i guess cutechess-cli just resumes normal operation after waiting only 10 seconds....

I am fine with scorpio getting out of the tournament due to its hangs but it should not be alluded that the cause of the tournament being restarted
is scoprio especially when i did things the rightway and their GUI (cutechess-cli) happens not to implement winboard correctly.

Daniel
I am not even sure this is the case at all. It played 80 blitz games without a problem so if NN loading taking too long was a problem, it would have
caused way too many hangs there...

Anyway one would assume cutechess-cli probably implemented the xboard protocol correctly.

Edit:
Indeed cutechess implements things correctly like I suspected. It waits for a "feature done 1" before initializing.

Code: Select all

	else if (name == "done")
	{
		write("accepted done", Unbuffered);
		m_initTimer->stop();
		
		if (val == "1")
			initialize();
		return;
	}
The only explanation for me is that it is not strong enough for Div4 so lets blame it on its hangs and then say it was causing cutechess-cli to hang or whatever...

syzygy
Posts: 4425
Joined: Tue Feb 28, 2012 10:56 pm

Re: what happened to scorpio NN in TCEC?

Post by syzygy » Sat Nov 17, 2018 6:58 pm

Branko Radovanovic wrote:
Sat Nov 17, 2018 4:09 pm
Confusion aside, the very idea of removing the engine with three crashes from competition and discarding all of its games - supposedly in the interest of fairness - is absolutely misguided, because crashes are essentially random events, just as normal wins, losses and draws are, and discarding valid games actually hurts fairness instead of improving it.
Crashes are random events with a probability distribution that is entirely unrelated to that of normal wins, losses and draws. The inclusion of games of a randomly crashing engine clearly hurts fairness. Discarding all the games of such an engine seems quite reasonable. (Of course the decision to disqualify an engine should be made on the basis of predetermined criteria.)

syzygy
Posts: 4425
Joined: Tue Feb 28, 2012 10:56 pm

Re: what happened to scorpio NN in TCEC?

Post by syzygy » Sat Nov 17, 2018 7:11 pm

Branko Radovanovic wrote:
Sat Nov 17, 2018 5:27 pm
If "fairness" of a certain ruleset means "maximizing the prior probability that a stronger engine gets promoted over the weaker one", then removing valid game outcomes will inevitably work against fairness. Using your example, removing 24 actual or potential "valid" outcomes as well as 6 "tainted" outcomes will have the net negative effect in measuring the relative strength of competitors. (That is my hypothesis anyway - I believe it could be proven with a Monte Carlo simulation, for example.)
It seems clear to me that there is no net negative effect. Removing all the games of the randomly crashing engine simply results in a normal tournament with one fewer participant.

You could argue the more participants the better, but that does not apply if the extra participant introduces severe noise.

edit:
OK, thinking about it again I think I now see your point. Your point is probably that the random crashes are equivalent to game-losing blunders, so an engine that crashes is not different from an engine that randomly blunders away the game.

I suppose it is hard to argue against that.
Still, it is intuitively clear to me that such unpredictable engines are not desirable in a tournament. In the long run (as the number of games approaches infinity), the randomly crashing engine clearly will not affect the relative ranking of the remaining engines (or at least not more than the addition of any regular engine to a tournament could). But an unpredictably crashing engines makes the results more volatile so that more games are needed. Just like you need more games at STC than at LTC.

Suppose engines A and B are equally strong.
If A and B always draw, then any tournament will give accurate results.
If A and B never draw, then it is highly likely that a tournament will suggest that one is stronger than the other, even though they are equally strong.

So the higher draw ratio, the "better" (not always better for spectators, but still).
Random crashes artificially reduce that draw ratio (and they will not make spectators happy).
Last edited by syzygy on Sat Nov 17, 2018 7:29 pm, edited 3 times in total.

Branko Radovanovic
Posts: 53
Joined: Sat Sep 13, 2014 2:12 pm

Re: what happened to scorpio NN in TCEC?

Post by Branko Radovanovic » Sat Nov 17, 2018 7:19 pm

syzygy wrote:
Sat Nov 17, 2018 6:58 pm
The inclusion of games of a randomly crashing engine clearly hurts fairness.
How can one hurt fairness by including non-crashing games of a crashing engine? On the contrary, it is their removal that hurts fairness. If it's unfair for an engine to receive a point just because its opponent crashed, it is then equally unfair to deduct a fully earned point against such engine from a game in which it didn't crash. If the choices are discard all or discard none, discarding all only seems fairer.

Post Reply