Page 1 of 2

CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Tue Feb 04, 2020 5:36 pm
by brinan
In my matches between Stockfish and lc0 (Leela Chess Zero), have suffered several aborted games due to Black's cancellation stall. Cute Chess message has always referred to lc0.

The result of going through logs of both Cute Chess and lc0 is a question as follows:
Is there any reason why Cute Chess would not send to lc0 the command

Code: Select all

>lc0.net.62164(0): ucinewgame
Note there had already been the interchange:

Code: Select all

>lc0.net.62164(0): isready
<lc0.net.62164(0): readyok
Or, Cute Chess does send ucinewgame but does not follow up with position startpos.
In this case, the receipt of ucinewgame is acknowledged in lc0's log. But then abrupt end:

Code: Select all

0118 21:51:29.797194 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> isready
0118 21:51:29.797319 2552 c:\projects\lc0\src\chess\uciloop.cc:219] << readyok
0118 21:51:44.840614 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> ucinewgame


============= Log started. =============
0118 21:52:00.761432 1036 c:\projects\lc0\src\main.cc:40] Lc0 started.
Any ideas gratefully received. Restarting individual games manually spoils the flow of weekend stream.

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Tue Feb 04, 2020 6:00 pm
by crem
brinan wrote: Tue Feb 04, 2020 5:36 pm In my matches between Stockfish and lc0 (Leela Chess Zero), have suffered several aborted games due to Black's cancellation stall. Cute Chess message has always referred to lc0.

The result of going through logs of both Cute Chess and lc0 is a question as follows:
Is there any reason why Cute Chess would not send to lc0 the command

Code: Select all

>lc0.net.62164(0): ucinewgame
Note there had already been the interchange:

Code: Select all

>lc0.net.62164(0): isready
<lc0.net.62164(0): readyok
Or, Cute Chess does send ucinewgame but does not follow up with position startpos.
In this case, the receipt of ucinewgame is acknowledged in lc0's log. But then abrupt end:

Code: Select all

0118 21:51:29.797194 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> isready
0118 21:51:29.797319 2552 c:\projects\lc0\src\chess\uciloop.cc:219] << readyok
0118 21:51:44.840614 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> ucinewgame


============= Log started. =============
0118 21:52:00.761432 1036 c:\projects\lc0\src\main.cc:40] Lc0 started.
Any ideas gratefully received. Restarting individual games manually spoils the flow of weekend stream.
Did you have ponder enabled by any chance? (that would increase a likelihood that it's a Lc0 bug).

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Tue Feb 04, 2020 6:28 pm
by Guenther
brinan wrote: Tue Feb 04, 2020 5:36 pm In my matches between Stockfish and lc0 (Leela Chess Zero), have suffered several aborted games due to Black's cancellation stall. Cute Chess message has always referred to lc0.

The result of going through logs of both Cute Chess and lc0 is a question as follows:
Is there any reason why Cute Chess would not send to lc0 the command

Code: Select all

>lc0.net.62164(0): ucinewgame
Note there had already been the interchange:

Code: Select all

>lc0.net.62164(0): isready
<lc0.net.62164(0): readyok
Or, Cute Chess does send ucinewgame but does not follow up with position startpos.
In this case, the receipt of ucinewgame is acknowledged in lc0's log. But then abrupt end:

Code: Select all

0118 21:51:29.797194 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> isready
0118 21:51:29.797319 2552 c:\projects\lc0\src\chess\uciloop.cc:219] << readyok
0118 21:51:44.840614 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> ucinewgame


============= Log started. =============
0118 21:52:00.761432 1036 c:\projects\lc0\src\main.cc:40] Lc0 started.
Any ideas gratefully received. Restarting individual games manually spoils the flow of weekend stream.
This is a problem with LC0 and the default timings in CuteChess.

Your log shows that LC0 would start logging around 30s after 'isready' and
between readyok and ucinewgame there is slightly more than 15s which is enough
for timing it out.

https://github.com/cutechess/cutechess/ ... ne.cpp#L72

You can change the src in lib/chessengine.cpp and recompile CuteChess.

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Wed Feb 05, 2020 12:23 pm
by brinan
crem
I do have Ponder enabled. Sense you think problem lies with lc0.

Any chance setting a memory limit would solve problem?
Have idea that after 3 games, memory management becomes an issue.
NNCacheSize set to 20,480,000 (6.68GB). System RAM 16GB.
---

Guenther
Thanks for link to ChessEngine code.

See the wait intervals:
Ping 15 seconds, Quit 5, Idle 15, and Start 35.

When the first abort happened, Cute Chess GUI's Wait [between games] setting was 3 seconds.
Restored setting back to default of 5, hoping this would help. It did not.
Finally, tried 15 seconds. No good.

In the light of this new information, could you be more specific as to which interval setting should be changed, and to what?
---

Going through some lc0 logs, it never responds to ucinewgame. A response from lc0 follows: setoption name Ponder, startpos sequence, and finally isready.

If the fault is with lc0, what is it not doing?
If the fault is with Cute Chess, why would it not send isready, or if it has, not send startpos?

Still not clear who's waiting for who.

Very grateful for the responses.

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Wed Feb 05, 2020 6:00 pm
by Guenther
brinan wrote: Wed Feb 05, 2020 12:23 pm crem
I do have Ponder enabled. Sense you think problem lies with lc0.

Any chance setting a memory limit would solve problem?
Have idea that after 3 games, memory management becomes an issue.
NNCacheSize set to 20,480,000 (6.68GB). System RAM 16GB.
---

Guenther
Thanks for link to ChessEngine code.

See the wait intervals:
Ping 15 seconds, Quit 5, Idle 15, and Start 35.

When the first abort happened, Cute Chess GUI's Wait [between games] setting was 3 seconds.
Restored setting back to default of 5, hoping this would help. It did not.
Finally, tried 15 seconds. No good.

In the light of this new information, could you be more specific as to which interval setting should be changed, and to what?
---
Try ping and idle 30 and start 45 or 60, no need to change the quit value.
(I never tested with ponder on, but with slow gpus (or very big nets) we tested those higher values and they were sufficient)

If ponder in LC0 is the final problem, I hope crem will tell you more...

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Wed Feb 05, 2020 6:58 pm
by brinan
Guenther
If ponder in LC0 is the final problem, I hope crem will tell you more...

Thanks for intervals.

Chatted to crem on Lc0 Discord.
Link was given to Ponder issue 944 at lco GitHub.
As a test, going to turn off Ponder for both engines in this weekend's match on my stream.

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Fri Feb 07, 2020 1:37 pm
by MikeB
brinan wrote: Tue Feb 04, 2020 5:36 pm In my matches between Stockfish and lc0 (Leela Chess Zero), have suffered several aborted games due to Black's cancellation stall. Cute Chess message has always referred to lc0.

The result of going through logs of both Cute Chess and lc0 is a question as follows:
Is there any reason why Cute Chess would not send to lc0 the command

Code: Select all

>lc0.net.62164(0): ucinewgame
Note there had already been the interchange:

Code: Select all

>lc0.net.62164(0): isready
<lc0.net.62164(0): readyok
Or, Cute Chess does send ucinewgame but does not follow up with position startpos.
In this case, the receipt of ucinewgame is acknowledged in lc0's log. But then abrupt end:

Code: Select all

0118 21:51:29.797194 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> isready
0118 21:51:29.797319 2552 c:\projects\lc0\src\chess\uciloop.cc:219] << readyok
0118 21:51:44.840614 2552 c:\projects\lc0\src\chess\uciloop.cc:132] >> ucinewgame


============= Log started. =============
0118 21:52:00.761432 1036 c:\projects\lc0\src\main.cc:40] Lc0 started.
Any ideas gratefully received. Restarting individual games manually spoils the flow of weekend stream.
try increasing the time between the games to 5 seconds under the tournament settings, I'm using version 1.0 - no issues here, but the cutechess default setting for time between games ( ~0.1 second) is problematic for lc0, ponder off of course

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Fri Feb 07, 2020 10:57 pm
by brinan
MikeB
try increasing the time between the games


Tried that. From 3 seconds, to 5 seconds, to 15 seconds.


the cutechess default setting for time between games ( ~0.1 second) is problematic for lc0

Cute Chess GUI 1.1.0 has 5 seconds as its default Wait value.
Perhaps you are referring to the client version.

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Fri Feb 07, 2020 11:22 pm
by crem
brinan wrote: Fri Feb 07, 2020 10:57 pm MikeB
try increasing the time between the games


Tried that. From 3 seconds, to 5 seconds, to 15 seconds.


the cutechess default setting for time between games ( ~0.1 second) is problematic for lc0

Cute Chess GUI 1.1.0 has 5 seconds as its default Wait value.
Perhaps you are referring to the client version.
Could you send more complete Lc0 logs, at least a few moves before the hang? I tried to reproduce that but cannot, it would be useful to know what happened before the hang.

Re: CuteChessGUI v1.1.0: 'cancellation stall' abort.

Posted: Sat Feb 08, 2020 12:24 am
by brinan
crem
Could you send more complete Lc0 logs


I'll upload via Discord. Chat with you there.