c-chess-cli

Discussion of chess software programming and technical issues.

Moderators: Harvey Williamson, Dann Corbit, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Ras
Posts: 1620
Joined: Tue Aug 30, 2016 6:19 pm
Full name: Rasmus Althoff
Contact:

Re: c-chess-cli

Post by Ras » Mon Nov 02, 2020 3:22 pm

Two more suggestions:

1) If an engine is unresponsive, c-chess-cli exits without killing the engine processes which keep on lingering as zombies. Child process kill could be added upon exit.

2) For tournaments, it would be useful to have some final output of the tournament table. Right now, this requires loading the PGN into some other software, or not using the tournament mode and running the encounters separately. Optionally giving a result file name and outputting that as CSV table (just a text file) would be nice. Using a semicolon as separator could work around locales that use the comma as decimal point.
Rasmus Althoff
https://www.ct800.net

User avatar
lucasart
Posts: 3168
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: c-chess-cli

Post by lucasart » Tue Nov 03, 2020 1:01 am

Ras wrote:
Mon Nov 02, 2020 3:22 pm
Two more suggestions:

1) If an engine is unresponsive, c-chess-cli exits without killing the engine processes which keep on lingering as zombies. Child process kill could be added upon exit.

2) For tournaments, it would be useful to have some final output of the tournament table. Right now, this requires loading the PGN into some other software, or not using the tournament mode and running the encounters separately. Optionally giving a result file name and outputting that as CSV table (just a text file) would be nice. Using a semicolon as separator could work around locales that use the comma as decimal point.
1) I was hoping to handle this portably and simply by relying on EOF, which is the idiomatic way of Unix pipes. when the engine gets an EOF reading from stdin, it should exit (or crash if the programmer was not careful which amounts to the same). And vice-versa (child dies, parent cant read from broken pipe, exits).

But you're right. A hanging engine could be stuck forever without ever doing a read from stdin. The problem is I need a solution that catches every case, not just this one. For example a Ctrl+C, and who know what else the operating system could throw at us.

2) Tournament table… Not sure about this. I don't want to use a broken ELO model, and I don't want to reimplement BayesElo either. The only thing I could print are pair stats, just WLD triplet per pair. But there are N(N-1)/2 pairs in a RR!

I like the CSV solution. Is there a format I can use that other tools use ? eg. doesn't Ordo have such an input format ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

User avatar
lucasart
Posts: 3168
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: c-chess-cli

Post by lucasart » Tue Nov 03, 2020 5:05 am

lucasart wrote:
Tue Nov 03, 2020 1:01 am
Ras wrote:
Mon Nov 02, 2020 3:22 pm
Two more suggestions:

1) If an engine is unresponsive, c-chess-cli exits without killing the engine processes which keep on lingering as zombies. Child process kill could be added upon exit.

2) For tournaments, it would be useful to have some final output of the tournament table. Right now, this requires loading the PGN into some other software, or not using the tournament mode and running the encounters separately. Optionally giving a result file name and outputting that as CSV table (just a text file) would be nice. Using a semicolon as separator could work around locales that use the comma as decimal point.
1) I was hoping to handle this portably and simply by relying on EOF, which is the idiomatic way of Unix pipes. when the engine gets an EOF reading from stdin, it should exit (or crash if the programmer was not careful which amounts to the same). And vice-versa (child dies, parent cant read from broken pipe, exits).

But you're right. A hanging engine could be stuck forever without ever doing a read from stdin. The problem is I need a solution that catches every case, not just this one. For example a Ctrl+C, and who know what else the operating system could throw at us.

2) Tournament table… Not sure about this. I don't want to use a broken ELO model, and I don't want to reimplement BayesElo either. The only thing I could print are pair stats, just WLD triplet per pair. But there are N(N-1)/2 pairs in a RR!

I like the CSV solution. Is there a format I can use that other tools use ? eg. doesn't Ordo have such an input format ?
Actually, I think 1) is purely academic. When parent dies, both stdin and stdout in the child process are broken, and will result in termination. If an engine doesn't terminate in such conditions, the engine is at fault.

Do you have a real example when this zombie child scenario happens ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

AndrewGrant
Posts: 876
Joined: Tue Apr 19, 2016 4:08 am
Location: U.S.A
Full name: Andrew Grant
Contact:

Re: c-chess-cli

Post by AndrewGrant » Tue Nov 03, 2020 6:33 am

lucasart wrote:
Tue Nov 03, 2020 5:05 am
Actually, I think 1) is purely academic. When parent dies, both stdin and stdout in the child process are broken, and will result in termination. If an engine doesn't terminate in such conditions, the engine is at fault.
Agreed fully. One of the things I test for each engine I add to OpenBench is whether or not they respect to closure of stdin.

Ras
Posts: 1620
Joined: Tue Aug 30, 2016 6:19 pm
Full name: Rasmus Althoff
Contact:

Re: c-chess-cli

Post by Ras » Tue Nov 03, 2020 9:20 am

lucasart wrote:
Tue Nov 03, 2020 1:01 am
I like the CSV solution. Is there a format I can use that other tools use ? eg. doesn't Ordo have such an input format ?
Ordo has only PGN as input and uses CSV as output. That would be the other solution if printing tournament stats (just the table without Elo) didn't make sense for c-chess-cli.

lucasart wrote:
Tue Nov 03, 2020 5:05 am
Do you have a real example when this zombie child scenario happens ?
Yes, that's how I spotted it. Let's take Raven 1.1, modified in chess.c line 62 to check the return code of fgets() and exit if it's 0, and match that against Zevra 2.1.2 with 8 threads (I have a 4C/8T CPU). I can't reproduce that with fewer workers, then Zevra won't be unresponsive.

Zevra is unresponsive after about 10-20 games, and while the Zevra processes are killed (most of the time, but not always), the Raven one's linger. The lingering engines are sleeping in waiting channel "pipe_wait" as per my system monitor, with FD 0 and 1 indicated as open files of pipe sort.

Using my engine instead of Raven has a similar effect, but I'm using read() directly on stdin (with error checking). The hanging processes of my engine are sleeping in waiting channel futex_wait_queue_me though.

It looks like the pipes aren't closed. I'm on kernel 5.4.0, but have also tried 5.8.0 - same results.

However, matching Raven and my engine works. Matching Demolito against Zevra works without Zevra becoming unresponsive. Pretty strange.

Killing c-chess-cli with CTRL-C before Zevra is unresponsive makes all processes exit as expected.

AndrewGrant wrote:
Tue Nov 03, 2020 6:33 am
One of the things I test for each engine I add to OpenBench is whether or not they respect to closure of stdin.
What's your test case for that (Linux)?
Rasmus Althoff
https://www.ct800.net

Ras
Posts: 1620
Joined: Tue Aug 30, 2016 6:19 pm
Full name: Rasmus Althoff
Contact:

Re: c-chess-cli

Post by Ras » Tue Nov 03, 2020 12:21 pm

Could the issue be a race condition between pipe() and fork()?

https://stackoverflow.com/questions/380 ... rom-thread
If a thread switch occurs and fork is called between the pipe and fork system calls, the pipe file descriptors are duplicated, causing the write/read ends to be open multiple times.
Even when matching Raven against my engine where things work with 8 game threads in parallel, there are like e.g. 31 file descriptors open in each engine process, 28 of them pipes. There shouldn't be that many pipes open. FD 0 and 1 as pipes (stdin/stdout), 3 FDs as file /dev/pts/0, the EPD and the PGN file (yes, open in the engine processes), and then 26 FDs 5 to 30 as pipes. Some engine processes have fewer pipes open, only up to FD 8, others more, up to FD 36.
Rasmus Althoff
https://www.ct800.net

AndrewGrant
Posts: 876
Joined: Tue Apr 19, 2016 4:08 am
Location: U.S.A
Full name: Andrew Grant
Contact:

Re: c-chess-cli

Post by AndrewGrant » Tue Nov 03, 2020 5:24 pm

Ras wrote:
Tue Nov 03, 2020 9:20 am
AndrewGrant wrote:
Tue Nov 03, 2020 6:33 am
One of the things I test for each engine I add to OpenBench is whether or not they respect to closure of stdin.
What's your test case for that (Linux)?
Well, in practice its

1) pkilling cutechess from the command line while its running OpenBench; and
2) Stopping a test on the OpenBench webpage, and seeing how a worker running said test reacts.

Nothing fancy.

Ras
Posts: 1620
Joined: Tue Aug 30, 2016 6:19 pm
Full name: Rasmus Althoff
Contact:

Re: c-chess-cli

Post by Ras » Tue Nov 03, 2020 7:18 pm

AndrewGrant wrote:
Tue Nov 03, 2020 5:24 pm
1) pkilling cutechess from the command line while its running OpenBench
If I do that with c-chess-cli, my engine exits correctly if there is only one concurrent game going on. Otherwise, the duplicate pipes are still open in the spawned engine processes so that reading stdin just waits forever.
Rasmus Althoff
https://www.ct800.net

AndrewGrant
Posts: 876
Joined: Tue Apr 19, 2016 4:08 am
Location: U.S.A
Full name: Andrew Grant
Contact:

Re: c-chess-cli

Post by AndrewGrant » Tue Nov 03, 2020 7:36 pm

Ras wrote:
Tue Nov 03, 2020 7:18 pm
AndrewGrant wrote:
Tue Nov 03, 2020 5:24 pm
1) pkilling cutechess from the command line while its running OpenBench
If I do that with c-chess-cli, my engine exits correctly if there is only one concurrent game going on. Otherwise, the duplicate pipes are still open in the spawned engine processes so that reading stdin just waits forever.
I'm not sure why having multiple concurrency changes things?

Ras
Posts: 1620
Joined: Tue Aug 30, 2016 6:19 pm
Full name: Rasmus Althoff
Contact:

Re: c-chess-cli

Post by Ras » Tue Nov 03, 2020 7:40 pm

AndrewGrant wrote:
Tue Nov 03, 2020 7:36 pm
I'm not sure why having multiple concurrency changes things?
Because reading stdin only gives an error if the pipe is closed. That however requires it to be closed on all other ends because it's reference counted. Since the pipes are duplicated also in the spawned engine processes, killing c-chess-cli doesn't close all other pipe ends so that reading stdin just gives a blocking call with no input. Basically, the zombies keep each other alive. With only one game, i.e. two engines, the second engine dupes the first one's stdin/out, but has no one duping its stdin/out. So the second engine exits, and then also the first one. A least, that's what I think that happens.
Rasmus Althoff
https://www.ct800.net

Post Reply