c-chess-cli

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: c-chess-cli

Post by Ras »

Two more suggestions:

1) If an engine is unresponsive, c-chess-cli exits without killing the engine processes which keep on lingering as zombies. Child process kill could be added upon exit.

2) For tournaments, it would be useful to have some final output of the tournament table. Right now, this requires loading the PGN into some other software, or not using the tournament mode and running the encounters separately. Optionally giving a result file name and outputting that as CSV table (just a text file) would be nice. Using a semicolon as separator could work around locales that use the comma as decimal point.
Rasmus Althoff
https://www.ct800.net
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: c-chess-cli

Post by lucasart »

Ras wrote: Mon Nov 02, 2020 4:22 pm Two more suggestions:

1) If an engine is unresponsive, c-chess-cli exits without killing the engine processes which keep on lingering as zombies. Child process kill could be added upon exit.

2) For tournaments, it would be useful to have some final output of the tournament table. Right now, this requires loading the PGN into some other software, or not using the tournament mode and running the encounters separately. Optionally giving a result file name and outputting that as CSV table (just a text file) would be nice. Using a semicolon as separator could work around locales that use the comma as decimal point.
1) I was hoping to handle this portably and simply by relying on EOF, which is the idiomatic way of Unix pipes. when the engine gets an EOF reading from stdin, it should exit (or crash if the programmer was not careful which amounts to the same). And vice-versa (child dies, parent cant read from broken pipe, exits).

But you're right. A hanging engine could be stuck forever without ever doing a read from stdin. The problem is I need a solution that catches every case, not just this one. For example a Ctrl+C, and who know what else the operating system could throw at us.

2) Tournament table… Not sure about this. I don't want to use a broken ELO model, and I don't want to reimplement BayesElo either. The only thing I could print are pair stats, just WLD triplet per pair. But there are N(N-1)/2 pairs in a RR!

I like the CSV solution. Is there a format I can use that other tools use ? eg. doesn't Ordo have such an input format ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: c-chess-cli

Post by lucasart »

lucasart wrote: Tue Nov 03, 2020 2:01 am
Ras wrote: Mon Nov 02, 2020 4:22 pm Two more suggestions:

1) If an engine is unresponsive, c-chess-cli exits without killing the engine processes which keep on lingering as zombies. Child process kill could be added upon exit.

2) For tournaments, it would be useful to have some final output of the tournament table. Right now, this requires loading the PGN into some other software, or not using the tournament mode and running the encounters separately. Optionally giving a result file name and outputting that as CSV table (just a text file) would be nice. Using a semicolon as separator could work around locales that use the comma as decimal point.
1) I was hoping to handle this portably and simply by relying on EOF, which is the idiomatic way of Unix pipes. when the engine gets an EOF reading from stdin, it should exit (or crash if the programmer was not careful which amounts to the same). And vice-versa (child dies, parent cant read from broken pipe, exits).

But you're right. A hanging engine could be stuck forever without ever doing a read from stdin. The problem is I need a solution that catches every case, not just this one. For example a Ctrl+C, and who know what else the operating system could throw at us.

2) Tournament table… Not sure about this. I don't want to use a broken ELO model, and I don't want to reimplement BayesElo either. The only thing I could print are pair stats, just WLD triplet per pair. But there are N(N-1)/2 pairs in a RR!

I like the CSV solution. Is there a format I can use that other tools use ? eg. doesn't Ordo have such an input format ?
Actually, I think 1) is purely academic. When parent dies, both stdin and stdout in the child process are broken, and will result in termination. If an engine doesn't terminate in such conditions, the engine is at fault.

Do you have a real example when this zombie child scenario happens ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: c-chess-cli

Post by AndrewGrant »

lucasart wrote: Tue Nov 03, 2020 6:05 am Actually, I think 1) is purely academic. When parent dies, both stdin and stdout in the child process are broken, and will result in termination. If an engine doesn't terminate in such conditions, the engine is at fault.
Agreed fully. One of the things I test for each engine I add to OpenBench is whether or not they respect to closure of stdin.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: c-chess-cli

Post by Ras »

lucasart wrote: Tue Nov 03, 2020 2:01 amI like the CSV solution. Is there a format I can use that other tools use ? eg. doesn't Ordo have such an input format ?
Ordo has only PGN as input and uses CSV as output. That would be the other solution if printing tournament stats (just the table without Elo) didn't make sense for c-chess-cli.

lucasart wrote: Tue Nov 03, 2020 6:05 amDo you have a real example when this zombie child scenario happens ?
Yes, that's how I spotted it. Let's take Raven 1.1, modified in chess.c line 62 to check the return code of fgets() and exit if it's 0, and match that against Zevra 2.1.2 with 8 threads (I have a 4C/8T CPU). I can't reproduce that with fewer workers, then Zevra won't be unresponsive.

Zevra is unresponsive after about 10-20 games, and while the Zevra processes are killed (most of the time, but not always), the Raven one's linger. The lingering engines are sleeping in waiting channel "pipe_wait" as per my system monitor, with FD 0 and 1 indicated as open files of pipe sort.

Using my engine instead of Raven has a similar effect, but I'm using read() directly on stdin (with error checking). The hanging processes of my engine are sleeping in waiting channel futex_wait_queue_me though.

It looks like the pipes aren't closed. I'm on kernel 5.4.0, but have also tried 5.8.0 - same results.

However, matching Raven and my engine works. Matching Demolito against Zevra works without Zevra becoming unresponsive. Pretty strange.

Killing c-chess-cli with CTRL-C before Zevra is unresponsive makes all processes exit as expected.

AndrewGrant wrote: Tue Nov 03, 2020 7:33 amOne of the things I test for each engine I add to OpenBench is whether or not they respect to closure of stdin.
What's your test case for that (Linux)?
Rasmus Althoff
https://www.ct800.net
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: c-chess-cli

Post by Ras »

Could the issue be a race condition between pipe() and fork()?

https://stackoverflow.com/questions/380 ... rom-thread
If a thread switch occurs and fork is called between the pipe and fork system calls, the pipe file descriptors are duplicated, causing the write/read ends to be open multiple times.
Even when matching Raven against my engine where things work with 8 game threads in parallel, there are like e.g. 31 file descriptors open in each engine process, 28 of them pipes. There shouldn't be that many pipes open. FD 0 and 1 as pipes (stdin/stdout), 3 FDs as file /dev/pts/0, the EPD and the PGN file (yes, open in the engine processes), and then 26 FDs 5 to 30 as pipes. Some engine processes have fewer pipes open, only up to FD 8, others more, up to FD 36.
Rasmus Althoff
https://www.ct800.net
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: c-chess-cli

Post by AndrewGrant »

Ras wrote: Tue Nov 03, 2020 10:20 am
AndrewGrant wrote: Tue Nov 03, 2020 7:33 amOne of the things I test for each engine I add to OpenBench is whether or not they respect to closure of stdin.
What's your test case for that (Linux)?
Well, in practice its

1) pkilling cutechess from the command line while its running OpenBench; and
2) Stopping a test on the OpenBench webpage, and seeing how a worker running said test reacts.

Nothing fancy.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: c-chess-cli

Post by Ras »

AndrewGrant wrote: Tue Nov 03, 2020 6:24 pm1) pkilling cutechess from the command line while its running OpenBench
If I do that with c-chess-cli, my engine exits correctly if there is only one concurrent game going on. Otherwise, the duplicate pipes are still open in the spawned engine processes so that reading stdin just waits forever.
Rasmus Althoff
https://www.ct800.net
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: c-chess-cli

Post by AndrewGrant »

Ras wrote: Tue Nov 03, 2020 8:18 pm
AndrewGrant wrote: Tue Nov 03, 2020 6:24 pm1) pkilling cutechess from the command line while its running OpenBench
If I do that with c-chess-cli, my engine exits correctly if there is only one concurrent game going on. Otherwise, the duplicate pipes are still open in the spawned engine processes so that reading stdin just waits forever.
I'm not sure why having multiple concurrency changes things?
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: c-chess-cli

Post by Ras »

AndrewGrant wrote: Tue Nov 03, 2020 8:36 pmI'm not sure why having multiple concurrency changes things?
Because reading stdin only gives an error if the pipe is closed. That however requires it to be closed on all other ends because it's reference counted. Since the pipes are duplicated also in the spawned engine processes, killing c-chess-cli doesn't close all other pipe ends so that reading stdin just gives a blocking call with no input. Basically, the zombies keep each other alive. With only one game, i.e. two engines, the second engine dupes the first one's stdin/out, but has no one duping its stdin/out. So the second engine exits, and then also the first one. A least, that's what I think that happens.
Rasmus Althoff
https://www.ct800.net