Strings of timelosses under cutechess-cli

Discussion of chess software programming and technical issues.

Moderator: Ras

AndrewGrant
Posts: 1960
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Strings of timelosses under cutechess-cli

Post by AndrewGrant »

I have the following output from a cutechess session https://pastebin.com/raw/JpHAX8tJ

Interestingly, the first 7951 games had no problems, and neither did the next ~6000.

My #1 worry is that this is the fault of Ethereal, which I find highly suspect. Outside input would be appreciated.

Assuming it is not Ethereal, the issue must either be with Cutechess or with my actual machine.

To further confuse things, I have played 2,008,484 games on my testing framework without having a time loss. Those are played in batches of 250.

Any thoughts? Similar experience?

Thanks

EDIT: Both blocks of time losses are 30 games each. I was playing with concurrency 30, meaning 60 engines running at once. This suggests that at some point, every single engine hanged. Then cutechess restarted the first 30. Then the 2nd set of 30 crashed, where restarted, and then all went on smoothly.
jdart
Posts: 4410
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Strings of timelosses under cutechess-cli

Post by jdart »

I don't know. I don't use concurrency myself, I just start multiple cutechess-cli instances.

I see zero time losses with the engines I use routinely, except Nemorino 4, which regularly loses about 5-6 games out of a 8000 game match (very fast time control).

It is essential to use a high-resolution timer to avoid losses in fast games, but most engines have that now. If they are losing on time I suspect that is faulty time management logic, but it is hard to be sure, especially with very rare time losses.

(Note I use Linux for testing - I have almost no experience with cutechess-cli on other platforms).

--Jon
AndrewGrant
Posts: 1960
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Strings of timelosses under cutechess-cli

Post by AndrewGrant »

There were 60+.6s games, 250ms move overhead, and I have 30million games played on 10+.1s without a single time loss....

main point here was the fact that all time losses occurred on top each other...
User avatar
Ras
Posts: 2703
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Strings of timelosses under cutechess-cli

Post by Ras »

The interesting part is that the engine failed to respond to ping. Looks like either the engine is hanging or the I/O has gone south.
Rasmus Althoff
https://www.ct800.net
AndrewGrant
Posts: 1960
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Strings of timelosses under cutechess-cli

Post by AndrewGrant »

It was suggested that the machine over heats, and due to a kernal bug, instead of downclocking each thread, the cpu instead idles them. This would cause engines to time loss, despite never crashing.

It seems like a reasonable idea, so I've upgrade the kernal and opened up the chassis to increase airflow.

The problem only happened once over the course of 12 hours, so testing is hard.