Scorpio question

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Scorpio question

Post by bob »

Daniel:

Was looking to add some new opponents for cluster testing, and ran into something that seems bizarre when I tried to add Scorpio (2.7.7 I think).

I have a cluster with 12 cores per node. If I run a test where I play 12 independent games on one node, with one of those games played by Scorpio, all is well. But if I play more than one scorpio game at the same time, all is anything but OK and I get an occasional strange move by the program that is illegal. I am not using MPI for any of this so it should not get wrapped up with your cluster search stuff, at least that I can see.

Is there any sort of "communication" possibilities here? IE a file shared by multiple instances of Scorpio (no book.dat file is present, and I supposedly have logging disabled).

Any ideas?
AlvaroBegue
Posts: 932
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Scorpio question

Post by AlvaroBegue »

You can run scorpio under strace, to see what files it's opening:

Code: Select all

> strace -e trace=file ./scorpio
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Scorpio question

Post by Daniel Shawul »

I think you have it covered: no log files, no books and no egbbs. Cluster code is disabled by default, so there should not be any communication but via files.
Also, your script should not pass 'log on' to scorpio, because command line settings override ini file of scorpio.
I have played many gauntlets of Scorpio on nodes with 32 processors, so I can confirm that it works except for the occasional hangs.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Scorpio question

Post by bob »

Daniel Shawul wrote:I think you have it covered: no log files, no books and no egbbs. Cluster code is disabled by default, so there should not be any communication but via files.
Also, your script should not pass 'log on' to scorpio, because command line settings override ini file of scorpio.
I have played many gauntlets of Scorpio on nodes with 32 processors, so I can confirm that it works except for the occasional hangs.
I played some 12 (SMP) games last night, one game per node, crafty and scorpio both using 12 cores, ponder=on, different nodes. I tried very fast games, and two games 40 moves in 2 hours. Zero problems. But when I run two games on the same node, something goes wrong. I added scorpio to my gauntlet yesterday, where I had 6 programs each playing two games against Crafty simultaneously. Only the scorpio/crafty games go south.

Only change I had to make for my referee program was to send e2e4 rather than SAN, which was trivial enough, all the other programs accepted SAN directly. I'm going to continue debugging...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Scorpio question

Post by bob »

AlvaroBegue wrote:You can run scorpio under strace, to see what files it's opening:

Code: Select all

> strace -e trace=file ./scorpio
Not on lightweight kernels...
AlvaroBegue
Posts: 932
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Scorpio question

Post by AlvaroBegue »

bob wrote:
AlvaroBegue wrote:You can run scorpio under strace, to see what files it's opening:

Code: Select all

> strace -e trace=file ./scorpio
Not on lightweight kernels...
I have no experience with lightweight kernels, but presumably you can use a non-lightweight kernel to run this test and see what files are being opened. Presumably the engine will open the same files regardless of what type of kernel is being used.

Anyway, it was just an idea.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Scorpio question

Post by Sven »

Have you tried to comment out
#define LOG_FILE
in "scorpio.h" and then recompile?

Not sure but function init_io() in "util.cpp" deals with creating (and under certain cirumstances also removing) log files in the "./log" subfolder of the current working directory, and if two instances of Scorpio try simultaneously to maintain these log files then I could imagine that there is potential for trouble.

Are there any files named "./log/log???.txt" left below the Scorpio working folder? Maybe 1000 such files?

Note that with the change above the "log" command seems to be unavailable on the command line according to the sources.

You might also decide to assign separate working directories to each instance of Scorpio to avoid any sharing of temporary files.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Scorpio question

Post by bob »

Sven Schüle wrote:Have you tried to comment out
#define LOG_FILE
in "scorpio.h" and then recompile?

Not sure but function init_io() in "util.cpp" deals with creating (and under certain cirumstances also removing) log files in the "./log" subfolder of the current working directory, and if two instances of Scorpio try simultaneously to maintain these log files then I could imagine that there is potential for trouble.

Are there any files named "./log/log???.txt" left below the Scorpio working folder? Maybe 1000 such files?

Note that with the change above the "log" command seems to be unavailable on the command line according to the sources.

You might also decide to assign separate working directories to each instance of Scorpio to avoid any sharing of temporary files.
No log files. But even if there were, I don't see how it would cause the search to make impossible moves here and there and lose on time. I can't play book games like this with Crafty, with learning enabled, because I have had corrupted book files here and there when two instances try to update at the same time. But it doesn't influence the games.

This has been a really strange issue. I originally thought that fast games might be a problem, since with 12 games and 12 cores, there can be an occasional small lag due to an interrupt or whatever. But if I run 11 others and one scorpio, no problems still. Only when I run more than one scorpio game at the same time, and only on the same node...

Still looking...