Crafty 23.1 scaling problem on Nehalem octa

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Matthias Gemuh
Posts: 3245
Joined: Thu Mar 09, 2006 9:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Matthias Gemuh »

Hugo wrote:Hello Mr. Hyatt

crafty is running under Arena in this mt tests.


regards, Clemens Keck

What do other GUIs do with handling more than 6 cores ?

Matthias.
My engine was quite strong till I added knowledge to it.
http://www.chess.hylogic.de
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by zullil »

Matthias Gemuh wrote:
Hugo wrote:Hello Mr. Hyatt

crafty is running under Arena in this mt tests.


regards, Clemens Keck

What do other GUIs do with handling more than 6 cores ?

Matthias.
Even better, can you run Crafty without any GUI? From a command line?
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Hugo »

Hello Matthias

I also have this problem in Chessbase GUI using uci adapter.
Regards, Clemens
User avatar
Matthias Gemuh
Posts: 3245
Joined: Thu Mar 09, 2006 9:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Matthias Gemuh »

Hugo wrote:Hello Matthias

I also have this problem in Chessbase GUI using uci adapter.
Regards, Clemens
If you used a uci adapter under Arena too, then try without
... under Arena/Winboard/ChessGUI.

Then that option without GUI.

Matthias.
My engine was quite strong till I added knowledge to it.
http://www.chess.hylogic.de
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by bob »

Matthias Gemuh wrote:
Hugo wrote:Hello Matthias

I also have this problem in Chessbase GUI using uci adapter.
Regards, Clemens
If you used a uci adapter under Arena too, then try without
... under Arena/Winboard/ChessGUI.

Then that option without GUI.

Matthias.
OK, some questiions:

(1) what operating system? Windoze? If so, before you run, use task manager to see what the system load is? Should be near zero. If not, there's no point in proceeding until that goes to zero. If linux, a "top" should show a load average of near zero. Again, if not, that has to be fixed first.

(2) when you run in a command-line box (windows) or in a terminal window (linux) and start crafty and run with mt=8, what does the load average go to? It should go to 8.0 or very close to that. If it only goes to 4, something's wrong. I can think of one easy explanation, that the executable was built with -DCPUS=4, which means you can't go past that.

(3) run with mt=8 and send me the log.nnn file (or post it here). That wlil tell what Crafty thinks is going on. 3.6M nps sounds pretty reasonable, and 8x that is your goal. You might not get all the way to 8x, but you should get very close, so that you see 30M or so...
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Hugo »

Hello again

sorry for my late answer, but my job made me busy last days.

operating system is Windows server 2003 enterprise 64 bit
system load before starting crafty is zero
when running crafty in command line load goes to 8 (100%, I can see all 8 cores with 100% load in task manager)

I have two octa computers with same OS. The W5580 and the QX 9775 (skulltrail). On skulltrail scaling seems normal. I am getting 22.000KNs with that machine.

log file content :
usage: bookpath|perspath|logpath|tbpath <path>
EPD Kit revision date: 1996.04.21
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].

Initializing multiple threads.
System is SMP, not NUMA.
EGTB cache memory = 32M bytes.
pondering enabled.
playing a computer!
use 'settc' command if a game is restarted after Crafty
has been terminated for any reason.
tournament mode.
book learning disabled
book file disabled.
max threads set to 8.
SMP keep extra threads spinning when idle.
hash table memory = 1024M bytes.
pawn hash table memory = 256M bytes.


Crafty v23.1 JA (8 cpus)

White(1): go depth 20
time surplus 0.00 time limit 57.00 (+27.00) (3:00)
depth time score variation (1)
starting thread 1
starting thread 2
starting thread 3
starting thread 4
starting thread 5
starting thread 6
starting thread 7
12 0.08 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8
12-> 0.10 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8 (s=2)
13 0.17 0.11 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. d3 d5 7. Bb5
13-> 0.35 0.11 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. d3 d5 7. Bb5 (s=2)
14 0.42 0.09 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. Bd3 Bc5 7. Ne4 Nxe4
8. Bxe4 <HT>
14-> 0.50 0.09 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. Bd3 Bc5 7. Ne4 Nxe4
8. Bxe4 <HT>
15 1.22 0.09 1. Nf3 Nf6 2. e3 Nc6 3. Nc3 e6 4. Bc4
Bb4 5. O-O O-O 6. Bd3 Bc5 7. Ne4 Nxe4
8. Bxe4 Bd6 <HT>
15 1.89 0.20 1. Nc3 Nc6 2. e4 Nf6 3. Nf3 e5 4. d4
exd4 5. Nxd4 Bc5 6. Be3 Bxd4 7. Bxd4
O-O 8. Bc4 <HT>
15-> 2.05 0.20 1. Nc3 Nc6 2. e4 Nf6 3. Nf3 e5 4. d4
exd4 5. Nxd4 Bc5 6. Be3 Bxd4 7. Bxd4
O-O 8. Bc4 <HT> (s=3)
16 2.96 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 <HT> (s=2)
16-> 3.13 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 <HT>
17 4.02 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 9. Qxf3 dxc6
17-> 4.94 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 9. Qxf3 dxc6 (s=2)
18 8.52 0.22 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
O-O 5. O-O d6 6. Nd5 Nxd5 7. Bxd5 c6
8. Bc4 Bg4 9. d4 Bxf3 10. gxf3
18 19.25 0.24 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bc5 6. Nxc6 bxc6 7. Bd3
O-O 8. O-O Re8 9. Be3 Bxe3 <HT>
18-> 20.66 0.24 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bc5 6. Nxc6 bxc6 7. Bd3
O-O 8. O-O Re8 9. Be3 Bxe3 <HT>
19 28.39 0.23 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bb4 6. Nxc6 bxc6 7. Bd3
d5 8. exd5 Qe7+ 9. Qe2 Qxe2+ 10. Bxe2
Nxd5 11. Bd2 Nxc3 12. Bxc3 Bxc3+ 13.
bxc3
19-> 37.88 0.23 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bb4 6. Nxc6 bxc6 7. Bd3
d5 8. exd5 Qe7+ 9. Qe2 Qxe2+ 10. Bxe2
Nxd5 11. Bd2 Nxc3 12. Bxc3 Bxc3+ 13.
bxc3 (s=2)
time=58.03 mat=0 n=561577064 fh=92% nps=9.7M
extensions=17.6M qchecks=20.8M reduced=57.0M pruned=194.8M
predicted=0 evals=261.7M 50move=0 EGTBprobes=0 hits=0
SMP-> splits=84607 aborts=10588 data=59/65536 elap=58.03
White(1): e4
time used: 58.03
time remaining (white): 0:29:01 (59 more moves)
time remaining (black): 0:30:00 (60 more moves)
if clocks are wrong, use 'clock' command to adjust them
Black(1): e5 [pondering]
time surplus 0.00 time limit 53.13 (+23.61) (2:57)
depth time score variation (18)
19 19.32 0.23 2. Nf3 Nc6 3. Nc3 Nf6 4. Bc4 Bc5 5.
O-O O-O 6. d3 d6 7. Bg5 Bg4 8. h3 Bxf3
9. Qxf3 Nd4 10. Qd1 Kh8 11. Qd2
19-> 26.40 0.23 2. Nf3 Nc6 3. Nc3 Nf6 4. Bc4 Bc5 5.
O-O O-O 6. d3 d6 7. Bg5 Bg4 8. h3 Bxf3
9. Qxf3 Nd4 10. Qd1 Kh8 11. Qd2
20 47.84 0.19 2. Nf3 Nc6 3. Nc3 Nf6 4. Bc4 Bc5 5.
O-O O-O 6. d3 d6 7. Bg5 h6 8. Be3 Bxe3
9. fxe3 Na5 10. Bb5 Be6 11. Qe2 Qe7
20 1:15 29/29? 2. Nc3 (10.1Mnps)

maybe it helps, when you go by RDP on my machine. When you are interested I can arrange that. Any other help would also be verry kind.

Thank you, Clemens Keck
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by bob »

Hugo wrote:Hello again

sorry for my late answer, but my job made me busy last days.

operating system is Windows server 2003 enterprise 64 bit
system load before starting crafty is zero
when running crafty in command line load goes to 8 (100%, I can see all 8 cores with 100% load in task manager)

I have two octa computers with same OS. The W5580 and the QX 9775 (skulltrail). On skulltrail scaling seems normal. I am getting 22.000KNs with that machine.

log file content :
usage: bookpath|perspath|logpath|tbpath <path>
EPD Kit revision date: 1996.04.21
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].

Initializing multiple threads.
System is SMP, not NUMA.
EGTB cache memory = 32M bytes.
pondering enabled.
playing a computer!
use 'settc' command if a game is restarted after Crafty
has been terminated for any reason.
tournament mode.
book learning disabled
book file disabled.
max threads set to 8.
SMP keep extra threads spinning when idle.
hash table memory = 1024M bytes.
pawn hash table memory = 256M bytes.


Crafty v23.1 JA (8 cpus)

White(1): go depth 20
time surplus 0.00 time limit 57.00 (+27.00) (3:00)
depth time score variation (1)
starting thread 1
starting thread 2
starting thread 3
starting thread 4
starting thread 5
starting thread 6
starting thread 7
12 0.08 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8
12-> 0.10 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8 (s=2)
13 0.17 0.11 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. d3 d5 7. Bb5
13-> 0.35 0.11 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. d3 d5 7. Bb5 (s=2)
14 0.42 0.09 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. Bd3 Bc5 7. Ne4 Nxe4
8. Bxe4 <HT>
14-> 0.50 0.09 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e6 4. Bc4
Bb4 5. O-O O-O 6. Bd3 Bc5 7. Ne4 Nxe4
8. Bxe4 <HT>
15 1.22 0.09 1. Nf3 Nf6 2. e3 Nc6 3. Nc3 e6 4. Bc4
Bb4 5. O-O O-O 6. Bd3 Bc5 7. Ne4 Nxe4
8. Bxe4 Bd6 <HT>
15 1.89 0.20 1. Nc3 Nc6 2. e4 Nf6 3. Nf3 e5 4. d4
exd4 5. Nxd4 Bc5 6. Be3 Bxd4 7. Bxd4
O-O 8. Bc4 <HT>
15-> 2.05 0.20 1. Nc3 Nc6 2. e4 Nf6 3. Nf3 e5 4. d4
exd4 5. Nxd4 Bc5 6. Be3 Bxd4 7. Bxd4
O-O 8. Bc4 <HT> (s=3)
16 2.96 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 <HT> (s=2)
16-> 3.13 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 <HT>
17 4.02 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 9. Qxf3 dxc6
17-> 4.94 0.25 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
Nc6 5. O-O O-O 6. Nd5 Nxd5 7. exd5
e4 8. dxc6 exf3 9. Qxf3 dxc6 (s=2)
18 8.52 0.22 1. Nc3 Nf6 2. e4 e5 3. Nf3 Bb4 4. Bc4
O-O 5. O-O d6 6. Nd5 Nxd5 7. Bxd5 c6
8. Bc4 Bg4 9. d4 Bxf3 10. gxf3
18 19.25 0.24 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bc5 6. Nxc6 bxc6 7. Bd3
O-O 8. O-O Re8 9. Be3 Bxe3 <HT>
18-> 20.66 0.24 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bc5 6. Nxc6 bxc6 7. Bd3
O-O 8. O-O Re8 9. Be3 Bxe3 <HT>
19 28.39 0.23 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bb4 6. Nxc6 bxc6 7. Bd3
d5 8. exd5 Qe7+ 9. Qe2 Qxe2+ 10. Bxe2
Nxd5 11. Bd2 Nxc3 12. Bxc3 Bxc3+ 13.
bxc3
19-> 37.88 0.23 1. e4 e5 2. Nf3 Nc6 3. Nc3 Nf6 4. d4
exd4 5. Nxd4 Bb4 6. Nxc6 bxc6 7. Bd3
d5 8. exd5 Qe7+ 9. Qe2 Qxe2+ 10. Bxe2
Nxd5 11. Bd2 Nxc3 12. Bxc3 Bxc3+ 13.
bxc3 (s=2)
time=58.03 mat=0 n=561577064 fh=92% nps=9.7M
extensions=17.6M qchecks=20.8M reduced=57.0M pruned=194.8M
predicted=0 evals=261.7M 50move=0 EGTBprobes=0 hits=0
SMP-> splits=84607 aborts=10588 data=59/65536 elap=58.03
White(1): e4
time used: 58.03
time remaining (white): 0:29:01 (59 more moves)
time remaining (black): 0:30:00 (60 more moves)
if clocks are wrong, use 'clock' command to adjust them
Black(1): e5 [pondering]
time surplus 0.00 time limit 53.13 (+23.61) (2:57)
depth time score variation (18)
19 19.32 0.23 2. Nf3 Nc6 3. Nc3 Nf6 4. Bc4 Bc5 5.
O-O O-O 6. d3 d6 7. Bg5 Bg4 8. h3 Bxf3
9. Qxf3 Nd4 10. Qd1 Kh8 11. Qd2
19-> 26.40 0.23 2. Nf3 Nc6 3. Nc3 Nf6 4. Bc4 Bc5 5.
O-O O-O 6. d3 d6 7. Bg5 Bg4 8. h3 Bxf3
9. Qxf3 Nd4 10. Qd1 Kh8 11. Qd2
20 47.84 0.19 2. Nf3 Nc6 3. Nc3 Nf6 4. Bc4 Bc5 5.
O-O O-O 6. d3 d6 7. Bg5 h6 8. Be3 Bxe3
9. fxe3 Na5 10. Bb5 Be6 11. Qe2 Qe7
20 1:15 29/29? 2. Nc3 (10.1Mnps)

maybe it helps, when you go by RDP on my machine. When you are interested I can arrange that. Any other help would also be verry kind.

Thank you, Clemens Keck
First thing I would do is boot box, go into bios setup and look at hardware settings. If you see something like "logical cpu on/enabled" turn it off. You are running an old version of windows. If you have 8 real cores, but have hyper-threading on, old versions could easily schedule most of the threads on the same physical package (processor chip) while leaving other physical cores on the other processor chip idle. newer versions of windows and linux have a process scheduler that understands this problem and handles it correctly, starting 4 threads on each processor "package" to prevent this kind of stuff.

Whether that is your problem or not is unknown, but it is certainly a known issue. While you are at it, I would disable "turbo-boost" as well as that causes some quirks when you test with one, then two, then 4 or 8 cpus. with 1 or 2, it will automatically overclock a processor, but when you get all 4 cores busy, the overclocking is disabled, which produces some odd performance scaling numbers.
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Hugo »

Hello

hyperthreading was OFF. only 8 cores/threads shown in Tasl manager.
I cannot imagine that turbo boost is the matter of this strange scaling. But I will test it soon.

Regards, Clemens Keck
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by bob »

Hugo wrote:Hello

hyperthreading was OFF. only 8 cores/threads shown in Tasl manager.
I cannot imagine that turbo boost is the matter of this strange scaling. But I will test it soon.

Regards, Clemens Keck
Turboboost won't cause this, most likely. What it should do is inflate the one thread, two threads, etc speeds, but not the 8 threads speed. It will overclock when you use less than the full number of cores. Which means that while you should still hit 25M-30M on that thing, your single thread NPS will be higher than it ought to be. The thing will ramp up the clock in 133mhz increments. Cheaper Nehalems will add up to 2 of those increments. The extreme versions will add more, although I am not sure of the exact value.

Any quirks in the hardware? For example, you have triple channels, each bank has to be the same size and same kind of memory (typical sizes are odd numbers like 3 gigs, 6 gigs, 12 gigs, etc.) If you don't do this the memory will slow way down.

Another interesting test is to run a one-cpu test that takes 60 seconds. Check the NPS. Then do two one-cpu tests (two separate runs of the program) and you should see both get that same nps. Repeat for 4 and 8. If the 8 runs slow down by 50%, something is really broken in the O/S, hardware or bios.
Gian-Carlo Pascutto
Posts: 1260
Joined: Sat Dec 13, 2008 7:00 pm

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Gian-Carlo Pascutto »

Maybe the box is overheating and throttling itself when all CPUs are enabled?