threads vs processes again

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: threads vs processes again

Post by Zach Wegner »

bob wrote:What kernel is that based on. Not Linux if I recall? If so, that is the problem. BSD/solaris threads are not particularly efficient. And you might have to use pthread_attr() to make sure that logical threads and physical processes are matched up... Solaris doesn't do that by default, but NetBSD I don't know about.

BTW I can run your code on our 8-core box to test if you want...
Yes, I'm starting to think its the OS. Pthread support for SMP boxes is relatively new, their first SMP kernel was around 2004 IIRC. The threads were all running on the same processor until I changed the PTHREAD_CONCURRENCY environment variable as described. I looked at the pthread_attr before, but there was nothing about thread-to-CPU matching.

It would be pretty cool to run it on 8 CPUs, though with my luck it will crash quickly. The easiest way would be to get both the threads and processes versions through CVS:

cvs -z3 -d:pserver:anonymous@zct.cvs.sourceforge.net:/cvsroot/zct co -P zct

cvs -z3 -d:pserver:anonymous@zct.cvs.sourceforge.net:/cvsroot/zct co -r zct0_3_2472_threads -P zct

Then for each, modify line 47 to have 8 for MAX_CPUS. Then run a "make release" for each, and type "bench" after starting them.

Now, as ZCT's SMP support is still incomplete, the idle time will probably be pretty big. It's usually around 10% for 4 cpus with processes, and 20-30% with threads. It also hangs occasionally, but it should get through a benchmark just fine.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: threads vs processes again

Post by bob »

I will see if I can get this to work tonight and post the results...
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: threads vs processes again

Post by wgarvin »

bob wrote:
Aleks Peshkov wrote:I suspect cross-process Transposition Table creates some memory management overhead.
It is simply shared memory either way (threads or processes). Both threads have the same hash table mapped into their virtual address spaces. Ditto for processes as I used the system V shared memory approach (shmget/shmat/etc).
Are you getting a lot of context switches for some reason? Could the difference be TLB flushes? Or some sort of virtualization overhead? Ideally each thread (or process) gets its own CPU and so they shouldn't have to switch very often. Maybe Linux has some mechanism you could use to measure the number process context switches that are actually occurring.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: threads vs processes again

Post by bob »

wgarvin wrote:
bob wrote:
Aleks Peshkov wrote:I suspect cross-process Transposition Table creates some memory management overhead.
It is simply shared memory either way (threads or processes). Both threads have the same hash table mapped into their virtual address spaces. Ditto for processes as I used the system V shared memory approach (shmget/shmat/etc).
Are you getting a lot of context switches for some reason? Could the difference be TLB flushes? Or some sort of virtualization overhead? Ideally each thread (or process) gets its own CPU and so they shouldn't have to switch very often. Maybe Linux has some mechanism you could use to measure the number process context switches that are actually occurring.
Nope. This is on a cluster node that is running nothing else. I ran both programs on the same node and interleaved the runs to make sure nothing changed. One simple test is to track process/cpu pairings, and there is zero problems there. Newer linux kernels have a horrific processor affinity. For example, I can run three processes on a machine with two cpus, and one process will lock onto one processor, the other two lock on to the other and get 50% of that processor each, while the first gets 100%.

But there is obviously some sort of difference. For example, with threads I only see one "process" running but using 800% of the CPUs available. With processes, I see 8 processes running, each using 100% of one processor. So something is different at a level I can not see. It is not going to remain a mystery forever however.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: threads vs processes again

Post by bob »

wgarvin wrote:
bob wrote:
Aleks Peshkov wrote:I suspect cross-process Transposition Table creates some memory management overhead.
It is simply shared memory either way (threads or processes). Both threads have the same hash table mapped into their virtual address spaces. Ditto for processes as I used the system V shared memory approach (shmget/shmat/etc).
Are you getting a lot of context switches for some reason? Could the difference be TLB flushes? Or some sort of virtualization overhead? Ideally each thread (or process) gets its own CPU and so they shouldn't have to switch very often. Maybe Linux has some mechanism you could use to measure the number process context switches that are actually occurring.
Nope. This is on a cluster node that is running nothing else. I ran both programs on the same node and interleaved the runs to make sure nothing changed. One simple test is to track process/cpu pairings, and there is zero problems there. Newer linux kernels have a horrific processor affinity. For example, I can run three processes on a machine with two cpus, and one process will lock onto one processor, the other two lock on to the other and get 50% of that processor each, while the first gets 100%.

But there is obviously some sort of difference. For example, with threads I only see one "process" running but using 800% of the CPUs available. With processes, I see 8 processes running, each using 100% of one processor. So something is different at a level I can not see. It is not going to remain a mystery forever however.

BTW linux does measure context switches among other things and the numbers are very low, although running vmstat to look at them adds to the number, obviously.
Cardoso
Posts: 363
Joined: Thu Mar 16, 2006 7:39 pm
Location: Portugal
Full name: Alvaro Cardoso

Re: threads vs processes again

Post by Cardoso »

Sorry for the slight off topic,

my question is will the newer crafty 22.2 run ok with pthreads on windows?
Have you already tested this new version on windows?

Thanks,
Alvaro
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: threads vs processes again

Post by bob »

Cardoso wrote:Sorry for the slight off topic,

my question is will the newer crafty 22.2 run ok with pthreads on windows?
Have you already tested this new version on windows?

Thanks,
Alvaro
This will hopefully resolve all the old issues, including the smpnice=1 that was causing problems on windows. By the time it is released, we will have tested it on several different platforms to make sure all is well...