pthread weirdness

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw, Ras, hgm, chrisw, Rebel, Ras

jswaff

pthread weirdness

Post by jswaff »

I'm seeing some very strange behavior with my program. On initialization, a number of threads are started that are simply going into a busy loop. When the program stops, its Cleanup() routine stops all those threads. Should be nothing to it. :)

Things work great on the first execution. For subsequent executions, the program initializes just fine, but somewhere along the way - after it enters its "command loop" (which just waits for input from scanf) it is KILLED right away.

What?!?!

Now, Prophet has been using pthreads all along for user input. A user emailed me about six months ago and told me that Prophet simply died right away on his Mac. I don't have a Mac (I use Gentoo Linux), and I was never able to reproduce this problem. I wonder if it's related...

Anyone have an idea?

--
James

Code: Select all

void InitSearchThreads() {

	printf("initializing search threads...\n");
	for &#40;int i=1;i<NUM_PROCESSORS;i++) &#123;
		search_pool&#91;i&#93;.available=0;
		search_pool&#91;i&#93;.stop=0;
	&#125;

	for &#40;int i=1;i<NUM_PROCESSORS;i++) &#123;
		printf&#40;"\t creating thread %d...\n",i&#41;;
		if &#40;pthread_create&#40;&search_pool&#91;i&#93;.t,NULL,InitSearchThread,&i&#41;) &#123;
			printf&#40;"ERROR intializing thread %d.\n",i&#41;;
			exit&#40;1&#41;;
		&#125;
	&#125;	
	printf&#40;"search threads initialized.\n");
&#125;

void StopSearchThreads&#40;) &#123;

	for &#40;int i=1;i<NUM_PROCESSORS;i++) &#123;
		search_pool&#91;i&#93;.stop=1;
		pthread_join&#40;search_pool&#91;i&#93;.t,NULL&#41;;
	&#125;
		
&#125;
Alessandro Scotti

Re: pthread weirdness

Post by Alessandro Scotti »

Using &i (address of a local variable, whose value is also rapidly changing) to initialize a thread looks suspicious...
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: pthread weirdness

Post by sje »

Some comments:

1) Why is the scanning index started at one instead of zero?

2) Is search_pool declared volatile? It should be.

3) You should check the return value for pthread_join.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: pthread weirdness

Post by sje »

Alessandro Scotti wrote:Using &i (address of a local variable, whose value is also rapidly changing) to initialize a thread looks suspicious...
I missed that. That's most likely why the failure occurs.
jswaff

Re: pthread weirdness

Post by jswaff »

Alessandro Scotti wrote:Using &i (address of a local variable, whose value is also rapidly changing) to initialize a thread looks suspicious...
You are the man - that was it. Thanks for the quick response.

I appreciate Steven's comments as well...

--
James
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: pthread weirdness

Post by bob »

all I can say is that I gave up on pthreads a few years ago. I found too many inconsistencies between the various unix and linux implementations. CPU time was problematic as some systems return the cpu time for a thread, while others return the accumulated time for all threads. scheduling policies varied (some systems default to one real process and round-robin the threads on that one process which won't use more than one CPU, unless you add specific thread-scheduling policy calls). I decided to go to "fork()" and roll my own to get away from all the nonsense and have not regretted that decision.

I notice you have a loop iterating on i, and you are passing the _address_ of I to the new thread. By the time that thread executes, i will probably be at limit+1 which is almost certainly not what you want. Don't pass the address of that, pass the actual value. Note that the create function requires a pointer, but you can recast the value "i" to a (void *) and then in the new thread re-cast it back to an int. Then you are no longer passing a pointer to a value that will be changed many times by the time the thread has been created and gets scheduled.

Also make absolutely certain you understand the difference between these:

volatile int X;
volatile int *X;
int * volatile X;
volatile int * volatile X;

they are absolutely not the same thing and you have to be sure that anything that can be changed in another thread is declared volatile or compiler optimization will kill you.
CRoberson
Posts: 2080
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Re: pthread weirdness

Post by CRoberson »

I agree with Bob. I've coded pthreads, Open MP and a few other libs that some groups seem stuck on. Each one has missing parts -- the developers decided not to provide that functionality because using it can cause errors.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: pthread weirdness

Post by sje »

Bob is right that there are resource usage metering differences among varying pthread implementations; I've seen this between Linux and OpenBSD. Which has the bug? It depends on who you talk with.

Linux threads are essentially lightweight processes and are managed as such, so a fork only approach is going to work pretty much the same as a pthread approach. On the various BSD systems it's a different story and so a pthread deployment is likely to be more efficient than a simple fork.

Symbolic uses pthreads, but it doesn't use any advanced pthread features other than customization of per thread stack space allocation. The idea is to avoid an unnecessary triggering of platform specific idiosyncrasies.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: pthread weirdness

Post by bob »

sje wrote:Bob is right that there are resource usage metering differences among varying pthread implementations; I've seen this between Linux and OpenBSD. Which has the bug? It depends on who you talk with.

Linux threads are essentially lightweight processes and are managed as such, so a fork only approach is going to work pretty much the same as a pthread approach. On the various BSD systems it's a different story and so a pthread deployment is likely to be more efficient than a simple fork.

Symbolic uses pthreads, but it doesn't use any advanced pthread features other than customization of per thread stack space allocation. The idea is to avoid an unnecessary triggering of platform specific idiosyncrasies.
I don't see any efficiency issue at all between pthreads and fork(). Almost all O/S's today use copy-on-write making process creation very fast.

The main difference is that pthreads shares everything that is global, while with fork() you have to explicitly share things (I use the SYSV shared memory stuff since that seems to work across every unix platform I have tried).

Even when I used pthreads, all I used was pthread_create, since I never want to kill/restart threads as the cost is still expensive enough to not do it inside the tree. I never used mutex locks as the overhead is beyond bearable and it murders efficiency if things are locked/unlocked frequently.

But the main thing I hated was the bugs. Fork() has to work or unix won't work. But the tacked-on thread library may or may not work, depending on random events and/or cosmic ray exposure..
Pradu
Posts: 287
Joined: Sat Mar 11, 2006 3:19 am
Location: Atlanta, GA

Re: pthread weirdness

Post by Pradu »

bob wrote:I notice you have a loop iterating on i, and you are passing the _address_ of I to the new thread. By the time that thread executes, i will probably be at limit+1 which is almost certainly not what you want. Don't pass the address of that, pass the actual value. Note that the create function requires a pointer, but you can recast the value "i" to a (void *) and then in the new thread re-cast it back to an int. Then you are no longer passing a pointer to a value that will be changed many times by the time the thread has been created and gets scheduled.
I do something similar to James but I'm not quite understanding the above. Can you give a quick example of why passing the pointer value would be wrong? Also I guess you have to worry about the type of int as for 64-bits exe I guess pointers are 64-bits.
http://www.opengroup.org/pubs/online/79 ... reate.html
Also make absolutely certain you understand the difference between these:

volatile int X;
volatile int *X;
int * volatile X;
volatile int * volatile X;

they are absolutely not the same thing and you have to be sure that anything that can be changed in another thread is declared volatile or compiler optimization will kill you.