A cautionary tale on thread safety
In the Old Days of Unix long before the pthread library and the advent of inexpensive multicore processors, there wasn't too much concern about enforcing thread safety on standard library routines. A result is that some of these older implementations lack re-entrancy and so are unsafe for use in a multithreaded program.
One of these routines is random() and Symbolic uses it in the program's random move picker. Everything was fine until I made the random game generator multithreaded. While there was no crashing, there was an introduction of bias when two or more threads shared activation time with the library routine's internal, single copy state.
On some platforms, there is a supplemental set of the random() routine family with each re-entrant routine having the name of the corresponding non re-entrant routine with a "_r" suffix attached. Unfortunately, this is not standard on both Mac OS/X and Linux.
For the present, I've fixed the problem by guarding the call to the random move picker with a spinlock. This works, but it also adds a significant performance penalty. A better solution may be to use my own re-entrant PRNG instead of random().
Moral of story: Be careful when calling any standard library routine from multithreaded code. If that routine has a persistent internal state, then use the alternative re-entrant version if one is available. Otherwise, use a mutex or spinlock, or write your own re-entrant version.
A cautionary tale on thread safety
Moderators: hgm, Rebel, chrisw
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: A cautionary tale on thread safety
Yes, that's trivial and obvious. A whole new thread to say that?sje wrote:If that routine has a persistent internal state, then use the alternative re-entrant version if one is available. Otherwise, use a mutex or spinlock, or write your own re-entrant version.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: A cautionary tale on thread safety
I thought it might save some time for a coder who, like me, might forget about which library routines could have a hidden persistent state.lucasart wrote:Yes, that's trivial and obvious. A whole new thread to say that?sje wrote:If that routine has a persistent internal state, then use the alternative re-entrant version if one is available. Otherwise, use a mutex or spinlock, or write your own re-entrant version.
In the case of using random(), it took me more than an hour to verify the slight bias through testing different configurations on different machines; the fix took only a minute. Much time would have been saved if random() had crashed or hanged.
As a reward for those reading this far, I'm posting the source for my Do Not Disturb timed sleep routines. These are coded to prevent a premature return caused by the kernel's behavior which wakes ALL sleeping threads when ANY of them gets a signal.
Code: Select all
void NapUsec(const Usec usec)
{
Usec now = FetchWallUsec();
const Usec wakeup = now + usec;
while (now < wakeup)
{
usleep((useconds_t) (wakeup - now));
now = FetchWallUsec();
};
}
void NapMsec(const Msec msec)
{
NapUsec(msec * MsecUsecLen);
}
void Sleep(const ui sec)
{
NapMsec(sec * MsecLen);
}
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: A cautionary tale on thread safety
Another example would be gmtime/localtime. There is a list of racy functions and it's also documented.
I agree with you that this is annoying but since everybody usually wraps the functionality somehow, this is handled in a single place anyway.
Also, are you aware of the fact that you can only use usleep for <1 second?
I agree with you that this is annoying but since everybody usually wraps the functionality somehow, this is handled in a single place anyway.
Also, are you aware of the fact that you can only use usleep for <1 second?
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: A cautionary tale on thread safety
This depends on the platform. I've used it under OS/X for testing with an argument longer than one second with no problems. But in regular usage, my program's longest timed sleep is only 100 msec.mar wrote:Also, are you aware of the fact that you can only use usleep for <1 second?
Under OS/X, all the sleep functions other than nanosleep() are just wrappers for nanosleep().
Linux might complain, though.
More important is I've just learned that usleep() is officially deprecated, so that means that it's got to go.
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: A cautionary tale on thread safety
A better solution would be to use a separate PRNG in each thread, each with its own internal state. See the man page for random_r (a nonstandard glibc extension), or roll your own.sje wrote:For the present, I've fixed the problem by guarding the call to the random move picker with a spinlock. This works, but it also adds a significant performance penalty. A better solution may be to use my own re-entrant PRNG instead of random().
EDIT: Oh, maybe that's what you meant by "re-entrant PRNG".
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: A cautionary tale on thread safety
A C++ class with private data members for the internal state, and a public member function to draw random numbers, is nothing more than syntactic sugar for a C re-entrant function (ie. that takes the state struct pointer as argument).AlvaroBegue wrote:A better solution would be to use a separate PRNG in each thread, each with its own internal state. See the man page for random_r (a nonstandard glibc extension), or roll your own.sje wrote:For the present, I've fixed the problem by guarding the call to the random move picker with a spinlock. This works, but it also adds a significant performance penalty. A better solution may be to use my own re-entrant PRNG instead of random().
EDIT: Oh, maybe that's what you meant by "re-entrant PRNG".
As you say, the important thing is that the state should not be shared between threads (each should have its own state). If it is, then some locking must be used. But that should be avoided, if possible, because it introduces a bottleneck with threads waiting on the lock...
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Revised sleep routines
Revised sleep routines
Code: Select all
#define NsecLen 1000000000ull
typedef unsigned long long ui64;
typedef ui64 Sec; // Seconds (always unsigned)
typedef ui64 Msec; // Milliseconds (always unsigned)
typedef ui64 Usec; // Microseconds (always unsigned)
typedef ui64 Nsec; // Nanoseconds (always unsigned)
void NapUsec(const Usec usec)
{
Usec now = FetchWallUsec();
const Usec wakeup = now + usec;
while (now < wakeup)
{
const Nsec nsec = (wakeup - now) * 1000ull;
struct timespec ts;
ts.tv_sec = nsec / NsecLen;
ts.tv_nsec = nsec % NsecLen;
nanosleep(&ts, 0);
now = FetchWallUsec();
};
}
void NapMsec(const Msec msec)
{
NapUsec(msec * 1000ull);
}
void NapSec(const Sec sec)
{
NapMsec(sec * 1000ull);
}
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: A cautionary tale on thread safety
I wonder why they did that. to allow you to sleep for longer time by introducing absolutely ridiculous precision?sje wrote:More important is I've just learned that usleep() is officially deprecated, so that means that it's got to go.
Not to mention that nanosleep interface is ugly, I doubt you will ever get sub msec accuracy on any OS (unless it'd use some hybrid approach that spins).
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: A cautionary tale on thread safety
EDIT (thanks for 15m limit): replace msec with usec