A cautionary tale on thread safety

sje · Post by **sje** » Sat Apr 25, 2015 4:40 am

A cautionary tale on thread safety

In the Old Days of Unix long before the pthread library and the advent of inexpensive multicore processors, there wasn't too much concern about enforcing thread safety on standard library routines. A result is that some of these older implementations lack re-entrancy and so are unsafe for use in a multithreaded program.

One of these routines is random() and Symbolic uses it in the program's random move picker. Everything was fine until I made the random game generator multithreaded. While there was no crashing, there was an introduction of bias when two or more threads shared activation time with the library routine's internal, single copy state.

On some platforms, there is a supplemental set of the random() routine family with each re-entrant routine having the name of the corresponding non re-entrant routine with a "_r" suffix attached. Unfortunately, this is not standard on both Mac OS/X and Linux.

For the present, I've fixed the problem by guarding the call to the random move picker with a spinlock. This works, but it also adds a significant performance penalty. A better solution may be to use my own re-entrant PRNG instead of random().

Moral of story: Be careful when calling any standard library routine from multithreaded code. If that routine has a persistent internal state, then use the alternative re-entrant version if one is available. Otherwise, use a mutex or spinlock, or write your own re-entrant version.

lucasart · Post by **lucasart** » Sat Apr 25, 2015 5:08 am

sje wrote:If that routine has a persistent internal state, then use the alternative re-entrant version if one is available. Otherwise, use a mutex or spinlock, or write your own re-entrant version.

Yes, that's trivial and obvious. A whole new thread to say that?

sje · Post by **sje** » Sat Apr 25, 2015 6:04 am

lucasart wrote:
sje wrote:If that routine has a persistent internal state, then use the alternative re-entrant version if one is available. Otherwise, use a mutex or spinlock, or write your own re-entrant version.
Yes, that's trivial and obvious. A whole new thread to say that?

I thought it might save some time for a coder who, like me, might forget about which library routines could have a hidden persistent state.

In the case of using random(), it took me more than an hour to verify the slight bias through testing different configurations on different machines; the fix took only a minute. Much time would have been saved if random() had crashed or hanged.

As a reward for those reading this far, I'm posting the source for my Do Not Disturb timed sleep routines. These are coded to prevent a premature return caused by the kernel's behavior which wakes ALL sleeping threads when ANY of them gets a signal.

Code: Select all

void NapUsec&#40;const Usec usec&#41;
&#123;
  Usec now = FetchWallUsec&#40;);
  const Usec wakeup = now + usec;

  while &#40;now < wakeup&#41;
  &#123;
    usleep&#40;&#40;useconds_t&#41; &#40;wakeup - now&#41;);
    now = FetchWallUsec&#40;);
  &#125;;
&#125;

void NapMsec&#40;const Msec msec&#41;
&#123;
  NapUsec&#40;msec * MsecUsecLen&#41;;
&#125;

void Sleep&#40;const ui sec&#41;
&#123;
  NapMsec&#40;sec * MsecLen&#41;;
&#125;

mar · Post by **mar** » Sat Apr 25, 2015 9:47 am

Another example would be gmtime/localtime. There is a list of racy functions and it's also documented.
I agree with you that this is annoying but since everybody usually wraps the functionality somehow, this is handled in a single place anyway.
Also, are you aware of the fact that you can only use usleep for <1 second?

sje · Post by **sje** » Sat Apr 25, 2015 11:38 am

mar wrote:Also, are you aware of the fact that you can only use usleep for <1 second?

This depends on the platform. I've used it under OS/X for testing with an argument longer than one second with no problems. But in regular usage, my program's longest timed sleep is only 100 msec.

Under OS/X, all the sleep functions other than nanosleep() are just wrappers for nanosleep().

Linux might complain, though.

More important is I've just learned that usleep() is officially deprecated, so that means that it's got to go.

AlvaroBegue · Post by **AlvaroBegue** » Sat Apr 25, 2015 2:23 pm

sje wrote:For the present, I've fixed the problem by guarding the call to the random move picker with a spinlock. This works, but it also adds a significant performance penalty. A better solution may be to use my own re-entrant PRNG instead of random().

A better solution would be to use a separate PRNG in each thread, each with its own internal state. See the man page for random_r (a nonstandard glibc extension), or roll your own.

EDIT: Oh, maybe that's what you meant by "re-entrant PRNG".

lucasart · Post by **lucasart** » Sat Apr 25, 2015 3:07 pm

AlvaroBegue wrote:
sje wrote:For the present, I've fixed the problem by guarding the call to the random move picker with a spinlock. This works, but it also adds a significant performance penalty. A better solution may be to use my own re-entrant PRNG instead of random().
A better solution would be to use a separate PRNG in each thread, each with its own internal state. See the man page for random_r (a nonstandard glibc extension), or roll your own.

EDIT: Oh, maybe that's what you meant by "re-entrant PRNG".

A C++ class with private data members for the internal state, and a public member function to draw random numbers, is nothing more than syntactic sugar for a C re-entrant function (ie. that takes the state struct pointer as argument).

As you say, the important thing is that the state should not be shared between threads (each should have its own state). If it is, then some locking must be used. But that should be avoided, if possible, because it introduces a bottleneck with threads waiting on the lock...

sje · Post by **sje** » Sun Apr 26, 2015 12:32 pm

Revised sleep routines

Code: Select all

#define NsecLen 1000000000ull

typedef unsigned long long ui64;

typedef ui64  Sec;      // Seconds &#40;always unsigned&#41;
typedef ui64  Msec;     // Milliseconds &#40;always unsigned&#41;
typedef ui64  Usec;     // Microseconds &#40;always unsigned&#41;
typedef ui64  Nsec;     // Nanoseconds &#40;always unsigned&#41;

void NapUsec&#40;const Usec usec&#41;
&#123;
  Usec now = FetchWallUsec&#40;);
  const Usec wakeup = now + usec;

  while &#40;now < wakeup&#41;
  &#123;
    const Nsec nsec = &#40;wakeup - now&#41; * 1000ull;
    struct timespec ts;

    ts.tv_sec = nsec / NsecLen;
    ts.tv_nsec = nsec % NsecLen;

    nanosleep&#40;&ts, 0&#41;;
    now = FetchWallUsec&#40;);
  &#125;;
&#125;

void NapMsec&#40;const Msec msec&#41;
&#123;
  NapUsec&#40;msec * 1000ull&#41;;
&#125;

void NapSec&#40;const Sec sec&#41;
&#123;
  NapMsec&#40;sec * 1000ull&#41;;
&#125;

mar · Post by **mar** » Sun Apr 26, 2015 9:56 pm

sje wrote:More important is I've just learned that usleep() is officially deprecated, so that means that it's got to go.

I wonder why they did that. to allow you to sleep for longer time by introducing absolutely ridiculous precision?
Not to mention that nanosleep interface is ugly, I doubt you will ever get sub msec accuracy on any OS (unless it'd use some hybrid approach that spins).

mar · Post by **mar** » Sun Apr 26, 2015 10:21 pm

EDIT (thanks for 15m limit): replace msec with usec