Writing bugs

Ras · Post by **Ras** » Sun Jan 13, 2019 12:30 am

mar wrote: ↑Sat Jan 12, 2019 8:46 pm(aka "not bad")

Completely irrelevant for this application, and also bear in mind that it's mod'ed down to have the desired small variation. For image and audio processing, that would be another issue, of course, especially where you don't mod down.

lucasart · Post by **lucasart** » Sun Jan 13, 2019 1:37 am

Coming back to the thread-safety problem, I think the real design mistake of C was to define rand().

Instead, a better way, is to make it re-entrant: rand(&seed). That way the ownership of the seed is pushed to the caller, and if the caller wants to make seed thread local, or chooses to use a global and have racy code, that's the caller's problem. Or maybe the caller wants to have a global seed with a lock?

For example:

Code: Select all

// SplitMix64 PRNG, by Sebastiano Vigna: http://xoroshiro.di.unimi.it/splitmix64.c
static uint64_t prng(uint64_t *state)
{
    uint64_t s = (*state += 0x9E3779B97F4A7C15ULL);
    s = (s ^ (s >> 30)) * 0xBF58476D1CE4E5B9ULL;
    s = (s ^ (s >> 27)) * 0x94D049BB133111EBULL;
    s ^= s >> 31;
    assert(s);  // We cannot have a zero key for zobrist hashing. If it happens, change the seed.
    return s;
}

Same goes for C's strtok(). Making it non re-entrant is not only a thread safety disaster, but even single threaded code with nested function calls will break.

Some of C's stdlib is just crap and should never be used in any serious program. Fortunately, these functions are trivial to replace by one's own code.

State is the root of many evils in programming. So write stateless code.

grahamj · Post by **grahamj** » Sun Jan 13, 2019 9:07 am

mar wrote: ↑Sat Jan 12, 2019 8:46 pm You mean something like this?

Michel · Post by **Michel** » Sun Jan 13, 2019 9:57 am

State is fine but it should be encapsulated in an object. Like so:

Code: Select all

rg=random_generator(seed)
r=rg.rand()

The state maintained by the rg object could be much more complicated than a single number (this would likely be the case for a cryptographically secure random generator). The rg object completely hides this state from the user.

mar · Post by **mar** » Sun Jan 13, 2019 10:03 am

Ras wrote: ↑Sun Jan 13, 2019 12:30 am Completely irrelevant for this application, and also bear in mind that it's mod'ed down to have the desired small variation. For image and audio processing, that would be another issue, of course, especially where you don't mod down.

There's no variation whatsoever, the modulo simply reduces the output to 15 bits (not useful, wasteful in fact).
Anything sized 2^n is automatically uniform distribution (if your PRNG is any good of course).

I simply generated 8-bit white noise bit by bit for visualization purposes.
Slicing the output of a bad PRNG pronounces the problem even more which is clearly shown in the images.

I still don't understand why you defend LCG so vehemently. It's bad whether you like it or not.
There're much better options (pick any KISS you want) that would satisfy 99.9% cases minus crypto
and this is what I'd prefer anytime over a bad PRNG.

The original problem was neither rng nor thread safety but simply multiprocessing instead of multithreading.
LCG/KISS don't care about data races on x86 apart from slowdown due to false sharing.
You'd still get the expected pseudo-random results with multithreading though.

lucasart · Post by **lucasart** » Sun Jan 13, 2019 10:49 am

Michel wrote: ↑Sun Jan 13, 2019 9:57 am State is fine but it should be encapsulated in an object. Like so:
Code: Select all
rg=random_generator(seed)
r=rg.rand()
The state maintained by the rg object could be much more complicated than a single number (this would likely be the case for a cryptographically secure random generator). The rg object completely hides this state from the user.

you do realise that rg.rand() is C++ syntactic sugar for C's rand(&rg) ?

In other words re-entrant function in C = method in C++

sure the state can be more than 64-bit but that does not require object orientation (which does not exist in C, and C's rand() was the topic of debate). just rand(&state), and state is of type 'struct State' which contains whatever it needs to contain.

Michel · Post by **Michel** » Sun Jan 13, 2019 11:09 am

Of course it is obvious how to emulate the correct object oriented design in C in this simple case. That was left as an exercise to the reader.

You were claiming "state is evil" (as usual a statement devoid of any nuance). I am claiming instead: "there is no problem with state as long as it is properly encapsulated".

Ras · Post by **Ras** » Sun Jan 13, 2019 3:28 pm

lucasart wrote: ↑Sun Jan 13, 2019 1:37 amInstead, a better way, is to make it re-entrant: rand(&seed).

POSIX-2001 has rand_r for this, but it was declared as obsolete in POSIX-2008 - any idea why?

mar wrote: ↑Sun Jan 13, 2019 10:03 amThere's no variation whatsoever, the modulo simply reduces the output to 15 bits

I meant in actual usage. When you want a small variation, say +/- 50 centipawns, then the usual idiom is to mod the rand() result down like in

Code: Select all

noise = rand()%101 - 50;

(not useful, wasteful in fact).

The mod down inside rand is useful in fact, and the C ISO standard suggests this for good reason. It would be nonsense to hand out the LCG 32 bit value where the LSB is regularly alternating. Also, RAND_MAX is only specified as at least 32767 anyway in the C standard because an int is only guaranteed to be at least 16 bits.

I still don't understand why you defend LCG so vehemently. It's bad whether you like it or not.

Because the notion that "it's bad" is just nonsense. It's unsuited for certain applications, and the one here just is not one of them.

Besides, the moiré image issue with MS' rand() isn't even the LGC, it's that they didn't just copy the suggestion from the ISO standard and used 214013 instead of 1103515245 as multiplier, which is too small. No surprise that MS invents something of its own and botches it up, they do that all the time. Use the one from the ISO standard, and you don't get a moiré image, see attachment.

mar · Post by **mar** » Sun Jan 13, 2019 4:08 pm

Ras wrote: ↑Sun Jan 13, 2019 3:28 pm I meant in actual usage. When you want a small variation, say +/- 50 centipawns, then the usual idiom is to mod the rand() result down like in
Code: Select all
noise = rand()%101 - 50;

Ok, not to mention that getting a random in interval by using modulo is not a good idea, you don't get uniform distribution this way unless the value you modulate with is a power of two. It gets worse the smaller the output range is, especially bad if RAND_MAX is only 32k.

The mod down inside rand is useful in fact, and the C ISO standard suggests this for good reason. It would be nonsense to hand out the LCG 32 bit value where the LSB is regularly alternating. Also, RAND_MAX is only specified as at least 32767 anyway in the C standard because an int is only guaranteed to be at least 16 bits.

Yes, legacy limits.
If it behaves this way (alternating LSBit), then it's simply a bad PRNG, sorry.
Anyone who suggests LCG as a standard way to generate PR sequences probably has no clue about PRNGs in general.

Because the notion that "it's bad" is just nonsense. It's unsuited for certain applications, and the one here just is not one of them.

Besides, the moiré image issue with MS' rand() isn't even the LGC, it's that they didn't just copy the suggestion from the ISO standard and used 214013 instead of 1103515245 as multiplier, which is too small. No surprise that MS invents something of its own and botches it up, they do that all the time. Use the one from the ISO standard, and you don't get a moiré image, see attachment.

How many times do I have to show you that LCG is bad

You claim it's nonsense, that's your strongest argument I suppose.
"nonsense" + zero data to back it up.

I wonder how you got that image but you couldn't possibly have obtained it by composing the 8-bit final grayscale value bit-by-bit (hence "cheating"

If you simply output the lowest 8 bits then I suggest you try 1024x1024 or 2048x2048 images, then you'll see for yourself.

This is what I get with the ISO multiplier, not much better (still 8 PRNG calls for each byte):

(I updated the source code)

Or if you want to prove me wrong, just take your ISO LCG and run it through some statistical test suites, like diehard or dieharder. Then come back and tell me how good LCGs are

So let's agree to disagree. You don't care about the quality of PR sequences while I do, it's probably as simple as that.

Sesse · Post by **Sesse** » Mon Jan 14, 2019 12:53 am

Michel wrote: ↑Sun Jan 13, 2019 9:57 am State is fine but it should be encapsulated in an object. Like so:
Code: Select all
rg=random_generator(seed)
r=rg.rand()
The state maintained by the rg object could be much more complicated than a single number (this would likely be the case for a cryptographically secure random generator). The rg object completely hides this state from the user.

https://en.cppreference.com/w/cpp/numeric/random

Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs

Re: Writing bugs