
Randomness in .NET - lowleveldesign
https://lowleveldesign.org/2018/08/15/randomness-in-net/
======
redcalx
I think this is the github issue referred to:

[https://github.com/dotnet/corefx/issues/23298](https://github.com/dotnet/corefx/issues/23298)

I wrote some replacement classes that address all of the known issues, here:

[https://github.com/colgreen/Redzen/tree/master/Redzen/Random](https://github.com/colgreen/Redzen/tree/master/Redzen/Random)

[https://www.nuget.org/packages/Redzen/](https://www.nuget.org/packages/Redzen/)

~~~
ygra
> I think this is the github issue referred to

I think so, too. Because it was not only referred to, but also linked ;-)

~~~
redcalx
Heh, fair enough :) Not easy to find though eh (v. similar colour for normal
and anchor text).

~~~
bhandziuk
It was hard to see. It also isn't underlined like a normal link.

------
nestorD
Two vital information on .Net PRNG :

\- It is not thread-safe (and might start outputting a serie of 0 when called
in parallel)

\- There is a bug (acknowledged by Microsoft but not fixed for backward
compatibility reasons) in the implementation meaning that the generator has an
abnormally short period and is, overall, less random looking.

~~~
adrianN
I'm really bummed out over the fact that you can't change the way a _random_
number generator generates numbers because apparently people depend on the
exact algorithm.

~~~
OskarS
It does make some sense. Lets say you're making a game with a procedural world
(e.g. something like Minecraft) which the player explores and makes changes
to. Instead of storing the entire (potentially infinite) world, you just store
the world seed and the changes players make. In that case, if the algorithm
underlying the PRNG changes, the entire game would break.

There's enough scenarios like this that making a change to a PRNG algorithm is
a very dangerous and breaking change. People rely on the fact that, given the
same seed, you get the same sequence of values.

~~~
iainmerrick
You should use your own PRNG in that case.

I understand not wanting to change the implementation now, but users should
never have assumed it would be stable in the first place.

~~~
OskarS
I mean, yeah. You probably should. But it's entirely reasonable of a game
developer to say "I'm not an expert in random numbers, but Microsoft has lots
of smart engineers, I'm sure they did their research and provided a good
implementation".

The actual answer is that you shouldn't just provide a default "Random" class,
you should provide a more general class with a pluggable algorithm.

~~~
iainmerrick
No, that’s not reasonable at all. You’d be assuming not only that the
implementation is exactly what you want, but also that it will be identical on
all platforms and will never change.

In practice for .NET it sounds like that’s actually correct -- the bad
implementation will never be fixed. That seems like a bad thing.

~~~
setr
Whats the point of providing a seed() function if the algorithm can change
from under your feet, for _any_ given implementation? In your scenario the
_only_ way to have seed() is through a custom implementation, because _any_
implementation may haves bugs or inconsistencies, that may be fixed at _any_
time. And only your own implementation will stay stable and sane

And this is true for _all_ backward-compatibility concerns: you’ll have a bug,
or a poor syntax decision, or a crappy api, thats required to be there because
of downstream concerns. If you keep breaking people’s programs to improve the
language, people will either eventually stop updating, or stop using the
language altogether, because it becomes a massive PITA to get any new
features; do it enough and people will say fuck it, you cant be trusted to
stay stable, I’ll write it myself. And eventually a library will come along
that promises stability, and you’ll be back in the same boat.

 _Stability is a feature_. And judging from how languages treat stability
today, and how one of microsofts major reasons for success was its almost
obscene adherence to backwards compatibility, it is an _important_ feature.

The cost is of course that these problems persist, and eventually build up
untill someone forks, or a major version increments.

But theres a reason that _perfect is the enemy of good_. Breaking programs
arbitrarily to fix bugs/issues _slaughters_ downstream productivity.

~~~
iainmerrick
I think that is sometimes right and sometimes wrong. It’s not consistent
enough to elevate to a principle.

Macs were incredible for backwards-compatibility back in the 80s and 90s, as
good as PCs if not better. Games from 1985 would run happily in System 7 and
MacOS 8. It didn’t help them win against the PC.

Since the return of Steve Jobs, Apple have become increasingly aggressive
about killing off old “obsolete” hardware and software features. As a Mac or
iOS developer it can be incredibly frustrating, constantly having to jump
through new hoops just to be permitted to stay on the platform. But that
doesn’t seem to have hurt Apple’s business success in the slightest.

To answer your initial question--

 _Whats the point of providing a seed() function if the algorithm can change
from under your feet, for_ any _given implementation?_

I was imagining that the algorithm would be stable across runs but permitted
to change across major library updates, say.

But I forgot there are two parts to it. One is seed(), the other is the no-
args constructor that uses the system clock but no additional randomness. Can
we at least agree that that one should be fixed? It’s hard to see how any
users even _could_ have a hard dependency on that specific implementation.
Like, code that absolutely requires independent Random objects created in the
same millisecond to have the same seed? Do you see a big risk in breaking
clients like that, for the benefit of improving randomness for everybody else?

------
garganzol
A useful snippet from the article:

"RNGCryptoServiceProvider is generally a safer choice when you need to
generate random bytes. Creating an instance of this class is expensive, so
it’s better to populate a 400-byte array than call the constructor 100 times
to populate a 4-byte array."

~~~
eterm
Both options there feel like the wrong solution.

Why would you call the constructor 100 times to popuate 100 4-byte arrays?

It's a service, surely the best approach to call the constructor once, then
populate 100 4-byte arrays by calling GetBytes 100 times from the same
service?

It is explicit code without the downside of the expensive constructor.

edit for clarification: It's true that getting all the data at once will still
be much faster because it'll save all the other overhead, but it's not the
constructor at fault there.

------
japanuspus
The blog-post by fuglede with a detailed analysis of the implications of the
RNG-bug is well worth a read: [https://fuglede.dk/en/blog/bias-in-net-
rng/](https://fuglede.dk/en/blog/bias-in-net-rng/)

~~~
ygra
Isn't there already bias in that computation because the range for the random
numbers includes more even than odd numbers since it's the interval [0,
2147483647)?

~~~
fuglede
Author here; thanks for the interest!

First of all, you're completely right. I do actually mention this fact just
before the start of the section "An experiment". Here, the argument is that
the bias this odd/even mismatch introduces is orders of magnitudes smaller
than what is introduced by the rounding errors; that is, a perfect theoretical
RNG drawing from that range would not produce nearly as biased a result (and
conversely, if you were to run the snippet in the blog post using `rng.Next(2,
int.MaxValue)` or `rng.Next(0, int.MaxValue - 2)`, you wouldn't see the same
bias, even though the ranges are still odd/even-biased to about the same
extent).

------
Insanity
I might be missing something, but the three predicated values are not in the
"nextSeed" table? Whereas the original three are.

EDIT: I feel like I miss something in the explanation. Can anyone explain how
the seed table is actually used?

~~~
lowleveldesign
The seed array is an internal state of the PRNG algorithm. It evolves over
time and PRNG uses values from this array (plus some additional parameters,
such as inext or inextp) to generate new "random" numbers. Thus, after
seeding, there is no real randomness in the non-cryptographic PRNGs. To learn
more, have a look at Marsenne Twister, which is also quite popular and has a
nice description in Wikipedia [1].

[1]
[https://en.wikipedia.org/wiki/Mersenne_Twister](https://en.wikipedia.org/wiki/Mersenne_Twister)

~~~
Insanity
Thanks for the link! I kind of figured it out over time, but the explanation
helps!

And thanks for the interesting article btw, just realised you're the author
:-)

~~~
lowleveldesign
Thanks :)

------
iainmerrick
_The algorithm used in the .NET Core is the same as in the .NET Framework_
[...] _There is a difference, however, when we use the default constructor._

“Core” and “Framework” have different implementations of the same class? Who
names these things?

~~~
yawgmoth
.NET Standard is an interface of which .NET Core and .NET Framework are
implementations. I agree that it's confusing terminology at first, but for
daily drivers of .NET languages, it should be pretty transparent.

~~~
SketchySeaBeast
It's somehow effectively doubled the number of search results for any one
topic, only half of which will work. But you're right, for a daily user you
learn quickly to treat as another criteria to sift through to get relevant
results.

------
zbigniewc
Thank you for sharing - I will have to put some work into integrating this
with my software that is unfortunately based on a tripleton design pattern,
but it's worth the effort.

~~~
lowleveldesign
Oh, tripleton might require a special PRNG:
[http://dilbert.com/strip/2001-10-25](http://dilbert.com/strip/2001-10-25) :)

