Hacker News new | past | comments | ask | show | jobs | submit login
Pseudorandomness (seas.harvard.edu)
55 points by luu 3 months ago | hide | past | favorite | 22 comments

Semi on-topic: pseudorandomness is an interesting idea that has interesting applications in real life.

Maria Konnikova [1] mentions that poker players, even when they have an on-average winning strategy, randomize their play using the second hand on their watches. For instance, they might execute their play 3 times every 4 plays, depending what their watches tell them.

For me the deeper implication is that in an environment with stochasticity (the real world is a mix between determinism and randomness), executing the same deterministic strategy all the time is likely not going to guard you against the random events that are harmful. Randomization provides a sort of protection by diversifying your strategies.

[1] in an interview with Sean Carroll on the Mindscape podcast

In game theory, the optimal strategy is often a stochastic one. In those games, if you play deterministically there’s always a way for your opponent to gain an advantage; playing randomly from a particular distribution is strictly better for you.

cf. https://en.wikipedia.org/wiki/Strategy_(game_theory)#Mixed_s...


A good example of this is rock-paper-scissors. Imagine playing against someone with a non-random strategy.

Good ol' rock always wins.

I wonder if an example of randomization might be: put 80% of your apps in AWS, but randomly deploy the remaining 10% in Azure and 10% in GCP just so that your team has expertise in the latter two.

This may also be an example of simply tripling your effort, while maintaining about 80% uptime.

I took the class this book is based on (years ago), and it was great: after programming languages, it was the most philosophically mind-bending class I had in any technical subject.

There sure is a lot of mathematical content there; it's a graduate-level class in CS theory. The math is all there for a good reason if you want to understand precisely what's true in this area vs. what's too good to be true, or to be able to prove that it's true. But I think the key insights can be understood without much math.

The author discusses this all much better than I can in a comment, so I recommend reading the introduction and conclusion chapters. Or this other exposition by a different author, which I discovered through a reference in the intro: http://www.wisdom.weizmann.ac.il/~oded/prg-primer.html

But here's an attempt at describing just one of the key insights: the right definition of "pseudorandom" is computational. Specifically, something that generates a bunch of data is pseudorandom if you can't tell it apart from genuinely random data, within feasible resources like polynomial time.

And one reason that's the right definition is: if you have such data, now you can plug it in literally anyplace you would use random data. Because if something would go wrong by doing so... well, that'd be a way to tell the difference! :-) There is no such way, so it must behave just like using the real thing.

It's an interesting area, but a quick scan of the generators section makes it look quite inaccessible.

I actually have a pseudorandom number generator as my interview question and it's quite simple. I have a slightly simplified asci-art version of it that use essentially a variation of Wolfram's Rule 30.

Coding it is easy, requiring no algorithms or data structures, just some iteration, if statements and substring look ups. At first some are annoyed at what looks like a useless, made up toy project, but then I tell them this was actually a part of Mathematica in the early versions and we start discussing how you'd turn that output into a useful pseudorandom number generating function with a seed, acceptable performance etc. Quite a few end up happy with the problem at the end.

This is a mathematical treatment of pseudorandomness, intended to prove guarantees against specific classes of algorithms . It’s not concerned with “practical” PRGs which don’t provide guarantees about the quality of their randomness

I've looked around abit and pseudorandomness is not an accessible subject, if you have any links for something then please share. I can not judge this, at least it seems thorough, but too much math but it feels like it at least is a good starting point.

Wolfram has a lot of talks, etc but this interview seems like a decent intro: https://www.youtube.com/watch?v=VguG_y05Xe8

Let's implement Rule30, which is quite beautiful and he talks about it a lot: https://mathworld.wolfram.com/Rule30.html


We start with a simple seed and apply a set of simple rules over and over again. If you scroll past the first few lines of the output, you start to get very complicated patters. If you try other rules, you usually don't see any patterns or anything interesting, rule30 is kind of rare.

Wolfram's done some analysis and he says if you go down the central line (assuming the width keep on growing, which is not the case in my fiddle), and take each "bit" (which I represent as "-" and "X") and plot it, you'll find they're fairly uniformly distributed. Once we have these stream of uniformly distributed bits, you can convert it a string of integers or whatever you need.

For the math behind whether something is random or not, you can look up the "chi squared test", but I think the intuition is way simpler - random things are uniformly distributed. Random things look like white noise on old TVs. If you plot the frequency of their occurrence, each one should be just as likely as every other thing. In our example, if you take the output, flatten it into a string, segment it into chunks of 8 (representing a byte), then count the occurrence of each byte ("--------", "-------X", "------X-") they should have the same number of occurrences.

Here's an alternative exposition (by Goldreich) recommended in this one's introduction: http://www.wisdom.weizmann.ac.il/~oded/prg-primer.html

It ends up getting pretty mathematical too, but it seems like Goldreich takes more time up front to really get into the concepts at a philosophical level before proceeding to the math - so you might enjoy reading the introduction, if nothing else.

"Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin." - John Von Neumann

There are great deterministic random number generators, like: https://preshing.com/20121224/how-to-generate-a-sequence-of-...

Can be used to shuffle a billion points per second on the GPU with sufficient randomness for some use cases.

Aren't most PRNs made deterministically? LCGs can have huge cycle times and can produce statistically sound random numbers.

I had to look it up and Wikipedia has this statement a bit differently. This one makes Neumann sound funny because his Mid-Square method is a terrible PRN generator.

> Aren't most PRNs made deterministically?

All pseudorandom numbers are deterministic; that’s what the ‘pseudo-‘ prefix is meant to indicate. There are other kinds of random number generator that observe physical entropy sources instead. Perhaps the most iconic is CloudFlare’s wall of lava lamps[1].

[1] https://blog.cloudflare.com/lavarand-in-production-the-nitty...

History trivia: Cloudflare stole this from SGI, who patented the idea.


Hardly stole it, literally acknowledged in blog post where we announced this: https://blog.cloudflare.com/lavarand-in-production-the-nitty...

We're not the first ones to do this. Our LavaRand system was inspired by a similar system first proposed[1] and built by Silicon Graphics and patented[2] in 1996 (the patent has since expired).

[1] https://en.wikipedia.org/wiki/Lavarand

[2] https://patents.google.com/patent/US5732138

Calm down.

I'm hardly accusing anyone of any wrong-doing. It is a turn-of-phrase that I'm sure most english-speakers have encountered.

It has a very different connotation when you say "I'm stealing..." than it does when you say "you're stealing..." It's only jargon in first-person.

Compare saying "wow, I'm really stupid" when you make a mistake vs. saying "wow, you're really stupid" when they do.

Self-deprecation applied to others is just deprecation.

> Cloudflare stole this from SGI, who patented the idea.

Why the accusation that Cloudflare stole this? The patent expired in 2016 and Cloudflare acknowledges SGI on their page.

PRNGs are often seeded with a "true" random value, from which they generate a sequence of pseudorandom values. In that case, you actually end up with a hybrid of random and deterministic. An example of this is `/dev/urandom`.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact