Hacker News new | past | comments | ask | show | jobs | submit login
Secure Randomness in Go 1.22 (go.dev)
292 points by rsc 13 days ago | hide | past | favorite | 94 comments





From the article

> Go aims to help developers write code that is secure by default. When we observe a common mistake with security consequences, we look for ways to reduce the risk of that mistake or eliminate it entirely. In this case, math/rand’s global generator was far too predictable, leading to serious problems in a variety of contexts.

> For example, when Go 1.20 deprecated math/rand’s Read, we heard from developers who discovered (thanks to tooling pointing out use of deprecated functionality) they had been using it in places where crypto/rand’s Read was definitely needed, like generating key material.

I made exactly this mistake in rclone. I refactored some code which was using the Read function from crypt/rand and during the process the import got automatically changed (probably by goimports when mixing code which did use math/rand) to math/rand. So it changed from using a secure random number generator to a deterministic one rclone seeded with the time of day. I didn't notice in the diffs :-( Hence

https://www.cvedetails.com/cve/CVE-2020-28924/

So this change gets a big :+1: from me.


Ouch, apologies for that. We changed goimports to prefer crypto/rand back in 2016, so I'm not entirely sure what happened during your refactoring. Perhaps code that used other math/rand-only APIs ended up in the same file. https://go-review.googlesource.com/24847

Anyway, I'm glad we're cleaning all this up!


Ever since 2016 I actually do a named import for "math/rand" now by calling it "mathrand" or "weakrand" just to be sure I don't accidentally use it for anything secure.

Go keeps getting better. Thanks for your hard work!


Yes I'm pretty sure that is what happened. I was consolidating random functions into one file which was already using math/rand so when I moved the function which was using crypto/rand there it picked up the math/rand import.

So not goimports problem, my problem for expecting goimports to magically do the right thing like it usually does!


I've also had some report a vulnerability because they thought math/rand was being used when it wasn't. They just mixed something up with a few different files – not a big deal – but it just goes to show how confusing the entire thing is.

text/template and html/template are similar. In hindsight this package name shadowing was a bad idea.


This is one of my bigger remaining issues with day-to-day use of Go. I love how neat Go `package.Symbol` usually looks -- that objective was met -- but it has at least these problems:

* It is syntactically identical to `variable.Member` so even at a glance it's ambiguous. After shadowing occurs, diagnostics get really confused. Go could at least include more error information when this is a possible cause.

* The best package names have a lot of overlap with the best local variable names and in some cases good unexported type names [1]. Single words, almost exclusively nouns, usually singular form (prior to `slices`/`maps`), etc. so if you follow these idioms for both packages and variables you risk a lot of shadows. Who among us hasn't wanted to shadow the name `url` or `path`? To this day I feel I waste a lot of time trying to choose good package names without burning the namespace of good variable names.

Outside the standard library, the official gRPC library is an arguably even better example of terrible package naming: `codes`, `peer`, `status`, and `resolver` are all likely variable names too. The only properly namespaced package is `grpclog`.

* Even if you accept that lowercase is unexported and uppercase is exported, the lowercase side of this makes package-level unexported type names more likely to shadow package names. You can nest those type names inside functions, unless they're generic functions, and these nested types can't have methods so they're very rarely useful. If intra-package namespacing and privacy were more refined, at least we'd have more workarounds to avoid type names shadowing package names.

* Semi-related, with packages being the only way to enforce privacy, it seems like we're encouraged to create a lot of packages if we want to enforce privacy a lot. But the more you create, the more globally unique names you have to choose, creating more pressure on that namespace shared with variables and unexported types.

* You can't nest `pkg1.pkg2.Symbol` even though you can nest `var1.var2.Member`[2]. This could get ugly fast and I don't actually want this in the language, but if it had existed then it would have been a useful tool to resolve ambiguity and separate namespaces without aliases. In any case it's another inconsistency between syntax and semantics.

* Import aliases can solve a lot of problems on the spot, but trades them for other problems. When used inconsistently, humans can get confused jumping around between files, but there's no tooling to enforce consistent use, not even a linter. Even when used consistently, tooling can still get confused, e.g. when you move code to another file that doesn't have the import yet then your tooling has to make its best guess and even the state of the art makes a lot of mistakes. In general, having even a single import alias makes code snippets less "portable" even within a project and even more so across projects, so in practice they should be avoided.

* Package names are tied to file organization while still also being tied to privacy. When you're perfectly happy with a project structure, this is very elegant and you forget all about it. When you want to refactor a bit, especially to avoid a circular dependency or adjust privacy control, you have to update all callers at best or are permanently limited by your exported API at worst [3]. I observe that most libraries out there prefer to have just a couple of huge exported packages, giving up privacy on their end in exchange for simplicity for users.

* Dot imports were designed in a way that ensures they never get used. You can only dot-import an entire package, not individual symbols from it, so any dot-import is a semver hazard and discouraged. It didn't have to be this way, importing individual symbols (like C++, Python, Java, Rust, and probably many more [4]) would have still reduced clutter without trading it for a semver hazard. It's a strange oversight in a language otherwise so well suited to writing future-proof code. Of my gripes in this comment, I think this is the main one I feel could be resolved by backwards-compatible language extensions, but it's also the least relevant to the original issue of name shadowing.

These are all manageable issues when you carefully structure your own projects and pick all of your own package names. Though, I'm sure anyone who has worked in Go long enough has had to contribute to (or outright inherit) a project written with a very different approach that has a lot more shadowing hazards, showing how these individually simple rules can combine to serious quality of life issues on a project. Exported package names often become unfixable, so when you inherit a situation like that, you're going to get routine papercuts with no real way to avoid them.

[1] If the separator wasn't the same `.` then this wouldn't be a problem, though I admit any other separator in this position is automatically uglier.

[2] Assume the type is in the same package so the middle token can be lowercase.

[3] Rust goes a bit far in the other direction by always requiring an explicit module structure, but once you have it, it's actually decoupled from file organization and privacy. That said, my overall experience has been that managing `mod` and `use` hierarchies in Rust is still far more clunky and manual than packages in Go. The good parts are fine privacy control, separate namespaces, and consistent nesting, the bad part is needing boilerplate even in what should be simple cases.

[4] I respect that Go doesn't copy other languages and does its own thing, but when a feature like importing individual symbols is common to languages as different as Python and Rust, there might be real value in solving the same problem in Go too.


This is a great analysis that gave me a lot to think about.

I've always loved the way Go's packages work. With care, I can organize large codebases that end up feeling so clean and nice to deal with.

I usually get to pick all my own package names, so the issues you bring up are manageable for me. But you explained them well, and I can imagine the frustration of inheriting public API packages that were not carefully designed.

And this one is a perennial little naming hiccup:

> Who among us hasn't wanted to shadow the name `url` or `path`?

I still prefer Go's packages to other languages' ways of code organization, but I think you convinced me that giving packages their own separator (and not sharing '.') might have been a better design. Though strangely, as you mentioned, any other separator is uglier.


> * Dot imports were designed in a way that ensures they never get used

Good, unqualified imported names are horrible for readability.


I also noticed when I tried to search for "secure password generation golang" nearly all examples use math/rand. And to make things worse all of them initialize seed with current time just before generating the password.

This was discovered after I found in our code someone used math rand and was curious where they copied it from :)


goimports has special-cased math/rand.Read vs crypto/rand.Read from basically the beginning. But https://github.com/golang/tools/commit/0835c735343e0d8e375f0... in 2016 references a time window where it could resolve "rand.Read" as "math/rand". Maybe you were in that time window?

I think this was my fault. I refactored some code using crypto/rand into a file which was already using math/rand and failed to notice that the imports were wrong and not magically fixed by goimports like they usually are!

So goimports rocks and my code review skills suck!


It would not be so tough to provide an API call with a name like "PredictableRand".

Or SemiRand, UnsafeRand, that kinda thing. Bit more explicit naming. I feel like a lot of Go's naming is down to legacy / what people are used to, which on the one side is good because familiarity, but on the other it opens it up to the same mistakes and confusion that has already been made in the past.

The main issue was math/rand having a `Read` utility function with the same signature as crypto/rand, compounded by packages being namespaced but the namespacing not being preserved by importing. So a call to `rand.Read()` could be cryptographic or not you'd have to check the import to be sure.

It's not so hard to do `import notrand "math/rand"` which helps avoid mistakes in the mean time.

I had a similar goimport issue importing the wrong pkg. I've added a forbidigo linter rule to fail when certain packages were imported.

Would not have happened in C# which uses distinct Random (and Random.Shared) as PRNG and RandomNumberGenerator[0] as CSPRNG even when mixing namespaces.

Also has corresponding analyzer rule if you want to enforce this project-wide[1].

[0] https://learn.microsoft.com/en-us/dotnet/api/system.security...

[1] https://learn.microsoft.com/en-us/dotnet/fundamentals/code-a...


If I had to guess which of Random and RandomNumberGenerator was the cryptographically secure one, I would have guessed wrong. It's not clear either way.

Maybe so, but once you know you won’t need to scroll to the top of the file to know which one you are using.

(This was posted last week by spacey at https://news.ycombinator.com/item?id=40237491 as well, but that post seems to have been incorrectly buried as a duplicate of https://news.ycombinator.com/item?id=40224864. The two go.dev blog posts are two in a series but quite different: this one is about efficient secure random number generator algorithms, while the earlier one was about Go API design.)

Russell Cox consistently produces excellent technical blogs and proposals (and work). If you want to improve the clarity of your writing and thinking, he is a great place to start.

His series on FSAs and regular expressions made me fall in love with all of that. I wasn't aware of who Russ Cox was at the time too, but that series of articles was just incredible. That's probably the highest quality content freely available about implementation of REs. Runner ups being various compiler focused books, but those aren't freely available and easily searchable via the web.

he makes nice video demos too https://research.swtch.com/acme

I've been using `math/rand` instead of `crypto/rand` where `crypto/rand` was absolutely needed. This resulted in static keys being used in early versions of dnscrypt-proxy2.

The reason is that I'm using the VSCode extension that automatically adds imports. In all the source files requiring secure randomness, I carefully imported `crypto/rand` manually, but I forgot to do it in one of the files. Everything compiled and worked file, and I didn't notice that in that specific file, the extension silently added a `math/rand` import.

Since then, I import `crypto/rand` as `cryptorand` to avoid the wrong `rand` to be automatically imported.

By the way, Zig also uses a ChaCha8-based random number generator, and for cryptographic operations, people can't supply their own generator; the secure one is always used. For testing, some functions accept an explicit seed. For constrained environments, the standard library also includes a smaller one based on the Ascon permutation and the Reverie construction.


I'm not sure what happened in your case, but it probably wasn't what you describe. We changed goimports in 2016 to prefer crypto/rand over math/rand (https://go-review.googlesource.com/24847), and that was before there was VSCode support for Go.

Filippo and I dug into this a bit more, and it is possible that VSCode auto-complete (which also adds imports for the auto-completed things) is the culprit here. Apologies if that's what happened to you. We will look into fixing that.

Now that math/rand.Read is marked deprecated, at least if it does get selected, you get a nice strikethrough rendering as well.


It is probably not related, but gopls has a known issue with its heuristic for picking imports: https://github.com/golang/go/issues/61208

Someone else mentioned this same thing. Tbh the practice of automatically adding imports seems totally crazy and defeats the whole point of segregating names into different namespaces.

This is a pet theory borne of no research at all, but from my fiddling around with Go I feel like this is a necessary feature for one reason:

Go will refuse to compile if you have unused imports.

When I'd be playing around trying to get things to work, the import management thing was a lifesaver, I'd delete a line of code that was only a temporary exploration and its imports would disappear, I'd write something similar and the tool would add the imports back again. The annoyance of having to add them back constantly or always make sure there was some dummy "use" somewhere or manually comment them out and back in every time would be incredible to me.

It seems like it would be a lot better to simply warn on used imports and maybe refuse in release mode, but Go apparently thinks religious purity in one aspect is worth the price of making standard tooling remarkably lax in another?


In general there is very little overlap. crypto/rand vs math/rand don't even overlap except for rand.Read, and that was a mistake. We also corrected the import fixer in 2016 to prefer crypto/rand, so no one should have run into this problem in a very long time. https://go-review.googlesource.com/24847

It's convenient, and probably fine when imports are unambiguous.

The problem here is that there were multiple possible imports for the same name, and in such situation, the extension picks an arbitrary one.

It feels rather like a bug (or a design issue) in the extension, that can be fixed, than a conceptual issue in automatic imports.


It doesn't pick an arbitrary one. It prefers crypto/rand, and has since 2016 (https://go-review.googlesource.com/24847).

I have a distinct memory from the first Go contributor summit where I brought this up (I have no idea what we were discussing or what it was in response to) and the attitude of every other developer at the table was "yah, but we special cased this after it caused problems with crypto/rand, math/rand so goimports is fixed now and it's fine".

And then it happened again with every IDE. And with other packages that haven't been special cased yet. And with… I dunno, I never could convince anyone but it just seems like a terrible idea to me. Please just write out your imports, it doesn't take that long. :(


> Please just write out your imports, it doesn't take that long.

I have to disagree with that; add debug fmt.Println() → add fmt import. Okay, found it, so remove debug → remove fmt import. Okay so the problem was that we need to use filepath.EvalSymlinks() → add path/filepath import. Wait, that still didn't work; let's add back that debug Println()... etc. etc.

Other people may have other dev cycles, but I hugely miss it when it's not available (I sometimes do some work on VMs for cross-platform work where this is the case).

Same with gofmt really; these days much of the time I just write:

if foo==bar{fun()}else{panic("oh noes")}

And let gofmt sort it out.

Of course I can write all the spaces and whatnot but why bother?

I do agree the whole package shadow thing is a right pain.


The "write out your imports" ship sailed with modules. Nobody wants to write out "code.internal.corporate.domain/bureaucratic/hierarchy/of/orgs/foo" when they're looking for "foo". Even GitHub-hosted modules have fairly long names.

The tools need to get better though. I'd rather they fall back to asking me than guessing when the import is ambiguous. I also think they should only ever autoimport from the stdlib or what's directly referenced in go.mod.


I mean, copy/paste is a thing, I'm not suggesting you have to always type out every single character. Just know what's being imported.

I don't see a functional difference between copy-paste and autoimport. They have more or less the same risks.

Imports should rarely be manually managed directly in source. Doing that is the epitome of tedium and most imports are just boilerplate, so your eyes will glaze over and you'll miss the finer points unless a specific file warrants deeper scrutiny.

That doesn't mean imports should never be reviewed, and again, I think the tools need a lot of improvement in this regard.


I've never copy and pasted crypto/rand and gotten math/rand.

Yeah that's a legit gripe but I wasn't really talking about the standard library which uses nice short import paths.

I've often thought about why the default implementation of many randoms around programming languages is to use LSFRs, MTs, and other fast RNGs in the 2020s.

It seems to be better to err on the side of 'people dont know if they want a PRNG or a CSPRNG' and switch the default to the latter with an explicit choice for the former for people that know what they need :)


> It seems to be better to err on the side of 'people dont know if they want a PRNG or a CSPRNG' and switch the default to the latter with an explicit choice for the former for people that know what they need :)

That’s exactly what we did in PHP 8.2 [1] with the new object-oriented randomness API: If you as the developer don’t make an explicit choice for the random engine to use, you’ll get the CSPRNG.

Now unfortunately the hard part is convincing folks to migrate to the new API - or even from the global Mt19937 instance using mt_rand() to the CSPRNG using random_int() which is already available since 7.0.

[1] https://www.php.net/releases/8.2/en.php#random_extension


OpenBSD had a similar problem with people calling arc4random and getting RC4 randomness, but they just changed it to use ChaCha20 anyway and backronymed it to "a replacement call for random".

https://man.openbsd.org/arc4random.3


Recently I started using a new golang library that generated random IDs for different components of a complex data structure. My own use case had tens of thousands of components, and profiling revealed that a significant chunk of the time initializing the data structure was in crypto/rand's Read(), which on my Macbook was executing system calls. Patching the library to use math/rand's Read() instead increased performance significantly.

In addition to math/rand being faster, I was worried about exhausting the system's entropy pool for no good reason: in this case, the only possible reason to have the ID's be random would be to serialize and de-serialize the data structure, then add more components later; which I had no intention of doing.

Not sure exactly how the timing of the changes mentioned in this blog compare to my experience -- possibly I was using an older version of the library, and this would make crypto/rand basically indistinguishable from math/rand, in which case, sure, why not. :-)


Do note that "exhausting the system's entropy pool" is a myth - entropy can't run out. In the bad old days, Linux kernel developers believed the myth and /dev/random would block if the kernel thought that entropy was low, but most applications (including crypto/rand) read from the non-blocking /dev/urandom instead. See https://www.2uo.de/myths-about-urandom/ for details. So don't let that stop you from using crypto/rand.Read!

once you somehow got 256 bit of entropy it will be enough for the lifespan of our universe.

> this would make crypto/rand basically indistinguishable from math/rand, in which case, sure, why not. :-)

It's closer to the other way around. crypto/rand was not modified in any way, its purpose is to expose the OS's randomness source, and it does that just fine.

math/rand was modified to be harder to confuse with crypto/rand (and thus used inappropriately), as well as to provide a stronger, safer randomness source by default (the default RNG source has much larger state and should be practically impossible to predict from a sample in adversarial contexts).

> I was worried about exhausting the system's entropy pool for no good reason

No good reason indeed: there's no such thing as "exhausting the system's entropy pool", it's a linux myth which even the linux kernel developers have finally abandoned.


Others have already addressed the impossibility of exhausting the system entropy pool, however, I would add that you can buffer Read() to amortize the cost of the syscall.

Also, make sure that your patch does not introduce a secure vulnerability as math/rand output is not suitable for anything security related.


One of the better arguments for using a CSPRNG (here, ChaCha8) is that they benchmark it within a factor of 2 of PCG. The state is still comparatively large (64 bytes vs 16), but not nearly as bad as something like mt19937 or the old Go PRNG. (If the CSPRNG was much much slower, which is generally true for CSPRNGs other than the reduced-round ChaCha variant, it becomes a less appealing default.)

How did you get to 64 bytes of state? Last I looked, Go's ChaCha8 implementation had 300 bytes of state. Most of that was spent on a buffer which was necessary for optimal amortized performance.

That's correct - the state is 300 bytes (36 uint64 + 3 uint32). https://go.dev/src/internal/chacha8rand/chacha8.go

Fair enough. I was just thinking of base ChaCha state without the buffering. 300B is still significantly better than mt19937 (~2.5kB) or the old Go generator (4.9kB!).

I know posting "I agree" is not generally welcomed on here, but ChaCha8 is really underappreciated as a MCMC/general simulation random number generator. It's fast, it's pretty easy on cache, and it won't bias your results, modulo future cryptanalysis.

Whats are other cases where you would need the former? I can only think of fixed seeding for things that need reproducible results (e.g tests, verifications).

I think theres another little force that pushes people towards the PRNG even when they don't need seeding: CSPRNG api always includes an error you need to handle; in case the sys call fails or you run out of entropy.

I'm curious how often crpyto.Rand read fails? How much random do I have to read to exhaust the entropy of a modern system? I've never seen it fail over billions of requests (dd does fine too). Perhaps a Must/panic style API default makes sense for most use-cases?

Edit to add: I took a look at the secrets package in python (https://docs.python.org/3/library/secrets.html) not a single mention of how it can throw. Just doesn't happen in practice?


> CSPRNG api always includes an error you need to handle; in case the sys call fails

A user-side CSPRNG — which is the point of adding a ChaCha8 PRNG to math/rand — performs no syscall outside of seeding (unless it supports reseeding).

> you run out of entropy.

Running out of entropy has never been a thing except in the fevered minds of linux kernel developers.


> Running out of entropy has never been a thing except in the fevered minds of linux kernel developers.

Linux used user input and network jitter to generate random numbers, not a pure pseudo random number generator. For a perfectly deterministic pseudo random number generator entropy is only required for seeding and even then you can avoid it if you have no problem with others reproducing the same chain of numbers.


Cryptographically-secure PRNGs are also deterministic, but as long as you have at least 256 bits of unpredictable seed, the output remains unpredictable to an attacker for practically forever.

Linux used/uses user input and network jitter as the seed to a deterministic CSPRNG. It continuously mixes in more unpredictable bits so that the CSPRNG can recover if somehow the kernel's memory gets exposed to an attacker, but this is not required if the kernel's memory remains secure.

To reiterate, running out of entropy is not a thing.


The difference between “I don’t have enough entropy” and “I have enough entropy to last until the heat death of the universe” is only a small factor.

Attack on the RNG state or entropy is much more of a risk. The entropy required is not a function of how much randomness you need to generate but how much time the system has been running.


Thankfully, Go is considering making crypto/rand infallible, because as it turns out the syscalls do not actually fail in practice (it's not possible to "run out of entropy"): https://github.com/golang/go/issues/66821

> CSPRNG api always includes an error you need to handle (in case the sys call fails or you run out of entropy).

> Perhaps a Must/panic style API makes sense?

Yes, CSPRNGs APIs should be infallible.


I like the approach of “all randomness on a system should come from a csprng unless you opt out”. It’s the stronger of two options where you lose a small amount of perf for a much stronger guarantee that you won’t use the wrong rng and cause a disaster. It’s a shame that this is still a sharp edge developers need to think about it pretty much all languages.

In 2024 that's the right answer. But the programming world does not move as quickly as it fancies it does, so decades ago when this implicit standard was being made it would be tougher, because the performance hit would be much more noticeable.

There's a lot of stuff like that still floating around. One of my favorite examples is that the *at family of file handling functions really ought to be the default (e.g., openat [1]), and the conventional functions really ought to be pushed back into a corner. The *at functions are more secure and dodge a lot of traps that the conventional functions will push you right into. But *at functions are slightly more complicated and not what everyone is used to, so instead they are the ones pushed into the background, even though the *at functions are much more suited to 2024. I'm still waiting to see a language's standard library present them as the default API and diminish (even if it is probably impossible to "remove") the standard functions. Links to any such language I'm not aware of welcome, since of course I do not know all language standard libraries.

[1]: https://linux.die.net/man/2/openat , see also all the other "See Also" at the bottom with functions that end in "at"


For those unaware, gosec (and by extension golangci-lint) will warn about uses of `math/rand`

https://github.com/securego/gosec/blob/d3b2359ae29fe344f4df5...


One of my favorite things about math/rand/v2 is that I can use it at work without a nolint directive and the subsequent pr discussions.

I'm still trying to interpret the recommendations regarding security and this new v2 option. The blog post makes statements like, "For secrets, we need something different." and then goes into detail about cryptographic randomness, ChaCha8, and how it is seeded with system randomness. It gives the impression of being very "secure". But then the package docs state:

>... but it should not be used for security-sensitive work ... This package's outputs might be easily predictable regardless of how it's seeded. For random numbers suitable for security-sensitive work, see the crypto/rand package.

If that's the case, then why hint at using math/rand/v2 "for secrets" in the blog post? Is the short version that we should all still use "crypto/rand" for anything sensitive, and all of the improvements described here are a safety net should someone inappropriately use math/rand/v2?


Correct. math/rand/v2 isn't optimal, but it's not an immediate catastrophic flaw to use it when you should have used crypto/rand any more. From the article:

> It’s still better to use crypto/rand, because the operating system kernel can do a better job keeping the random values secret from various kinds of prying eyes, the kernel is continually adding new entropy to its generator, and the kernel has had more scrutiny. But accidentally using math/rand is no longer a security catastrophe.


Even in the worst case benchmark the new strategy is only about half as slow as an insecure random number generator. However most benchmarks were much closer.

Go is doing the right balance between safety and performance for the standard library (and for the apps built on top of it). Hopefully other ecosystems follow suit.

If an application needs fast insecure random numbers - they should implement an app internal generator themselves. Having insecure randoms within easy reach is a footgun we can put away.


Except this honestly seems worse.

Encouraging people to assume the "random" primitive is cryptographically secure is just encouraging bad practice. Making math/rand/v2 cryptographically secure might solve a problem, but it's now making something which doesn't look like it's promises security "okay".

`math/rand` functions in general though do not have the convention of being cryptographically secure - changing them so they are is just making a change to make bad code do the right thing, potentially masking the fact that if we're making that obvious mistake, what others are we also making?


The article explains this rationale very clearly:

> Using Go 1.20, that mistake is a serious security problem that merits a detailed investigation to understand the damage. Where were the keys used? How were the keys exposed? Were other random outputs exposed that might allow an attacker to derive the keys? And so on. Using Go 1.22, that mistake is just a mistake.

It is not masking a mistake - you should be using a CSPRNG in security sensitive contexts, however if you screw up the collateral damage is much lower. You should be detecting this mistake with static analysis and code audits rather then observing the random number generator on prod.


Right but that's the point. There's genuine and useful uses for predictable random number generators in a lot of contexts. This would be, conventionally, what I'm reach to a math/rand (or equivalent in other languages) for.

Whereas `crypto/rand` is very obviously "the CSPRNG function".

I understand the motivation, but if we're trying to find bad code why not go further and just require people to put //go:I AM NOT DOING CRYPTOGRAPHY at the top of files which use math/rand or something?

It's unloading the gun but not worrying that people are still pointing it around and clicking the trigger.


I think authors already did admit that introducing two rand packages was a mistake, so they're now just correcting (most) programs automatically so that the existing packages become more secure, and raising awareness that math/rand should no longer be used. I think it's the best they can do in this situation

They're also choosing to make math/rand/v2 use the cryptographic generator.

I think what you're saying is that the problem which was once easily diagnosable from an incorrect import may now become much more difficult to diagnose from an incorrect interface implementation. I think that is somewhat valid in a technical sense but a lot less likely to occur in practice, because most people will use the package-level functions which are guaranteed to use the CSPRNG.

Go 1's math/rand would more accurately be called an additive lagged Fibonacci generator. The first publication of it is due to Green, Smith, and Klem [1].

[1] https://doi.org/10.1145/320998.321006


That publication doesn't seem to mention the "lagged" part, or maybe I missed it. I am aware of https://www.leviathansecurity.com/blog/attacking-gos-lagged-... which also refers to it as a lagged Fibonacci generator.

Rob Pike and I exchanged mail with Don Mitchell (who wrote the original C version of the Go 1 generator) a few months back to see how he would describe the algorithm, and he said "As I recall Jim and I implemented Marsaglia's LFSR-like generator."

I think both descriptions (lagged Fibonacci and LFSR-like) are accurate in different ways, so either would be fine, but for the post I decided to use the original author's description.


The name itself might be due to Knuth; they were initially known as additive generators in other early literature.

Actually it wasn't Knuth; only the 1997 3rd edition contains the lagged Fibonacci name. The first instance of the name I can find is Marsaglia-Tsay in 1985 [1] (and possibly Marsaglia's 1984 "A current view of random number generators", which is impossible to find online).

[1] https://doi.org/10.1016/0024-3795(85)90192-2


Great article!

One nit.

I think statistical randomness seems to be conflated with pseudo-random number generators here. (PRNGs)

The wiki definition of statistical randomness is: "A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities"

Does this apply to true random number generators (TRNGs)?

You better hope it does. At least in the long-run or "in the limit". Otherwise, it is not a TRNG.

A TRNG should generate, in the long-run, "A number sequence containing no recognizable patters or regularities.

So, I think we can say: statistical randomness does not imply PRNGs, but can apply to TRNGs also.

I think the issue comes from the fact there are a large number of statistical randomness TESTs for PRNGs to qualify PRNGs with a qualified form of statistical randomness.

So, I think psuedo-random number generators would have been more appropriate than "statistical randomness" to identify PRNGs.

But, it's really a small nit.

Again, great article!


> Go 1.22 makes your programs more secure without any code changes. We did this by identifying the common mistake of accidentally using math/rand instead of crypto/rand and then strengthening math/rand. This is one small step in Go’s ongoing journey to keep programs safe by default.

This is such a developer-friendly take especially for all of us who have had unfortunate run-ins with java.util.Random


> a lightly modified version of Daniel J. Bernstein’s ChaCha stream cipher. ChaCha is widely used in a 20-round form called ChaCha20, including in TLS and SSH. Jean-Philippe Aumasson’s paper “Too Much Crypto” argues persuasively that the 8-round form ChaCha8 is secure too (and it’s roughly 2.5X faster)

Call me paranoid but my mind immediately jumps to the question of whether this paper can be trusted or if it has been planted by a TLA to intentionally weaken crypto.

I don't know Jean-Philippe, or much about them, but they seem to be both an experienced cryptographer and someone who has founded a company that is close to many government-adjacent organisations (UNHCR, banks, payment services, defence contractors—and that's just from the home page [0]) and therefore could easily have been exposed to persuasive state actors.

Does anyone know more about the security of the 8-round form and whether we should be concerned?

[0] https://www.taurushq.com/


The cited paper[0] only increases my concern:

> “But what if your adversary is NSA or Mossad? Won’t they have the computing capabilities to run a 280 attack?” Such a question is irrelevant. If your problem is to protect against such adversaries, the answer is probably not cryptography.

Handwaving away better cryptographic security on the basis that they'll probably get what they want some other way does not work for me. This is likely and often true, but those other methods may be more expensive, be unusable without revealing their hand, or be politically or diplomatically sensitive.

We should not give up on our security being as resistant as possible to these agencies on such a basis.

[0] https://eprint.iacr.org/2019/1492.pdf


This is a fair concern.

> Does anyone know more about the security of the 8-round form and whether we should be concerned?

This is the latest cryptanalysis I could find (see Table 2 and 3 for an overview):

https://ieeexplore.ieee.org/document/10410840

We don't even have an attack against ChaCha8. While it is likely one will appear as cryptanalysis improves, it is far less likely such an attack will ever become practical.

But obviously, not everyone from within the cryptographic community would agree with JP Aumasson either. For example, DJB had this to say 1 year and 5 months before "Too Much Crypto" first appeared on the IACR ePrint archive: https://twitter.com/hashbreaker/status/1023969586696388613.

So in conclusion; somewhat inconclusive? Going by the results so far, ChaCha8 is probably fine.


> Call me paranoid

You're being too paranoid. If you have a substantive disagreement with the content of the "Too Much Crypto" paper then we can talk about it, but to posit that Aumasson was compromised by a TLA(with no evidence) and that this paper is the result is pure conspiracy thinking.

Aumasson designed BLAKE[0], as well SipHash[1] and SPHINCS+[2](both of which he designed with DJB, btw).

[0]: https://www.blake2.net/#co [1]: https://en.wikipedia.org/wiki/SipHash [2]: https://sphincs.org/


> but to posit that Aumasson was compromised by a TLA(with no evidence) and that this paper is the result is pure conspiracy thinking.

Except we have some evidence that the NSA has compromised processes in exactly this way before. The OP was just asking a question and suggesting a likely and known mechanism for perfidy, he didn't actually posit that it was true.


Correct. I have no reason to believe Aumasson was compromised, but it’s certainly happened before that people in similar positions have been.

Regardless of whether there’s a third party with an ulterior motive or it’s (more likely) simply the author’s genuine opinion, the paper “Too Much Crypto” seems ok with limiting the security of cryptography to levels that may not be secure against the most advanced and well-resourced adversaries:

> “But what if your adversary is NSA or Mossad? Won’t they have the computing capabilities to run a 280 attack?” Such a question is irrelevant. If your problem is to protect against such adversaries, the answer is probably not cryptography.”

You may agree with that, too. But it’s quite an opinionated stance and one that I’d expect to see clearly signposted and explained in API docs, and for the more expensive and secure alternative to also be available.


There comes a point when "just asking questions" crosses a line into conspiratorial theory crafting. I can ask all kinds of crazy questions, like: "what if the world is run by a species of lizard people who live underground?". In the absence of evidence, it's pointless to debate such things. In general, people should be allergic to thinking in this way. That's not to say that conspiracies never happen, they do. However, it's on you to substantiate your insane claims, it isn't on your interlocutors to prove that they aren't true.

Complete novice question: Would it be possible to build a language that could read your source code and tell you, at compile time, which expressions have values that depend on RNG? And on cryptographically secure RNG?

So that you could annotate a variable as needing to be cryptographically secure and the language could check that, somewhere along the way, its value depends on an adequate RNG function?


Why is chacha8 used instead of a HW-accelerated AES block cipher (e.g AES-GCM) when that's available? Also AES-GCM only requires 12 bytes of random IV vs chacha8's 32 bytes not that that actually matters.

It's a good question. We probably could have designed something based on AES-GCM instead, but it would have had more limited impact.

ChaCha8 is still very fast even without direct hardware acceleration. The 32-bit benchmarks at the end of the post are running with no assembly at all and still running within 2X of the 64-bit SSE2-based assembly. AES-GCM with hardware is pretty fast, but AES-GCM without hardware is quite slow.

Just now I tried benchmarking ChaCha8 in 256-byte chunks compared to AES-GCM in 256-byte chunks. With HW acceleration, AES-GCM is maybe 10% faster on my Apple M3 but 20% slower on my AMD Ryzen. Same ballpark as ChaCha8 though.

On the other hand, if I disable AES hardware acceleration, that same benchmark drops by about 20X. So using AES would not have been a good idea for systems without AES hardware.

Overall, not much win to AES in the best case, and quite a loss in the worst case.


I meant that you could use the AES branch when running on HW-accelerated AES systems and chacha8 otherwise. Given that the security properties of AES are better understood than chacha8, any issues with chacha8 would have more limited scope. And since this is a cryptographic RNG, the specific implementation doesn't actually matter. The math variant probably would probably need to use the chacha8 variant since that can have reproducability requirements for a given seed although it's arguable if that reproducability needs to be the same between totally different machines since the implementation of math/rng isn't actually defined to have that property & you're already changing this in 1.22 which indicates it's mutable.

I'm kind of surprised that it's slower on AMD Ryzen - it looks like only the Pro series have a an actual co-processor. Weird decision on AMD's part to implement AES-NI without HW acceleration on some CPUs instead of just not implementing the AES-NI instruction set. That being said, AES-CBC would be even better for this purpose since the authentication guarantees aren't needed.

On my Intel machine, it's 5.7 GiB/s for AES-GCM. I don't know how you benchmarked the chacha8 version so I can't run the equivalent on my machine.


For benchmarking ChaCha8, I ran:

    go test -bench=Block internal/chacha8rand
For benchmarking AES-GCM, I edited src/crypto/cipher/benchmark_test.go:51 to add 256 to the length list, and then I ran:

    go test -bench=GCM/-128-256 crypto/cipher
    GODEBUG=cpu.aes=off go test -bench=GCM/-128-256 crypto/cipher
You're right that we could use AES where available in the places where reproducibility doesn't matter, although that's a second implementation to debug and maintain. ChaCha8 seems fine.

> I'm kind of surprised that it's slower on AMD Ryzen - it looks like only the Pro series have a an actual co-processor. Weird decision on AMD's part to implement AES-NI without HW acceleration on some CPUs instead of just not implementing the AES-NI instruction set.

I meant that AES-GCM is 20% slower than ChaCha8 on that system, not that HW-accelerated AES-GCM is 20% slower than a software implementation. On the contrary, the HW-accelerated AES-GCM is 20X faster than software on that system.


Related - a week or so ago I was playing with ASCON[1], a sponge based cipher and hash function aimed at embedded systems. A passing thought was that it might be handy to use as a random number generator. When I read this post a couple days ago, out of curiosity I picked up the ASCON permutation and benchmarked it vs this one.

It was unfortunately a bit slower: ~27 ns per 64-bit value (6 round permutation) vs ~4 ns for the included ChaCha8. I suspect it could be optimized, and run at the higher output rate (8 rounds per 128-bit output). One nice thing is that it does have a smaller state of only 40 bytes.

But - for the performance, this ChaCha8 implementation is awesome!

[1] https://ascon.iaik.tugraz.at


I recently updated my password generator module[1] to use math/rand/2 with the ChaCha8 source. The performance improvement was quite nice and the security stays the same or better, given that I now rely on Go's algorithm for a unbiased random number instead of my own.

All in all, I leave my thanks for the excellent work done by the Go team here!

[1]: https://sr.ht/~jamesponddotco/acopw-go/


This love like a good direction.

I’m furious at C++ for removing the easy to use random_shuffle, with no easy to use replacement.

I’ve seen several programs use C++s new random generators wrong, and end up breaking programs while removing random_shuffle.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: