> This is all fascinating, and I'm not totally dismissing that something is happening here.
Qualcomm devices have a hardware random device, which when coupled with the qrngd daemon feeds the kernel entropy pool. In normal operation of Android, I never see this pool actually get low unless running the qrngtest tool, in which case qrngd just fills it right back up.
Looking in drivers/char/random.c though, the functions which are called in the interrupt, input, and block device paths have an inner function add_timer_randomness which calls preempt_disable().
As a totally non scientific test, I turned all of these functions into no-ops and recompiled. This way, we're ONLY relying on the hardware RNG.
There's no change in entropy available because random numbers really just aren't being used all that often.
But now, I'm seeing a modest increase in interactivity on the device. Certain things feel smoother, and there is less UI jank. There's no change in frequency scaling or power usage as proposed earlier. qrngtest passes just fine as well.
What's going on here? I'm not entirely sure. We're either all crazy, or this is tickling a subtle scheduling bug in the kernel. More investigation is needed. 
Nice to see somewhat knowledgable commenting on this issue and actually test things out. It's very easy to say "meh, you haven't done a scientific test so it's probably placebo", but remember that you're being equally un-scientific in such an analyse.
When so many users are reporting improvements you (aka. Google) owe them to get to the bottom of it.
I also find it surprisingly that nearly no-one posts information about what phone they tested it on. Android ships in so many variants that it wouldn't surprise me if it only works in some of them.
>It's very easy to say "meh, you haven't done a scientific test so it's probably placebo", but remember that you're being equally un-scientific in such an analyse.
No, dismissing unblinded observations of phenomena the perceptions of which are notoriously suggestible (like UI responsiveness) is not "equally un-scientific" as making unblinded observations of phenomena the perceptions of which are notoriously suggestible.
Pointing out that such observations is unreliable is, in fact, a simple application of science. Grandstanding tu quoque arguments are not.
I am running it on Nexus S running ICS and there is a definite improvement, in launching apps and getting back to home. I think the improvement is mostly seen on older devices, newer devices are already fast enough.
Ah, this is interesting, since he's probably one of the few people outside of Google with the credentials to actually comment on something like this. It still seems hard to quantify "UI jank" but I'm glad he's looking into it. I had dismissed it as placebo yesterday but if he thinks it's worth looking into then maybe it is.
"i don't know why you think this is anything to do with dalvik (or libcore). neither touches /dev/random. java.util.Random is just a PRNG seeded from the clock. java.util.SecureRandom uses /dev/urandom.
"what version of Android are you reporting this against? i think there was a bug once upon a time. <checks> oh, yeah, <= gingerbread was broken. > gingerbread should be fine."
I hate to say it, but this kind of bad analysis is just endemic in the Android modding world. There's no technical leadership at the top (the AOSP people are, for the most part, silent except occasionally in bug reports like this), so people just throw whatever hack they want onto forums filled with even-less-clueful folks and things go viral.
If you tried this on a "real" Linux project you'd be laughed at. Where's the test showing latency improvement? Hell, there isn't even any identifaction of any code at all that reads from /dev/random (and of course it appears there is none).
It seems a little unfair, but frankly I blame Google for this mess. They refuse to run AOSP as a "project", which leaves the AOSP-derived projects a big soup of developers with no clear meritocracy. Some bits (Cyanogenmod) are really quite good. But the stuff on XDA is probably 60% useless junk.
Only a few people inside Google are allowed to see it. Their colleagues aren't allowed to sit in the same cafeteria as the Android developers, much less see the source code, until it's already done.
If it was actually Open Source, XDA developers would actually have real, original development, and features like lock screen widgets, officially added in JB 4.2 in 2012, would have been completed by the community in 2009.
Only old, already released versions of Android are Open Source. This makes it useless for real developers - who would want to fix something that might already be a non-issue as of six months ago?
As a result, the only 'community development' in Android is people who know how to tweak settings files and make ROMs with particular apps and settings built in.
All of this is true. But it's not the choice between two extremes. Google doesn't have to wall off community development in order to develop new features in secret. There's no reason they couldn't curate an AOSP project where submissions like this one (although obviously not this particular one!) could be made in a way that was truly useful to the community. It would cost money and time, and maybe they don't have that. But it's something that other FOSS players do, so it's not an impossibility.
Cyanogenmod doesn't have access to the current android source. The current test release, 10.1 is based on Android 4.2, there is no version based on the actual current android source code as that is not available to anyone who isn't a Google employee on the Android team.
And? I mean, ok, we get it, the most recently committed code is not open yet, probably because Google would prefer its flagship phones not be spoiled. So what? My phone runs an OS whose code is on github, and I install new versions when I choose to do so, from nightly builds.
And all the drawbacks I cited in the original post, ie, nobody outside the Android team bothers contributing anything, and the project has less developers and less features than it would if it were truly open.
Why would new phones be 'spoiled' if their source code was released?
> nobody outside the Android team bothers contributing anything
Do you have any figures to back this claim? Somewhere above or below a guy said he did send a patch.
> less developers and less features than it would if it were truly open
Very unlikely. Were it fully open (say, the Linux-way), it would not be pushed forward so steadily by Google, it would not have so many high-end apps (many written by Google), it would not have such a big piece of the market, it would not be the solution of choice for manufacturers, etc.
I also wish Android to be more open, and I hope we will see soon emerge a "Landroid" that will be to Android what Linux is to Unix, and probably Cyanogenmod is paving its path, but this is no reason to spit in the soup: Android is a giant step forward and without it we would be all sandwiched in a MS-Apple cross-fight.
> Why would new phones be 'spoiled' if their source code was released?
Suppose the most recent feature developed by Google on Android reveals that their next phone will have two-faced screens, wouldn't you think Google might prefer to decide when and how this new phone will be announced? As a result, we have a slightly lagging opening of the source code. It is sad, but better than nothing.
It is definitely incorrect to say "nobody outside the Android team bothers contributing anything": you can check the Android AOSP patch review website to see otherwise. However, I think it is a fair assessment to say that doing so is incredibly discouraging; while I don't have "numbers" to back it up, I definitely have tons of anecdotes on this.
On my own side, I can say this: I actually have code right now in Android; however, there is probably no chance that I will ever bother submitting another patch again.
What, in my experience, happens is that you submit a patch, and one person looks at it, thinks it is great, and gives it a +1. However, he can't merge it until someone agrees. Only, there isn't enough involvement on Google's end to get two people to agree. It isn't like someone else disagrees: you just don't get two peoples' opinions on the patch.
That is a patch that I submitted in January of 2009. Apparently, in May of 2010, a second person agreed, and the patch was merged. Yes: 16 months later. (There were three other patches I submitted in November of 2008 that were merged in August of 2009, but that was a "much more reasonable" 9 month gap, so... yeah ;P.)
With this patch, that same second guy had "disagreed"; well,
he wanted justification for why I wanted the feature. Two months later, the patch died, as I had long moved on. Now, had that question come, I dunno, a year earlier? (So, a mere 4 months after I submitted it), maybe I'd have remembered whether or not it was required for my use case. ;P
This patch wasn't even mine, but somehow I ended up in the system as "maybe he'll be second reviewer this time!" (and no, I have no clue how or why or what the rules are).
Only, this was the most depressing of all: when I reviewed the patch, I determined that the original code actually had a buffer overflow in it... this was an important patch.
The patch, of course, had already been in the system for a while: it was submitted in April, but it wasn't until nearly two months later that I got poked to review it. But, I don't think I actually could be "second reviewer", so there just ended up being a bunch of +1s from random people until yet another month went by; it finally got merged.
Another half of the problem is the same thing people here are complaining about: that AOSP is really just "an occasionally merged external branch" to "the real code".
The result is that when you are working on a patch, you actually can't know whether you are working on code that even exists anymore in the upstream branch. Maybe the code exists, but doesn't have the bug anymore; or maybe it does, but the implementation is sufficiently different that your code no longer merges against it.
What this means is that even if someone reviewed your patch instantly, it already might not apply anymore, and if it doesn't it isn't like they can even ask you to fix it.
It used to be that Google was promising that they'd merge the internal and AOSP branches, so they'd be working in public on nearly the same codebase (closer to Chromium). However, that promise at some point dissolved so greatly that they simply closed the source for something like eight months, not even attempting to merge it.
Why? As far as I can tell (and I've stared at this a lot), it was to hobble Amazon slightly, who had just announced that they were working on a tablet based on Android. Not just "an Android tablet", but "we are going to use all of that valuable work you did, but not pay you anything for it because we don't need your first-party Google apps".
The code didn't become open again until Amazon started shipping their tablets to users. At which point, the story with the open source codebase changed a lot. Now, the promise is more "we will make certain our preferred partners get access to the codebase quickly", but otherwise the code doesn't get dropped until the product ships.
Due to this, when we received our Android 4.0 test devices at Google I/O, and I soon there-after found a bug in libdl (broken shared object initializers), I was largely screwed. I seriously wasted a week (successfully!) figuring out what caused the bug and reverse-engineering what they changed using a disassembler, so I could file a bug.
When I managed to get the bug filed to the mailing list was about when the code finally dropped, so maybe that week was just wasted, but maybe I needed that time to find it anyway. Of course, Android 4.0 shipped with that bug. The issue was fixed in Android 4.0.1, and "luckily" almost no one upgraded to Android 4.0 until Android 4.0.2, but... frowny. :(
I don't think the complaint is overstated. The issue is the usefulness of community patches and contributions. Opening only released code makes those kinds of community involvement mostly impractical.
It isn't as if there isn't a model for community contributed patches for Google products: The Chromium projects take community patches and have a code review workflow for them. I don't know how effective this is, or how many end up in Chrome and ChromeOS. But it's not as if opening up more is not do-able.
> If it was actually Open Source, XDA developers would actuall have real, original development, and features like lock screen widgets, officially added in JB 4.2 this year, would have been completed by the community in 2009.
I agree in general that if AOSP was an actual open source project things would be better but I don't think XDA would be any different. It's been full of end users for years now and most of them are easily fooled by stuff like this. CyanogenMod is developed in the open and they do have quite a few contributors, but it's not terribly ground breaking (although some of their features do get reimplemented in AOSP eventually). The fact is that the Android project is really a gigantic codebase and it takes a lot of time to get started (and even on a really fast machine, 20-40 minutes to build), and a lot of the work CyanogenMod developers do is porting to hardware that's unsupported by AOSP, not feature development.
> If you tried this on a "real" Linux project you'd be laughed at.
Not really- the Ubuntu forums are an identical cesspool. There are just too many users for developers to interact with, so they hide in forums like IRC and mailing lists that drive off most of their non-technical users.
Not my point at all. If AOSP worked like, say, the kernel (or KDE, or gcc, or any other big project) this "fix" would have been submitted to and vetted by a functioning group of experts who would have immediately said "Dude, no." and it would have died silently.
No such thing exists with AOSP. Google does their development internally. External patch submissions (if acknowledged at all) end up appearing in a usable release only months after the fact.
So the poor excited hacker here (who, let's be clear, really didn't do anything wrong other than, well, being wrong) had nowhere to post a fix somewhere where it would be useful to people. So s/he threw it up on XDA with all the other junk, and we're now having this discussion.
Do "real" open source projects issue releases after every patch?
I've recently submitted a small patch to Android and it showed up in the repository immediately. While there is some development done in secret, not everything is. You can git blame and email the developers and they will probably talk to you.
(My experience is perhaps different because I have a @google.com email address, but the developers are actually nice people that do value open source, as far as I can tell.)
Maybe I'm more forgiving of the cargo cult tweakers than I should be, because that was one of the steps on the path to where I am now. Maybe my total understanding of virtual memory and other concepts would be more theoretically pure if I'd just ignored computers until learning the theory in school, but I think that, maybe, my hard-won experience makes me a more "empathetic" developer.
Yeah, users' reports of its effectiveness seem to vary wildly.
My suspicion is that it depends heavily on the CPU governor  a device uses. Some will see that process running once a second, and keep the CPU at or near max frequency 100% of the time. Others will back down more quickly and you won't see much of a benefit, and still others were probably running at max frequency even before the change, so it wouldn't have much of an effect on them either.
There are in-kernel consumers of entropy (such as for aslr) that wouldn't show up as /dev/random opens. Reading urandom will deplete the available entropy, but won't block - however, anything relying on actual entropy availability will then block until entropy is repopulated. So if entropy_avail is ever hitting 0, there's some vague plausibility that you could have some slowdowns. But honestly, it doesn't seem likely.
It appears to be wrong that there's anything that reads /dev/random, yes. It is still correct that there are reports of latency reductions across the board. Whether or not they are real or placebo, or if they are real but have other causes (such as reboots) is another matter.
But it's too easy to just dismiss this. I tried it on one of my tablets too, and din't reboot, and subjectively it feels substantially snappier - previously the UI would often freeze up long enough for me to get ANR dialogues, and the recommended solution for that tablet model is a full factory reset when it start being persistently slow.
That does not mean that I'm sure the effect is real. Nor does it mean that it has to have anything to do with /dev/random.
I am extremely- extremely- dubious of these claims. The app linked runs every second- it is far more likely that this constantly running app is keeping the CPU speed artificially high, probably at a detriment to battery life.
There doesn't appear to be any actual benchmark evidence to back up what's going on- just that things appear faster. Sometimes I really feel sorry for people that administrate these bug boards. "I have no evidence to back up what I'm saying other than gut feeling but why won't you listen to us, Google?!?"
I've been running it for a few hours now and didn't notice any significant difference on battery life, but I'll keep watching out for it. I'm now reading through the xda thread to maybe get some insight into how this was "determined"...
Looking at the patch, they just modified rngd to populate /dev/random from /dev/urandom instead of /dev/hwrng.
/dev/urandom in turn just uses the kernel's internal entropy pool, which is the same one /dev/random uses, and a PRNG to stretch the entropy when the raw entropy available is depleted.
My fear here is that the kernel's internal entropy pool is just constantly being replaced by the PRNG output from /dev/urandom, which is then being used by /dev/urandom to reseed it's PRNG output, causing essentially just a cascade of a PRNG reseeding itself. In turn, /dev/urandom's PRNG was seeded by some initial entropy condition that may have been less than optimal, especially if it was just whatever default was there soon after boot time.
If /dev/hwrng was designed to be a primary, or worse the, source of good entropy, then this patch may have eliminated good entropy.
Anyone with kernel-level knowledge of the Linux entropy pool care to weigh in? I don't know as much as I wish I did.
random and urandom are more or less the same... /dev/random is still a PRNG, using the same algorithms as urandom, it just maintains a record of some desirable degree of estimated entropy and just blocks until it's satisfied.
They also use separate entropy buffers, called the blocking and non-blocking pools, respectively. The latter doesn't directly seed itself from the former like you might expect.
Both use the SHA1 hash function on their buffers, presumably to prevent any practical leakage of raw data from the pool to the outside world, and both then mix this hash back in to their pool before outputting just half of it to the user.
According to https://lwn.net/Articles/525459/ filling the entropy pool with /dev/urandom is a bad idea. Of course, checking my Nexus S, /dev/hwrng just plain doesn't exist; I assume that means the kernel would normally get entropy from other sources at a slower rate.
In order to exercise any control whatsoever over radio noise, the attacker would have to have exceptional control over your surrounding environment. And if you're sampling at a high enough frequency, the attacker isn't going to be able to control the noise unless broadcasting it very near you. And even under those circumstances, there will always be some outside noise that they couldn't eliminate.
/dev/random is not "raw entropy" it goes through the exact same mixing and (SHA-1) compression stages that /dev/urandom does. It just happens to block if there's not much entropy in the pool at the start.
I understand that, my words were not precise. My point in phrasing it that way was that there is no entropy stretching. (You only get to read (about) as much entropy as has been gathered. You don't have to worry about reading 128 bytes and having them be associated with the previously read 128 bytes, because the pools that generated them are distinct.)
Linux /dev/random is based on an old design based on obsolete recommendataions. A modern cryptographically-secure properly implemented RNG doesn't need to worry about enropy exhaustion after a good initial seeding.
On the contrary, there have been real security problems seen when apps unpredictably get nothing back from the RNG. Exhaustion can also be provoked by a remote attacker.
Good quality PRNGs are not that new. But for good reasons, cryptographically sensitive applications do not want to be in a position of relying purely on PRNG stretched output. Reading from urandom, you have no idea how much entropy was used to seed your output. Was it 5 bytes from last week, or 4K from 5 seconds ago? You don't know.
The PRNG may be good enough to keep successive reads from being statistically correlated and such, but that is only one concern. Crypto, random number generation, and the practical implementations thereof is much more complicated than just the quality of the primitives. Sometimes you want to know for sure how much entropy is devoted to your read. In such a case, you use random.
But for good reasons, cryptographically sensitive applications do not want to be in a position of relying purely on PRNG stretched output.
What I'm saying is: if you are worried about the quality of your PRNG or the size of its pool or the secrecy of its content then you ought to choose a PRNG you can be confident in. Saying "I don't trust my PRNGs output for more than a short time" is equivalent to saying you believe your PRNG is defective (or you don't believe in one way functions in general).
Was it 5 bytes from last week, or 4K from 5 seconds ago? You don't know.
If your kernel takes a week to scavange 5 bytes unknown to an attacker, it's defective. You only need 100-200 bits, and the first seeding is the important one. This amount shouldn't take more than a few seconds to accumulate on all but the most quiet deterministic embedded systems.
Sometimes you want to know for sure how much entropy is devoted to your read.
Entropy is what the attacker doesn't know. The kernel can only try to estimate it based on wild-ass guesses about the properties of the attacker. Decrementing this estimate as RNG numbers are handed out is unjustifiable.
If the effect is real, it may just be that since it's precalculating numbers for /dev/urandom (which is possibly slow for the quantity of data needed, particularly on a slow ARM), that means that the PRNG is run ahead-of-time instead of on-demand. A better solution might be to keep a pool of lower-quality bits for /dev/urandom to fall back on when the entropy pool is low.
The reason why this drains entropy is because when you start a new process (for example "cat" in this case), the kernel passes 16 bytes of randomness at exec time for the benefit of userspace. This allows for fast stack randomization, without the overhead of explicitly opening and reading from /dev/urandom.
Due to the internal kernel interface which gets used, this draws from the /dev/random pool, thus decreasing entropy_avail, although it is not a blocking call. There has been some discussions that the userspace randomization should use a CRNG, or perhaps draw from the urandom pool instead.
> and every scientific mechanism shows it can't possibly speed anything up.
Does not follow from this:
> nothing is opening and using /dev/random continuously, as shown by using inotify to watch it.
The finding that nothing is opening and using /dev/random rules out the explanation the author of the app gave. It does not in any way prove that the app has no effect, even if it might be for totally different reasons.
Several people appear to be looking at ways of doing some proper tests to determine if there is a real measurable effect, so we'll presumably find out soon enough.
It is also possible that this has nothing to do with blocking but rather with processor resources depletion. The fact that the entropy generating processes are being triggered to run at just the wrong time when the device is most likely busy instead of being triggered when it is idle might cause slowdowns.
This thread and patch are silly, but running out of entropy can be a real performance problem sometimes. I've seen it on embedded Linux systems that I've set up to be headless servers. After turning off lots of device drivers, I noticed that SSH was incredibly slow sometimes. The problem was a lack of entropy, causing reads to /dev/random to block. Like in the article, the quick hack is to redirect /dev/urandom to /dev/random, but that isn't a real fix because of the security implications. The right fix was digging through the available devices and figuring out where entropy could be legitimately captured.
One explanation is that this application wakes up every second which keeps the CPU and GPU from sleeping or lowering frequency, which would reduce the delay introduced by whatever CPU/GPU governors are being used.
Correct me if I'm wrong, but couldn't it just be that people are seeing improvements because running this every second makes more memory heavy apps in the background leave ram? It seems, just looking at the comments here, that symlinkers have had no good effects, while those installing the app have.
Besides the obvious thing of showing a process blocking on a read() from a source of entropy, you could also change the minor number on /dev/random to temporarily make it the same as /dev/urandom. No code needs to be re-compiled or added to the system. Just make everything restart (so it grabs the right /dev entry with the new minor number) and see if it changes.
That should tell you if it's actually a lack of entropy, or another process undoing the CPU governor, or something else entirely.
I went ahead, opened a root shell on my Gingerbread phone, and symlinked random to urandom.
No noticeable difference whatsoever.
I agree that the "speedups" reported by some users are either (1) because they had to reboot their phones, or (2) because rngd prevents the system from going into deeper sleep states which are costlier to recover from. There must be a very small number of users who truly run specific apps that drain the entropy pool, on Gingerbread, and who are helped by this hack.
This solution might not help because urandom depletes random entropy anyways when there is some available.
It might be that it is not the blocking on urandom calls that is the problem but rather depleting the entropy which triggers processes to generate more of it at just the wrong time when processors are busy rendering things.
If the entropy would be pre-generated when the device is idle then there would be more power available for rendering later.
I've been trying this on UnSense 2.1 and 3.0 for the Thunderbolt. The APK doesn't seem to do much even after enabled and applied on boot. The zip seems to work once flashed in 4ext but then again it could also be the placibo effect. I don't know if it's a all around improvement or if it's just the Windows XP effect. "Show the desktop before windows is finished loading to give the illusion of a faster boot time".
I have no low level experience with android but we have fought this issue on our analytical servers. One culprit was java code (hadoop) that read small amounts of entropy with buffered io essentially filling and discarding a large unneeded buffer every time it needed a small unique id. It caused all sorts of weird lags especially with ssh and was a pain to debug.
There is at least one actual android engineer who has responded to about 10 comments in the bug report.
At this point, the bug report is mostly filled with people saying "i think my phone goes faster, fix this bug", which is non-helpful.
It just occurred to me, if an entire mobile OS and its apps were written in a language like Go, this sort of bug could conceivably never happen, because with a few coding conventions, nothing would ever block.
That isn't a fix. The problem isn't blocking as such but rather latency. If for example a screen refresh and reading from 10 different network connections were all done in one thread (non-blocking) it could still take some time before the screen refresh code gets to run.
60fps gives you 16ms per frame. Correctly written Android apps (and iOS for that matter) are expected to do work on secondary threads, and keep UI only code in the UI thread. Of course many applications don't quite stick to that.
In the Android developer settings you can make it whine about work being done in the UI thread (eg storage/network activity - StrictMode). You can also make it flash a red border whenever the UI thread blocks. In Android 4.0 I used to see a lot of red border from Google's own apps. In 4.2 it happens a lot less often.
Well, no. You still have to wait for computation, you still have to wait for network... and you still have to wait for entropy generators, if you use them.
It would possible for a part of application (thread) that do not rely on random numbers to be more responsive. But you do not need OS rewrite for that. And, of course, if you are aware that you are waiting for random numbers to draw a map, you want to do something with that in the first place. :-)
We had a large system written in go start giving us random identifiers that weren't random (something like dlsfaioghoph0000000000) due to entropy depletion. On a technical level it was drawing from non blocking /dev/urandom...it wasn't blocking but it wasn't exactly failing gracefully either.