The author's previous blogpost (http://taras.glek.net/post/Laggy-phones-and-misleading-bench...) contends that 4K writes are a good proxy for phone lag, but has no evidence or measurements either, just the author's contention.
It could all well be true! And it's great the author's dug up some concrete areas the pixel team could potentially improve at (dumb question: couldn't the pixel be updated to use most of these FS tweaks? could an in-use pixel FS be converted to use f2fs?). But I don't think the author has done a good job demonstrating an actual relation to UI lag or anything to do with the phone's perceived performance.
disclaimer: googler, but I do iOS things.
The 2012 Nexus 7 is a good comparison. It had slow, cheapskate flash memory, which degraded over time. But why should that cause UI jitter? As several people have pointed out, well-written apps use separate threads for UI and I/O. There were some big UI jitter problems in the OS, but there was a big effort to fix those in Jellybean, which the Nexus 7 shipped with.
The problem is that just using separate threads isn't enough to keep UI and I/O completely separate. The UI uses the GPU, and the GPU fights with the kernel over access to I/O resources. Maybe the UI thread or GPU driver has a cache miss and needs code paged in; maybe the OS is busy writing out another page.
If the Motorola phone really does have 10x lower I/O latency, that seems very significant, but we don't know if it really affects the UI without measurements.
Just setting 'nobarrier' on the file system sounds dangerous though. I'd be very wary of that unless they have really good arguments and measurements to show it's safe.
I write Android apps, and I/O reliability is a big headache. I'd put the success rate for simple file system operations at only around 99.9% -- as soon as your users number in the thousands, there are always a few users whose phones fail in bizarre ways. Most of the problems show up on Samsung phones, but that might just be because they're the most popular brand.
Apple hardware definitely ages much, much more gracefully than any Android hardware I've seen. That might be okay if everyone got a new phone every two years, but that's definitely not the case; tons of people are using old phones.
In fact, smooth but overlong animations can feel more laggy simply because the time it takes for the phone to do what you actually wanted is longer.
If 4k of I/O take 16ms or more with great regularity (and p99 seems rather frequent to me), you're going to regularly hit high multiples of that in any kind of mildly intensive workload (which writes a lot more than once, and a lot more than 4k). And at high multiples of that, no amount of off-thread I/O is going to be able to hide that latency.
So sure: asynchronicity will make slowless a little less ugly. And that's about it.
(On the other hand, fsync is a pretty-heavy handed tool, and he doesn't show the latency for non synced I/O - so maybe the problem is just that software is too conservative - I mean, how often do phones crash, and how bad is losing a small amount of data then?)
Phones crash a lot. Losing data can be anything from a minor nuisance (lost a few seconds of work) to a massive headache (some important file was subtly corrupted).
In practice, you almost certainly need some concept of atomic actions or transactions to deal will corruption.
The more relevant question then isn't whether phones crash and cause corruption, it's whether the additional amount of committed transactions you lose by relaxing durability requirements matters. You're always at risk of losing those in flight - does it matter if you lose the photo you just took? It depends, but I suspect the linked articles fsyncs/second is certainly massively excessive. The "bad" case was after all still just 1/50th of a second, i.e. much less than the blink of an eye. My contention is that phones don't crash often enough to make sub-second durability worthwhile. Most phones I come across certainly crash less than once a day, and losing your past 1 second of actions once a day sounds acceptable to me.
In short: the benchmark is not directly useful. Perhaps it correlates with some relevant aspect of I/O performance (but that's not clear), or perhaps software is buggy and actually tries to fsync more than 50 times a second (but then: fix that), but it may simply be a meaningless benchmark.
He both produces some measurements he has made, and cites an academic paper that corroborates. It's not certain proof but it's a long way from 'no evidence'.
Can you point me to any evidence related to the title, i.e., Pixel lag is 10x worse than Moto Z?
That would require reformatting, so probably not.
(neither a Pixel fan or hater here)
It is like asking what is faster:
actually calculate the sin(x)
If calling fsync is critical to your application, then the Moto Z has broken your app by making fsync a no-op. If it isn't critical to your application, then why are you calling it?
If you manage application state yourself, and the phone does power off or kill your app at an unfortunate time, you might up with a corrupted application state. If you're a big company like Facebook or Google with millions of users this will be happening multiple times per day, and produce a slew of complaints about your app.
Depending on the exact mechanism and implementation, making fsync a no-op is not necessarily bad. As long as it can (with highest priority) fsync all of its buffers upon notification of loss of power and be done before all capacitance is lost then there should be no trouble at all. Of course, this requires some coordination between the hardware, the OS and the filesystem which might be non-trivial even for a fully integrated company like Samsung so it could be it simply doesn't work at all like that.
I'd guess the most popular apps DO use a database on Android, right? So we could make some testable predictions. If the Moto Z's fsync is a no-op, then we should expect more database corruption. So if we don't see database corruption, we can presume a highly prioritized, efficient fsync on power loss notification. Battery pull vs. low battery poweroff event? I'm still thinking about controls and test methodology. Also, either Android app databases don't corrupt often, or they're good at recovery, because I haven't seen app errors (aside from incompatibility) in a long while. Then again, I have a Pixel, not a Moto Z.
I am going to reflash with these filesystem changes though; just not sure how to benchmark.
The database should never get corrupted. If it does, you have a bad fsync implementation.
It batches the fsync per transaction, so there's only one fsync for 10000 inserts in one transaction.
> Depending on the exact mechanism and implementation, making fsync a no-op is not necessarily bad. As long as it can (with highest priority) fsync all of its buffers upon notification of loss of power and be done before all capacitance is lost then there should be no trouble at all. Of course, this requires some coordination between the hardware, the OS and the filesystem which might be non-trivial even for a fully integrated company like Samsung so it could be it simply doesn't work at all like that.
It's supposed to work if the OS crashes. Kernel panics are not incredibly uncommon.
In other words, because the Moto Z cheats.
I suppose it's understandable... nobody should ever be blocking GUI animations on fsync, much less two of them in a row, but here we are.
In the 'computer is a phone' perspective there is no sudden power loss unless the user yanks the battery out of the back. That is presumably a rare occurrence for your typical user. As a result, the big risk for phone file systms is that the phone crashes, but nearly every phone I've seen preserves memory contents as long as power is applied, so even a crashed phone, on reboot can pick apart the previous bits in memory and reconstruct what was going on before it crashed.
Given that fairly unique to phone criteria I could see nobarrier as a legit option.
What you're proposing is possible in theory, but no general purpose operating system I've seen actually attempts to recover page cache state from uncleared RAM upon reboot.
I've witnessed a number of phones with the "crumple zone"-alike feature of effectively ejecting the battery cover + battery out of the phone whenever you drop it.
I drop my phone all the time and I don't have a case. It's fine.
It's probably not just bad luck that the glass happened to break after the Nth time you dropped it. It was bound to happen if you kept dropping the device.
I suspect that if I didn't have a case I wouldn't drop it so often.
Do you just lose a new contact, or corrupt an essential database?
How essential are we talking about here?
What I mean is, your smartphone is a device that can get lost, stolen or destroyed any day. All the data that is physically stored in it should either be expendable or synced elsewhere.
Are people creating new contacts that often on a phone? I probably do that only a few times per week. If the phone crashed immediately following the input, I'd just ask the person for their details again.
> corrupt an essential database?
It would only corrupt application data. Android isn't completely stupid, /system is mounted read-only on every Android phone I've ever seen.
I guess it depends on how highly you depend on application data remaining consistent.
Presumably the corruption would only apply to data that was modified but had not been flushed to flash before the power cycle, so at rest data shouldn't be affected.
> I probably do that only a few times per week.
If your data is kept in a SQLite database or any other type of compound storage scheme it's entirely possible to lose data that wasn't being modified, due to corruption of the metadata that governs the layout of the file (actually I don't have experience with SQLite file format specifically, but I ruined my startup's launch a decade ago with almost identical reasoning--thousands of pressed CD-ROMs in the garbage).
Define longer-lived, you'll probably have replaced the battery several times before flash wear becomes an issue
That said, all mobile devices, laptops included, have a "force shutdown" option activated by holding the power button for x seconds.
Or the battery is a couple of years old. Or the user lives someplace where it's cold outside.
Filesystems optimized for flash, and for battery-backed systems in general, (laptops, phones) have some history: https://en.wikipedia.org/wiki/Flash_file_system
But don't listen to me. I like TxF too.
this is the thing that confused me.. why are apps doing disk i/o on the GUI thread?
For example, on iOS, +[CLLocationManager authorizationStatus] (checking if the app is allowed to use location or not) has to read a file . Almost no app developers are aware of this.
 I've been told this by guys who worked full-time on location stuff, but I haven't verified it myself and can't find it in docs, so take it with a grain of salt
Wouldn't the cache HAVE to be flushed as it fills up at some point to maintain coherence.
If you're not cheating, you're not doing it right.
They simply do not care about optimizing the performance of the phone running their software. Otherwise they never would have chosen a garbage-collected language like Java for the platform in the first place (I understand they've ameliorated this concern recently). Or used ext4 like here.
Not sure if you're being serious or hyperbolic - but Google bought Android, Inc (an Andy Rubin startup). The team had a lot of Danger alumni - Danger being the makers of the HipTop - aka the original T-Mobile Sidekick (also running Java). I'm certain they knew a thing or two about hardware, they just made trade-offs you disagree with.
1) fsync cost. Yes, fsyncs are dangerously slow in any Android app. (SQLite for example is a common culprit. Shared Prefs are another). HOWEVER, it's possible that flushes cause reads to be queued behind them (either in the kernel or on the device itself) which is even worse because
2) Random read cost is super super important. Android mmap's literally everything and demand paging is particularly common AND horrendous as a workflow. To add insult to injury, Android does not madvise the byte code or the resources as MADV_RANDOM, so read-ahead (or read-around) kicks in and you end up paging in 16KB-32KB where you only wanted 4KB.
Also, history has shown custom flash-based file system on Android to be a world of pain. yaffs, jffs have some pretty atrocious bugs/quirks. I'd much rather see the world unify on common file systems, optimized for flash-like storage, rather than OEMs shipping their own in-house broken file "systems" (I'm looking at you, Samsung).
Still, it's not as tested as, say btrfs and ext4. Can't wait to see its particular quirks.
In particular, everything is compiled down to lookup tables and hash tables within each odex/oat. Your point still stands but the hit is much lower than you would think and given the slow speed of the superfluous reads, it ends up being a net positive for A LOT of cases.
The way odex files are structured, there is actually a fair bit of data sequentially organized (for example dependencies), even with the indexing. The odex format does seem to have some elements that anticipate read-ahead (e.g.: those hash tables, dependencies...).
That said, there is a real question about proper tuning of read-ahead for flash memory (like, perhaps 4k or even 0-byte read-ahead is the right thing to do in general ;-). It's not like it is hard to abuse it.
Why is that so?
There is a good question though... do you really need fsync on Android? If you don't, why are you calling it?
Has anyone experienced this issue with their Nexus 6? My phone is more than 2 years old and I have no noticeable slowdown.
The pixel might have the slower storage option but it has no effect on usability. From what I have read its UI performance is the best of any Android phone yet.
"The Pixels are fast — noticeably faster than Samsung's Galaxy S7. On performance alone, these are easily the best Android phones you can buy."
You can play games for awhile stripping stuff out of memory so it can live in the fast area but eventually its speed drops to approximately zero even if all you have installed is the kindle app at which point I tossed it.
There is a lot of cargo cult about there about wiping the cache over and over supposedly helps in the sense that the owner feels they're doing something, although all they're doing is wasting the last few R/W cycles left.
Not saying that this means all later products have this issue licked, just that the original Nexus 7 is a known awful case.
The only way I was able to make it remotely usable was to install an old version of cyanogen mod and install only bare minimum apps.
As an aside, having a phone that costs $150 is sort of amazing. It's removes the fetishism of "oh god, what if this breaks?" that I had with my iPhones and Nexii. It's a new and interesting feature that isn't discussed enough.
>that fsync blog post floating around is pretty much bogus. also nobody should use nobarrier, it's not safe at all
My chinese Android phone has a more annoying hardware lag - a delay between touching the screen and touch event processing is over 100 ms. Any drum app is unusable. And if you try to scroll something up and down fast it is easy to see how the content on the screen lags behind finger movements.
As others have pointed out, a lot of apps do I/O related work in the UI thread, mostly b/c people don't care enough not to (and sometimes it's hard). Flipping a toggle somewhere can easily cause a write to a sqlite database to need to happen.
And it's for far more than just databases.
> So the title is misleading.
I don't think the title is that misleading, it is very much and significantly faster at storage operations than the Pixel, and those matter quite a bit in your daily usage.
All that was tested was fsync(), not "storage operations", and the Moto Z was faster at fsync() because it turns off fsync.
But it's not clear why this matters to anything. Lag is usually the result of file system reads being slow (paging in code/resources), not writes being slow, so how does fsync performance matter?
That's not what was tested, that's only what is shown in the graph. The lack of fsync contributes to it.
I wrote a little fio benchmark driver to fill all available device storage with random 4k writes, print perf stats along the way
One of the other things clearly mentioned is that b/c of Google's use of an additional FUSE filesystem they take a 30% performance hit:
This means that on the Pixel every user IO gets a round-trip back into user-space before hitting the NAND. Fuse burns more CPU and slows down IO by up to 30%.
Sure, the nobarrier trick makes a lot of difference but it's not the only thing in that post, nor was that the actual benchmark.
Google maintain the kernel implementation (sdcardfs) too... they choose to use the FUSE implementation for good reasons. Of course, most other vendors only care about benchmarks, not things like robustness / security.
> One of the other things clearly mentioned is that b/c of Google's use of an additional FUSE filesystem they take a 30% performance hit
Only on one mount point which isn't used for anything lag-related. There's no code on that mount point. There's nothing UI-critical on that mount point. It means your photos might load a bit quicker, but again given that there was no actual numbers or objective information we don't actually know how the two compared. Good job to Motorola to handle that in the kernel, but it contributes nothing to the spectacular claim that the Pixel has 10x more lag than the Moto Z.
But given this claim: "They got Moto-Z to performing close to high-end laptop SSDs." I think it's pretty obvious the author only actually looked at fsync(). Unless somehow the Moto-Z is ~5x faster than the storage in an iPhone 7 (which is the gap between an iPhone 7's storage and a high-end laptop SSD), it's obvious bullshit.
On older android devices i have run apps that could show IO rates. And invariably when things would lag or freeze the IO would be maxed out (one offender would be Facebook and their Messenger, because after each update they would force a recompile or something).
But on the other hand, if you are paranoid about security in Android, you should either go with a Google phone (quickest updates), or something like Copperhead.
But when I broke my Nexus 5 I needed a replacement. The Nexus 6 was a bit larger than I was interested in buying and at $650 it cost a bit more than I wanted to pay for a device I was not really too keen on.
Enter the Moto X (2014 version) which was essentially a smaller version of the Nexus 6 for around $450. OS was all-but-stock Android and updates seemed to be almost as fast as Nexus devices thanks to Moto Mobile's Google ownership. Picked one up and it was one of the better phones I've owned. A little while later, I picked up a Moto360 on sale for $125 because I had some "play money" and the itch for a new gadget to play with.
The phone was great but over time updates slowed as Google sold them off. Then my watch started crapping out due to a defect and I had to deal with the new Lenovo-owned support team.
Dealing with that support department was miserable. Google has a bad rep in this department but it's nothing compared to the "new" Moto. It took me months to set up an RMA and get the issue resolved as their website constantly crapped out or failed to work properly while setting up the RMA. Their support techs were either unavailable or unable to offer assistance.
After that phone, their later offerings got rid of the near-stock at a good price that made the Moto X so attractive and as things went on, they continued to move toward mediocrity.
This time around, I bit the bullet and paid "iphone money" for a Pixel. So far, other than the bland design it's been an excellent device. And while I would've loved a return to the almost-flagship-for-half-the-cost of the Nexus 4 and 5, in the end, I found that I'd rather pay an extra $200 for something I use daily for 2+ years rather than suffer the delayed updates and poor support.
Since they keep an almost pristine Android, and hardware is rock solid, I think the paranoia is not necessary. My Moto X force is really unbreakable.
Another interesting bit was that, instead Lenovo to "eat" the Motorola, it seems that it will move all its mobile handsets to the Moto name and leave quite some large space to develop, while supporting it with lots of money.
I was really really skeptical at the beginning and I also told myself not to buy anymore from Moto under Lenovo, but in the main time my opinion changed and seems they go, let's say, not in the wrong direction, especially the Moto Z and the mods.
I wouldn't trust any Chinese company product or service, especially after the recent "cybersecurity law" that allows the government to force companies to install backdoors in their products (both local and foreign).
Perhaps all of this will change in the future, but not anytime soon. People should boycott Chinese products until that happens. It would also set an example for the US and everyone else that it's not acceptable to put backdoors in your products.
Why would Twitter or Facebook apps need to write much to persistent storage? They don't do so in a browser.
Lying on fsync doesn't seem a good option tho, even if developers abuse it.
Also, I have an insane amount of tabs open in chrome. I'm not sure what the rule is, but it seems like it keeps ~5 tabs in memory, depending on how much is in the current tab. (Tangent: Opera has a great offline feature for saving stuff before you get on the plane.)
I'm actually a little amazed than I can scroll pinterest or tumblr as well as I can.
Might be you talking of something <$100 since on Amazon there tons of offers with 1GB for ~$50 and 2GB RAM that still under $100.
Mine Xiaomi Redmi Note 3 Pro cost $170 and it's has 3GB RAM.
I'd also caution that the prices for US consumers on Amazon are far lower than what is experienced in smaller markets, which have much less competition and fatter telcos. Massive monoply power at work, most tech simply never makes it to many countries at all.
I was recently in Thailand, and even in major metro areas it was obvious that most shops were owned by a cabal of monopolistic fat cats. Similar situation in many parts of the middle east. Prices for equipment which would just work with a direct import (like iphones) are ridiculous.
Also, until recently, Android had a per-process memory limit.
I know fsync() ensures data gets written to disk, but why does anyone care that it can happen so often? When a device crashes, some data (prior to the sync) may be lost, but do we really need multiple checkpoints per second to ensure only sub-second data loss?
I'd be content with a couple of minutes worth of loss even on my main PC, with its lack of battery backup. To enforce rapid syncs on a phone seems utterly pointless.
Keep the syncs for meaningful checkpoints, like buying something in an app or marking a message as sent. Multiple fsync() calls per second are a total waste.
What you're looking for (I think) is SQLite in WAL mode, with PRAGMA SYNCHRONOUS=NORMAL. That's ACI but not D.
My money is on the battery crapping out first.
Also, disable background apps unless you really need them. Also Settings -> General -> Accessibility -> Reduce Motion, Reduce Transparency. Delete cookies from Chrome/Safari. Also, Settings -> Messages -> Expires -> 5 Minutes (instead of never).
... And yeah, I wouldn't be surprised either, but I was intrigued by the prospect.
Personally I don't use those kinds of tweaks but it might be worth a try.
Whereas, if SQLite did issue I/O from a separate thread, one could easily implement an "async commit" function which guaranteed consistency and ordering but not necessarily durability (i.e. a write barrier). This would suffice I suspect for 95% of usage in applications: users will probably be OK if their phone loses the last few seconds of user input before an OS crash, so long as everything else is left intact.
EDIT: In fact, Postgresql has an option to permit exactly this behavior: https://www.postgresql.org/docs/9.6/static/wal-async-commit.... This is possible in Postgresql because, unlike SQLite, I/O does not run in the client thread.
Pretty much all apps DO have SQLite running on its own thread, or at least a background thread pool of some sort. Android gets very cranky at the developer if they don't do this - there's tons of warnings, both runtime and in the form of lint. Database access, or anything involving fsync(), is exceptionally rare on the UI thread or any user-latency-critical thread.
The performance boost in this case is mainly due to the nobarrier option, which is potentially dangerous unless you have something like a battery-backed RAID controller to act as a persistent disk cache. If you back up all your stuff, could be nice. (A similar hack I used to use to eliminate iops-related slowdown was to implement a tmpfs mount for small files that got written a lot, and just rsync them once a minute to disk)
One of the benefits of f2fs is that you can select the allocation and cleaning algorithms that it uses based on the flash chip in use, so before using it on your system, you might want to tune it to your application. https://www.kernel.org/doc/Documentation/filesystems/f2fs.tx...
I found this https://ubuntuforums.org/showthread.php?t=2326934
I'm going to study it. Any other first hand experience here on HN?
Love this write up/research! Hopefully it will teach the Pixel team a few things, or maybe they already knew but will now have the ammo to take to Product and change things!!
I know there's some kind of system to prevent network I/O on the main thread... w... why... I dare to ask the obvious, w-wwhy isn't there a warning for simple disk I/O too?
Check your running applications, the answer to sluggishness might be there.
noatime implies nodiratime. I shrug off when I see newbies copy pasting this to their /etc/fstab but this is in a mainstream Android device??
/dev/block/dm-0 /data ext4 rw,seclabel,nosuid,nodev,noatime,noauto_da_alloc,errors=panic,data=ordered,inode_readahead_blks=8 (no nobarrier mount option)
In theory Google should be able to easily change the /data ext4 mount option, why didn't Google?
How is it that no one has made a worthy successor to the OG Turbo. That was a monster of a phone across the board.
Although, to be technical it was a Moto Z Force Droid Edition...