

How Offline Web Apps Should Work - AffableSpatula
http://blog.stateless.co/post/6246070973/how-offline-web-apps-should-work

======
jerf
I disagree entirely. Trying to make a cache anything other than a cache is a
fatal error. The purpose of a cache is to return exactly what would have been
returned if the cache wasn't there, only faster, and even minor deviations
from that principle rapidly become problematic. The gulf between disposable
data that can be optionally used for speeding things up, and canonical copies
of data (even locally canonical ones) is not a difference that should be
glossed over into one single subsystem, but a difference that should be
highlighted and called out and generally fussed over.

For that matter, the W3C should stop calling this a "cache". The offline
applications, as specified, are not functioning like cached values. They are
functioning like installed applications. If you can't dispose of a "cache"
without a thought beyond "Gosh, my app's performance is going to suffer until
this cache is repopulated", _you are asking for pain_ , or, alternately, you
don't actually have a cache.

To the extent that you may be able to re-use HTTP mechanisms for detecting
changes, that's great, but it should be done carefully and consciously to
avoid dependencies between the cache and application storage.

Also, I don't see how this secondary proposal deals with the fact that an
application install may involve installing files that the browser has not
accessed yet, which seems a fatal error to me.

~~~
AffableSpatula
There's not a fatal error there, afaik. I didn't detail how to deal with un-
accessed files because the solution seemed trivial: when your application
boots up, make sure it prefetches all required assets (i.e. warms the cache)
by making ajax requests to each URL.

I'm not sure about your interpretation of what does and doesn't constitute
appropriate use of a web cache, either. Do you have any examples of the kind
of problem you're foreseeing?

~~~
jerf
"I'm not sure about your interpretation of what does and doesn't constitute
appropriate use of a web cache,"

Not _web_ cache, just plain _cache_ cache. Any kind of cache. Trying to
abstract cached content and permanent content with one abstraction is a recipe
for disaster, a disaster that leaves you without the proper tools to solve it.
It's the sort of disaster that comes up a lot in software engineering where I
can not provide a single pithy code example in 20 lines, because they you can
always reply with "Well, I'd just write these 25 other lines". The problem is
deeper; you've written a semantic confusion into the base primitives of your
system and you _will_ pay for it.

It's similar to the problem with RPC; you should not and can not simply sweep
the difference between local and network communication under the rug. It works
small and is incredibly painful in the large. Unfortunately, I can not seem to
find a pithy explanation of why this is true. (Perhaps I should write one.)

An installed application is not simply a cached version of the website. Even
the W3C proposal has gone too far down that road, but the solution is not to
keep going even farther.

~~~
johnzabroski
> It's similar to the problem with RPC; you should not and can not simply
> sweep the difference between local and network communication under the rug.
> It works small and is incredibly painful in the large. Unfortunately, I can
> not seem to find a pithy explanation of why this is true. (Perhaps I should
> write one.)

I would be willing to help you write it if you'd like. I understand network
disruption, partial failure, and weak/episodic connectivity very well,
including scenarios where nodes in a network are required to remain radio-
silent for long periods of time. Command-and-control military-style systems
are much more general than the Web.

Related, automatic code distribution is a sort of Holy Grail in programming
language design. It is difficult for many reasons. Even data is a huge
problem. For example, if the toolbar buttons on Google Docs suddenly die (like
they did for our marketing rep a few weeks ago), then the application can
become extremely hard to use. Web applications don't handle this well today,
and a cache doesn't really address this well, either, because it could be that
the new resource invalidates an older cache. In other words, it is a version
control and configuration management problem.

------
jorangreef
Another requirement is more reliable browser-side storage.

IndexedDB is too slow (an order of magnitude slower than SQLite) and too high-
level (with consequences for those who want to handle index migrations or
index multiple values).

WebSQL (SQLite) needs to be supported by Firefox so that the web can have two
viable storage options, not just IndexedDB. LocalStorage does not count since
it is limited to 10mb.

Furthermore, Chrome needs to offer real permission choices to users who want
it. At the moment there is no way for an app to be granted storage quotas
unless it registers with the Chrome web store.

All the browsers need to come together on this:

1\. SQLite must be in all the browsers (at least until IDB proves itself apart
from political support from Mozilla). SQLite is probably the most widely-
deployed embedded database and IDB is a far cry away.

2\. Users must be able to grant storage quotas to web apps directly (without
requiring that they be installed through some web store).

This will only happen if we the developers start to get active on these issues
and stop assuming that the W3C committee will get everything right without our
involvement.

~~~
paulirish
Storage quota management is coming out in Chrome 13. You'll like it.

[https://groups.google.com/a/chromium.org/group/chromium-
html...](https://groups.google.com/a/chromium.org/group/chromium-
html5/msg/5261d24266ba4366)

~~~
jorangreef
That's good news.

------
AretNCarlsen
How does the browser know the set of pages to be cached for a particular
offline app, though? Do you have to provide a static sitemap at a known URL?
But to support multiple apps deliverable from a single subdomain, you would
need to specify the sitemap's URL in some webpage, perhaps as an attribute to
some HTML element. So each sitemap would be a sort of Manifest which controls
the Cache for an app.

By the way, the app's cache obviously ought to be isolated from other apps'
caches, right? Especially if you let the user grant an enlarged cache as a
per-app permission. And the user might want to clear the miscellany browser
cache without clearing the app cache, so we had better give this Application
Cache a distinct name.

~~~
AffableSpatula
That's a decent point about clearing the cache - it should be distinct from
flushing assets indicated for offline use. That distinction could be drawn by
determining whether or not each asset was served with stale-if-error.

------
newhouseb
We use a similar strategy for our HTML5-based IOS app, however the issue is
that it might take a few seconds to realize that an error has occurred and the
resource should be served stale, which pretty obnoxiously kills any
responsiveness.

The ideal way to make this work is to allow requests to be loaded idempotent
and thus you can serve a stale request first and then if a more recent version
should come down from the server you can re-load that resource individually
(say a feed request to your server or something). This of course requires more
than just an HTTP header but it's a cleaner solution provided you can prove
that loading of resources is idempotent.

~~~
AffableSpatula
The "cache-control extensions for stale content" provide stale-while-
revalidate, which seems to cover that requirement

------
wslh
There are any real possibility for running pure offline application within the
browser? obviously without embedding a webbrowser control on your application.

------
ra
> _I believe the current proposal for offline web applications is too
> complicated, fiddly, and brittle._

This is true, but I feel that HTML5 is the problem.

Given that we can't hope for a quantum leap over HTML for quite some time, I
expect that one of the JavaScript frameworks will emerge as a powerful
application framework for web apps, offline and online.

cappuccino.org is probably a good first generation example.

------
andrebehrens
I fail to see how a bunch of cache headers is less fiddly or significantly
different, really, than a manifest.

