
IDPF, EPUB Standardizing Body, Has Combined with W3C - rhythmvs
http://idpf.org/news/idpf-has-combined-with-w3c
======
rhythmvs
I believe it a good thing that the EPUB standard will henceforth be further
developed by and within the same body that develops the standards on which
EPUB relies. After all, EPUB is “stripped down html5” anyway. As a developer
of Web-based html5 books, I can certainly see the benefits of becoming enabled
to re-use my static html and css ‘as-is’ and repackage into EPUB-based books
for offline consumption by e-readers.

But there’s some fierce objection against the merger of IDPF into the W3C
[1][2], the key (?) argument being formulated as:

> “The W3C is focused on promoting the Web, but eBooks are not websites. When
> the IDPF is gone, who will advocate for readers?”

Maybe a fair point, but I don’t think I can agree. While it is true indeed
that reading long-form content like books requires enduring focus, and that
having to read them in a browser where linkbait always is luring to have you
click away into a never ending feed of distraction, the problem is not the
underlying technology.

[1] [http://www.publishersweekly.com/pw/by-
topic/digital/content-...](http://www.publishersweekly.com/pw/by-
topic/digital/content-and-e-books/article/72492-overdrive-s-steve-potash-
moves-to-block-idpf-merger-with-w3c.html)

[2] [http://futureofebooks.info/](http://futureofebooks.info/)

~~~
staz
I thought that WhatWG were actually the ones developing HTML5?

~~~
kuschku
WhatWG is just rubberstamping "whatever Chrome implements" (sometimes with
feedback from Firefox) as HTML5.

The actual development is entirely controlled by the browser vendors, causing
pain for everyone trying to parse HTML programmatically.

~~~
andybak
Because everything went so well when the W3C was left to their own devices...

~~~
kuschku
Certainly better than bullying cURL into accepting their idiotic URL
specification.

Especially considering what they want would rather be specified as a parser
for generating URLs from user input, and not to simply demand every tool
interacting with URLs to be able to parse malformed URLs.

~~~
jcranmer
WHATWG didn't bully cURL into URL, people complaining that cURL didn't match
browsers did that.

The WHATWG made a decision long ago that its standards would be descriptive
(describe how people parse it), not prescriptive (describe how people should
write it). Anyone who attempted to write a web browser would have had to
reverse-engineer how other browsers treated crap, because it was the only way
to get websites to work. If you think the definition of URL was stupid, you
should see what they had to do to support document.all: define a new concept
in JS to represent the notion of "this looks and acts like undefined but you
can actually use it as an object."

~~~
kuschku
And a descriptive standard is entirely useless.

The entire point of a standard is that implementors agree on a design
definition, then implement it, and it stays consistent, forever.

If you look at standards that work and fail, you’ll quickly notice a pattern.
Prescriptive are Metric, the A-series of paper, the entire SI system, most
open standards, etc. Descriptive are the imperial / US standard, Letter paper,
Microsoft "Open" XML, etc.

And before you complain that prescriptive standards are useless because you
can never change legacy systems: Several countries have prescriptive language
design, with legal authorities how the language has to be used, and they
manage to deal with centuries of legacy data.

~~~
jcranmer
The email RFCs are prescriptive, and totally useless. I actually have
evidence, for example, that RFC 2047 is more often violated than not, and I
wonder if message/global will ever see usage as a Content-Type. Prescriptive
attempts at tackling memory models in languages have generally failed.

Also, your delineation of prescriptive/descriptive is laughable. The imperial
standard, letter paper, and OOXML are all prescriptive standards (albeit OOXML
is a very badly written one). Prescriptive language standards aren't
necessarily well-applied--ask how many people follow the 1996 German spelling
reform, or how many use «le hashtag» instead of the "official" «le mot-dièse»
(hint: look at the name of the Wikipedia page is).

OOXML did poorly not because it was descriptive but because it wasn't precise.
It was an XML rendering of internal Office file formats, and its description
of terms were no better than internal documentation. Something like the TNEF
format is much closer to a descriptive document, since it spends a lot of time
discussing the differences between Outlook 2007, Outlook 2010, and Outlook
2013 at various steps.

~~~
kuschku
Considering I’m German, the 1996 spelling reform was exactly what I was
referring to – I’ve only found a single document this decade which wasn’t in
the new spelling.

All other documents I read have been updated in the meantime.

If an entire country can update centuries of material in a few years, why is
it so hard to update some simple websites, or, in case that’s not possible,
ship a polyfill as Addon?

~~~
jcranmer
If a new version of, say, Chrome were to drop the Mozilla/5.0 from its UA and
break a website because it relied on Mozilla/* in its UA detection (there are
STILL sites that do this), who would users blame? Chrome, obviously.

If you're trying to make a new web browser, it's even worse--people won't use
it if it breaks the sites that use it. And the web developer would say "it
works in all the major browsers, what's wrong with it and why should I spend
the time to fix it for your shitty new browser?"

The problem is that the blame for broken sites is universally attributed to
browsers, not website developers.

~~~
kuschku
That’s why you create a new version that is specifically _not_ backwards-
compatible with the old one, ensure all tools have appropriate linters already
existing, browsers in developer mode or beta/dev versions fail on such sites,
etc; and then let websites opt-in to the new version with a special header.

~~~
jcranmer
That has been tried before, and that has failed. One of the things that HTML5
fixed was the DOCTYPE mess (in fact, the <!DOCTYPE html> represents the
minimal string that enabled standards mode in every browser, including IE). In
addition to standards/quirks mode (and the concurrent but separate HTML/XHTML
issues), Mozilla tried versioning JS (that was ripped out), and IE tried the
compatibility mode switches.

It should be noted that later versions of IE eventually gave up and joined the
crowd by having its UA string pretend to be Chrome, which pretends to be
Safari, which pretends to be Firefox, which pretends to be Netscape.

------
jathu
When I was making a simple EPUB app for my own use [1], I found it surprising
that many publishers don't follow IDPF standards. Most do, but there are a
significant amount that mix and match all sorts of rules.

[1] [http://jathu.me/bisheng/](http://jathu.me/bisheng/)

~~~
protomyth
Any pattern to it or anything in particular that struck you?

~~~
matwood
The problem is two fold. The first is that most authors and publishers are not
very technical. The second is tools like InDesign created EPUBs that would
fail the idpf epubcheck tool with errors [1]. Adobe has fixed some of these
issues in later versions, but it's expensive for a publisher to re-export all
their books.

What this leads to is basically fixing random, one off issues depending on
publisher and book. I would definitely suggest not writing your own reader and
instead looking at something like Readium (and contribute if you have time!).

[1] [http://mademers.com/two-more-indesign-cs5-export-to-epub-
bug...](http://mademers.com/two-more-indesign-cs5-export-to-epub-bugs-to-
report/)

~~~
Finnucane
I oversee the epub production for an academic publisher, and the epub
conversions are created by our typesetter at the end of production. They do
indeed do a bit of custom programming (and sometimes manual labor) to make the
files come out decent. They're supposed to run epubcheck on every file before
delivery. I'm the guy who ends up fixing all those little random errors, so I
have to insist that the code be reasonably clean and orderly (they work to a
spec document that I prepared). Plus, accessibility is a concern. Poorly coded
and disorganized files don't play well for people who need assistive reading
systems.

------
nivals
This is good news. Making fully baked ePubs for the first few versions of the
B&N Nook was more art than science due to some of the problem with ePubs.

------
chris_wot
I guess I've never really understood the need for ePub when PDF is now a
standard. I'm sure I'm missing something, the question is: what is that thing?

~~~
abrowne
My thought has been more like, "I guess I've never really understood the need
for ePub when HTML is now a standard." … which is exactly what this
announcement is about.

~~~
msbarnett
Ideally, ePub would be "reflowable HTML plus baked-in things you would
obviously want in a book made dead simple for author-editor-publisher
workflows that will never involve a programmer in a million years". Proper
footnotes that appear on the same page as the thing they're footnoting, for
example.

In practice though this is still a shitshow even in ePub 3, 99% of publisher
just ship a pile of endnotes, and consequently certain authors like Terry
Pratchett are a god-awful experience in ebook format. Jumping back and forth
every 30 seconds sucks.

There are work-around that can be done with JS or certain reader-specific
markup you can use for iBooks to at least get pop up footnotes, but most
publishing houses don't have the technical knowhow or desire to invest that
much time in doing the right thing. There needs to be dedicated markup and a
better standard for how readers should treat its display whenever possible.

A real failure of the standardization committee, and it will likely only get
worse with a group that cares even less about books specifically. The standard
continues to be driven by people more interested in gee-whiz features than
solving longstanding problems in replicating table-stakes features of print.

~~~
Finnucane
Readers complain that ebooks cost too much, and then readers complain that
publishers don't want to spend money on technical stuff.

But some of the problem is that there's not a lot of payoff for technical
development, because only some platforms will support it. The Kindle doesn't
handle Javascript or math at all, has very poor support for tables, SVG art,
and video. So there's half your market gone right there.

~~~
msbarnett
> Readers complain that ebooks cost too much, and then readers complain that
> publishers don't want to spend money on technical stuff.

To be clear, I'm complaining that the standard overcomplicated certain
desirable feature to the point where publishers would have to spend money on
technical stuff, when they shouldn't have to.

I fully get why publishers don't want to have to engage in fiddly, expensive
bespoke development per-title.

