

How Scribd's HTML document reader works (part 1) - snowmaker
http://coding.scribd.com/2010/05/17/facing-font-in-html/

======
jrockway
This is some solid engineering. I wish the code was freely available, so I
could finally have a sane mouseless PDF viewer application.

~~~
djcapelis
Perhaps a digression, but xpdf isn't a sane mouseless PDF viewer?

One of my favorite features is pressing 'q' causes it to quit, solely because
I can quickly (It's blazing fast, at least on Linux) open up a pdf,
pageup/pagedown through it, and quit the application quickly, efficiently and
easily.

So it's not built with the most attractive toolkit, but it does seem to meet
your requirements, no? (Okular is close to converting me from xpdf, but it's
those extra several milliseconds during startup and the lack of a single key
to quit the app that keeps me with from switching.)

~~~
rtra
You can easily configure all the key shortcuts you want. I use 'q' for
quitting, 'w' for fit to width, 'p' for fit to page, 'g' for goto-line, etc...

I love its minimal GUI interface (by default, C-m for toggling menu
visibility!), highly customizable navigation panel, and its 'trim margins'
view option. I also find it faster than xpdf after startup. The only thing
that i don't like right now is the slow chm rendering.

------
smanek
Great stuff! Aren't there licensing issues involved with making arbitrary
fonts available over the web though?

I'd talked to a few font foundries in the past, and very few would let me use
their fonts with a @font-face (or, in their eyes, distribute their fonts
indiscriminately).

~~~
matthiaskramm
Understandable concern. For now let's just say that we have a blog post in the
queue (part 3/3 of the HTML5 series) that describes exactly how (and to which
degree) we store and (re-)distribute fonts.

~~~
danieldon
It would do a huge service to the web if you can get the blessing of foundries
for a method that can be used by anyone for displaying licensed fonts using
@font-face.

Even creating derivatives of fonts is often restricted, eg, the Adobe Font
Folio EULA considers derivatives under the same license as full fonts and the
Linotype EULA states "It must be ensured that the Font Software cannot be
fully _or partially_ extracted from said documents" (emphasis mine). So,
presumably, it would be potentially problematic even if you are extracting
individual characters and generating new fonts.

As I understand it, since fonts can't be copyrighted it's purely a software
licensing & "piracy" issue. If you've figured out a way we can all get around
that it would be a big deal. Maybe you are in a unique situation because you
aren't agreeing to an EULA and, instead, simply extracting characters from
fonts embedded in documents uploaded by your users. Hopefully you'll share the
details of how your attorneys have advised you about what a website's
liability is in this area using different techniques

Honestly, on one hand I'm hesitant to comment about it and give a voice or any
encouragement to the foundries, but on the other hand it would be pretty
shitty if scribd just skirts these issues and plays fast and loose with the
situation while the rest of the web is still stuck using the same old font
stacks because they can't afford to risk the legal fees.

~~~
ars
You answered your own question.

Fonts can't be copyrighted. They didn't agree to any EULA. All they need to do
is convert the font to some other format, and they are legal.

And BTW, this line is ridiculous: "It must be ensured that the Font Software
cannot be fully or partially extracted from said documents".

It's possible to embed a subset of the characters, and that's what most
require. But to prohibit even a subset? That means you can not use the fonts
for any electronic documents whatsoever unless you convert them to bitmaps,
which means they are almost useless for PDFs.

I personally would never buy a font with such a restriction unless all I made
was print.

BTW, if you want to be legal, just download (pirate) the font file, and
convert it to some other format. You never agreed to a EULA, and fonts can't
be copyrighted (only the original file can be).

In theory they could complain about the original pirate act, but they would
have zero evidence, and any subsequent usage is legal.

~~~
gjm11
Mechanical conversion from one format to another doesn't defeat copyright
protection. If you write a book in English and I translate it into French (a
change much more fundamental and much more creative than you're proposing),
that doesn't entitle me to publish it without your permission.

The shapes of the glyphs are not protectable. The font file is considered to
be a computer program, and is protectable as such. Converting it to a
different format doesn't get around copyright any more than compiling a C
program does.

------
yarone
For the record, I must say, Matthias, you kick butt. I've used your tools
(swftools.org) for the last 7 years and found them to be extremely well built
and useful. And the "tech support" has been amazing (you just complain about a
bug or make a feature request on his mailing list and you get a fix /
enhancement within days). Incredible work. Scribd made a really, really smart
move bringing you on board to lead state-of-the-art document rendering. Please
keep up the great work.

~~~
trip
I agree :)

------
modeless
Rotating text by generating custom font files with pre-rotated characters and
putting each character in a separate absolutely positioned div element is more
"practical" than CSS transforms? I find that difficult to believe.

~~~
matthiaskramm
Well. We're talking about making this work in quite a number of different
browsers, including IE6. So yes, it is.

~~~
modeless
IE (including IE6) supports transforms, as you point out yourself. Of course
they don't do it in a standards-compliant way, but that's true for their
custom font support also. What problems did you run into?

~~~
matthiaskramm
Performance issues, among others. IE's DXImageTransform, as the name suggests,
renders an html element to an image and then transforms that entire image.

~~~
blasdel
The VML APIs that've been in IE forever are all pretty damn fast individually,
but displaying the results slows the rendering down significantly. It's like
how straightforward PHP is damn fast as a templating system, but absurdly slow
for implementing a high-level templating system in.

Besides, text selection wouldn't work.

Have y'all managed to avoid implementing a GDocs-style system for hiding text
underneath bitmaps for selection?

------
jurjenh
I wonder how the character spacing issues are resolved - for example ff and fi
generally should overlap and physical print typography uses different
characters for this. This is definitely possible in PDF (though I'm not too
sure of the details), but then how does this translate back to HTML?

------
diziet
I've been really impressed with scribd when I saw the html text roll out.
That, plus the fact that the website had a copy of out of print textbook that
I needed raised my kudos value of scribd.

------
senki
The best part:

"[...]we need to store three font files: One for Internet Explorer (.eot) one
for embedded devices (.svg) and one for Firefox, Safari, Chrome et al (.ttf)."

Go IE!

