
How Scribd runs 150,000,000 polygon intersections a day - matthiaskramm
http://coding.scribd.com/
======
euroclydon
I think a lot of people (me included) just though of Scribd as a YouTube of
documents -- taking advantage of unlicensed material to juice up Pagerank, and
then somehow converting that into a revenue stream. It also seemed kind of
annoying to launch a Flash player just to view a PDF in a now, double-
scrolling window, however I'm reminded, while reading this technical post,
about the Ycombinator interviewer who said: "where's the rocket science?"
Clearly, we're seeing some smart developers tackle a tough problem, and for a
broad audience. I think they have a bright future!

~~~
mquander
I don't really see how Scribd will be able to maintain any appeal if Chrome
polishes up their in-browser PDF rendering, which basically loads
instantaneously and seems to render things fine. (The HTML transmogrification
is cool, but if PDFs are fast, why do I need it?)

~~~
euroclydon
My point was that, I don't know that they have any legitimate appeal now, but
they seem to be smart, and they're building a IP portfolio in an important
space, so they might be able to switch gears or sell their IP. Heck, they
might see the writing on the wall and maybe that's why they're writing these
technical blogs, to gain exposure for their IP and technology.

------
mhd
JavaScript/HTML seems to have caught up to Display PostScript at last. Nice.

------
petervandijck
ok that's crazy. Is scribd becoming the new pdf (I mean that in a good way)?

~~~
ary
The only thing I could applaud more would be a new (and simpler) cross
platform document format (for asset encapsulation and offline viewing) that
could be transformed to and from HTML on a whim. Scribd <-> My Screen <-> My
Printer. To death with PDF.

Come to think of it HTML5/CSS3 + an open archive/compression format would suit
this purpose nicely.

~~~
tomjen3
What is your problem with PDF?

Almost any browser other than lynx has a buitin pdf reader, and pdf can be
created by just about anything.

Finally pdf is a format that allow you to ensure that no matter where it is
printed, it is exactly as you wanted it to be.

~~~
gruseom
The problem with pdf is that everything about it is clunky and slow. I didn't
_decide_ to groan whenever I see that something I want is trapped inside a
pdf; years of annoyance just built that up as a reflex.

Edit: actually, there is an exception, and you mentioned it: printing a pdf,
once you have it open, is almost always a good experience. Too bad it's the
thing I want to do least often.

------
CamperBob
Well, that's about as many as a modern 3D game engine executes in one second,
so, meh.

~~~
reitzensteinm
I wasn't going to say anything, but you're being upmodded like crazy, so...

It's a complete apples to oranges comparison. Scribd is performing quite
complex logic with an emphasis on correctness not speed. I feel like you're
dismissing an interesting article for the sake of an amusing one liner, and
that's pretty much the antithesis of HN.

~~~
CamperBob
_I feel like you're dismissing an interesting article for the sake of an
amusing one liner, and that's pretty much the antithesis of HN._

It is a nicely-written and illustrated article, but it describes a problem
that hasn't been considered 'interesting' for decades and can, in any case, be
tackled with a caching scheme. Why would they need to run the same
intersection logic over and over, when there's only a finite number of glyphs
to render in any given document?

~~~
matthiaskramm
> but it describes a problem that hasn't been considered 'interesting' for
> decades

Actually, polygon operations on grids is still an active research area, with
the latest papers on the topic less than two weeks old:
<http://www.sci.utah.edu/socg2010agenda.html>

> Why would they need to run the same intersection logic over and over [...]?

We usually don't. It basically depends on the context in which the glyphs
appear on the page. For a standard, say, LaTeX document consisting of mostly
text and without weird graphic operations taking place around or on top of the
text, a given glyph is just processed once.

