
TeX line breaking algorithm in JavaScript (and HTML5 Canvas) - bramstein
http://www.bramstein.com/projects/typeset/
======
stralep
Why current browsers are not using this, or some improved version of this
algorithm?

~~~
patrickg
Because a) first fit, best fit or something else is trivial to implement,
having a good total fit implementation that works in a lot of circumstances is
not trivial (I have re-implemented a simpler version of the algorithm in Ruby
including using a hyphenation library). b) you can ask the same question: why
does almost no word processor implement this algorithm? And with word
processors it is much more important! The answer is simple: Most word
processors are doing many things badly so having a total-fit h&j (hyphenation
and justification) algorithm would only make the output slightly better but
still much inferior to good book typography.

~~~
imurray
_why does almost no word processor implement this algorithm?_

In TeX the line breaking depends on the whole paragraph. I imagine it could be
visually quite distracting if the line-wrapping jumped around in a WYSIWYG
word processor as you added text to a paragraph.

Edit: ah. I see this is noted in the posted article’s To-do section:

 _Figure out how to deal with dynamic paragraphs (i.e. paragraphs being
edited) as their ratios will change during editing and thus visibly move
around._

~~~
patrickg
What about a menu (ribbon) item called "apply very nice typography to your
document" which recalculates/reformats the paragraphs? Would this be a
solution?

~~~
yan
This is exactly why I prefer authoring documents in LaTeX vs WYSIWYG tools: I
can concentrate on semantics and not worry about representation. Writing stuff
in Word forces you to be distracted by its appearance on the page and can
drive me crazy with micro-adjustments. With LaTeX, I just write the document
and the computer does what computers are good at doing.

~~~
bodhi
> I just write the document and the computer does what computers are good at
> doing.

Which is: Interpret your instructions _very_ literally, so you still need to
go over your document carefully to fix problems like urls running off the edge
of the page?

Smart-arse joke aside, I agree with you that LaTeX is generally great, but
it's not perfect, and sometimes it is difficult to make it do what you want.

------
patrickg
This is great news. However, having a Knuth & Plass line breaking algorithm
without hyphenation is like sex without a partner. Pretty much useless. The
good points you get from using a total fit algorithm is much less then you
lose from lack of hyphenation. If hyphenation is implemented, this will be
very great!

~~~
jacobolus
Well, hyphenation isn’t too hard, since there are reasonable solutions that
can be used directly or easily ported. (There are existing javascript
hyphenation libraries floating around.)

The real problem with all of this is that everything is being done in a canvas
element, which means no text selection, no searching, etc. etc.

~~~
bramstein
Hyphenation is definitively on my to-do list. As you said, it will improve the
output of the canvas tremendously.

The canvas is just a way to show the output of the line breaking algorithm.
Earlier I had an example on my website of a dynamically created DOM paragraph
using the Knuth & Plass algorithm. That way, the rendering was done by your
browser (i.e. select, searching, etc. was possible) while the line breaks were
handled by the algorithm. Unfortunately I haven't been able to get it reliable
on all browsers, so I took it out before I posted this item. It's still on my
Github repository:
[http://github.com/bramstein/javascript/tree/master/src/types...](http://github.com/bramstein/javascript/tree/master/src/typeset/)

~~~
elblanco
I'm definitely following this and waiting for when you get the browser
rendering complete. The could be really great.

------
ZeroGravitas
Does the TeX algorigthm reduce _rivers_ in the text, or is that just a
coincidence?

<http://en.wikipedia.org/wiki/River_(typography)>

edit: re-reading this seems to follow naturally from reducing the size of the
gaps between words, which the algorithm does.

~~~
patrickg
As you said: the Knuth/Plass algorithmn does not reduce or even detect rivers.
Detecting rivers can be done in linear time, but reducing is very likely to
put the algorithm in a higher time complexity class (which is almost linear
with input length for the normal algorithm!).

------
spicyj
Impressive!

Doesn't work in iPhone, sadly. Not sure why not, because Canvas is supported,
I think.

Might be possible to make text selectable, but looks like a huge chore:
[http://stackoverflow.com/questions/1451635/how-to-make-
canva...](http://stackoverflow.com/questions/1451635/how-to-make-canvas-text-
selectable/1457088)

~~~
jacobolus
Older versions of Webkit canvas don’t do text I don’t think. Not totally sure.

~~~
bramstein
I use the new Canvas Text API, which---I guess---isn't supported by the
iPhone. It would be fairly straightforward to make it use the old text API. I
chose canvas as an output format for this demonstration, the algorithm itself
knows nothing of canvas, so it could be adapted to use whatever output format
is suitable for a device.

------
grayrest
Do you have any plans on intra-character spacing as well? I know this is done
in print to reduce the word gaps in justified text but I'm not sure if the
resolution on screen would allow it.

I have a &shy; hyphenator using Liang's algorithm as a YUI3 module if you're
interested.

~~~
lambda
If you used sub-pixel alignment, it might be worth it.

I just wish we could get 300dpi displays for our computers already, and stop
worrying about the pixel grid. But, we're going to need some better techniques
for resolution independent UIs before that happens. In particular, you will
probably actually need better hinting for vector based UI graphics, so that
the same vector graphics that can be used for high resolution displays can
also be used for lower resolution displays while still looking good.

~~~
jacobolus
It’s worth it to use some inter-letter spacing anyway, even if the spaces can
only be whole numbers of pixels, because it avoids exceptionally wide or
narrow word spaces.

