Hacker News new | past | comments | ask | show | jobs | submit login

We debated this in 1993 on the www-talk mailing list [1]. Terry Allen (an editor at O'Reilly) wanted rendered HTML documents to follow Tex conventions with extra space after a period. Marc Andreessen (still at NCSA in 1993) pointed out browser developers couldn’t do the syntactic analysis required to distinguish the end of sentences from inter-sentence periods. Guido van Rossum (working on Python version 1.0 at the time) weighed in on the whitespace issue and complained “it’s mostly propaganda by Knuth and Kernighan (TeX and troff) that makes people want this.” We ended up with browsers collapsing spaces between sentences for the web. Most style guides [2] seem to have settled the issue in favor of a single space but debate rages, eh?

[1] http://1997.webhistory.org/www.lists/www-talk.1993q3/index.h...

[2] https://en.wikipedia.org/wiki/Sentence_spacing_in_language_a...




A-ha! Another consequential web outcome decided in Urbana-Champaign. As an engineer and a writer, I have monitored this debate for some time. Editors of mine have usually lobbied for two spaces, while engineers I’ve worked with say, simply and reasonably, browsers reading HTML deprecate the extra space, so why bother? Now we know why. Thank you for the fascinating inside story. The traditionalist in me, however, the part that still enjoys a paper newspaper, will always prefer the luxuriousness of the double space.


One of the comments there concisely summarizes history better than modern typographers do:

> I think most books don't do this any more [have wider spaces between sentences]. Newspapers certainly don't. It's way too much trouble escaping out all the abbreviations.

If you need wider spaces between sentences you also don't want them indiscriminately after every period, so you need to exercise more care and indicate them differently in the input markup. For example, using two spaces:

    This is a sentence.  This is another.  Dr. Smith wrote in Proc. Amer. Math. Soc. about NASA.  It was good.
Or in TeX, which makes the typical case easier by using the heuristic that a period after a lowercase letter ends a sentence, you need to indicate wherever this heuristic fails (you can also use ~ to prevent a line-break):

    This is a sentence. This is another. Dr.~Smith wrote in Proc.\ Amer.\ Math.\ Soc. about NASA\null. It was good.
Having to keep track of which periods are ends of sentences is a bit inconvenient in the general case, which explains why wider inter-sentence spacing was given up first in newspapers, then low-quality mass-publishing, then finally high-quality printing as well (almost). But typographers today (see my other comment) make it seem like this has always been the case and wider spaces was just a quirk from typewriters somehow!


Note: You should use nonbreaking spaces (the ~) in many places where they are often not used.


When?


Wait, what's the \null for and shouldn't there be a \ after Soc.?


Yes there should be a \ after “Soc.” Proofreading the typeset output might have caught this error. :-)

The “\null” after “NASA” is to show an example where the heuristic fails in the other direction: the “.” after the uppercase letter won't be considered to end the sentence, so you need to either type `\hbox{NASA}.` or `NASA\null.` (where `\null` is an abbreviation for `\hbox{}` i.e. an empty box) or something like that — something so that the period doesn't immediately follow an uppercase letter.


I think `\relax` is a better no-op than `\null`, right?


While \relax is indeed a no-op “command”, that is precisely why it doesn't affect anything here. The way TeX works is that (roughly speaking), after expanding macros and acting on other commands (primitives) it encounters, it treats every character it encounters as another member of the current horizontal list (paragraph). Then it breaks this list into lines, etc. Whether you insert \relax or not, the horizontal list is the same: the uppercase A will be followed by the period. So you need something like \hbox{} or (equivalently) \null (something that actually gets into the horizontal list) between the “A” and the “.”.


So that's why that happened.

Typographically, I don't think you can compare the visual effect of an extra period in properly rendered right-justified text with the effect when it's buried in the usual ragged-right default on the web.

In the early 90s most monitors didn't have the resolution to handle full justification without making it look like crap.

In theory it would have been helpful if there had been enough foresight to consider the situtation >20 years later. Because those neat margins look really nice.

In practice full justification would have used significantly more memory and processor power and would have been much slower. So although it's a shame we ended up with ragged right as the default, it's only become practical to consider high quality implementations - not just budget hacks - of more polished formatting in the last few years or so.


Ragged right is not somehow inferior to full justified. It automatically creates rythm that keeps you engaged. There is reason most graphic profesionals prefer look of ragged right.

It's also pretty hard to have quality full justified text. To keep spaces consistent you need hyphenation and advanced typography algorythms like latex microtype package and paragraph composer like the one in Indesign. And it will still not be enough. You will need to have experienced human make adjustments - fix hyphenation and fiddle with settings. When you see full justified done well there is someone putting quite bit of effort so it looks well.

The reason it is so hard is that if you take somehow optimal length of 70 characters you will have only about 10 spaces on line add to that limit of 2-3 consecutive hyphenations at the end of lines and few longer words will get you to pretty tight spots. It's a balacne battle. Sure increase number of characters on line and then it will look even but long lines make things much harder to read and its the worst problem because you will not be able to catch next line.

Most often you will see justified text full of rivers or crazy big differences between space widths - both more distracting than ragged right. For some reason some people think its more grand and authorative but it is quite dogmatic view.

The main reason why it's used in books is that it saves space. On 400 page book it can make difference. But the publishers employ dedicated expert who does this work and goes page by page so everything is in order.


I don’t know. It sounds like you’re older and smarter than I, but whenever I see two spaces at the end of a sentence, I fix it to make it one space. It looks plain wrong to my brain


When you're reading a several hundred page technical report, those double spaced periods are really helpful for picking sentences out at a glance. You don't tend to read those in order from start to finish. There's a lot of jumping around, highlighting things, writing notes. Having sentences be easily visually distinguishable at a glance when scanning a page is fairly helpful.

It's kind of like trying to read tons of code with or without syntax highlighting.


Double spacing between paragraphs is surprisingly helpful with smudged text on physical paper. Many linguistic conventions seem arbitrary, but are really about redundancy.

Punctuation could have replaced spaces.just like capitalization of the first letter in a new sentence,it’s helpful to make things as obvious as possible. Is that period a typo or are those supposed to be different sentences?


It basically did, except the other way around: punctuation was originally used where we would use spaces now.

https://en.wikipedia.org/wiki/Punctuation#Western_Antiquity


Two spaces is probably wrong, but for typesetting it generally looks better to have slightly bigger spaces at the end of sentences, and if you need to stretch space it generally works better to do so at the end of sentences.


And you would be right. There shouldn't be two spaces. It's some old convention from typewriter times. Nowdays space is not fixed width. It's fluid and it's job of the system (browser, graphic editor) to make it right width.

In fact first thing book designer / publisher will do when they get text from author is run it through cleanup script to fix all mistakes and inconsistencies and one of those will be double+ spaces.


We should have gone with 1.5 spaces as a compromise.


Despite it being called "double spacing", TeX's inter-sentence spacing is actually only about 1.25x the inter-word spacing (before justification stretching is applied, at least). The argument is really just over whether or not the space after a sentence should be wider than the space after a word, and not how much wider it should be if so.


Or a slanted space, like a mezuzah.


Only for Ashkenazim/Chasidim. Sephardi and Mizrahi hang them vertically. https://www.myjewishlearning.com/article/ask-the-expert-slan...


  Why compromise when you can do both?   With three spaces everyone wins!


This doesn't paint python in a better light to me. Two spaces is not so that you render two spaces, it just made it easy to distinguish the end of a sentence. Calling it mostly propaganda just shows even then the python crowd got strings of the day wrong.

Is also why two line feeds indicates end of paragraph. Works way easier than requiring container markers. Disagree? Just look at in house Java projects to see how many screw up the doc thinking they did multiple paragraphs.

And yes, that is the same thing. Books do not have extra space between paragraphs. It is a stylistic choice, period.


How much of the book spacing is to save physical space? A fair number of larger / more expensive books that I have on hand, and nearly all non-fiction (especially technical) ones, do have additional space between paragraphs.

Potentially all the paperback novels I have within reach do not, though, agreed. They also have far smaller margins, tighter spaces between lines, and thinner paper.


Not sure that matters. My point was that it was a frugal markup. You don't do two spaces because you want more space, per se. You do it to signal a period at the end of a sentence.

Similarly, we don't ever start a paragraph with spaces. Very common to indent opening paragraphs, though. Why don't we space at the first sentence? Because we already have a clear marker for that.

To that end Word is correct to flag two spaces. In all views. Because it is a wysiwyg. You are not writing a markup, you are seeing the rendered target.


If people are double-spacing on a wysiwyg, doesn't that imply it's not frugal markup?


They are using markup where they shouldn't be. If, in word, I type italic text using stars, that just means I'm wrong in that context. Similarly, if I use escapes, such that I expect \n to add a new line, I'm just wrong. That doesn't make it less useful where it originated.

Edit: I amusingly had to try twice to get the italic text there. Double stars did not bold. I don't know how to escape the stars.


Word has never had markup though. Nor have most people ever written markup. Why would they be trying to write markup?


I would wager most folks that use word have never been the two space crowd. Especially not people that learn on word.

Just like most people that write on paper don't leave extra space after a sentence. We do indent paragraphs commonly. At least, if you are writing somewhere you want it indented, it is incumbent on you to do so. :)


Wow, that webhistory mailing list archive is quite the treasure trove...


I got to thinking more about this and posted an essay to my blog [1]. Looking back, I wonder if browser developers could have applied the algorithms that Tex uses and perhaps web pages would look slightly more elegant. We take for granted the world around us. Web browsers work a certain way; web pages display the same space between sentences as between words. We forget that many things we take for granted are the product of a social process (or “social practice” as Mao Tse-tung famously said). In this case, something you probably took for granted (and probably never even thought about), was the product of a brief discussion in the summer of 1993 among developers who each had their own beliefs about what was right and good. Reading the archives of the www-talk mailing list reveals how part of our world came to be.

[1] https://danielkehoe.com/posts/personal-history-punctuating-t...


The very next comment thread is about unreasonable load that web crawlers create. http://1997.webhistory.org/www.lists/www-talk.1993q3/0001.ht...

Great proposal from Nathan Torkington that search engines be called "Josie and the Pussycats". Somehow Larry and Sergey didn't get that memo.


Thank you for this! I've been complaining for a long time this is why it happened - browsers eat white space.


Another piece of evidence in favor of "the debate rages" (though IMO more in favor of double-spacing): if typographers were even slightly unified in this, you'd see period-double-space ligatures that render as period-single-space.

Instead, I'm not aware of any common fonts that do this.


I'm not sure extra space after a period is a convention in TeX, meaning it is added only when required by aligning text, not after every period.


Not sure if you mean if it's convention to put an extra space in the source input or the rendered output, but 'extra space' (as opposed "an extra space character") is definitely rendered in the output.


AFAIK it is indeed a convention in TeX, and LaTeX. See https://tex.stackexchange.com/tags/frenchspacing/info


Since someone must, I’ll go ahead and propose we instead use a single tab after periods if it’s the end of a sentence. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: