I learned old-school printing in high school. We had a press with a flatbed and screw. More to the point, we had lower and upper cases filled with steel type, including lots of different dashes and spaces. Our teacher was a traditionalist. He taught us such things as how to justify a line of type. Narrowing a line involved switching "thicks" (nominal space slugs) for "middles" in an artful way. Widening it involved switching thicks for two "thins", and so forth.
We had plenty of dash characters--we had hyphens of course, and figure dashes, and en dashes, and em dashes. Figure dashes are one en in width, with the vertical position of the visible element is designed to match the numerals in the font. This let us set a sentence like so with subtle perfection.
Type designer John Baskerville (1706-1775) owned a printing-works and type foundry in Manchester.
Does anybody notice this sort of thing? Probably nobody except book designers. Does anybody care? Well, my mom said she cared when I pointed it out to her.
From that teacher I learned to respect the choices of the type designer and how to use those choices. His tradition--now mine--dictates that all these dashes be typeset closed, without leading or trailing spaces. In the OpenType era, a type designer may choose to leave a speck of white space around a dash, but it's up to the designer, not the typesetter.
Text rendering in software like web browsers and word processors is gradually catching up to the tradition. Adobe's InDesign product does a great job of all this. Recent versions of MS Word are great improvements.
It helps to know the HTML entities for these items if you'll play around with them in your web pages.
— em-dash
– en-dash
‒ figure dash (using en-dash is mostly acceptabe).
‑ non-breaking hyphen
  one em of space
  one en of space
the much-misused nonbreaking space (use an en space for   ktksbai)
  thin space (0.2-0.25em)
People who use [La]TeX tend to become a bit obsessive over typographic details, as well. What do you think of that program? (Very interesting comment, by the way.)
The hardest part about using LaTeX is getting to the realisation that to replicate InDesign, you might need OpenType support that might not ship by default, as well as needing to run it more than once. Once you have /latexmk.pl [1] and XeTeX [2], you can then achieve amazing results [3] with the right modules and some creativity. It's just not quite as fluid to use as InDesign for graphically-minded people.
Or you can use make, or a bash script. OpenType support has been standard for years; the preferred engine is LuaLaTeX: https://lwn.net/Articles/731581/
FWIW, as a relatively novice LaTex (in combination with the Beamer program, largely for technical presentations) user, I can imagine why such people nit-pick over typography. The Computer Modern and its derivative Latin Modern fonts are beautiful---not to mention their inimitable flexibility.
I'll blame two of my colleagues for indirectly 'inspiring' me to spend more time than justified, tinkering with LaTex and TikZ (it lets you produce "vector graphics from a geometric/algebraic description"). I'm not happy with the amount of time I spend with LaTex+TikZ, but I find the results to be satisfying and very convincing. Have to diligently chip away at it to get more productive with them.
Something about LaTeX just doesn't click for me, and you'll never in regular use have any need for pure TeX. Luckily, there's ConTeXt which (unsearchable name and comparatively small support base aside) is a very enjoyable flavour and allows for some beautiful typesetting. Maybe not worth it to switch if you're already used to LaTeX, but if you've never used any of the family, I highly recommend giving one or the other a try.
If by "pure TeX" you mean Plain TeX, I use it, as well as LaTeX (I know I'm far from alone here). i agree that ConTeXt is a great project, with many advantages over LaTeX.
This comment made me smile, because it's so true :)
After my college days using LaTeX, support for ligatures arrived in browsers and I would spot those (fi, ff, ij, ft, and perhaps others) and post them on Facebook.
HYPHEN is not much used since in most fonts it's identical to ASCII ‘-’. MINUS SIGN is meant to be visually congruent with +×÷=, etc (same stroke width and appropriate alignment).
Personally I have - – — ‐ ‑ − on my usual keyboard layout.
Hyphen is also a Western name separator, as in Jean-Luc Picard. Using a hyphen where a dash belongs, because of its ease of use on a typical U.S. layout keyboard, is a common mistake. A lot of people type two hyphens to make a dash, as well, partially because of ASCII, and it was nice to update my phone and Mac recently and see that Apple is now automatically fixing that by default without making me hold down the hyphen—disrupting my typing flow.
You could add the translation before, but it’s cool to introduce a lot of folks who don’t know that to a proper dash.
The em dash is the most beautiful punctuation mark, all the more because it is rare --- and it is best kept that way. Certainly a sentence with more than two is imparsible. In fact, more than a few in a whole article is probably getting annoying. It's like the exclamation point in that way.
Once you discover the em dash, it is hard to restrain yourself. But I'm trying to come back to the comma. It does its work without fanfare, almost invisibly. The mature writer realizes his job is to tell the reader something without getting in the way.
I land on the side favoring a space on each side of the dash. Unconventionally, I prefer to ASCII-encode my em dashes with three hyphens. The traditional two aren't long enough. But two is better than one. I always have to reread sentences by someone who thinks the hyphen and the dash are the same thing.
> I land on the side favoring a space on each side of the dash.
Horrible; if you are going to set a dash open and use it for the purpose of an em-dash, use an en-dash.
Setting an em-dash open is ugly as hell.
> I prefer to ASCII-encode my em dashes with three hyphens.
Two-hyphens is conventional for an en-dash. Since a set-open en-dash serves the same purposes as an em-dash, a lot of people just use that instead of the three-hyphen conventional encoding of an em-dash.
Some people use set-open hyphens on ASCII in place of dashes, which is actually probably the best thing in a monospace presentation but ugly in proportional fonts.
>> I land on the side favoring a space on each side of the dash.
> Horrible;
I strongly disagree. A closed em-dash—like this—looks jarringly like the words either side of it are closely related, as if by an extra-long hyphen. For example, in t previous sentence, “this—looks” visually appears to me to be a single compound entity, which is ironic considering that the intention was to make them appear especially separated. The ideal spacing is a thin space on each side, but I find a full space totally acceptable.
Of course, worst of all is asymmetric spacing, like this— yuk.
Edit: I just realised a closed em-dash doesn't look too bad in these comments because there is a very slight space either side of it, but I still stand by the above. A closed em-dash is much worse in documents typeset in TeX, where the dash without manual spacing practically touches the letters either side of it.
Hackernews defaults to Verdana—a known terrible font-design. Dashes should be visibly thinner than hyphens. And the reply-box is in monospace, so I can't see difference between the en- and the em-dash. In German we use the en-dash with leading and trailing space instead of the em-dash.
If anyone else was surprised as I was to hear that Verdana was disliked, here's the most compelling argument I found against it:
While it was meticulously designed for on-screen legibility, these days higher resolution screens and anti-aliasing, etc. are a better solution than meddling with the letter shapes e.g., taller x-heights, bigger openings, etc.
One detractor did allow for Verdana when rendering fine-print/legalese at very small font sizes--as long as it was unlikely to be printed.
Verdana was exquisitely well designed for the special problem of on-screen readability on the kind of screens that dominated in the mid-1990s, but, yeah, it's not 1996 any more.
On most screens these days, you are probably better off, if using a font of Verdana’s vintage or older, using one optimized for print legibility.
AFAIK closed em-dashes are the more widespreadly used version, and the fact that they are em-dashes gives away that they indicate no relations among the words they separate. Though this is kind of a bikeshedding problem where people don't differentiate between a hyphen and anything else, and sentences result incomprehensible, or they trip you and you have to read and re-read them. Even many publishers just don't care. I'm actually content when people are kind enough to use something that's not a hyphen for indicating what em-dash does. IDK whan en-dash is really used for, but apart from word pairs (like in love--hate) I'd rather not see them. My conception of dashes for daily use: hyphen for compound words or minus sign (too much trouble to distinguish), en-dash for word pairs, em-dash for marking asides in sentences (where they denote kind-of a different tone from paranthesised passages).
Today (as per this thread) I learned I've been mistaking en-dashes for em-dashes all along. With surrounding spaces, at that.
Em-dashes feel too long and it feels unnatural not having spaces in between words – there's always at least one, when using any other punctuation marks.
> Some people prefer the way a ‘space-en-dash-space’ looks. Sometimes when you use the em-dash people say, “What is that? I don’t like that big long dash thing.” Some technical writers think the n-dash is the only one to use.
> It’s not a big deal. I usually use ‘space-n-dash-space’ instead of the m-dash – just to keep everyone happy. You can see this ‘wrong-n’ method used in countless websites, magazines and papers as a replacement for the m-dash. If you use the ‘wrong-n’ method and use it consistently, it works fine and seems to keep the greatest number of people happy.
You can put a thin space before and after the em dash. This space is narrower than a normal letter space.
Example with a normal space:
The quick brown fox jumps over — and, sometimes, under — the lazy dog.
Example with a thin space:
The quick brown fox jumps over — and, sometimes, under — the lazy dog.
This Smashing Magazine article discusses spaces in more detail. At the bottom of the article is a list of different spaces and their corresponding HTML entities:
What font do you use to read hackernews? Because in Verdana those are exactly the same. Tried some other fonts in inspector but think your thin spaces could be even thinner :)
Incidentally the author of that article, Marcin Wichary, (a) has many awesome photos of historical computing and typography, among other things, almost all CC-licensed¹, and (b) co-wrote the Pac-Man Google doodle².
I don't know enough about punctuation to know whether he actually meant "em" when he wrote "en" in several places, but I'm going to assume he actually meant to refer to em-dashes some places, and en-dashes others.
I noticed few others seemed to favor it as much, but never wondered why. My friend hates the em dash, and I suppose I partly agree. She said she dislikes when people use it as commas – but parenthesis are fine – which is pretty fair.
I never realised up until today (I am ashamed to admit) that the convention is for no spaces. I always offset señor em-dash.
However, I think my reasoning went thus. I put a space after ‘,’ unless the comma precedes a quotation mark. I similarly would never ever not put a space after ‘.’, nor ‘;’, nor finally monsieur ‘:’. Also to contrast with the way-one-uses-hyphens.
But an unbalanced em-dash, like so– would look preposterous so left open – is how I came to use it over time. But now that it has been brought to my I find that this– is not so preposterous and in a mono font this – does appear to be swimming in space. I may have to train myself–thus–though it may take some time.
If you prefer spaces around the dash, then use an en-dash. This is standard outside the US, though not universally. Most US publishers use the closed em-dash.
I put spaces around mine. I’m not sure if that’s the norm — let alone correct usage — but it tends to help readers differentiate it as a non-compound word structure.
Yeah? So? First, it does not say that one must not use spaces around em dashes. Second, so what if it did? This is English we're talking about. The main rule of English is that there are no rules. If you want a rule for English: it's to speak and write in such a way as to be understood -- who would fail to understand the whitespace around an em dash?! Not I. But missing whitespace around an em dash can be confusing -- is that a compound? did the author mistype a hyphen? Ugly. Whitespace is nice. Use it.
These conventions are rarely very strict but they do help communicate your ideas. Understanding them and understanding the impact of not using them is an important set of knowledge if you want to be clear and well understood in your writing.
It is similar to good typography in that it isn't required but that it will often make the reader enjoy or understand the work more when reading.
Obviously there have been many great writers who knew how and why they were breaking those conventions. E. E. Cummings was a notable example.
Nothing about allowed there. It’s about communication & understanding.
You are free to break whatever convention you like. It will just effect your communication clarity.
Understanding that (and how and why too) will help you be more effective.
It’s no different to saying “you can program it that way if you want but understanding the conventions & methods other people use may help you have more readable and efficient code”.
How are you failing to understand "the reader will expect certain conventions so following those will make it easier to communicate effectively with them"?
The downvotes show the grammar nazis abound. I'm with Churchill, who famously said of not ending sentences with prepositions that "that is the sort of thing up with which I shall not put!".
There are rules to making oneself well- and easily-understood generally, and to appearing to be an educated writer to other educated people. If you care about neither of those things then sure, there are no rules.
You can do it. Unless you have an editor, nobody is going to stop you. It just looks awkward and pretentious to anyone with knowledge of typographic conventions (as in pretending hard to be fancy while revealing your ignorance of standard practice.) It’s like using obscure words in an ungrammatical or unidiomatic way; again nobody but your editor is going to stop you, but anyone who knows the way those words are typically used will find it awkward.
The AP is a poor source for a typographic style guide. They are a wire service not professional typographers, and my speculation is that their conventions come from a context of typewriter manuscripts rather than properly typeset books. They eschew en dashes and use hyphens to indicate ranges (e.g. 10-20 instead of 10–20) – something no professional typographer would condone – because they are expecting someone with a typewriter to only be able to type hyphens, whereupon the choice is either one hyphen or two. So in a manuscript you get spaces around pairs of hyphens -- like so -- standing in for dashes. Their style guide is designed so that someone doing the newspaper typesetting can directly copy the typewritten manuscript over to the press without thinking about what kind of dash to use.
I use UTF‐8 except when I have reason to stick to ASCII. (One such situation is public mailing lists, where people’s misconfigured mail clients constantly butcher any non‐ASCII characters when replying.)
No editor is going to tell you not to do something that lots of people do. As the other reply says, you're being a tad pretentious over nothing much. Chill.
In professionally typeset material people use either spaced en dashes – like this – or un-spaced em dashes—like this—but spaced em dashes are unconventional and unprofessional, frowned upon by anyone who knows what they’re doing. Sometimes professionals put hair spaces or thin spaces around em dashes — like this — but on the web those often are rendered improperly, such as on this forum. Try copy/pasting this paragraph into a rich text editor and swapping between a few typefaces to see what the effect should look like.
I take it you never went through a juvenile Kerouac phase?
METHOD No periods separating sentence-structures already arbitrarily riddled by false colons and timid usually needless commas-but the vigorous space dash separating rhetorical breathing (as jazz musician drawing breath between outblown phrases)--"measured pauses which are the essentials of our speech"--"divisions of the sounds we hear"-"time and how to note it down." (William Carlos Williams)
edit: I kid, but those notes never left my adult mind and come to play to a lesser degree in my consideration in structuring letters, comments, and larger writings. I think the style has its place.
I'm incredibly happy that the em dash is gaining the attention it deserves. However, if honest—I am also a little upset that the good people the comments section are tending towards placing white-space around each dash and the text it forms an abutment with.
Your own comment to me illustrates why some sort of space is better. In fact, I can't tell if you're being ironic or not.
Your post is difficult to read because the em dash looks the same as the hyphen later in the post.
The argument that clinches it for me is editing. If you use an em dash with most word processing programs, you will consistently run into problems with the dash and surrounding text being treated as single words. You could reprogram things to recognize em dashes in those scenarios, but the standard behavior exists because of the way the vast majority of people perceive text.
Personally, I think the solution is a thin space, or to drop em dashes in favor of en dashes or something.
I use em dashes frequently, and these issues are frustrating to me. It's also frustrating translating them to plain text.
Depends on typography traditions. In Russian, em-dash in the middle of a sentence is always typeset with short spaces. We also rarely use en-dashes (I think there are maybe two cases when it is allowed: in date ranges like 1940–1945 and as a bullet point in lists).
In my opinion — and let's face it, much of this is just a matter of opinion! — a normal space is too much. Try thin space (U+2009) for a more subtle effect. (The exact result may depend on your font, however.)
I agree; it is down to opinion. I'd like to make use of the thin-space — but it doesn't make much difference in the typeface I'm using to write this reply.
When the dash (whatever type) is used to separate parts of a sentence, in British use there will be spaces either side. If Oxford is the exception, then I'm not surprised – they're the exception to many British writing styles. Admittedly, these examples look closer to en than em length.
I try to avoid hyphen-based range notation. It's sometimes hard to tell if a single hyphen was intended to be a minus sign. Double hyphens look like ASCII em-dashes. I've never heard of of the triple convention and would be confused if you neglected an inline explanation.
Here are some alternatives:
* English: from 1 to 3
* Some programming languages: 1..3
* Math: [1, 3)
I guess this also the main purpose of the en-dash* : http://www.thepunctuationguide.com/en-dash.html . The problem there is that en-dashes can be inaccessible as in my case--typing on a phone.
* I actually originally missed your point because you forgot the hyphen between 'en' and 'dash'.
Two hyphens is plenty. Three is calling attention to it. One is pitiful.
While we're on this subject, wide spacing between sentence-ending periods and the next sentence is the Right Way (tm). In ASCII that's two spaces. We're not animals. Let's behave accordingly.
When the web was invented, we came to consensus there should be a single space after periods that close sentences in rendered HTML documents.
You can see the original discussion, on the www-talk mailing list in July 1993 [1]. In the thread "Space after Periods," Terry Allen (an editor at O'Reilly) advocated for rendering more space after a period that closes a sentence than after a period that marks an abbreviation (in keeping with TeX and troff conventions).
I proposed that, "A WWW document (which uses proportional fonts) should have the same space between sentences as between words" and cited as authority "Words into Type" and the "Chicago Manual of Style." in 1993, "WIT" and "Chicago" set standards for publishing much like RFCs do for the Internet.
Terry Allen and I engaged in some snarky backbiting, then Ken Chang of NCSA Publications said he preferred "'one space fits all' as writers of HTML really shouldn't need to know the fineries of typography" and finally Guido van Rossum complained that, "extra space after a sentence... is mostly propaganda by Knuth and Kernighan (TeX and troff)" and implored, "Let's keep HTML simple!"
If you don't like the way that browsers collapse spaces between sentences, you can blame me (and Ken Chang plus Guido van Rossum who clearly had issues with whitespace beyond Python).
> While we're on this subject, wide spacing between sentence-ending periods and the next sentence is the Right Way (tm).
This convention started with typewriters, and typesetters (e.g. my parents) will tell you that it is only considered correct in monospace fonts. This is also why HTML takes the liberty of reducing whitespace down to a single space character unless you go out of your way to use an .
> typesetters (e.g. my parents) will tell you that it is only considered correct in monospace fonts
People like to say that but it's never been explained to my satisfaction. Spaces are twice as wide in a monospace font. Why would you use more spaces when and only when you have a longer space?
> Why would you use more spaces when and only when you have a longer space?
I've also wondered this. Referring to this mock-up[1], I have the same monospace text with single vs double spaces after sentences, and proportional font with single vs double. The usual argument is that #2 is better than #1 (for mono), but #3 is better than #4 (for proportional). It feels like an inconsistency in the preference, and (as you pointed out) the reasoning doesn't make sense.
No, an em-width space between sentences with either half-em or third-em width space between words was typesettig convention for a long time before typewriters, and is where the triple (later double) spacing in typewritten manuscript convention came from.
Typesetting convention evolved to narrower and eventually mostly settled on equal-to-interword spacing after that typewriter convention evolved from earlier typesetting convention.
In short, wider sentence spacing goes back to the first English standards of the eighteenth century – including the first nine editions of the Chicago Manual of Style, and even the tenth still had it as an en-space – and only really started falling out of favour in the twenties. Typewriters had limited effect, but the automated typesetters that suceeded them were actually what killed it off, to simplify the programming.
I typically do still type a single space, but I take full advantage of my preferred TeX flavour's (ConTeXt) automatic conversion to wider spaces. One successful, post-education convert here!
But what of sentences that end with an abbreviation that ends in a period, like "U.S."? If it's just one space after a period, it's impossible to know that the sentence has ended? I realize in books or published articles this can be typeset in a smart way to make it distinguishable, but what about in ordinary writing?
Exactly. "I live in the U.S. I am happy." requires a bit more cognitive work to parse if spaces after sentence-ending periods are not wider.
EDIT: For completeness, here is that same text with an extra space: "I live in the U.S. I am happy." -- see how much easier that is to read? Sure, it's not that big a deal, but it's definitely kind to the reader.
Regarding your edit, it appears exactly the same :(. The browser collapses whitespace. I'm not sure if HN's comment system will allow you to double-space after a period, other than in a <pre> block:
I live in the U.S. I am happy.
I live in the U.S. I am happy.
Full stops aren't generally used in abbreviations here, Mx Cryptonector.
Though there's a more recent trend to write "Nato" rather than NATO, which annoys me for the inconsistency that arises. Plenty of people pronounce VAT as one word, or PAYE, but these are usually left in capitals.
In TeX, inter‐sentence spaces are visibly wider than normal spaces, including those that come after abbreviations. And when justifying text, they stretch faster than spaces between words.
The compose key and its composition tables are a thing of beauty — you can even compose an uppercase ß (ss) these days: ẞ (SS). It's such a shame that neither Windows nor MacOS (or the mobile operating systems for that matter) provide this feature out-of-the-box.
Mac OS has supported a similar system since the classic era – for example, https://en.wikipedia.org/wiki/Compose_key leads with Compose + ~ + n = ñ, which is Option + ~ + n on Mac OS.
The newer system is simply holding down keys to see accented variants which feels very natural on touchscreen devices but is slower than keyboard shortcuts if you know them.
I'm OK with that. I'm in the U.S., and I comprehend my non-American English-speaking colleagues, but I don't try to be like them. The same should go the other way around. We're NOT going to standardize on one of the flavors of English -- variety is good.
Recognize, however, that it does annoy those of us outside America when something very straightforward — like the ö in my friend's name, or writing £ — doesn't work. Imagine moving to a hypothetical country with no letter C, and being told your name "must" be kryptonektor, or instead seeing %ryptone%tor.
Here in New Mexico ñ is very common in place names. And I’ve known people with accented names who, though they don’t insist on it, appreciate (sometimes visibly brightening) when their names are spelled accurately.
Follow-up tip: also on a Mac, go to System Preferences → Keyboard and check “Show keyboard and emoji viewers in menu bar”. Then, hit the icon just added to the menu bar and select “Show Keyboard Viewer”. With it open, hold down Option, Option+Shift, etc, to live-preview what will be typed.
For symbols not there, hit Control+Command+Space to open the emoji/unicode typer. Hit the icon in the upper right to switch to a view where you can add favorites, like those arrows above.
Some favorite shortcuts of mine besides the dashes:
- Option+; → … (proper ellipsis)
- Option+[ and Option+Shift+[ → “ and ” (admittedly, typing in an editor with smart quotes is better)
- Option+] and Option+Shift+] → ‘ and ’
- Option+= → ≠
- Option+Shift+= → ±
- Option+x → ≈
- Option+, → ≤
- Option+. → ≥
- Option+Shift+8 → ° (degree symbol, don’t confuse with similar characters)
- Option+p → π
I read something related to Perl 6 work that resonated: Unicode opens up a lot more possibilities than we’ve taken advantage of, we just don’t yet have the hardware to make it convenient. Keys labels as screens is a perennial idea; maybe once somebody does a great job of it, we’ll see those Option shortcuts frequently used.
- Greek characters from Vim’s digraphs, so p star = π, l star = λ, &c. (Looks like WinCompose’s defaults now actually include asterisk-first ones, star p → π, so I might drop my additions.)
- Emoji, :), :D, XD, :/, :S, &c.
- Improved curly quotes, whereby I super-conveniently type curly quotes all the time: ;; → ‘, :: → “, '' → ’, "" → ”. (Look at them on the keyboard to figure out the reasoning. ∷ is available as 2:.)
I contemplated, but didn’t add, Vim’s box drawing digraphs; there are a few collisions, and I only ever use box drawing characters in Vim, so I’ll just use Ctrl+K there.
Aagh! I mean... and so it comes full circle. Does no one remember when "emoticons" were sideways ASCII art? Does no one wonder why entry (8) in a list gets turned into "open paren smiley face with sunglasses?" Kids these days...
Oh, I remember it full well. But even in those comparatively early days when only :-( and :-) had specific Unicode codepoints [HN filters out emoji, imagine the appropriate codepoints yourself], I preferred to use them to :-) and :-(.
Why do I use all the fancy Unicode things? Because they’re there!
If a particular compose sequence strikes you as sensible and suitable for everyone, you can submit patches upstream as well. I believe the base tables reside in Xorg, and GTK uses these as an upstream source, but I'm not completely sure.
I've did this for vowels with macrons — ā (-a), ō (-o), ū (-u), in particular because they are useful in romanised Japanese.
Once you acquire a Compose key, you never can go back. `Compose - - -` = —. I write with curly quotes, hyphens, minuses, dashes, superscripts, Greek letters, &c. all thanks to my Compose key. (Constraining myself to ASCII is actually hard, and I only use straight quotes when it’s actually what I need, when programming.)
My last laptop ran Arch Linux; but I was willing to switch to Windows for the superb hardware that is Microsoft’s Surface Book, once WSL existed, and I knew that WinCompose existed—had either of these two items not been possible, I could not have switched to Windows.
The em-dash shortcut, system-wide remapping of CAPS LOCK to CTRL from the GUI in about 15 seconds flat, a top-level modifier key (command) that's respected enough by developers that it mostly retains its "I belong to the OS and, therefore, to the user, not to you, mere program" character, and a de facto standard package manager that leaves the base system the hell alone, almost never breaks, and can install many of the binary packages I want.
Also if I went back to Linux I'd have to figure out how to remap my window controls to Spectacle's defaults, which'd probably take forever or end up being impossible in the window manager I settled on or whatever.
> a top-level modifier key (command) that's respected enough by developers that it mostly retains its "I belong to the OS and, therefore, to the user, not to you, mere program" character
Really? Every program on OSX seems to use some weird combination of CMD/SHIFT/CTRL/OPT keys but I don't know if I've ever seen a single Windows/Linux program use the Windows key.
I had to program this and other special characters—notably •, ™, and ®—as an autohotkey script on my work PC. I was going insane. I already had autohotkey going to reverse the direction of Windows’ stubbon backwards scroll anyway. (Once I used so-called “natural scrolling” for a week it became impossible to understand why anyone would choose anything else.)
No. I’d get dizzy real fast if that very natural association between movement and perception were suddenly reversed though. (Somehow that didn’t happen when I started using natural scrolling, only when trying to go back from it.)
On *nix with xkb, you can get these by selecting the 'us(mac)' layout. You'll need an AltGraph key (i.e. Option), which is usually on AltRight in Windows fashion.
You're right. It is as you describe on my 2015 MacBook Air running 10.12.6.
But on my 2009 iMac running 10.11.6, using LibreOffice, option-hyphen did produce an en-dash, but command-hyphen did. So either LibreOffice or some configuration setting somewhere on this Mac is responsible for this nonstandard behavior.
I use an Apple keyboard with a numeric keypad on the right. From left to right on the lowest row, the keys are Control, Option (Alt), Command, Space, and so on.
I was using LibreOffice 5.1.6.2 to try this. It looks like LibreOffice has remapped the keys.
As someone who takes flak from literati friends for his em dashin', this is a vindicating find—(!)thanks, OP.
I learned to use em dashes in the military on our performance reports (sort of like an annual resume with a grade attached).
For a few (dubious) reasons, every bullet had to be precisely 1 line long, +/- 0-3 spaces at EOL (we had almost as many rules regarding these bullets as we did the other kind, it seemed).
Since we had a lot of (usually somewhat independent) points to jam into 1 line, we often had lots of separate clauses with a period, semi-colon[1], or em dash. The em dash always seemed to fill precisely the correct amount of space.
I guess the Em Dash is like the Kurwa of punctuation:
> The most common curse word in Polish is kurwa, which can mean a variety of things - damn, bitch, fuck, {insert any word}, and can even serve as a comma.
Maybe I'm weirdly old fashioned, but I try to stick to ascii whenever possible - which means whatever character is stuck in the middle of this sentence. Ditto for the ellipsis, since it's not like three trailing periods at the end of a sentence can be mistaken for anything else...
This is a great example of how limiting ASCII is. It doesn’t even contain the full set of punctuation that correct (!) English requires. It supposes that you are satisfied using ' for both ‘ and ’, and " for both “ and ”, and eliminates all the dashes for the hyphen—which has to play double duty as a minus sign.
It also eliminate dieresis from the language, naive not naïve, and strips all accents from loan words
Question is: how much charactersetting is enough, and how much is too much?
I've been kicking around thoughts on Unicode and various codepoints, and which really ought or ought not be used.
It is decidedly complex. I've got in mind a simplified coding, though, for common interchange formats. Effectively a glyphic pidgen. Presuming "glyphic" is a word.
I use an old convention for approximating em dashes in ASCII by using two hyphens, set apart from surrounding words by spaces. Typographers might shudder, but I find the meaning clear -- its presence signifies the same sort of rhetorical devices that the article talks about. The surrounding spaces serve to make the deliberateness of the punctuation more obvious.
In many rich text generators, an em dash is often made with three hyphens, but for as-is plain text, I find three hyphens excessive. In my opinion, most of the usage of the en dash -- the rival, shorter dash -- occurs in situations where you wish to connect, rather than set apart, so while typing in ASCII for the things you'd use an en dash for, a single unspaced hyphen suffices.
I think you're just like most everyone else - and it's not really about ASCII. There's only one easily accessible dash on the keyboard. The distinction between hyphens, en- and em- (is there a better indication of the level of minutia involved than the fact the latter two not only look nearly the same but are also pronounced almost identically?) dashes is important for typesetters and typography enthusiasts. Everyone (statistically) else just smacks the thing near the top right of their keyboard and it's fine.
I don't know the difference between em-dash and en-dash. But I'd also wager that you don't stumble and come to a grinding halt when you see a - in the middle of a sentence, like a robot trying to parse words as arithmetic.
I've never, ever once had anyone complain to me that my meaning was unclear when using a - or ' in an email, but I've seen encodings get jumbled. There's no gain for me to not stick to ascii in 99% of cases.
I used to be an em dash over-user but have calmed down quite a bit in that department over the last several years. My current idée fixe is the archaic ellipsis as (over) used by Céline . . . I just can't bring myself to go with the single character scrunchy version.
I wonder how the em-dash has avoided this. It seems to just fade into the writing and make the phrases sound like the writer wanted them to sound.
Both ellipses and semicolons carry far too much baggage with them, and make a lot of people have strong feelings about the author, occasionally positive.
I've never understood strong reactions to unusual punctuation. They serve purposes that are not well-served by anything else. Inappropriate use I can understand, but that's just like misspelling. You don't advocate for elimination of the word "loose", for example, just because it's often misused.
Ellipses, for example, are useful for conveying implied but unstated thoughts. There's nothing else that functions the same way. Semicolons serve similar unique functions.
Properly formatted (ie, in LaTeX), the real ellipsis character is the only way to go. With fixed-width fonts it's ugly, but with fixed-width fonts most of the typography is in your imagination, anyway. The hideous [dot][space][dot][space][dot][space] approach needs to die.
The problem being that when you have to use four marks to indicate an ellipsis leading to the end of a sentence, you get an odd triple followed by a space….
(That doesn't bother me, to be honest. I'm justifying my own quite arbitrary taste)
For dialog, as used by James Joyce, or in typesetting other languages such as Spanish, French and Polish, the em dash (U+2014) is not used, but the quotation dash is (U+2015). The quotation dash is a bit longer. See, https://en.wikipedia.org/wiki/Quotation_mark#Quotation_dash.
Philip K Dick, one of my favorite authors, used the em dash a great deal. In fact, I think I first noticed it in his stories. Without having been taught how to use o read it, I immediately understood PKD's intent -- and how the dash wasn't exactly necessary for anything but flavor. A mimic of other punctuation marks is an apt description!
The em-dash is like the AK-47: it feels good in one’s own hand, but one is distressed to find it everywhere one looks. If you find yourself wielding either often, you should stop and ask yourself: “Am I the bad guy?”
Just minutes ago I read two articles demonstrating the horror of a world awash in em-dashes, both in The New York Times [0, 1]. One may empathize with the writers, if one is so inclined: they have employed em-dashes for emphasis, to insert asides, and to lend needed structure to their sentences. But I think one should not be so sympathetic, because they’ve used 19 of the damn things (!), and every usage would arguably be improved by substituting another form of punctuation or by restructuring the sentence in a way that eliminates the need for complex punctuation entirely.
The ugliest em-dash is in the second sentence of the first article:
> A new class of security vulnerability — a variety of flaws that affect almost all major microprocessor chips, and that could enable hackers to steal information from personal computers as well as cloud computing services — was announced on Wednesday.
This is an abomination. The writer has needlessly separated the sentence’s subject and predicate by two em-dashes and 28 words in two phrases, burdening the reader with the task of untangling her mess. Did neither she nor her editor care enough about the reader to re-write the sentence trivially in a natural way? Try this:
A new class of security vulnerability was announced on Wednesday, a variety of flaws that affect almost all major microprocessor chips. The flaws could enable hackers to steal information from personal computers as well as cloud computing services.
This use of a double em-dash to insert additional information mid-sentence is almost always the wrong thing to do. Here’s another example from the second article:
> Over the next few years, hundreds of millions of dollars in American deposits flowed from Swiss banking stalwarts — institutions like Credit Suisse and Julius Baer — to Bank Frey.
I propose instead:
Over the next few years, hundreds of millions of dollars in American deposits flowed to Bank Frey from Swiss banking stalwarts like Credit Suisse and Julius Baer.
If you find yourself using em-dashes like this, pause and ask yourself why you can’t render the sentence more naturally. Most times you can, whether by restructuring the sentence, by using standard appositive phrases set off by commas, or by true parentheticals (preferably at the end of the sentence, not dividing the subject and predicate). If you can’t make that work, consider separating the information you wish to convey into two or more sentences.
Here are some more examples of what not to do. From the second article:
> And since the problem is built into the hardware — billions of chips that cannot easily be replaced — fixing this class of problems may also be prohibitively expensive.
Just replace “hardware” with the more detailed phrase and a comma: “... since the problem is built into billions of chips that cannot easily be replaced, fixing ...”
From the second article:
> She would return to the United States secretly carrying just under $10,000 in cash — the cutoff for having to make a customs declaration.
Just use a fucking comma!
Now, if you’ll give me a moment to collect myself in the face of all of this grammatical turpitude, I will say that there are two usages of the em-dash in these articles that I do think sensible. Both set off information in ways that other forms of punctuation can’t (except for the venerable parentheses), and they do so in ways that don’t interrupt the logical flow of the sentence:
> A common trick involves having the microprocessor predict what the program is about to do and start doing it before it has been asked to do it — say, fetching data from memory.
> Prosecutors said all the secrecy — the nameless debit cards, the scissored bank paperwork, the shadowy phone calls — showed Mr. Buck knew what he was doing was wrong.
TLDR: if you use em-dashes often, you might be a lazy writer with a bad editor.
Midsentence injection deserves its own thread. Is it a trend, or am I just noticing it more? It's hard on the brain, because you have to hold the first part of the sentence in memory while you take in the interruption. Your example is great, the one about the security vulnerability. Your revisions are much better in all cases.
It's actually my first impulse to inject these midsentence clauses. Is it because of a short attention span, or is it a fear of someone raising an objection if I don't qualify every statement?
I do it too, when I’m being lazy. It simply requires more effort to order one’s thoughts well.
The best reference on style I know in this regard is Joseph Williams’s book Style: Toward Clarity and Grace [0]. The book explains how to analyze your own writing to lessen the mental burden on the reader, mainly by re-writing your sentences so that there is a natural logical flow from subject to predicate to object, with detail added in phrases that don’t get in the way.
My sixth grade teacher cured me of this right quick. “Pick up a book,” she said, “—go on.” I did so. I looked up at her but she just looked back down at me. I opened the book. I looked at the spacing. I closed the book, looked back up at my teacher and said, simply, “I see.”
That selfsame teacher admonished me for turning in a creative writing assignment with single sentence paragraphs in dialog: “Paragraphs consist of a topic sentence, supporting arguments, and a concluding statement,” she insisted. I looked up at her, “But... books—!” She wouldn’t have it. “When you write a 40,000 word novel you can use one sentence paragraphs.” So, 19 years later, I did.
I am an unapologetic curmudgeon who uses two spaces. Or rather, my right thumb does. I learned to type on an IBM Selectric, I use mostly mono-spaced fonts, and two spaces after a full stop makes perfect sense. I went thirty years between learning it and thinking about it, and then some point-and-drool twee bloggers made it unfashionable.
> Or rather, my right thumb does. I learned to type on an IBM Selectric, I use mostly mono-spaced fonts, and two spaces after a full stop makes perfect sense. I went thirty years between learning it and thinking about it, and then some point-and-drool twee bloggers made it unfashionable.
I actually learned to never use double-space-after-period for text that would be rendered directly (e.g., not by a typesetting toolchain that would replace it with a more appropriate but not available in ASCII wide space) in proportional fonts before the Web existed, and it seemed to be fairly widespread advice. This is absolutely not something that was originated by “point-and-drool twee bloggers”.
> I learned to type on an IBM Selectric, I use mostly mono-spaced fonts, and two spaces after a full stop makes perfect sense.
It's still considered fine to do that when using monospaced fonts, just as using the prime symbol (') instead of an apostrophe (’) won't raise any eyebrows in that scenario.
When your goal is to make something that looks like it was typeset by someone with an awareness of typographical conventions, then understanding those can be helpful. ("Twee bloggers" didn't invent them.)
"When your goal is to make something that looks like it was typeset by someone with an awareness of typographical conventions..."
If by that you mean typesetting, I agree. Typing and typesetting are two different activities.
LaTex, for example, will typeset both ".(one space)" and ".(two spaces)" into the same white space that is somewhat larger than one typed space and somewhat smaller than two.
> If by that you mean typesetting, I agree. Typing and typesetting are two different activities.
Exactly, and many (non-developer) folks may not have ever experienced the "two spaces after sentences" convention if they haven't sat in a typing class with old-school typewriters, and haven't been instructed on formatting monospaced type.
I got into the habit of double spacing because it does make the TeX source code easier to read (and I often do just read that, rather than the typeset document), and also because emacs has a number of sentence based commands for which it uses the double space as a sentence delimiter.
I learned the basics on Mavis Beacon Teaches Typing, which must have assumed a monospaced font because it taught the double-space-after-period style. Took years before I learned that wasn't always how you're supposed to type, and another couple years after that to fully break myself of it.
Hacker News eats the second space. Which is fine. I'm not saying that typesetters should show two spaces between sentences, rather the edict against typing two spaces between sentences is wrong. Two spaces even has a (small) amount of semantic information. i.e. "This is a full stop and not part of an acronym or radix point, etc."
It's not Hacker News eating the space; it's actually how the browser renders text. That said, I agree. What I type into the computer is not typesetting commands; it's input to the renderer, whose job is to understand what I'm telling it and lay it out appropriately. So, putting one space after a period indicates an abbreviation rather than the end of a sentence; the renderer may or may not choose to lay these out differently, but at least it has the option.
HN could if it wanted to insert an <space> following a full stop. It doesn't.
One reason being Mr. Twee I-don't-use-two-spaces-after-a-full-stop, that it's awfully hard to tell if "Mr. Twee" is a sentence end or not, or if "stop. It" is supposed to be something else.
That is, in the typewritten manuscript, the double-space following the full stop carries meaning which the typesetting system (HN's HTML encoder, here), would have to guess at.
Or, you know, there's another way. Those of us who learnt to type properly can tip it off with our fuddly double-spaces.
I see this article quoted all the time, and every time I read it I get frustrated again.
>But I actually think aesthetics are the best argument in favor of one space over two. One space is simpler, cleaner, and more visually pleasing.
Okay, great? So your best argument is that you subjectively prefer it over two spaces.
>The diners included doctors, computer programmers, and other highly accomplished professionals. Everyone—everyone!—said it was proper to use two spaces.
Ah, but our valiant author sure was ready to put them in their place!
I find this whole article oozes condescension and faulty logic, and I can't help but get my dander up whenever I see it referenced.
> Depending on the context, the em dash can take the place of commas, parentheses, or colons—in each case to slightly different effect.” The “slightly different” part is, to me, the em dash’s appeal summarized.
For me, it's of the opposite. It's like using `auto` too universally in C++: sometimes it's helpful to know the type you actually intend. Although I cede that the overuse in the article is more likely because it's discussing it and making a whimsical point. :)
Yes! I use em dashes quite often. I tire of commas and parenthesis. I do prefer two short dashes together, because I enjoy ASCII. I also use short dashes as parenthesis sometimes, -like this-, but this doesn't always work well as some UIs use that for strikethrough.
I love dashes. I do tend to avoid ems for parenthetical remarks – for those, I'll use spaced ens – but I do definitely use the longer for interjections and (roughly) semicolons, as the article describes. Where I break from convention is that, if they're marking an interjection or other interruption, I will put a space in on a single side of the dash— like so —to better demarcate the logical flow. Then again, I also put a thin space before exclamation points, etc. if I can (though unlike French, nothing if I just have full spaces), so I have other reasons to dismiss the complaints about style.
Thanks for that, actually! You're the first to comment on the aesthetics rather than the "incorrect" usage. My thinking is that the dashes are part of the pragmatics of the primary, wrapping phrase, and don't actually interact with the inner one—they are therefore bound to the first and separate from the second. Whether or not it's a good look, the fact that doing so much better signposts the parenthetical makes it worth it, right? And that spacing rule is actually almost the same as with ellipses, though I'm definitely with you on that taking a while to look good to me as well.
Do you think it's just being unused to that layout, or is there some deeper issue to how it looks?
There's a lot in what you say. Not crazy at all. They only looked awful for about 10 seconds hehe. But sure, using them to help show the meaning is a good idea. I've recently learnt to write a = b*c + d, spacing assisting clarity. Carry on, I support you in your chosen mission.
This is awesome :) I've been using the em-dash a lot recently, but didn't even know it had a name. It often seems more natural than using semicolons— semicolons often feel more formal and cold.
I still prefer semicolons when connecting two related thoughts, whereas it seems that em-dashes are most often used as a pause or a pseudo-comma. The brilliance of the semicolon is that you can connect two or more closely related sentences without a full stop. Em-dashes can't be used to connect multiple complete sentences. Semicolons are really more powerful than em-dashes, less pretty they may be.
I know this because I've read Knuth's The TeXbook. I learnt so much about typography from that book. And yes I did find several places to use an em dash in my PhD thesis.
We had plenty of dash characters--we had hyphens of course, and figure dashes, and en dashes, and em dashes. Figure dashes are one en in width, with the vertical position of the visible element is designed to match the numerals in the font. This let us set a sentence like so with subtle perfection.
Type designer John Baskerville (1706-1775) owned a printing-works and type foundry in Manchester.
Does anybody notice this sort of thing? Probably nobody except book designers. Does anybody care? Well, my mom said she cared when I pointed it out to her.
From that teacher I learned to respect the choices of the type designer and how to use those choices. His tradition--now mine--dictates that all these dashes be typeset closed, without leading or trailing spaces. In the OpenType era, a type designer may choose to leave a speck of white space around a dash, but it's up to the designer, not the typesetter.
Text rendering in software like web browsers and word processors is gradually catching up to the tradition. Adobe's InDesign product does a great job of all this. Recent versions of MS Word are great improvements.
It helps to know the HTML entities for these items if you'll play around with them in your web pages.