Hacker News new | past | comments | ask | show | jobs | submit login
Hyphen Hate? When Amazon went to war against punctuation (graemereynolds.wordpress.com)
489 points by Jare on Dec 15, 2014 | hide | past | web | favorite | 172 comments



Amazon is probably trying to correct publishers who copy+paste their hardcopy texts (from InDesign or wherever) into eBook format, taking with them artifacts from the print designer, like forced hyphenation. These artifacts from the print world are not useful in eBooks, since the reader is automatically breaking up lines based on the screen size, font, etc.

http://indesignsecrets.com/field-guide-to-indesign-hyphens.p...

I presume that this is what happened because a) she has a paperback & eBook edition of this book, b) this happens a lot, and c) someone complained. I would hope that Amazon would do a search in the eBook and not only look at the total hyphens, but also find a few examples of words that were probably broken up for print layout reasons. If so the author should remove the hyphens and resubmit the eBook.

If that's not the case, it's very silly. Obviously hyphens are useful and shouldn't be banned.


This would be easily rectified if they looked up the words on each side of the hyphen. If they are actual words, then that hyphen should not count. If they are just partial words, then they should. But removing a book from circulation based on an obviously shoddy algorithm should never happen. As a Kindle author myself, I'm having second thoughts about the platform. Censorship by robots isn't exactly an attractive feature.


> If they are actual words, then that hyphen should not count.

That would have side-effects e.g. https://en.wikipedia.org/wiki/Double-barrelled_name


Of course there will be exceptions, and it wouldn't need to be perfect. It just needs to be better than counting hyphens alone, which the suggestion above inarguably is.


Not to mention fantasy/SF love of invented languages. I'd hate to be HP Lovecraft if this becomes common.

Or have to quote transliterated Arabic, for that matter.


Names are proper nouns and therefore capitalized, the edge case is resolved until you find a proper noun with capitalization in the middle of the word with the hyphen coming before the capital letter. This seems to be limited to foreign words(German in particular) and Company Names where the author inserted a hyphen. https://en.wikipedia.org/wiki/CamelCase#Current_usage_in_nat...


Well then they shouldn't do it at all or rethink their algorithm altogether.


Or do the sane thing most other people do: back up an imperfect but effective algorithm with competent human review.


Check the comments after the post. That's mentioned as a possible reason and the author says that's not the case. Apparently Amazon is complaining about real hyphenated words, not line-break hyphen-ated words.

[edit: It appears, though, that the author is stylistically abusing hyphens, so while they may not be wrong, they're grating to most readers. So, should Amazon be in the business of banning books on the grounds of poor style, rather than technical grounds that are inarguable? (If the book contained the word hyphen-ated, that would be wrong unless it was dialogue and the speaker was pausing between syllables.)]


If "stylistically abusing" punctuation is worth of blocking an ebook, then I have to assume Huck Finn and Cloud Atlas are next on the list of books to remove for their heavily punctuated attempts to represent speech patterns.


furyg3 knows that the book only has proper hyphens. Amazon took down the book because they thought it was full of line break hyphens. It wasn't, but that's what Amazon thought because they never actually looked.


So really the title of the post should be "Collateral damage when Amazon went to war against bad punctuation," which casts Amazon's motives in a different light.


I don't believe that is satisfactorily disproving this position. It is very very easy to believe that the automated system trying to prevent lazy formatting is just interpreting a large prevalence of hyphenated words as line break hyphens.

Graeme merely asserts that his book was free of those, which we knew going in. It does nothing to say why Amazon was raising the issue, and it wouldn't shock me that an automated system would be unable to tell the difference.


Thanks for making that point, this makes a lot more sense now. I really hope this is the case. (Though the author does state in a comment that they don't have any of those errors in their text.)


> she has a

Btw, the author's name appears to be "Graeme", which seems to always be male:

http://www.behindthename.com/name/graeme


This explains the strangely-located hyphens I've been seeing in a book I'm reading currently. This must be exactly how the book was submitted.


So, I looked through the preview of the author's "High Moor" which is available on Amazon.co.uk and I see things like:

"One of them picked up what looked like a dog's hind-leg and put it in a white plastic bag."

"With the two-week Easter holiday just over a month away, ..."

"The woman's white hair was hidden by a red-head scarf, ..."

"The door of one of the expensive motor-homes opened ..."

Seems to me Amazon may have a point about some of his use of hyphens.


I'm quite sure that "two-week Easter holiday" is correct in both English(US) and English(UK)[1]. The use of the hyphen in such examples is somewhat analogous to Haskell's use of "$" as a precedence operator, so we know which word modifies which.

But you're correct that the others examples are erroneous, some egregiously so. I had to read the "red-head scarf" several times before I was finally able to parse out the presumed meaning: "red headscarf".

1. Source: me. I worked for 10 years as a translator, editor, and proofreader for Lionbridge and many other multinational publishing and localization companies before moving into software development.


Yes. I wasn't clear enough. I was not implying that "two week Easter holiday" was incorrect (although I think it would be OK to go with "two week Easter holiday). My point was that 3 of 4 uses of hyphens were wrong.


> My point was that 3 of 4 uses of hyphens were wrong.

Wrong according to whom?

We're not talking about writers for a newspaper or magazine, where everyone is expected to adhere to a single style for the sake of uniformity.

James Joyce did a number of things "wrong" both grammatically and stylistically, but that was part of the artistic expression. It's weird, to say the least, for a bookstore to enforce style. When I walk into a physical bookstore, I expect that the bookstore has made some decisions around book placement[0] for the sake of optimizing sales, and books that have bad grammar may not be as attractive. But I don't expect that they have made decisions directly based on the grammar of the books ("we dislike the way you use punctuation, so we're going to put your book in the back of the store, or take it off the shelves altogether").

Amazon is now trying to eliminate the need for publishers, but in that process we should be clear not to conflate the role of the publishers (which includes editing) and the role of the marketplace.

[0] Incidentally, most brick & mortar bookstores are organized by genre and then either title or author name, so the extent to which they can even 'demote' books is generally limited to the ability to place books "cover out" instead of "spine out". With the exception of keeping adult-themed books in a special section and promoting certain books in a themed display (e.g. "Staff Picks for Summertime Reads"), bookstores don't really take heavy action against books based on content.


The writer is no James Joyce.


That's a really poor response.

When Joyce published his first novel as a nobody, people like you would've said something equally thoughtless about his grammar, followed up by a glib "the writer is no Arthur Conan Doyle."


James Joyce followed proper English when he published his first novel as a nobody. (And his second novel, and third.) He was also a professor of English at the time, so he wasn't really "a nobody" even then. So he may not be the best test case.


His point entirely stands despite your carelessly dismissing one part of it.


Absolutely agree. It is called 'artistic freedom' after all. Right?


Style is not the same as making a mistake.


And who should be the arbitrator of style vs. mistake?


It's easy enough to ask the person that wrote it. Why, do you think answering that is typically difficult or unclear?

And to be clear I mean asking about specific instances, so don't point back at the blog post, which seems to be a case of Amazon making a mistake. All of the hyphens quoted from Moonstruck are grammatically correct.


True, if the process just involves asking the author then I have no issues with that, and it should be easy to resolve.

I do think it is difficult for anyone other than the author to answer that question though. For me, only the "red-head scarf" example would be a mistake, but others in this thread indicated that 3 or 4 of the examples were wrong for them.


"Wrong according to whom?"

According to the customer who complained about them (who is, after all, the person with the money), and anyone else who's familiar with standard English usage.

Consciously choosing to break the rules and breaking them because you don't know any better are two different things.


> According to the customer who complained about them (who is, after all, the person with the money)

There is a time honoured tradition regarding what to do if you have money and don't like the artistic style of the author, and "demand their publisher pulls the book or forces the author to change their style to suit you" isn't it. (Not to mention all the happy customers who gave the book 4 & 5 star ratings and didn't complain, who are, after all, also people with money)


> Consciously choosing to break the rules and breaking them because you don't know any better are two different things.

And how did Amazon decided on which was the reason for his actions?


The preview of Moonstruck (the book in question) is also available through Amazon. Here are the first few hypens I can see:

"There could be no-one there, ..."

"...into a razor-filled muzzle ..."

"... sleep, muscular, brown-furred monster."

"... through her blood-soaked fur."

"... propelled backwards in mid-leap, ..."

"... the red-haired woman ..."

"... a tall, grey-haired man ..."

"mid-thirties"

"mother-in-law"

Not my style of fiction, but most of these don't offend my sense of grammar either. I think "mid-leap" is the only one that slightly jars.


Beginners mistake. These are minus (-) not hyphen (‐) characters. </unicode-jibe>


You are being downvoted but apparently that is what /r/books is claiming the reason is: https://www.reddit.com/r/books/comments/2pcnv2/amazon_remove...

I guess it causes issues with bad screen readers.


Beginner's mistake. "Beginners mistake" can only be used as the start of a sentence as in "Beginners mistake azure for blue."


Some beginners mistake "Beginners mistake" as a phrase that can only be used at the start of a sentence. :)


I'd say most of that usage is unnecessary clutter. Hyphens are intended to disambiguate grouping of word pairs. I haven't heard of anything called a "dog hind" that could have a leg.

"The woman's white hair was hidden by a red-head scarf, ..."

That's incorrect, isn't it? It's reinforcing the potential confusion that the hyphen is meant to avoid. Unless she really was wearing a red scarf made from someone's head?


I don't know, in mathematics we say that we have a K-vector space when it really means K-(vector space).


To be fair, "two-week" isn't wrong... but it's in bad company.


Yes. What they were going for was "red head-scarf", but that is also questionable. I've only ever seen "head scarf" unhyphenated.


I can't remember where I came across it (possibly Gowers' Plain Words), but I like the rule that you should hyphenate to avoid ambiguity but not otherwise. E.g. is "a red head scarf" a red scarf for the head or a scarf for red heads? From context, it should be obvious, but hyphenating "head scarf" would make it explicit. (Of course, you could always avoid the problem entirely by using "headscarf" instead.)

There are some exceptions: e.g., I've read you should always write "Douglas-fir" instead of "Douglas fir" because it's not actually a fir.


If you need an hyphen to make clear what a "read head scarf" is, shouldn't you rewrite your sentence?


Well I don't know if this is right, but if I had to guess

red-head scarf might suggest that the flairs at the end of the scarf are red, and that it just a normal scarf https://www.google.ca/search?safe=off&tbm=isch

where as a red head-scarf would be a red... head scarf. https://www.google.ca/search?q=head+scarf&tbm=isch

Dunno. Needs more context frankly.


Yeah, all of those are pretty bad. I would have complained too if I had read this book.


There are one or two errors in usage, which are down to the editor not the writer, but the majority of them are correct (if not always necessary) in British English. His argument was based on the fact that Amazon told him to remove all of them rather than giving the book another proofread and removing the incorrect ones.


Two-week is correct.


Ah, yeah, good point. I didn't read that one.


I read that latter one as specifically meaning a scarf made out of red head


Here's the thing though - the writer should have the artistic liberty to use the punctuation they desire.


Yes, but there is a difference between intentionally using bad grammar to invoke ambiguity, and unintentionally getting in the way of a story because the writer doesn't know better.


Sad that he blew £1000 on an "editor" incapable of tidying up the hyphenation mess.


To be fair, we don't know how much worse the text was prior to bringing an editor on.


re: "red-head scarf", I'm skeptical that the artistic intention was that it was a scarf for redheads being repurposed for white hair. Though I have not read the story so that might be part of the plot for all I know.


It could be a scarf that somehow makes you look like a redhead.


Playing devil's advocate, Amazon should have the liberty to choose what to sell. Obviously, they cannot abuse, bad publicity is bad for sales, but in this case -IANANES- (I'm not a native english speaker) the examples sound bad. "red-head scarf" is -to me, IANANES- a scarf made of a red head or a scarf specially designed for redheads.


Good point, I've caught a few other things that grind my gears.

"Are there people out there so fucking mind-bogglingly stupid that the inclusion of a – between two words confuses them enough to be torn from the story and ruin the reading experience so much that they felt obliged to write to Amazon and complain?"

I think that the author insulting someone who have paid for his book is not something intelligent, nice or professional.

"Do they need to do something about the quality of the ebooks on their device? Oh yes. Absolutely no question about it."

I was going to argue freedom of speech and the ability to form ideas the way the author intended is what matters here, but the author himself doesn't seem to agree himself.


> I think that the author insulting someone who have paid for his book is not something intelligent, nice or professional.

That's fairly straight up UK english hyperbole. Calling the author stupid and unprofessional because you're unfamiliar with the way UK english works is a trifle unfair.

Edit: I'm specifically talking about being unfamiliar with that style of hyperbole here. I am not defending the hyphenation. That's why I didn't mention the hyphenation at all originally. But apparently that wasn't enough, so now I'm explicitly saying I wasn't talking about the thing I didn't mention. Hope That Helps, Have A Nice Day.


He didn't say the author was stupid, nor did he say he was unprofessional. He said the author's behavior wasn't nice. And it isn't. There's no question in any English reader's mind (US or UK) that the author's heavy-handed use of hyphenation is grammatically incorrect. The buyers of the book DID have their intelligence insulted, and on closer examination, it appears that those customers are actually correct.

I see no reason not to insist on some kind of quality here. We're not talking about poetic license or writing to reflect a dialect, we're talking about the author believing hyphens are correct for a given passage. In the vast majority of his usage, his beliefs are incorrect, and his lack of grammatical awareness (enforced by his own post) is making for a bad reading experience.


I'm still not sure the bookstore is the best place to go for grammar enforcement, rather than suggesting he hire an editor. But Amazon's foray into supporting hands-off self-publishing is at war with their need to have a minimum quality level to avoid scaring off customers. An actual working rating system would help, if anyone had managed to invent such a thing.

I'm also of the opinion that if I'm reading a book with such egregiously awful editing, I'd stop reading it, rather than ask for corrections. If the editing is bad, are the ideas any better?


I'm in a quandary: doesn't the bookstore have a right to set a quality standard (not a poetic one), and a right to protect itself from chargebacks from unhappy customers?

Is the phrase "self published" really correct for Amazon and others to use? It could be argued that Amazon is the one publishing it, and they're really letting you bypass the editors and sales reps from a print-based publishing house to get your works onto Kindles.

I really don't know. The situation makes me uncomfortable as a creator of content, certainly, but the author's railing against the company for protecting its customers and defending against liability from credit card issuers ... That just feels like someone bitching out of a sense of entitlement, and history shows (all over the world) that people who thoughtlessly feel entitled are rarely able to be convinced that they, in fact, are the ones that need to change.


I have edited my comment to make it clearer that you weren't replying to anything I actually said. Hope that helps.


I was replying to what you said. His hyperbole regarding unintelligent customers was the behavior that was called out as not nice, professional or intelligent.

His piss-poor use of hyphens (see what I did there?), however, isn't hyperbole. Hyperbole is a rhetorical device (and the greatest thing ever!), and unless he intended to evoke an emotional response in his readers by their incorrect and non-poetic application, it's just bad form; Being from the UK has bugger all to do with it.


It's strange that you educate us on what hyperbole and rhetorical devices are, yet can't recognise both those things in a sentence which begins "Are there people out there...?". The author is saying the direct opposite; that people aren't that stupid.


> I was going to argue freedom of speech and the ability to form ideas the way the author intended is what matters here, but the author himself doesn't seem to agree himself.

The example he gives is of auto-generated books uploaded for the purpose of gaming algorithms for profit. I'm as much a free-speech fundamentalist as the next guy, but that is no more a free speech issue than the Nigerian governor emailing me the unique opportunity to take a cut of $80,000,000 worth of penis enlargement pills.


One of the things I like about the Kindle is the option when reading to highlight a block of text and flag it as an error, which I've previously used for obvious typos, and the occasional badly placed hyphen. Having seen this and the implication that it can result in automated removal of a book from the Kindle store I think I'm going to stop that now.


That's fair, but they hardly hinder readability. I think that's well within artistic license and should hardly be automatically (!?) suppressed from the store.


It kind of threw me that he used the word 'semi-colon' in his defense about not having too many extraneous hyphens - apparently that is an acceptable variant, but the most common spelling I've seen has always been 'semicolon'. Is this a British thing? Or does he just really like his hyphens?


None of these are bad enough to ban a book. Even though "red-head scarf" is strange, it sounds like intentional wordplay in that sentence. Remember, writing is a creative activity. It's not engineering.


Those to me seem like a perfectly acceptable use of a hyphen.


'red-head scarf' certainly isn't an acceptable use of a hyphen, unless the scarf is woven from the hair of a red-haired person.


I read that as "A scarf that red haired people wear"


That would be a legitimate way to parse that phrase, but it's far more likely that this is a typo for "red head-scarf". There isn't a particular kind of scarf worn by red-heads, AFIAK.


Admittedly, I thought of the sexual double-entendre. Muff can make a nice scarf...

But I digress.


Motorhome is not hyphenated.


I blame spellcheck. In many cases the most legible solution is a German-style mashingwordstogether, but spellcheck systems highlight those so people hyphenate instead.

For example, spellcheck has a curly red underline right now. But would it really look better as spell-check?


Spell check, no hyphen. Easiest way to figure out what needs a hyphen IMO is to google it. Otherwise, just try to avoid hyphenated words, a simpler word will often do.


I think it would be more grammatically correct to call it "spell-check" or even "spell check" instead of "spellcheck".


Spelling check... you aren't checking spells.

At least, this is what I've been told by a very picky guy...


My guess is that the majority of complaints Amazon gets about hyphenation in ebooks are due to poorly converted ebooks, where line-breaking hyphens from a print-formatted source file have accidentally made their way into the ebook. Amazon support is generally not set up spend more than 5 seconds looking at your email, so I surmise that they've filed it into the "yeah fix your ebook hyphens" category and are not reading carefully enough to realize that they've misfiled it.


Yet somehow they didn't pull A Dance With Dragons, which had appalling hyphenation in the original release.


You're blaming them for giving different treatment to an author who's at least a million times more important to their business?


I would expect more scrutiny of blatant formatting mistakes in a high profile release.


Or, in the time-constrained-support-worker model, a high profile release just means that they trust the publisher's quality control, and won't flag it for review as quickly as self-published content (which is highly variable in quality).

It's a poor QC process, not a stylistic or editorial review.


Yes. That's probably why he's beating up on Amazon....because they're picking on the little guys while ignoring the big ones.


Nice combattive metaphors, but this is just too silly to engage with.

EDIT: Downvotes, but I'm sorry, I don't WANT to explain how it makes sense to expend more time and money on customers and suppliers that are a _million_ times more valuable to one's business.


The big issue here are not the hyphens, other than maintaining formatting for different devices the publisher should not mess with the content after an author approved edit.

Traditionally there was the copy editing process done by the publisher, but automating this on a the-customer-is-always-right process would probably have left a lot of hard to digest classics no where to be found.

And seriously? Running a work of literature against a dictionary for other reasons than supervised spell-checking?

Literature is about expressiveness not about serfdom to grammar or spelling.

What about Neologisms? There would be no nerds without inventive authors like Seuss. http://flavorwire.com/291507/everyday-words-that-were-invent...

---

There surely still is a need for copy editing, proper formatting and layout, which was almost never done by the author alone, and also requires a different skill set than that of a typical author. These are the services a publisher should offer, but doing this without involving the author for final approval is just false.

---

see also:

https://en.wikipedia.org/wiki/Doublespeak

https://en.wikipedia.org/wiki/Solecism

https://en.wikipedia.org/wiki/Sic and https://en.wikipedia.org/wiki/Stet

https://en.wikipedia.org/wiki/Malapropism also Cacography

https://en.wikipedia.org/wiki/Catachresis


All of your examples relate to the knowing use of, let's say, "flavorful" language.

The author's "flavor" appears to be "hyphen-happy" - do you believe his intent was to style his writing that way for dramatic or artistic effect?


A traditional publisher would have most likely turned any random author down in the first place as it was the norm throughout the history of publishing with high initial costs and a few publishing houses functioning as gatekeepers.

Then there were the so called (vanity) self-publishing houses who more or less capitalized on the rejects, while asking for a quite significant amount of money first to print and rudimentary, if at all, edit whatever the author offered. Here the gatekeepeíng factor to the public was how much money you had to spare to get your books printed and listed.

Nowadays with digital publishing the initial production costs are so low, that suddenly copy editing and proofreading by a person other than the author becomes a significant part of the costs, which it wasn't before, since that cost was dwarfed by the actual printing costs and was normally considered a given.

With no money to spare for third-party editors/proofreaders/layout this job is outsourced to algorithms and mechanical Turks with questionable results.

I suggest the best way to improve quality and acceptance of self-published work, is to actually pay a professional third-party person to proofread and copy edit your book first in a joint effort with the author. Only after that you should submit your work to a digital publisher. If amazon then still rejects this, based on customer complaints than we have an actual problem at hand which is not just the absence of any kind of human copy editing.

---

Substitute 'damn' every time you're inclined to write 'very;' your editor will delete it and the writing will be just as it should be. — Mark Twain

https://www.goodreads.com/search?q=editor&search_type=quotes


The quote reminds me of the 'hairy arm' idea, which I quite enjoy. A quick Google gave me this: http://www.1099.com/c/co/in/insanity006.html


"and included a handy link to the Oxford English Dictionaries definition page which described it’s usage"

AHEM



Also, should be Oxford English Dictionary's. Gotta watch that punctuation. :)


Theirs always someone nitpicking...


I love your (presumably) deliberate grammar error given the context of this tangent.


> Theirs always someone nitpicking...

https://i.imgur.com/L2Rhc.png

Based on my limited experience, people who think grammar Nazis are annoying shouldn't be suspected of being too intelligent.


Language is about communicating ideas, I couldn't care less if someone leaves out that arcane little tick or uses the wrong pronoun. In 99% of cases it makes no difference to comprehensibility.


Not to someone who can't tell the difference, no. But that's true of anything.


They derail the conversation. Of course they're annoying..


I will never mind a "Hey, you misspelled «X»". I can agree "Ha, he misspelled «Y», he's such a complete moron" is annoying/disruptive, but I don't think the answer to assholes is to stop caring about basic grammar rules. Incidentally, as an outsider I've never seen people being as careless/clueless with their language as British/American people are. A result of the education system or society, maybe.


I think it's mostly the number of common homophones and the speed at which most English speakers think and type that causes the problem. Add that to obscure language rules and you can end up with something that looks completely wrong, but isn't.

A great example is "red-head scarf"; what you read is "(read head) scarf" what it means is "red (head scarf)". Granted, it makes the wording unnecessarily complicated, but it's probably grammatically correct.


By calling yourself an outsider, I'm assuming that you don't speak British or American English? If so, you're not really in a good position to judge grammaticality.


Or possibly a result of the number of people using the language. Including those with only limited experience.


I was talking about native British and American speakers. Not to mention that most of the people I've been in contact with were (highly) educated.



> it’s usage

That’s not the only nor the biggest nit you can pick with this essay.


Let me copy/paste a reply I just posted.

"Let me shame you every time you introduce a bug/typo in your code."


The compiler I use does that already, thanks


When there are discussions about Hachette, one often sees writers saying "Amazon gives me more of the cover price, and gets my book into people's hands faster, what's the downside?"

I guess this is the downside.


As someone who has been using a Kindle for almost all my book reading in the last few years, the major publishers don't care. Doesn't matter what I read it tends to have quite a few OCR errors and other mistakes that clearly exist for the print edition (extra hyphens) or 'correcting' things that used to be hyphenated (by removing the hyphen that needs to be there).

My new Kindle has the ability to let me report typos and other content errors.

I've been using that feature A LOT. I've reported 13 content errors in the last month, and I'm not reading all that much, just a few hundred pages in that time.


Agreed. It's worse for technical books in my experience. The last few I've bought have had unreadable code samples and screwed up or just plain missing contents pages. It's got to the point where I'm going back to buying paper textbooks.


I've been using the highlighting feature on errors in order to appease my OCD. The fact that Amazon has added error reporting seems like a hopeful sign and an admission that there is a widespread problem. The question is, are they actually going to make the fixes?


I've heard of books being taken down (such as in this case) and even refunds being issued, but I wonder if it really matters.

When I asked about the feature on /r/kindle they said it had been around since the Paperwhite or so, yet books are still riddled with things that even a basic spell check would catch.

Amazon doesn't seem to be holding publisher's feet to the fire.


Legitimate query: anyone know if the author receives those reader-reported issues directly?


From reading the original article here, it seems they receive a notice of the reports, not the actual reports, which I would assume Amazon does to protect readers from retaliation (?), and after Amazon has reviewed the report and checked for themselves. At least that was the feeling I got from the original article.


Are you suggesting that Hachette has no standards and/or formatting rules for the books they publish?

Surely not.


i think he's suggesting that Hachette has sane standards and formatting rules, instead of running your book against a dictionary and counting the hyphens.


And your evidence that's what Amazon is doing would be?


From what I'm seeing here (and on the site) a customer complained about bad hyphenation. Amazon pulled up the book, and, sure enough, it has some bad hyphenation.

That's not at all the same as algorithmically rejecting a book based on hyphen count.


Use en dash instead of normal dash and hope amazon's bots don't notice.

If that fails, replace - with ·. Include preface "due to limitations of Amazon's ebook technology, this book uses the · symbol in place of - for hyphenating words".


When you have "grammar matters" in big fucking letters at the top of a blog you really need to proof read.

> So, chuckling to myself, I sent back a response pointing out that the use of a hyphen to join two words together was perfectly valid in the English language and included a handy link to the Oxford English Dictionaries definition page which described it’s usage.

POSSESSIVE ITS HAS NO APOSTROPHE.

Normally I really don't care -- apostrophe use is confusing; my grammar is lousy; I tend to be descriptivist not prescriptivist, etc.

The writer doesn't mention if these hyphens are all joining two words, or of some of them are used to split a single word across a linebreak. The arguments are different for each. For the former you point to the satisfied reviews and a bunch of style guides about use of hyphens. For the latter you mention the piss-poor typography on Kindle, and the lack of any control for authors or readers. (The setting to left-justify with ragged right margin is a hidden setting that requires a tweak to access).


The grammar mistake in the author's sentence does not matter as much as the "grammar matters" example. And there's a difference between errors and ignorance. If you type fast it's easy to not fully think through all grammar rules before the words are already typed and done. I think the author is worried that blind automation could be harming the beauty of the written word. It would be like iTunes complaining that Blues artists are using too many off-key flat notes (and they say, no, those are blue notes!). Clearly from the article, hyphens are part of this author's colorful use of language.

Obligatory: http://xkcd.com/37/

(edited for grammar :)


Apostrophe use is not confusing. Its (and yours and ours) use is pretty straightforward (if you say 'it is use' to yourself, that is) unlike many computer languages with which many folk who read HN are extremely proficient since they make their living that way.


I really, really don’t get your point. I’m utterly confused by it. This is an incidental non-issue that couldn’t be any less relevant if it wanted to. There is no possible world in which this could ever be of the tiniest relevance.


Let me shame you every time you introduce a bug/typo in your code.


Please do. Especially if I have "grammar matters" in big letters in those code headers and the code is about grammar.


Completely missing the point, Mr. Know-it-all.


Weird that you call me Mr Know it all when i have already said that my grammar is terrible.

You're missing the point - when a person is complaining about other people's misuse of grammar, and does so on a webpage that has "grammar matters" in big letters at the top, they really need to make sure that their grammar is not blatantly incorrect. Obvious errors discredit the point they're trying to make.

See also Skitt's law. http://www.telegraph.co.uk/technology/news/6408927/Internet-...


You're missing the point. The author is really talking about Amazon suppressing creativity in the written word. If you're going to take issue with the title, you should probably go with the "should be a title more in line with this theme". Clearly the title is hanging people up.


No, he is not talking about suppressing creativity at all. If he was, he wouldn't argue that his usage of hyphens was defensible as "correct" per the OED.

If it was a creative issue, he'd simply say "it is a stylistic choice, what else do you need from me to re-activate this ebook for sale?"


Yep. Amazon is suppressing books that probably shouldn't be suppressed, but it's not because of the creative aspects of them.


We have suppressed the book because of the combined impact to customers

Reading this line, you would think the book is of a sufficiently extreme nature to trigger censoring, but seeing it applied to something as trivial as hyphenation (would that be hyphen-hate-tion...?) is the most shocking part.


And this is why we used to deal with monopolies so harshly. Alas. I wonder when our country will start working in the citizens' favour?


Huh? This has nothing to do with monopolies and everything to do with the volume of crap e-books that Amazon is wading through. Unfortunately, the OP happened to get caught up in one of Amazons filters and their first level CS is clueless.

Monopoly or not, this is a hard problem and one that we see in other places like Google search or Apple's app store. Gaming the systems have become big business, and the little guy ends up being a casualty in the overall war.


When they force authors to publish in non-DRM encumbered formats that all readers can read like ePub or Mobipocket?


Amazon doesn't "force" DRM on anybody. It's a checkbox in the publishing portal.


It would be nice if the author would include a sample page of where the hyphens are used. We don't know if they are just used for word breaking across lines from a poor ebook or if they are used for joining words. I went to amazon to try to check, but it says item under review, so its not like we can see the real situation.


I, personally, do not want Amazon to make quality determinations regarding grammar and style for books they sell. It offends me philosophically and it seems horrible just from a practical point of view. I do understand this could be reasonable if, as some commenters wrote, this is a result of trying to combat poorly implemented digitization.

The rest of these comments discussing whether the original author is using hyphens correctly seem insane to me. Writers write as they write. Read it or don't read it. Enjoy it or hate it. Ask for a refund if you think it is sold in bad faith. But forcing a writer to change it is an unacceptable invasion of expression by the infrastructure. Any short term gain in increasing "quality" is subsumed by the potential weakening of the evolution of communication and art, which relies on creators taking liberties with conventional forms.


Actually, a few years ago someone wrote a script that pasted random YouTube comments together and submitted them to Amazon's Kindle Store as an actual ebook. Amazon published them all -- including their randomly-generated titles (like "A Lot Was Been Hard") -- until the books began receiving lots of media attention. Only then did Amazon remove them, citing the possibility of an unpleasant reading experience.

http://www.beyond-black-friday.com/2012/06/27/the-legend-of-...


I read tons of books via the kindle app on android. I believe that hyphens DO hinder the reading experience... due to an apparent bug with the kindle app!

I have seen countless times that, if the current page breaks on a hyphen, then the next page seems to begin at the next full sentence, thereby losing information. My normal workaround is to simply change font size temporarily to force a repagination. It is annoying.

So, maybe this is why people have complained? I can't really see any other explanation for caring whether or not an author chose to hyphenate.


Seems like most people in the thread are missing that this started with at least one actual reader complaint, and automation was only run over the book to verify. The author of the blog says there are only one complaint, but given the rather craptastic grammar examples posted and the vitriolic stance of the author, I'm not so sure.

Also worth noting that the "automation" was a spell checker, which presumably flagged examples that aren't considered properly hyphenated in the English language.

Amazon does most certainly pull books that have bad spelling or grammar based on user complaints, and I'm glad they do so. I've stumbled into poorly self-edited Kindle books many times, and have been awfully pissed I've dropped money on them. I'm happy for some quality control.

For those that think there should be a completely hands-off release process for these books, that's just horseshit. Books have been edited pretty much since the concept of "books." If an author wants to apply back for a waiver based on their Joycean command of the English language, wonderful. For a pulp writer, I don't buy it for a second.

I'm also waiting to hear about the rash of automated spell or grammar checking causing issues. Because right now what I see is one frustrated author throwing around accusations and any number of satisfied authors who actually wrote their books correctly.


Hypothetically, let's say this writer is really talented. Few writers use hyphens for maximum effect. So few readers get to experience the hyphen-joy. When they do finally encounter it, they might be confused or annoyed. To some of us, that's sad, and aggravating. The technical glitch, at least at first glance, feels like the symptom of a larger lapse in longing for literacy and appreciation for poetry.


My harsh response notwithstanding, I'm all for experimentation in literature.

I'm just not at all sure this is a technical glitch. As far as I can tell from the excerpts mentioned, the author probably does actually misuse hyphens. I'm not very prescriptive when it comes to language, but there is a point where you have to consider it to be an error instead of a mutation.

Edit:

OK, the book is back up and it's on Kindle Unlimited so I grabbed it and skimmed the first few chapters. So far, not too bad on the hyphen front, just a couple of instances I caught (water-bed jumped out). It might get worse later, of course.

But unless it's messing with screen-readers as posited above, perhaps Amazon did blindly react on this one. I've certainly seen much worse.

That said, holy crap the writing is profane. I don't think I've seen this many fucks in one place since I stopped watching C-SPAN.


I wonder if Amazon's automated system would flag something like "Riddley Walker" where, in the future, words have changed meaning and are spelled phonetically. (1) Maybe Amazon could have an editors community similar to Stackoverflow where they can influence and override the automated curating system.

1. http://en.m.wikipedia.org/wiki/Riddley_Walker


Off topic, but has it occurred to anyone that the overuse of the term 'war on x' might impair what the word 'war' really stands for?


"Early examples of war as metaphor in US political discourse include J. Edgar Hoover's "war on crime" in the 1930s".

We've been using it as a metaphor since 1930's. It hasn't impaired the "real" usage of the word "War" so far. I doubt it will in the future.


To be fair, the US has always been at war with Eastasia.


George Carlin more than 20 years ago.

https://www.youtube.com/watch?v=ilipDBclxRc



Well, not long after the term started (Hoover), the US Congress ceased to "declare war" when proceeding to engage in what obviously is war. Now it's just "kinetic military action" or some such, never a statement of "the USA hereby declares war on ______ ...".


That ship done sailed. Like, in the 1960s. No worries, though, because if any concept deserves impairing and besmirching, surely war does?


Has Amazon named this algorithm yet? I'd like to nominate "Proper English Diction And Nomenclature Tracker".


Amazon is rather hard-headed about their Kindle formatting.

I suppose it stems from early issues with even well-known books being horribly mis-formatted for Kindle. Ok enough hyphens for now.

At least you English writers can count your lucky stars that it is possible to publish ebooks at all on Amazon.

If you have published a regular book on Amazon in a language not deemed to be worthy of Kindle support you are SOL: https://kdp.amazon.com/help?topicId=A9FDO0A3V0119

Rather a short list isn't it? And the list is not growing despite sideloaded books displaying perfectly fine on Kindle in the not supported languages.


The bigger issue is Amazon's inability to recognize the situation when the author followed up. In their defense, they probably have hundreds of thousands of self-publishing authors. And with most of them earning very little, Amazon may not even generate enough revenue from them to support a full-time staff even just to follow-up on every author's question.

It's unfortunate, because authors do deserve better, and they'll understandably be upset when they're routed to a malfunctioning robot. But if the problem is too many authors and not enough support staff, maybe the real question is: does self-publishing scale?


Interestingly, the Kindle (at least, the Paperwhite I have) does not seem to hyphenate words, preferring instead to leave a ragged right edge if it's not able to justify within tolerances.

I realize this is a completely different situation than Graeme's, where he is deliberately hyphenating words in an anachronistic way, presumably for stylistic reasons. But I wonder if Amazon do have some sort of stance on hyphenation based on reader data.


Hilarious if you see how Amazon still doesn't care about hyphenation on their Kindle platforms. It's an easy job, e.g. the algorithm used by LaTeX is free to use and with Amazon's size they might even build a good database to handle all exceptions. If you consider that this all could be precompiled, it doesn't even matter that a first-gen Kindle has a lousy CPU.


My experience with Amazon's "rules" are that they tend to be so heavily automated that their support staff couldn't change anything if they wanted to. So my suggestion to Mr. Reynolds is to strip the hyphens and move on. Amazon is a bit of a monopoly and your only other choice is to not use their services.


Here is a wild idea... If you are going to be a publisher of books, why not qa (ie edit) these books the old fashion way. If quality books is the top priority, no algorithm will match the discernment of the human eye.

I am a developer, (not a writer), but this outrages me nonetheless


Calling someone's employee a "random fuck-wit" is probably not a successful strategy to get expedited successful results from your customer.

Amazon goes above and beyond for their customers. As a publisher or marketplace seller operating on their platform, they are your customer. Amazon's reseller support is entirely different, in my experience, from how they treat consumers -- a bit less polish because it is an internal tool is to be expected, but it goes a bit beyond that. I was initially a little unhappy with that, but since I'm also a customer, it makes sense.

They clearly could do better here in escalation path, but don't expect hand-holding. Amazon isn't exactly going to go out of business because your book isn't for sale.


Amazon is acting as the marketplace, or the conduit between producers and consumers here. In this formulation, both ends of the pipe are their customers. Yes you can make a reasonable case for quality control, but not a big one. Books are a pretty big caveat emptor when it comes to the prose contained. They always have been.

If Amazon is so concerned about formatting, then it should make some better tools to help authors fix bad formatting, not just run what seems to be simplistic regex matches on a pile of words.


Next up: E. E. Cummings books removed for lack of readability.


Amazon wants to auto-hyphenate based on the current screen / font. Hardcoding hyphens in an ebook will look terrible in most cases. I don't see a problem with Amazon's stance.


I don't think that's what the author is talking about. They mean hyphenated words such as 'wind-up'. The current screen or font doesn't make a difference for whether or not that's hyphenated. You're thinking of hyphenation for splitting a word over a line.


That's what the author thinks the problem is, but we don't know because the book isn't available and there isn't a list of the hyphenated words.

It's possible that most of those hyphens are line-break hyphens, not word-join hyphens.


The author claimed that this wasn't the case. I don't see why I would take this as a bad faith posting. You'd think if it actually were the case, Amazon would have a better way of indicating so.


The whole thing's clearly a confusion, with the author being a nonce as he doesn't realize it's to do with line break hyphenates and Amazon customer service being nonces because they don't realize those hyphens aren't line break artefacts and they're just cut/pasting a standard response.


i didn't think a human being could be a 'nonce', or is this some new usage i don't know about?


From wikipedia http://en.wikipedia.org/wiki/Nonce_%28slang%29

>> In the United Kingdom, nonce is a slang word for a sex offender or child sexual abuser.


Nonce means mild idiot in my social circles. I wasn't implying they were a social offender. I wasn't even aware of its implications. Learn something every day.

I'd delete the comment but HN has got ridiculously militant about that now and I can't.


In colloquial UK usage it means 'paedophile', and is generally considered offensive.


Unless they're really making claims about the person's sexuality, I'd guess the intended word was dunce, or something similar?


I think you're confusing hyphenation as word-break, which should be automated based on font, screen size, etc. But hyphenation also applies to 2 words that have a different meaning when not hyphenated. And this brings meaning, which Amazon should not interfere with.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: