Nookd (villagecraftsmen.blogspot.com)
227 points by tch on June 1, 2012

I was an engineer on the Nook team. It isn't possible for the device software to change text of books.

Most likely the author or conversion service took the Kindle edition, ran a search and replace for 'Kindle' and there you have it.


This book not being under copyright (afaik), the text is almost certainly from some guy in his garage grabbing Gutenberg Project, etc., books and formatting them for various e-readers.

He probably has some boilerplate stuff like "Formatted for Kindle by John Doe", which gets inserted into the Kindle .mobi version (Kindle being the dominant platform, he does that first). Then he does a search-and-replace, turning "Kindle" into "Nook" in order to create a version appropriate to that platform.

In fact, he could use something like Calibre to do the conversion from .mobi to .epub, which I believe allows for regular expression substitutions as part of the process. That would allow him to get whole shelves of the Gutenberg library in one simple, automated process.

Nothing insidious here. Just somebody being a little bit careless.

The device software reads data from flash and renders it in pixels on the screen. It is absolutely possible for it to modify the text as it does so.

I would assume that eclipxe meant that the software shipped with the Nook does not allow for on-the-fly text modification. Even if it did, this is the only occurrence of text substitution that has been seen, so the fault still lies with the publisher.

Not necessarily. There are many more moving pieces to this, especially considering DRM.

Well good thing the author of this post bothered to talk to anyone before throwing this out there on the internet.

To the contrary of the snark, throwing this out there on the internet may have been about the only way for the post's author to get attention of a Nook team member. It wasn't B&N's fault (they're just retailing what they're paid to), so getting someone knowledgeable to look at the issue via nook.com may have been ... unlikely.

It was posted, it got discussed on HN, within a few hours we have a good relevant answer from an insider.

Why do all that? Just look at the file you downloaded in another viewer, see if the word is still there. Then, load another file that contains the word kindle and see if it gets changed. What's so hard about that?

This actually points to a risk that Barnes and Nobles, Amazon and other content stores are susceptible to: users who assume that they edit their content like a traditional publisher would or should.

This was a 99c purchase of a public domain work that was probably reformatted by someone who didn't care enough to check their work. This work was then publicly criticized by a consumer who didn't care enough to check their facts, either.

There has always been similar things with books. I remember being warned away from el-cheapo Shakespeare collections, etc.

The 99-cent War and Peace I see on their site is published by a third party, Superior Format Publishing. Could they not be the ones that search-and-replaced "kindle" with "nook"?

From their site (http://superiorformatting.com/), it appears that's not the only problem they are having (from their site):

Some of Our Titles

Whoops, looks like there was a problem get the book data from Amazon. Please try again in a moment

I have the kindle version and it looks fine. And why are they paying for copyright-free material? They can get a nice eBook version from Project Gutenberg.

But remember that War and Peace was not originally written in English, and the quality of translation matters a great deal. With Russian literature especially, the quality of translation varies widely, and some of the best translations into English are still copyrighted. A bad translation might faithfully reproduce the text's literal meaning but badly mangle the literary elegance that made the book so acclaimed in the first place. I'd rather pay $14 for a good translation than spend 1100 pages regretting being cheap.

Sure, but the 99 cent edition in question almost certainly is the Project Gutenberg edition, mangled by a third-party file conversion.

(On personal note, a few months ago I decided to read Les Miserables. I read a few chapters in one of the modern translations and the same material in the Project Gutenberg version. I concluded that IMO the Gutenberg version was clearly the superior translation.)

I agree, but I think in the case of Tolstoy, or even Dostoevsky, the best translations also have their copyrights expired, since they were written usually few years after the works were published. I read a French version of Anna Karenina (Recommended by the way) once in hard copy, then reread the Gutenberg version, and to me the literary differences were minimal.

I can't speak for Tolstoy, but the Gutenberg version of Dostoevsky was atrocious compared to the Pevear & Volokhonsky translation. FOr a more modern example, Simone de Beauvoir's first translation was by an unskilled botanist who happened to know french and english. A recent translation came out that is night and day compared to the original, but it will enter public domain some fifty years later.

the quality of translation matters a great deal

Indeed. The English translation of Jules Verne's "The Mysterious Island" in Project Guttenberg is an "old" one which censors certain aspects of Captain Nemo's anti-colonial background.

Really? "The Mysterious Island" is one of my favorites, and I'd be curious to know what, exactly, is being elided in the Project Guttenberg edition. Is this something you noticed yourself, or something you read about?

I found out in Wikipedia, unfortunately after I read the book... http://en.wikipedia.org/wiki/The_Mysterious_Island#Publicati...

An excerpt from Kingston's translation: http://www.gutenberg.org/files/1268/1268-h/1268-h.htm (this is the one I read)

"Captain Nemo was an Indian, the Prince Dakkar, son of a rajah of the then independent territory of Bundelkund. His father sent him, when ten years of age, to Europe, in order that he might receive an education in all respects complete, and in the hopes that by his talents and knowledge he might one day take a leading part in raising his long degraded and heathen country to a level with the nations of Europe."

The corresponding excerpt from White's translation, which is also in Project Gutenberg http://www.gutenberg.org/dirs/etext05/8misl10h.htm (closer to the original French)

"In Captain Nemo was an Indian prince, the Prince Dakkar, the son of the rajah of the then independent territory of Bundelkund, and nephew of the hero of India, Tippo Saib. His father sent him, when ten years old, to Europe, where he received a complete education; and it was the secret intention of the rajah to have his son able some day to engage in equal combat with those whom he considered as the oppressors of his country."

And the original French

"Le capitaine Nemo était un indien, le prince Dakkar, fils d’un rajah du territoire alors indépendant du Bundelkund et neveu du héros de l’Inde, Tippo-Saïb. Son père, dès l’âge de dix ans, l’envoya en Europe, afin qu’il y reçût une éducation complète et dans la secrète intention qu’il pût lutter un jour, à armes égales, avec ceux qu’il considérait comme les oppresseurs de son pays."

I pay for public domain works all the time, often because the formatting is better or it's a specific translation I'm interested in.

For translated works, modern, copyrighted works are nearly always better.

Is the authors implication that in the digital nature of ebooks the ability to manipulate text becomes easier? Regardless of medium isn't this always a possibility? Whether it happens at the publisher and then printed on dead wood or otherwise -- to me it seems like one in the same?

It's rather harder for some government/publisher/web search company to modify the hardback copies on my shelf to airbrush Trotsky out of the pictures.

With an eBook all it takes is a click by somebody at Amazon

This is a good argument against DRM and e-book-related cloud services but not really against e-books generally. I own a Nook and have never bought a locked-down e-book, so the publisher really can't edit the copy I own short of breaking into my computer.

No it doesn't take "a click". It's pathetic how you malign Amazons good name, when they were the ones who are the victims in this case.

Stupidity like this only hurts the long-term goal of making printed books something only hobbyists are interested in (like vinyl records), but what's the solution? Data integrity regulation from governments? Or will the markets take care of it?

It worries me that one day hard copy books will be viewed as a hobbyists endeavor..

It worries you that people will read more? That people will read more varied things? That authors will get better compensation? That self-publishing will become a much more viable model for literally millions of authors?

Man you worry about some weird stuff dude.

I have literally no idea what you're arguing against, or where you came up with any of your arguments. I did not mention people not reading, nor people not reading varied things, nor compensation, nor self publishing.

So confrontational, geez.

The context of my statement was simply that I enjoy the physical nature of books. I was simply trying to express that if they were gone, or harder to come by than their digital brethren, I would be a little saddened.

Perhaps other things worry him and I don't think you should assume they are the ones you mentioned.

Your comment struck me as snarky and unfair and I have downvoted it and upvoted its parent.

Hard-copies endure. Digital, not so much. Checkout Jordan Mechner's recent archaeological expedition to retrieve some source code of his from 25 years ago (Prince of Persia). It was nearly lost, and that is only 25 years!

The digital world is ephemeral.

Atlanta is enjoying an exhibit ("Passages") of a very large collection of very rare Bibles. From fragments of early/original texts (ex.: Dead Sea Scrolls) to definitive works (first-run King James Bible) to remarkable renditions (illuminated works) to unusual associations (mother-of-pearl encrusted cover Bible given by (!) Yassir Arafat) to notable errors (Wicked Bible, named for the single typo "Thou shalt commit adultery"), the soon-ending exhibition is for this thread a testimony of the importance of physical copies.

Per that last example, consider that the "Nooked" War and Peace could be considered a "great typo" someday sought after by collectors - except that, being ephemera, the digital copy will either be lost or copies unverifiable due to ease of replication.

I certainly appreciate the benefits of e-books. At the same time, physical presence carries a lot of meaning beyond just content. Alas for those notable books lost in a sea of bits...

Source code is intentionally kept secret and is different from the widely-distributed version.

The book-equivalent of source code is the collection of discarded drafts and working notes of the author. These, too, are often lost, but that has little to do with whether they are digital or not.

It's expensive to properly store and maintain physical copies of things like books. I understand that archivists have some trepidation replacing a known system of storing physical books with a relatively unfamiliar and untested system of storing digital goods. But the bugs are already being worked out; if you want to make sure something lasts, make it digital and make sure lots of others can get to it.

Isn't this the sort of thing digital signatures are for?

Author can sign the text of the book, and publishers can sign the container. I don't know how future-proof this is (and it obviously isn't going to work for something like War and Peace), but it's something that can be done today.

