
New DRM Changes Text of eBooks to Catch Pirates - Lightning
http://torrentfreak.com/new-drm-changes-text-of-ebooks-to-catch-pirates-130616/
======
columbo
Meh.

Buy three ebooks from different accounts, render them to pain text, diff them
and take out the differences.

Why three? The assumption is 2 books would have the correct value and 1 book
would have the copyrighted value.

Though I don't see this as DRM. I've done this before on a much smaller scale
to track things as they enter the wild. It's just security through obscurity,
the digital equivalent of a tracking collar without any of the benefits, like
guaranteed uniqueness.

It's also crazy overkill. Here's a much simpler way to do it: At the end of
every period randomly decide to add an extra space. Hardly noticeable. Spread
(again random) between 1024 and 2048 of these over the entire document. Count
them up... now you have a pretty decent identifier without having to do any
sort of special logic to the words themselves. To the average person it's not
noticeable, it doesn't require messing with the original document, and it can
be made unique enough to be a digital fingerprint. Of course this house of
cards falls apart when you know the secret.

~~~
clicks
Not to mention... would-be pirates could just use a non-traceable method of
payment when acquiring these eBooks.

This seems to be a lot of effort for really no good.

~~~
visarga
> would-be pirates could just use a non-traceable method of payment when
> acquiring these eBooks.

what? since when do digital 'pirates' pay for ebooks?

~~~
a_bonobo
The very first element of the chain of distribution still has to acquire the
original somehow.

~~~
clicks
Exactly. A lot of leakers in the scene actually go to quite some lengths
(financially and otherwise) to be the first one to release something. It's a
whole another world of its own where cred matters a lot, and people will
really do go through a lot to make their name known.

~~~
tekromancr
Scene culture is fascinating.

~~~
a_bonobo
There used to be a selfmade low budget series called The Scene which showed
the internal workings of release groups:

[http://www.welcometothescene.com/faq.html](http://www.welcometothescene.com/faq.html)

I watched 2 episodes or so before it became too boring watching people chat on
AIM, but I learned quite a bit about the internals (in how far these are true,
I do not know - the makers say it's fictional but the stories are authentic)

~~~
tekromancr
This is really dated, an a really good way.

~~~
a_bonobo
glad you like it :)

------
mathrawka
I have worked with e-book authors in creating systems to mitigate piracy of
their books.

Even if you know who gave out the book that the pirates were able to get, it
doesn't make them automatically guilty. In one case, the brother of a customer
was the culprit... the author was actually surprised because from the looks of
the evidence, it was someone who he trusted that "leaked" the book.

Sure, the person could have lied and blamed his brother, but the author
trusted him enough to believe the story and accepted the apology from the
brother.

Anything involving DRM starts to make things get hairy. I can understand the
want of the authors to protect their contents that they worked hard on
producing. But I also understand how the real customers don't like DRM. In the
end, DRM was ditched and a subscription model was created. Both the author and
the users were happy with that resolution.

~~~
kefs
Exactly this.

Just because a copy of an ebook I purchased is now floating around the
internet, does not, a pirate, make me.

------
Ellipsis753
I can see some benefits of this but I think the idea of a script automatically
going through and making changes is quite horrible. In one example they
removed a new paragraph break. That's an unacceptable modification to me.

I think that the best way to do something like this (if you want to do it at
all) would be if the original author chose 30 small acceptable changes
themselves. The presence of these changes (of lack thereof) would then allow a
30 bit ID number to be embedded in the document allowing each copy to be
identified separately. They could also have a built in checksum to allow the
original ID to be recovered even if some of the changes were found. By
increasing this to 100 possible changes (50 active changes in each copy on
average) then you could have a very large amount of error correction data and
the original ID could still be recovered when a large number were found and
changed (this would prevent pirates buying 2/3 copies to find the changes as 2
copies would only highlight half the changes on average if there is an equal
chance of the change and original showing).

Ultimately I do not think this will work however. A pirate can use a pre-paid
credit card to buy the e-book. In this way they cannot be tracked. So I
honestly don't think this is worth the effort.

Just my thoughts on the matter.

------
beagle3
That's not DRM. If anything, that's a textual version of "digital
watermarking"
[http://en.wikipedia.org/wiki/Digital_watermarking](http://en.wikipedia.org/wiki/Digital_watermarking)

It's old news - it's been done to find internal document leakers for years - I
believe even in typewriter days.

Thing is, it only works on a very small scale, when the people under
surveillance are unaware, and are unwilling to collaborate. When these
assumptions don't hold, it is trivial to generate a version which has all the
information yet does not correspond to any original watermarking (or worse,
replicates some watermark that is not part of the set used to generate the
version)

~~~
davorak
> It's old news - it's been done to find internal document leakers for years -
> I believe even in typewriter days.

I knew it was an old technique but I had not thought about it being that old.

If you have any idea of where you heard this I would be interesting in
knowing.

~~~
shardling
I first read about it in an old Tom Clancy novel.

A short bit of googling led me to this:
[http://en.wikipedia.org/wiki/Canary_trap](http://en.wikipedia.org/wiki/Canary_trap)

So the technique was well known by 1987, which certainly qualifies as
typewriter days. Apparently it was used in fiction as early as 1972.

------
steve19
This DRM actually changes the content by rewriting the text, not unlike
Microsoft Word's grammer checker/corrector tool. I would be very surprised if
publishers and authors would go for this. Clumsily changed sentences could
ruin a good book.

I have always assumed that publishers embed an ID somewhere in the Mobi or
EPub files (they are just HTML, you could insert all sorts of things that
would never be displayed) since it is trivial to remove the DRM from them.

~~~
mikeash
The willingness of content creators to deliberately distribute an inferior
product just to catch pirates never ceases to amaze me.

~~~
pyre
'Inferior' up until now didn't describe the content, but the rest of the
experience. This would change the content itself, which I would expect most
authors to be very protective of.

~~~
mikeash
Not entirely true. It's been done in movies for a few years now, to track down
leaks from theaters:

[http://en.wikipedia.org/wiki/Coded_Anti-
Piracy](http://en.wikipedia.org/wiki/Coded_Anti-Piracy)

Of course, movies and books are probably opposite ends of the spectrum when it
comes to artistic integrity.

~~~
tekromancr
They claimed that they were imperceptible, bullshit. I see about 3 or 4 every
time I see a movie in theaters. I see it, I notice it, and the fact that they
were willing to disrupt their own content bugs me enough that I am unable to
enjoy the rest.

As far as artistic integrity goes, it makes sense that the immense cost of a
film being covered by the studio gives them quite a bit of say in the
filmmaking process, for good or ill.

------
just2n
What if someone steals a copy of my file, then uploads it to TPB? Now do I get
sued for millions?

Am I presumed guilty, since it's clear there's "evidence" to show that I was
the source of the now widely spread copy? Do I now have to prove to a judge
that I actually didn't upload that file to TPB?

Guilty until proven innocent?

~~~
andrewflnr
A new form of evidence doesn't necessarily mean the end of the "reasonable
doubt" standard (at least in the US, I don't know how other countries phrase
things).

~~~
Silhouette
Copyright in most cases and most places is still a civil matter, so the
standard in many places would be more like "balance of probabilities" rather
than "beyond reasonable doubt".

In other words, if your personally marked copy of a work was found floating
around on the Internet and you got sued for copyright infringement, you'd
probably have to convince the court that it was more likely that someone had
stolen the work from you and redistributed it than that you had redistributed
it in some other way yourself.

This isn't a completely safe position even then, because if someone has
suffered substantial damages as a result of the work being distributed (and
copyright law is _infamous_ for inflated claims of damage caused by
infringement) and you have potentially already admitted to being negligent in
protecting your copy of the work then you might just find yourself jumping
from the frying pan to the fire.

------
Rangi42
I had hoped from the headline that they would be doing something subtle, like
replacing some capital "A"s with Greek "Alpha"s, or replacing a single-
character "ü" with separate "u" and "combining umlaut". But visible changes
that mess with, not just the text formatting, but the actual words of the
text, are unacceptable.

~~~
gizmo686
Invisible changes are trivial for a pirate to undo (once they know what is
going on). With visible changes, the pirate would have a significantly more
difficult time to erase the watermark. My guess would be that, with only a
single copy, he would have to destroy the contents of the books a huge amount
to hide himself, where as the publisher would only need to destroy a very
little amount.

------
asveikau
I have to say I started to have more faith in ebooks when I realized how easy
it is to strip DRM from Kindle purchases. I've made more purchases since then
and feel better about doing so. I like the convenience of ebooks, the way they
make my bookshelves look antiquated, but I always had in my mind the "1984"
scenario where they take away your purchases. So I strip the DRM of all my
purchases and keep a local copy, even though I only ever read them with the
Kindle app. When they take away my ability to do that, I'll stop purchasing.

------
a_bonobo
I speak German and had a look at the example document [1], and it's hard to
tell how big the changes are - this document shows the original and the
changed version and asks how annoying the changes are, from a scale of 1 to 6,
presumably to know how much their algorithm can change before the rights-
holders complain.

Often, the words are replaced by synonyms ("nicht denkbar", not thinkable gets
replaced by "undenkbar", unthinkable) or switches the ordering of words
("zwischen Blei Gold und Silber", between lead, gold an silver becomes
"zwischen Gold, Silber und Blei", between gold, silver and lead), just the
connection of words is changed ("Schmiergeldübergabe", transfer of bribes,
becomes "Schmiergeld-übergabe") or the linebreaks are slightly changed, for
example one word is moved to the next line.

At this stage, it's very hard to tell how much the final mechanism is going to
change the original text, but I can't see much changing of very minor elements
like punctuation.

It's still a stupid idea since many of these above elements are important to
the "vibe" and meaning of a text - the way an author orders elements, or the
specific words used by authors can be interpreted differently.

[1] [http://www.scribd.com/doc/148113352/Sidim-
Changes](http://www.scribd.com/doc/148113352/Sidim-Changes)

------
stormbrew
Why on earth do the organizations that 'fight' piracy keep suffering from the
delusion that it's just random everyday people who share a copy with a friend
that are the source of pirated material on the internet?

~~~
caryhartline
Have you been to a torrent site? Are the hundreds of seeders of a single
e-book all friends?

~~~
stormbrew
I'm really not sure what you thought I was saying.

------
Duhck
This is more of treating the symptom(s) and not the problem(s).

Regarding DRM in general, it exists to prevent piracy, and piracy exists
because content is easier to obtain, consume, and share through these
'illegal' methods.

If prices came down, accessibility went up, and shareability is embraced,
piracy will fade into the background, and the remaining people doing it will
be unjustified in their actions.

If content creators and distributions embraced a system that benefitted the
consumers everyone would be better off, but they are years (if not decades)
behind the times.

------
ivan_ah
> the DRM shuffles some words around, inserts synonyms, changes the paragraph
> format or the punctuation. For example, the word “unsympathetic” could be
> changed to “not sympathetic,” and so forth.

This is horrible! I would prefer my book to be pirated than to have an algo.
mix words up.

Besides, if you wanted to do this, why couldn't you use non-printing
characters instead, like the soft hyphen U+00AD or zero width space U+200B?

------
mathrawka
And another comment...

I've always wondered if companies like Apple, Google, etc do text watermarking
of company-wide confidential emails. Everytime I see an e-mail in the full
text posted to Techcrunch, I imagine someone just got busted...

~~~
nsp
Elon Musk used this technique at Tesla in 2009, though it backfired hilarious
when he didn't tell the company leadership, and the general counsel sent out
his version to the entire company as well.

Hate using gawker as a source, but it's what I found first:

"Musk set out to entrap potential leakers by sending each employee a slightly
altered version of an email which he expected would get sent to the media.
Musk began the memo, "I'm a big believer in trusting employees."

By altering phrases scattered throughout the email — changing "I'm" to "I am,"
for example — a Tesla IT employee created individualized memos which would
have a detectable "fingerprint" in the text. In the memo, Musk asked everyone
to sign a new, stricter nondisclosure agreement. The agreement wasn't the
point of the email — it was just a ruse to catch the company's leakers."

[http://gawker.com/5164035/tesla-ceo-in-digital-witch-
hunt](http://gawker.com/5164035/tesla-ceo-in-digital-witch-hunt)

------
slig
I remember seeing a startup doing something similar to catch employes leaking
internal docs. Can't find it now, though.

~~~
brokentone
Agreed, remember a story, but can't place it. The technical term for this
apparently is a "Canary trap"
[http://en.wikipedia.org/wiki/Canary_trap](http://en.wikipedia.org/wiki/Canary_trap)

------
cpdean
DRM where I can still email my favorite books to friends. I support this.

