
An Analysis of the Impact of Arbitrary Blockchain Content on Bitcoin [pdf] - nvarsj
https://fc18.ifca.ai/preproceedings/6.pdf
======
ddalex
So the obvious bit here is that an adversarial actor can poison the blockchain
with bits that make downloading, storing and processing the blockchain a
highly illegal act.

This may have a chilling effect on participating in the network, thus reducing
the number of nodes, and undermining the trust on the blockchain and in the
coin because the reduced requirement of computing power needed to mutate the
blockchain.

Accordingly a determined actor with enough resources (e.g. a nation-state) can
render the bitcoin value-less and use-less at will....

~~~
VMG
The data embedded in the blockchain can only be extracted using custom tools.

Custom tools can be used to extract any data from almost any other kind of
other data, using steganography. You can extract illegal data from innocuous
looking tweets, using the right tools.

~~~
arcticfox
That's a bit of a stretch. There's no reason to obfuscate anything (unless
miners are actively and intentionally censoring content) and it'd be
straightforward to make a browser.

For example, I don't think it's a defense for the owner of a website to say
"this horrible illegal content I'm sharing is only accessible via a custom
tool that downloads and decodes image data from me...such as a web browser."

~~~
VMG
You can make a browser extension extracting illegal data from innocuous images
on twitter

~~~
OscarCunningham
Only by using your own stock of illegal data.

~~~
VMG
that's not true:
[https://en.wikipedia.org/wiki/Steganography](https://en.wikipedia.org/wiki/Steganography)

~~~
mlindner
You don't seem to understand what Steganography is and how it works. Perhaps
you should read that article. You cannot produce illegal information out of
non-illegal information without having that illegal information somewhere else
to use as a key.

~~~
VMG
[https://i.imgur.com/tn6dav0.png](https://i.imgur.com/tn6dav0.png)

the "key" is a very simple algorithm, that is obviously not illegal

~~~
SAI_Peregrinus
But in that case the second image was already stored in (low significance
portions) the first. A truly innocuous image won't have such stored data.

~~~
VMG
That not important.

My point is that you could unknowningly store illegal data, without blockchain
being involved.

Anyone sharing the data by torrenting an apparently legal file, reposting an
image, or quoting a text could redistribute potentially illegal data.

This problem is not constrained to blockchain.

~~~
dragonwriter
> My point is that you could unknowningly store illegal data, without
> blockchain being involved.

The problem with blockchain is that a full node will be _knowingly_ storing
and distributing illegal content (or at least is likely to held to be, once
the chain is widely known to be infested with such content.) The only way
to.stop is to stop storing and interacting with the chain.

With a torrent, you could be unknowingly doing so, but very easily _stop_
doing so with the particular content once you discovered it (without
abandoning torrents altogether), which may or not be a complete bar to
liability but would probably result in generally more lenient treatment in any
situation other than the government is targeting you and the actual offense is
pure pretext, in which case the content hardly matters.

~~~
VMG
There are many non-blockchain scenarios where there is difficult-to-remove
data. How would one deal with embedded illegal data stored in firmware? Public
records?

And which subsets of illegal data are legal?

------
oasisbob
This gets close to something else I've been pondering lately - how to deal
with immutable data structures in the realm of GDPR.

Recitations in GDPR require systems to be designed with privacy in mind.
Immutable structures like the Bitcoin blockchain or Merkle trees in other
applications would seem to be fundamentally incompatible with some GDPR
privacy requirements.

Let's say Google receives a valid right to be forgotten request for an entry
in one of their Certificate Transparency logs? Then what? I don't see how it
can be dealt with without destroying the integrity of the log.

~~~
cesarb
Some immutable data structures can cope with missing data. Merkle trees are an
example: to validate that a leaf is part of the tree, you don't need any of
the other leafs (only their direct or indirect hashes). For the Bitcoin
blockchain, it has been designed so that transactions where all outputs have
been spent can be pruned, after the spending transactions have been validated.
It wouldn't be hard to extend this to prune "illegal" transactions, even after
they've been included in valid blocks; the only consequence would be that a
node wouldn't be able to validate other transactions spending these "illegal"
transactions (so it would have to risk accepting an invalid transaction, or
rejecting a valid transaction, in both cases risking being on the wrong side
of a fork).

~~~
chrisabrams
You would still have the history of those transaction(s) before they were
pruned though...

------
zone411
Not too long ago, I checked the number of images that were posted on the
Ethereum blockchain and I also posted a test image
([https://etherscan.io/tx/0x3ee5575306ddc235b0586984172888b47e...](https://etherscan.io/tx/0x3ee5575306ddc235b0586984172888b47e92789d672d1aa917f261395efb2495)).
There were 74 images with headers that appeared as jpg, gif, or png already
posted before my test image. It's very easy to post to the Ethereum
blockchain: just convert back and forth from file to hex using a site like
[http://tomeko.net/online_tools/hex_to_file.php?lang=en](http://tomeko.net/online_tools/hex_to_file.php?lang=en).
You might also have to calculate how much gas will be used. I opened a few
images and I saw a couple selfies but the oldest jpeg is a certain infamous
image from 1999 that I don't think anybody cares to see. I anticipated the
chance of illegal images (though it's not exactly a novel idea so I figured
somebody would notice it has happened by now if an illegal image has been on
the blockchain for a while), so I haven't opened others.

~~~
wakkaflokka
Does etherscan let you search the "Input Data" of a transaction? I can see
somebody building a script to parse the blockchain for image headers and
convert the hex to a file now that this 'idea' has hit the mainstream. Going
to be interesting to see what happens with this.

~~~
zone411
I wrote the script to parse the Ethereum blockchain from a local copy myself.

------
Moodles
Some interesting graffiti on the bitcoin blockchain:
[http://www.righto.com/2014/02/ascii-bernanke-wikileaks-
photo...](http://www.righto.com/2014/02/ascii-bernanke-wikileaks-
photographs.html)

I have always wondered this. What will a country do if someone embeds child
pornography or a picture of Mohammed or something in the blockchain? Will it
then be illegal to store the blockchain in that country? Is a link to such an
image much different to an actual image? It seems hard to ever stop this
happening with a public permissionless blockchain, pretty much by design.

~~~
zeth___
I haven't looked for obvious reasons but back in the day people were saying
that there was already child porn embedded in bitcoin:
[https://bitcointalk.org/index.php?topic=671894.0](https://bitcointalk.org/index.php?topic=671894.0)

Which is why child porn is one of the Four Horsemen of the Infocalypse:
terrorists, pedophiles, drug dealers, and money launderers.

That we are trying to apply flesh space laws to bits just goes to show how
stupid we still are:
[https://en.wikipedia.org/wiki/Illegal_prime](https://en.wikipedia.org/wiki/Illegal_prime)

Until we change the laws we have to meet the needs of digital computers
instead of the printing press we will have these ridiculous ways of attacking
useful new technology.

~~~
dpwm
I'm reading this charitably as you saying that laws that criminalise
distribution regardless of intent or knowledge need to be changed to recognise
that with the internet people can effectively commit crimes without knowledge
or intent. This is particularly relevant where distribution of material (child
porn, terrorist material) is criminalised.

This seems a really difficult area that's almost incompatible with the way we
do criminal law, because intent and knowledge are so hard to prove either way.
Even where there's plenty of circumstantial evidence of intent it is not going
to prove it either way.

You could plan to murder somebody for years, leave an evidence trail and then
run them over by complete accident. Should that be premeditated murder or
manslaughter? In the eyes of a jury it will almost certainly be premeditated
murder. You could black out at the wheel and for a while have no knowledge
that you have killed. From the philosophical perspective the lines are blurry:
Only the individual can actually know, and given the amount to which people
can self-delude themselves, even that isn't guaranteed.

Juries do not make decisions on reasonable doubt, and often default to balance
of probabilities. Depending on the jurisdiction, when a unanimous verdict
cannot be determined a majority one is accepted.

Let's say I have a HDD I write random data to as a block device. What are the
implications for me in twenty years if somebody creates an image file format
that can decompress some of the linear subregions of my random HDD data. I
haven't the time now to do anything but a very simple analysis of this.
Intuitively this depends on the size of the drive and the size of the
compressed file. Let's say for arguments sake it can encode a prohibited piece
of data in 10kB.

At what point of completely random storage material are you likely to have a
forbidden piece of data? Well, each terrabyte contains approximately 1e12 such
linear subsequences. And we need 1e3010 such subsequences to match a forbidden
sequence. So that's 1e2998 or so TB if there is only one forbidden piece of
data. With more I think the birthday paradox kicks in. Now if we can encode
the forbidden data in 8 bytes or such then we reach the problem much sooner. I
doubt that will happen somehow.

~~~
zeth___
>I'm reading this charitably as you saying that laws that criminalise
distribution regardless of intent or knowledge need to be changed to recognise
that with the internet people can effectively commit crimes without knowledge
or intent. This is particularly relevant where distribution of material (child
porn, terrorist material) is criminalised.

No I am making the very simple case that a number is a number and you can't
make a number illegal.

All digital information is numbers and banning any of the 4 horsemen of the
infocalypse at mere possession will ensure we retard most useful technologies.

The only time that flesh space laws should apply is when flesh space actions
are taken.

An example: terrorist training material is sent on how to build a bomb. Until
a bomb is built, or a conspiracy to build a bomb is made, nothing illegal
should have happened.

~~~
Freak_NL
Wouldn't that make cyberbullying and slander even in its most extreme forms
legal? In those cases often someone is being harmed (with real, flesh space
consequences), despite the material being nothing more then numbers.

~~~
zeth___
Slander and cyberbullying are civil law, not criminal law.

~~~
zaarn
But it's just digital information, as you mentioned. It never enters flesh
space as real.

~~~
zeth___
Civil law is again very different to criminal law.

You can obviously have a contract between two people for sending digital
information between them.

That this only happens in digital space does not mean the contract can't or
shouldn't be enforced.

------
rogual
Isn't intent generally a big deal in law? If I argued that your innocent MS
Word doc was illegal because it decoded to child abuse images under a special
encoding scheme I just made up, I'd be laughed out of court, and rightly so.
It's technically true, but irrelevant. No crime was committed. Doesn't the
same principle apply here?

~~~
cstross
In the UK (and, I believe, the USA) child pornography is a "strict liability"
issue — intent is irrelevant, mere possession is itself illegal. This is how
teens get nailed for receiving unasked-for sexts from their under-age-of-
consent SO.

As with illegal drugs and, in the UK, unlicensed firearms, this means the
prosecution doesn't need to prove intent. (Mandatory — usually harsh —
sentence terms are usually part of the package with strict liability
offenses.)

~~~
dandare
I don't know enough about UK laws or laws in general, but it should be easy to
discredit such law. It only takes one rebel tech savvy teen to send his nude
sext to whatever celebrity/politician email/phone he can guess while alarming
the police - and the press to stir as much shit as possible.

Actually, pardon me for a minute, I have sci-fi short to write :).

~~~
cstross
In the UK, the police and prosecutors tend to take a common-sense approach:
the "rebel tech savvy teen" would be the one who ended up prosecuted for
possession.

(Similarly: possession of an unlicensed gun carries a stiff prison sentence.
But if someone chucks a pistol on your lawn and you without delay call the
police and ask them to take it away, you're probably safe. Picking it up and
taking it inside is another matter, however ...)

The USA is a bit different. District Attorneys being elected means they have
an incentive to bring charges against "soft" targets who'll take a plea
bargain, i.e. hapless teens and people too poor to afford a decent defense
lawyer.

------
mtgx
How do you keep the blockchain uncensorable if you start accounting for child
porn, and copyright, and all this other stuff that "needs to be taken down".

And if you do all of those things, why even bother with the blockchain and all
of its cons in the first place? All that remains is the completely accountable
surveillance of users, and I believe that becoming a bigger surveillance
machine than even Facebook ever was wasn't the original vision for the
blockchain.

I think that unless the main blockchain projects such as Bitcoin, Ethereum,
and so on, don't start implementing anonymity by default soon, in a few years
they won't be allowed to do it anymore.

~~~
hndamien
One can hope that the Monero market cap catches up with Bitcoin for this very
reason.

------
hapnin
When PGP was released in the early 90s, one of the arguments used against it
was "child porn". When P2P networks using DHT became viable earlier in this
century, one of the arguments used against the networks was "child porn".

The tech still remains and still functions, arguments or not.

~~~
otherwiseguy
The difference is between "the technology" and an "implementation of the
technology". Sure, P2P DHT exists. Just like blockchains exist. But a specific
blockchain, which is append-only, is trivial to make illegal by embedding
illegal content in it, making every user of it technically in possession of
illegal material. You can use a DHT w/o breaking the law. You cannot store
child porn on your computer without breaking the law.

~~~
posterboy
In some cases a link to a file is enough, which would implicate DHT, but due
to the nature of the distribution would not compromise the whole network. Cf.
Pirate Bay.

------
def-
If you are interested in this kind of paper, it was in today's morning paper:
[https://blog.acolyer.org/2018/03/19/a-quantitive-analysis-
of...](https://blog.acolyer.org/2018/03/19/a-quantitive-analysis-of-the-
impact-of-arbitrary-blockchain-content-on-bitcoin/)

------
m3kw9
Isn’t this more like someone throw a stolen item in your bag without you
consenting or knowing?

------
XR0CSWV3h3kZWg
For the cases where data is encoded as the P2PK address those tx should be
safe the prune, although downloading them to verify the block would be
necessary.

------
bryanph_
The solution is simple. You need a cryptocurrency where the primitives are
simple and predictable. Basically the same reason we don't use eval() on user
input. This might very well mean the end for bitcoin if legal parties pick up
on this.

------
jwilk
Abstract:

 _Blockchains primarily enable credible accounting of digital events, e.g.,
money transfers in cryptocurrencies. However, beyond this original purpose,
blockchains also irrevocably record arbitrary data, ranging from short
messages to pictures. This does not come without risk for users as each
participant has to locally replicate the complete blockchain, particularly
including potentially harmful content. We provide the first systematic
analysis of the benefits and threats of arbitrary blockchain content. Our
analysis shows that certain content, e.g., illegal pornography, can render the
mere possession of a blockchain illegal. Based on these insights, we conduct a
thorough quantitative and qualitative analysis of unintended content on
Bitcoin’s blockchain. Although most data originates from benign extensions to
Bitcoin’s protocol, our analysis reveals more than 1600 files on the
blockchain, over 99 % of which are texts or images. Among these files there is
clearly objectionable content such as links to child pornography, which is
distributed to all Bitcoin participants. With our analysis, we thus highlight
the importance for future blockchain designs to address the possibility of
unintended data insertion and protect blockchain users accordingly._

~~~
elmar
"illegal pornography, can render the mere possession of a blockchain illegal."

I see this as a very strong legal attack vector on full nodes and
cryptocurrencies, probably a way around it, is to only allow meta information
on a cryptographic form, even then the owner can publish the view key
publicly.

A drastic solution is to just prune or don't even allow metadata.

~~~
CamTin
You don't even need the ability to record metadata on-chain to encode
arbitrary data. An agreed-upon method of encoding it into ordinary
transactions is enough. Even if BTC-style transactions were just
inputs/outputs (they're not), you could still encode information down into the
satoshi-place of the inputs or outputs themselves. It's even worse for
something like Ethereum: essentially the whole point of that blockchain is to
encode abritrary (executable) metadata in the form of the contracts
themselves.

~~~
rogual
You could do that with an ordinary bank account though, and call the cops on
your bank. In fact, you could do it with any service provider who logs your
activity. Simply invent an encoding scheme and encode something illegal in
your actions.

------
martindale
Title is sensationalism. Actual paper title is "A Quantitative Analysis of the
Impact of Arbitrary Blockchain Content on Bitcoin" — see also Illegal Numbers
[0].

[0]:
[https://en.wikipedia.org/wiki/Illegal_number](https://en.wikipedia.org/wiki/Illegal_number)

~~~
zeth___
The paper isn't wrong though. There are supposedly numbers that correspond to
illegal jpg images in the bitcoin block chain.

~~~
omginternets
> There are supposedly numbers that correspond to illegal jpg images in the
> bitcoin block chain.

How did they come about? Do you have a source for this?

~~~
throwawaylolx
Thats's precisely what the paper linked in the OP describes?

------
bsenftner
This is obvious, even to the casual observer (who care to think beyond the
hype.) If this did not occur to you, and you are into
bitcoin/cryptocurrencies, you should get out as you're way above your head.
(expecting down votes to the truth.)

~~~
pfortuny
Well, it depends on the design: the format of each entry might be fixed (think
of a specific blockchain for a specific task). Yes, a generalistic blockchain
is quite likely unable to avoid this.

But who knows.

~~~
userbinator
_the format of each entry might be fixed (think of a specific blockchain for a
specific task)._

As long as you can distinguish between a 0 and 1, or any two states in
general, you can store and represent arbitrary data. This basic premise is
what makes digital computers so flexible and powerful, and how things like
stenography and crypto work.

~~~
evanb
I think you mean steganography.

------
tromp
> We thus believe that future blockchain designs must proactively cope with
> objectionable content.

The MimbleWimble blockchain design [1] on the other hand doesn't leave any
room to add freely chosen data, mostly due to the lack of any form of scripts.

[1] [http://mimblewimble.cash/](http://mimblewimble.cash/)

------
randomerr
If you can trade and breed Pokemon-esque creature in the blockchain, why not
illegal content? It's practically untraceable and you don't have to go to
trouble of running something like and onion skin router to obscure your
identity.

CryptoKitties craze slows down transactions on Ethereum

[http://www.bbc.com/news/technology-42237162](http://www.bbc.com/news/technology-42237162)

------
arthurcolle
god this is reposted so many times, SOO MANY TIMES

public service advisory: read the whole btc wiki, there's a lot of great stuff
in there!

[https://en.bitcoin.it/wiki/Agent](https://en.bitcoin.it/wiki/Agent)

[https://en.bitcoin.it/wiki/Script](https://en.bitcoin.it/wiki/Script)

Links of note: [https://bitcoinmagazine.com/articles/bootstrapping-a-
decentr...](https://bitcoinmagazine.com/articles/bootstrapping-a-
decentralized-autonomous-corporation-part-i-1379644274/)

~~~
banachtarski
oh great cryptocurrency zealot, how must we repent for our transgressions
against the faith?

~~~
dang
Please don't make the site worse by posting like this, regardless of how
annoying another comment may be.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

