
[EXPERIMENT] I attached a file to a tweet - botolo
http://pastebin.com/1pxubqfw
======
jerrya
Long before there was dropbox, I used to claim my backup solution was to use
steganography and place my data inside porn pictures that I would upload.

Then I could just use google to find my backups distributed all across the
net.

You might be looking at my 2007 Quicken files even now!

~~~
joering1
how?

~~~
nsmartt
I have no idea if this guy is serious, (EDIT: I submitted this before I saw
his response) but it's entirely possible to embed data in JPEG files via
base64 + modifying exif data. It isn't really viable for large files, however,
unless vastly distributed. Perl:

    
    
      use Image::img;
    
      my $img = new Image::img;
    
      $img->ExtractInfo('image.jpg');
      #$img->SetNewValue("UserComment", $yourdata);
      #$img->SetNewValue("Comment", $yourdata);
      $img->WriteInfo('image.jpg');

~~~
hoop
You don't need to use exif data. Many tools use the often ignored alpha
channel of each pixel. In this case, you have X x Y bytes of available storage
where X and Y are the width and height of the image.

~~~
baddox
For lossless images, you could also use the least significant bit of each
color channel of each pixel to represent your binary payload. I doubt this
would be noticeable, especially with a photograph.

------
bkirwi
Some numbers on this:

A standard-quality .avi is about 800MB. Base64 provides 6 bits of information
per character, so that movie translates into ~8M tweets. Twitter seems to
limit users to 1k messages per day,[0] so that movie would take about 22 years
to upload.

[0] [https://support.twitter.com/articles/15364-about-twitter-
lim...](https://support.twitter.com/articles/15364-about-twitter-limits-
update-api-dm-and-following)

~~~
icebraining
You could greatly reduce the number of tweets by using an encoding that took
advantage of the full Unicode spectrum, as opposed to Base64 which just uses
the ASCII set.

~~~
ZenPsycho
wouldn't it be easier to just stick <http://> in front of a chunk of data and
let the twitter url shortener take care of it for you?

~~~
baddox
But then you're not really using Twitter (their main micro blogging service)
to share the file. You might as well encode your data as a lossless image,
upload to Flickr, and paste the link into Twitter.

~~~
ZenPsycho
So in order to break one arbitrary set of rules, we should strictly adhere to
this other arbitrary set of rules, otherwise what would be the difference
between doing that, and just not following any rules at all?

.... huh? This kind of reminds me of this xkcd strip about floor tiles
<http://xkcd.com/245/>

In any case, I kind of wonder why nobody hasn't just thought of PSK encoding
their DVDs and walking around town blasting them out of a boombox. I'm sure
that will make you loads of friends.

------
cyanbane
One interesting thing about this is the question of what is a file. Depending
upon how Twitter stores a tweet, does a file actually exist at their location,
or only a stream of information? Under current DMCA what would they be asked
to remove X number of tweets in a row? If decoded and a copyright holder
(provided there is one) notices, does he have to submit 155 claims/takedown
notices? What if that information is broken up, chunked and some agreed upon
pattern is used ie every 3rd tweet is garbage etc?

~~~
Animus7
This seems to kind of tread into the territory of DeCSS haikus [1]; if I can
speak (tweet) the file, isn't that expression of my rights to free speech?

[1] <http://www.cs.cmu.edu/~dst/DeCSS/Gallery/decss-haiku.txt>

~~~
icebraining
_if I can speak (tweet) the file, isn't that expression of my rights to free
speech?_

Philosophically? Possibly. Legally? No, since public performances of literary
works are speech too and they're still protected by copyright.

~~~
cyanbane
Interesting (IANAL). So what is the legal distinction, persistence? I read
recently that the MLK _"I Have a Dream"_ Speech was copy-written. I assume the
difference in persistence comes from the delta between if I go up to the front
of an auditorium and read the speech vs if I record that speech to an
auditorium and redistribute it. Twitter would be an archive-able public medium
is that where its usage as a medium gets moved to illegal?

~~~
icebraining
No, there's no distinction, going up to the front of an auditorium and reading
the speech is copyright infringement too. According to the law, _that_ speech
is not protected free speech.

(by the way, sorry, nitpick: copywritten → copyrighted)

------
aed
One of the more interesting chapters in the Steve Jobs book discusses the time
when Jobs finally convinced the recording industry that piracy wasn't
necessarily a problem because people want free stuff (though that is a portion
of it) but that it was simply easier than the alternatives. (In the case of
the music industry at the time, every label had their own solution and they
were all a pain to use.)

This experiment (and other humorous examples like this:
<http://datenform.de/blog/dead-drops-preview>) displays the complexity of
trying to prevent piracy by fighting it. If a critical mass of people want
something and there isn't a convenient way to get it, alternatives will arise.

Life... er... pirates will always find a way.

------
waffle_ss
Why limit yourself to Base64? Twitter supports Unicode quite well. The 140
character limit is actually counted using normalized Unicode code points[1].

[1]: <https://dev.twitter.com/docs/counting-characters>

------
papaver
The question in the end is who is responsible for the 'file.' MegaUpload was
shutdown because they are being targeted as the responsible party. Most sites
like YouTube and others have convinced the necessary parties that they are not
responsible.

Once a file is broken down into multiple parts and scattered throughout, can
you be held responsible for hosting parts of files? how large does the 'part'
have to be to be held responsible? what happens if a file is split into parts
and posted on pastebin + github + blogs and a trackers are used to manage and
build the files again?

The only thing taking down megaupload will do is create new means to allow
sharing to occur.

~~~
wmf
_Once a file is broken down into multiple parts and scattered throughout, can
you be held responsible for hosting parts of files?_

Yes, because each part has the same color as the original.
<http://ansuz.sooke.bc.ca/entry/23>

_how large does the 'part' have to be to be held responsible?_

It doesn't matter. If the original is infringing, every bit in it is also
infringing.

 _what happens if a file is split into parts and posted on pastebin + github +
blogs and a trackers are used to manage and build the files again?_

Also doesn't matter.

~~~
anonymoushn
_It doesn't matter. If the original is infringing, every bit in it is also
infringing._

The following was pasted from an mp3 of Bangs by They Might Be Giants, rather
than typed:

    
    
      Q
    

This is against the law and I should go to gaol.

------
Natsu
Why not just tweet magnet links instead? Surely that's better than killing
Twitter with base64 encoded files? Especially images where you already have
twitpic and the like.

It's an interesting idea, but I don't see what you could do with it that you
can't already do better with other services.

~~~
pash
The author's motivation is to make an anti-censorship statement, not to
provide an easy means of sharing files:

> _[Piracy] has always existed and I believe it will always exist. ... Content
> providers and copyright holders should just acknowledge this and try to find
> a way to revolutionize the idea of content distribution and its business
> model._

This seems to me to be a good way to remind people (or introduce them to the
idea) that all information on the Internet, whether political speech, a copy
of _The Origin of Species_ , mindless Twitter blather, or copyrighted files,
is in the end the same stuff, just a bunch of ones and zeroes.

Once you've acknowledged that fact, it's a short step to realizing that you
cannot disrupt piracy through technical means without also disrupting
communication essential to a free society. I think that's what the author's
getting at.

~~~
botolo
You totally got the point of my post and my experiment! I created the
experiment after reading one of my friends' reaction to the Filesonic decision
about stopping file sharing. He posted on his Facebook account that this was
the end of piracy. But as Pash correctly states, as long as content is
digital, it's just a bunch of ones and zeroes and the usenet experience
teaches us that any file could be easily translated in words and posted
anywhere, including blogging platforms.

Using Twitter was just a provocative way of analyzing the issue, given that
the experiment was deeply interesting also under the light of DMCA, as another
user suggested. I was thinking what would happen if people started "infesting"
legit websites such as Twitter or other blogging platforms with copyright
protected content under the form of Mime64 messages or using any other
encoding method which translates a file in text. While posting a link to a
copyright protected file would justify a DMCA takedown notice, the problem
would be bigger if the file would be hosted (under the form of Mime64 tweets)
by the blogging platform itself.

Let's make it even more complex. What if tweets or blog posts would be posted
randomly (not just by one user) and links to all these tweets or posts would
be collected in a small document, something similar to a .nzb file? Could
Twitter or any other blogging platform refuse a DMCA takedown notice for
messages which are harmless by themselves and which are a violation of a
copyrighted work only if collected together following the list of files
contained in the .nzb-look-a-like file (which may be hosted somewhere else)?
This would be a general counsel's nightmare.

In conclusion, the fight against piracy is the classic mouse-cat
fight...piracy will never die and the only way to bypass this is to come up
with a new business model for content providers and copyright holders.

I know that Paul Graham recently launched the idea of "killing Hollywood" by
creating a new business model for content distribution. I think this idea
should even be broader and should be about "how to kill piracy" by removing
the fundamentals of pirates (which is, I think, sharing, avoid paying premium
prices, etc.) by creating a new business model or method for content delivery
and fee collection for content utilization.

Thanks everyone for the great comments that you posted so far.

~~~
icebraining
_Could Twitter or any other blogging platform refuse a DMCA takedown notice
for messages which are harmless by themselves and which are a violation of a
copyrighted work only if collected together following the list of files
contained in the .nzb-look-a-like file (which may be hosted somewhere else)?_

Sure they could, the tweets would still have the work's colour[1].

[1]: <http://ansuz.sooke.bc.ca/entry/23>

------
Genbox
With some regular expression magic I extracted the data from Twitter, however,
the data was in the wrong order. With this command in Linux, I reversed the
order and decoded the data:

tac twitData.txt | base64 -d -i > image.jpg

I've uploaded the image here:

[http://iqsecur.blogspot.com/2012/01/sending-files-using-
twit...](http://iqsecur.blogspot.com/2012/01/sending-files-using-twitter.html)

However, it seems there is an error in the image, I'm not sure if it is the
process itself or the image actually has an error. A reverse search resulted
in 0 results, so I'm inclined to believe the former.

~~~
garethsprice
The technique of splitting files into many base64-encoded chunks was popular
on Usenet in the 90s (another 7-bit platform with message size limits).

This was always the problem there too - 144 parts, 3 of which were
missing/corrupt and the entire transfer was rendered useless.

~~~
Terretta
In operation, a user will select a set of files from which the redundant data
is to be made. These are known as input files and the set of them is known as
the recovery set. The user will provide these to a program which generates
file(s) that match the specification in this document. The program is known as
a PAR 2.0 Client or client for short, and the generated files are known as PAR
2.0 files or PAR files. If the files in the recovery set ever get damaged
(e.g. when they are transmitted or stored on a faulty disk) the client can
read the damaged input files, read the (possibly damaged) PAR files, and
regenerate the original input files. Of course, not all damages can be
repaired, but many can.

The redundant data in the PAR files is computed using Reed-Solomon codes.
These codes can take a set of equal-sized blocks of data and produce a number
of same-sized recovery blocks. Then, given a subset of original data blocks
and some recovery block, it is possible to reproduce the original data blocks.
Reed-Solomon codes can do this recovery as long as the number of missing data
blocks does not out number the recovery blocks.

<http://www.par2.net/par2spec.php>

------
igul222
I really don't understand what the author is trying to prove here. Countless
sites let people upload and share free-form information (a few that come to
mind: Dropbox, Gmail, Facebook) in ways that would be much easier for pirates
to use than Twitter, and none of them are going to get shut down any time
soon.

Yes, shutting down Megaupload and its kin isn't going to stop piracy. But I
don't think that was ever the goal. As long as it reduces piracy by some
measurable amount, which I think it will, then the censors will have
succeeded.

~~~
botolo
I don't think this action will slow down piracy. The average user don't even
know what's the difference between Megaupload or other cyberlocker websites.
The user just (I guess) types what he wants on Google and gets some website
with links to these cyberlockers.

This act, on the contrary, could help induce other websites to correct their
conduct or could help induce potential developers to create such cyberlockers.

------
sachleen
I can't find it but I remember something that sent messages like this over
Facebook chat and then, on the other end, the software pieced it all together
to show the image that was transferred. Anyone else remember this? It was a
video so it might be on YouTube.

------
Skywing
To this extent, why not just turn twitter into a torrent tracker? tweet out
something like: <torrent id> \+ <seeder information>. You could then just
perform a tweet search for that torrent id, and you'd get all the seeders in
return.

~~~
icebraining
There's no real advantage in that: if you can use bittorrent, you can use DHT
too.

This experiment shows that unlawful file sharing would still be possible even
if the Internet connections were restricted to accessing only popular
websites, as long as they allow any kind of user content to be posted.

------
Genbox
Tweet #91 is an error from Excel. I was going to combine the messages into the
image, but I can't without the full base64 stream.

~~~
botolo
Ops, sorry about that! This is the missing tweet:
+WjH/aGaR2ZyNxHHQCgALlpcscmpdvHaolA381MAcfxflQB6w/w48T6mslwuk3UiIQrNEhJz6Y9a5PUPBup2EzI8ZRweY5BhhX9EPh39lCey8NRtCbZZjHkfaLcGVS3XJ6E/UVF4n/Y4

------
funkah
Considering the medium, couldn't the author just have pasted the base-64
encoded data on pastebin?

~~~
Animus7
I could be wrong, but I thing the idea is that Twitter is immune to
censorship, so no matter what sites get taken down, file sharing can live on
in 300 baud. Or something.

...Yeah, I don't really get it either.

~~~
ertdfgcb
Well, it does say "experiment". But I think the point is summed up in this
line: "I believe that piracy will always exist". The OP is showing that data
is so malleable that no matter how many Megauploads you take down, there will
always be a way to share it, which seems obvious to us at HN, but to a layman
this might be a little more eye-opening. Or maybe he was just fucking around
over the weekend and wanted to find the most ridiculous possible way to send
data.

------
grusk
Comment on publishing platform, not content: Instead of using pastebin.com,
use <http://pen.io> (for example PAGENAME.pen.io -- no account required, and
you can edit if you have the password to the page, however you can't format
the text) or <http://hackpad.com> (account registration is quick and you can
format your text).

