

Deanonymizing Darknet Data - julianj
http://atechdad.com/Deanonymizing-Darknet-Data/

======
jcromartie
The data is not anonymous to begin with if it contains plain identifying
information (like an email address).

------
vonklaus
> Parsed hundreds of thousands

> found 37

I agree that anonymity is hard and when you make waves it becomes harder but
it would be tough to draw that conclusion from those stats. Interesting
nontheless.

~~~
gwern
There would be many more than that if markets didn't (usually) strip metadata
by default.

~~~
vonklaus
quite possibly true. I can't be asked to dig it up but either this post, or
another person speaking about your data said these sites are hand coded and
often don't scrub data from uploaded images.

if you as an individual can't be asked to scrub a bit of metadata off of a
photo you upload to an online black market you probably aren't qualified to
work part time at walmart. dodgy bit here though, it would be pretty trivial
to edit it and intentionally leak false info and be moderately clever while
everyone thinks you are daft.

~~~
gwern
I wouldn't say _often_. Most of them, as far as I can tell, get that right
from the start, and the ones that don't fix it fairly quickly after being
told.

------
gwern
Incidentally, 37 is definitely low. When I did the same thing in early 2014, I
got >90 unique GPS coordinates.

~~~
rory096
Why the discrepancy? Just that most images on the sites aren't jpgs? (What
else supports exif, tiff?)

~~~
gwern
I believe I checked PNGs as well, and I went through all my archives up to
that point (it wasn't just Evolution which had forgotten for a while to strip
image metadata). Possibly my grep was also better at checking variant field
names? Dunno.

------
twerkmonsta
Wow. What an accomplishment. :/

------
MichaelCrawford
Most online child pornography is not hidden nor encrypted in any way. Just
search Bing; you can get the keywords from wikipedia then follow bing's
suggested queries.

Microsoft claims that they remove child abuse links from their index upon
notification by such organizations as the National Center For Missing And
Exploited Children but clearly they dont.

There are some imagehosts that either have no way to locate their images from
their homepages, or whose search forms dont list child pornography in their
results, however its all indexed by bing.

The videos at the filesharing services commonly have obvious or easily brute-
forced passwords. There arent that many different passwords in widespread use;
I expect a list of the 100 most common passwords would decrypt 99% of the
videos.

The passwords and lightly obfuscated links are mostly posted to dead forums.
If you operate a forum yourself the simplest way to defeat this is to disable
posts in threads older than a month or so.

If you want to find these posts on your server look for a large number of
occurrences of the word the word "password".

I have even seen this at last.fm.

------
elyrly
TIL stop allowing your pictures to have meta tags.

