Hacker News new | past | comments | ask | show | jobs | submit login

Stylometry only works too an extent with helping to identify what's what. I could just stylometry to show that this document was written by a professional screenwriter, but it writing style isn't unique enough to identify one person out of billions. Gwern has an excellent post about this.



I remember a really impressive demo where someone found all the alt accounts of a HN user using stylometry: https://news.ycombinator.com/item?id=17944484 The account didn't have that many comments on it (maybe ~1000 karma)

He also claims to have used the same algorithm for identifying company insiders for trades (see parent comment in that thread for context).

I don't think he was bullshitting even though his tool isn't public because he has a different NLP project that works reasonably well (https://hnprofile.com/ which can find users based on their interests).


Agree 100%. I'd believe that the DHS has identified Satoshi, but very very skeptical that they could use mass email collection to do it. Stylometry could work though if you limited your search to the set of people active on niche pre-Bitcoin lists/forums where Satoshi was likely to be active under a different name.


It depends on how much data you have. When you have everyone's emails you can rank everyone by closest match and narrowing parameters.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: