Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The results could be horrible, but can imagine a simple technique for hiding all those clues. Just send the text to google translate, translate it to an intermediate language and the back to the original one. I can warranty an excellent t rinse and clean. Change the intermediate language and you will change the features of the final text. Of course, you risk horrible semantic changes in the final text ;)

UPDATE: fix typos.



There's a great paper that investigated this technique: https://www.eecs.berkeley.edu/~sa499/papers/adversarial_styl...

From the conclusions: "Translation with widely available machine translation services does not appear to be a viable mode of circumvention. Our evaluation did not demonstrate sufficient anonymization and the translated document has, at best, questionable grammar and quality."


Has there been any other work on anti-fingerprinting? Seems like just taking the fingerprinting algorithms and applying transforms that randomize the features they look for would go a long way.


Took only a minute to try:

English -> Filipino (Tagalog) -> Chinese-simplified (Mandarin?) -> English

    I remember reading an article about a year ago (NSA) to identify the
    user, based on how they are written, vocabulary, spelling errors,
    grammar, language, and so on.

    It is interesting to me, because it is difficult to change the written
    and spoken word in use. It can be estimated that there are between two
    characters similar amount of help.

    Recently I can think of now is to check plagiarism (used in schools and
    universities, for example) is proprietary algorithm.

    Are there any public this algorithm? I can find out more information?
    (Academic journals?) I just DDGing wrong search terms? 
I have to say, this is a great idea. There was some information lost in transit, but most of my thoughts came through (albeit broken). It's probably worse since I used multiple intermediaries and Mandarin doesn't map onto English (or vice-versa) in grammar or vocabulary.

edit: A site for this exists. http://ackuna.com/badtranslator


Google Translate almost holds it together...until the very end...

Original:

  In the beginning God created the heaven and the earth. And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.
[a half dozen languages later...]

Result:

  God created heaven and earth in the beginning. And the earth was formless and empty, and darkness was on the face of the abyss, man. Answer the Spirit of God on the water surface.


Of course you're sending all this information to Google now. Are there any offline translators that are advanced enough to be used for something like this? I imagine most just naively map words 1:1 which wouldn't do much good here.


I was wondering the same. Oddly enough, on your phone you can use Google Translate offline, they let you download language packs. Not sure if they ever "phone home".


Google translate to Tagalog is generally not great.

If you chain through European languages you barely lose anything. Doing English->Dutch->German->English is a good set to use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: