

Lorem Ipsum: Of Good and Evil, Google and China - panarky
http://krebsonsecurity.com/2014/08/lorem-ipsum-of-good-evil-google-china/

======
lnanek2
Garbage in, garbage out. The system probably didn't have the lorem ipsum
placeholder text in its dictionary for all languages, so just mapped to
whatever its algorithms could guess. Since there is no right guess, it's
pretty random.

The rest of the conspiracy nonsense in the article is pretty silly and stupid,
honestly. There are a huge number of government documents translated into
other languages that were probably used as the training set. I have programs
of my own that rummage through SEC filings, for example. So NATO and countries
being common mappings if you pick a random one isn't odd at all.

~~~
broolstoryco
> The rest of the conspiracy nonsense in the article is pretty silly and
> stupid, honestly.

Was also surprised to read something like this on Krebs

~~~
j_s
[https://news.ycombinator.com/item?id=8152663](https://news.ycombinator.com/item?id=8152663)

Evaluate for yourself whether or not he may have fallen a few notches lately.

------
lultimouomo
Maybe it's a simplistic explanation, but I would think that this was caused by
the vast amount of multi-language sites in which pages in languages other than
english have not been written yet, so selecting one of them displays the lorem
ipsum (probably google translate identifies this untranslated pages as latin
even though they were supposed to be language X).

~~~
mjburgess
The problem is the consistent politicization of each word in ways related to
intelligence and the extremely good properties of lorem ipsum text (its
nonesense that doesn't stand out as nonesense - a holy grail of ciphers).

Its possible that this is statistical noise... but it seems particularly
plausible that it started out that way then some one gamed it into being far
less noisey and more consistently intelligence-based.

~~~
johnlbevan2
I suspect words such as Company and China are pretty commonplace on the
internet, so the data used for Google to learn is likely to include a number
of these mapped to Lorem Ipsums. Sentences making sense could be down to
another part of the algorithm - Google doesn't just do word for word
translation, but tries to map meanings based on the context of the sentence,
and to ensure the output sentence is grammatically correct in the new language
- as such the algorithms distort the results into sentences which would be
unlikely to appear by chance, making them more appealing to us when suggesting
conspiracy.

------
userbinator
Very interesting... and somewhat creepy, the phrases that are coming up. I can
confirm that "lorem" and "ipsum" don't work now, but playing around with other
pieces of lipsum still gives odd phrases like "suspendisse bibendum duis" ->
"suspend regional banking", "nostrud exercitation turpis fermentum" -> "Iraqis
saying through Arizona", and "Curabitur duis bibendum" -> "Nike's
restructuring".

An explanation I have is that the Chinese could be somehow using Google
Translate to "latinise" news stories in order to bypass censorship.

~~~
DominikR
To use a Google Service in China you have to bypass censorship, so there's
really no point in using Google Translate in the way you suggested.

------
haberman
Reminds me of this hilarious bug, where Translate would randomly add the
phrase "he now praises the iPad" to totally unrelated sentences:
[http://www.huffingtonpost.com/2013/01/05/google-bug-
praise-t...](http://www.huffingtonpost.com/2013/01/05/google-bug-praise-the-
ipad_n_2416474.html)

------
lvturner
A lot of the translations read like spam to me, with the mentions of
"commerce", "home business", "the company" etc, and in Chinese marketing copy,
it's quite common to say use "China" as part of the marketing "China's
first...", "China's biggest.." etc etc

So perhaps a less sinister explanation, is Chinese spam?

------
ww520
Bad training data. Lorem Ipsum is the de facto placeholder text for so many
webpages.

~~~
yen223
That is correct. The interesting question is, why did it translate to 'China',
rather than something more banal?

~~~
blowski
It could just be a selection bias, in that we think it's interesting so it
makes the news. If it had translated to something more banal, we probably
wouldn't be discussing it on Hacker News.

------
xwintermutex
This was also mentioned in an article about the Defcon 22 contest, posted on
HN too:
[https://news.ycombinator.com/item?id=8189549](https://news.ycombinator.com/item?id=8189549).
Apparently, the translation ceased to work now?

~~~
erbbysam
A short story on that as a Defcon 22 badge competitor - When we reached the
stage where we got the "Lorem ipsum" page. We first noticed that a bunch of
the lines did not directly follow the "Lorem ipsum" format exactly and had
strange capitalization. So we thought that the difference between the expected
"Lorem ipsum" text and this text was the clue... We eventually figured out if
you pasted the entire block into Google translate something strange would pop
out (that was relevant to another hint -
[https://www.defcon.org/1057/SarangHae/](https://www.defcon.org/1057/SarangHae/)
and then was useful again with what that email address returned ).

Looks like Google updated their latin translator to completely break the
puzzle :)

page in question:
[https://www.defcon.org/1057/FissilingualElucidation/](https://www.defcon.org/1057/FissilingualElucidation/)

------
katewishing
It still works by translating from English to Latin. I found a bunch by
running a list of NSA keywords through it:
[https://i.imgur.com/UGMIPpE.png](https://i.imgur.com/UGMIPpE.png)

All of the resulting translations seem to be from a text generated by
lipsum.com.

~~~
katewishing
I tested with a conventional English word list for comparison. Here are the
Lorem hits: [http://pastebin.com/yh26U7iz](http://pastebin.com/yh26U7iz)

------
hygap
It would have been interesting to see if you tried this while logged in on a
Google account not belonging to a security researcher.

It's a remote change but maybe it brings up totally random results that are
then passed though your accounts search bubble filter. Hence sec related
topics.

------
kps
At the time of a previous HN thread,
[https://news.ycombinator.com/item?id=5200728](https://news.ycombinator.com/item?id=5200728)
there were some amusing results from prefixes of the stock boilerplate, often
changing letter-by-letter:

    
    
      Lorem ipsu → Dummy Item 
      Lorem ipsum dolor → Welcome
      Lorem ipsum dolor s → The Pussycat Dolls
      Lorem ipsum dolor sit amet, c → This page is available
      Lorem ipsum dolor sit amet, consectetur adipisicing elit → This page is half the battle WIN!

------
fishnchips
I guess it's just a Rorschach test for the Internet.

------
sekasi
What I've heard on the grapevine on this is that it's used as a method to
defeat internet censorship in countries that are subjected to said censorship.

Not sure if that's true, but just passing that on.

