Google Translate uses statistical machine translation [1] seeded from a gigantic automatically curated parallel corpus of similar documents.
As"lorem ipsum" is a typographic placeholder, the filled in version appears appears to have the same document structure (HTML) and would therefore be statistically likely candidates as translatable pairs.
It's free verse, from the living, beating heart of the Internet. All those support tickets have developed a special kind of pathos:
How long before any meaningful development.
Until mandatory functional requirements to developers.
But across the country in the spotlight in the notebook.
The show was shot.
This is not the traditional text of Lorem Ipsum, save for the first sentence. The actual translation is far less exciting. The only thing I noticed it that it translates something with "train" while I don't think a word for that exists in Latin.
Actual Lorem Ipsum [0]:
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
IIRC Google Translate doesn't work as a straight "word swap" like other tarnslation tools perhaps do. Google Translate works in a similar way to their search engine in that common phrases and works on popular sites are compared and probable meaning are. This is why names celebrities sometimes get translated by the names of their peers[1] and why the pseudo-Latin in Lorem Ipsum translates.
It doesn't. By definition, translation should not add meaning, just preserve the original. It is not possible to turn an arbitrary text into poetry by any mechanical means.
A poem by Ginsberg or any other good poet is the densest medium of communication. The message may be a bit ambiguous, but the mean rate of reception is still high: You tend to love it or hate it.
I was going to jump in and suggest that google's statistical translation methods were being thrown off by lipsum being used in so many strange contexts, but it turns out "Lorem Ipsum" is a mangled nonsensical version of the original text:
Oddly, "Vestibulum ante ipsum primis" translates to "Cisco Security", although taking each word separately translates to "Manufacturing before football first". I don't remember much about my Latin year but if each word has a different meaning depending upon what other word it's combined to, the possibilities are endless.
Because of Google's approach based strictly on statistical regularities, words can completely change translations based on context even in languages where that wouldn't normally happen, because the contexts can swing the estimates.
One funny one comes with city names, where Google sometimes mistakenly "translates" a city to a different city that happens to have frequent usage in the target language, in contexts that it must find analogous.
For example, here are some translations involving the Danish city Billund (location of Lego), which change even based on punctuation:
Billund -> Billund
Jeg er i Billund -> I am in Billund
Jeg er i Billund. -> I'm in London.
For whatever reason, intriguing place-name translations are particularly common in the Danish->English case. Brøndby is often Red Sox, Odense is Kentucky, and Hillerød is sometimes Whatfield.
Now I'm imagining an automated translation of a story from, say, the Star Trek universe to the Star Wars universe, substituting place and character names based on frequency of use in each universe.
Bad idea. Star Trek stories are horrid quality compared to Star Wars.[1] Do Star Wars to Star Trek. ;)
1: Based on my own experience. I've read at least 50% of the available Star Trek books and 75% of Star Wars. I've got maybe four Star Trek books I've liked and dozens of Star Wars ones.
It's not an easter egg, at least not in the sense that it's special-cased to that one chunk of text. It worked with some arbitrary lorem ipsum I generated from lipsum.com . I think this one might actually be funnier: http://translate.google.com/#auto/en/Aliquam%20viverra%20mat....
This sort of incoherence really isn't unexpected, considering that Lorem Ipsum is not a piece of coherent text, but rather a series of sentence fragments and even fragments of words.
I commented [1] in a previous version of this [2] that you get amusing character-by-character changes typing into the text box by hand. The results are slightly different today.
I think the corpus of latin/english translations is not large enough, because the translation of even the basest schoolboy latin seems mangled; different declensions of the same word get different translations. 'Ancilla' [female slave, I was taught] is translated: maid, handmaiden, women, and ancillary, depending on declension?
Oh, I think I can shed some light on this! Never studied Latin but I sing a lot of church music. The very popular "Magnificat" text in Latin in includes the line:
quia respexit humilitatem ancillae suae.
This is traditionally translated as:
Because he hath regarded the humility of his handmaid;
Or:
For he hath regarded the lowliness of his handmaiden.
Surely there has been some manipulating of the search results (this non-standard version of) Lorem Ipsum to get those results form google translate. A more standard Lorem Ipsum text comes up with a pretty standard translation (it actually seems to revert to the original Ciceronian text which it is based on).
It is nonsense - the text was altered - with modified, added and removed words, that make it nonsense. This is why the latin translator doesn't translate it. It is based though on a Cicero text.
It's not completely nonsense, but it definitely doesn't have any coherent meaning. It started as an excerpt from Cicero but it was mangled by removing words and letters without regard for sentence structure or even for preserving words. Some of the words aren't even real Latin words.
"We will be sure to post a comment." I thought that was a sign that this was an intentionally unserious translation that an engineer snuck in as an Easter Egg. Though it's a bit more absurdist than Google's usual brand of humor.
I'm fairly certain Google already knows almost every password considering most are rubbish and Google Books has been around in one form or another since at least 2004.
Indeed! Cisco Security saved the ancient Republic from the Cataline rebellion, delivering famous orations. "Quo usque tandem abutere, Catilina, patientia nostra? Quam diu etiam furor iste tuus nos eludet? Quem ad finem sese effrenata iactabit audacia?"
As"lorem ipsum" is a typographic placeholder, the filled in version appears appears to have the same document structure (HTML) and would therefore be statistically likely candidates as translatable pairs.
[1] http://www.youtube.com/watch?v=y_PzPDRPwlA