Lorem Ipsum translated by Google (translate.google.ca)
361 points by lemieux 1395 days ago | hide | past | web | 73 comments | favorite

Google Translate uses statistical machine translation [1] seeded from a gigantic automatically curated parallel corpus of similar documents.

As"lorem ipsum" is a typographic placeholder, the filled in version appears appears to have the same document structure (HTML) and would therefore be statistically likely candidates as translatable pairs.

[1] http://www.youtube.com/watch?v=y_PzPDRPwlA

It's obviously been crawling support tickets/emails with people complaining about non-forthcoming copy:

We've all been there in one form or another. Those of us that do client work anyway.

It's free verse, from the living, beating heart of the Internet. All those support tickets have developed a special kind of pathos:

    How long before any meaningful development.
    Until mandatory functional requirements to developers.
    But across the country in the spotlight in the notebook.
    The show was shot.

This is not the traditional text of Lorem Ipsum, save for the first sentence. The actual translation is far less exciting. The only thing I noticed it that it translates something with "train" while I don't think a word for that exists in Latin.

Actual Lorem Ipsum [0]:

  Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Actual translation in Google Translate: http://bit.ly/127UkCu

[0]: http://en.wikipedia.org/wiki/Lorem_ipsum

For some reason, "aute irure" is translated as "bullet train"; however, neither of them are Latin words.

IIRC Google Translate doesn't work as a straight "word swap" like other tarnslation tools perhaps do. Google Translate works in a similar way to their search engine in that common phrases and works on popular sites are compared and probable meaning are. This is why names celebrities sometimes get translated by the names of their peers[1] and why the pseudo-Latin in Lorem Ipsum translates.

[1] http://answers.yahoo.com/question/index?qid=20100905144430AA... (I know it's a terrible source, but it's the only link I could find)

And "cillum dolore eu fugiat nulla pariatur." is "that produces no resultant online applications."


I never knew Google had a "Translate to Allen Ginsberg" option.

It doesn't. By definition, translation should not add meaning, just preserve the original. It is not possible to turn an arbitrary text into poetry by any mechanical means.

A poem by Ginsberg or any other good poet is the densest medium of communication. The message may be a bit ambiguous, but the mean rate of reception is still high: You tend to love it or hate it.

I can't tell if you're joking. If you're being serious, though, I think you took my comment too literally. :)

I was going to jump in and suggest that google's statistical translation methods were being thrown off by lipsum being used in so many strange contexts, but it turns out "Lorem Ipsum" is a mangled nonsensical version of the original text:


Thank you. 'Lorem Ipsum' isn't Latin. Translation tools, dictionaries, or your old high school text books are not going to help you 'get it right.'

This is probably an Easter Egg.

The original is translated much better.

Oddly, "Vestibulum ante ipsum primis" translates to "Cisco Security", although taking each word separately translates to "Manufacturing before football first". I don't remember much about my Latin year but if each word has a different meaning depending upon what other word it's combined to, the possibilities are endless.

Because of Google's approach based strictly on statistical regularities, words can completely change translations based on context even in languages where that wouldn't normally happen, because the contexts can swing the estimates.

One funny one comes with city names, where Google sometimes mistakenly "translates" a city to a different city that happens to have frequent usage in the target language, in contexts that it must find analogous.

For example, here are some translations involving the Danish city Billund (location of Lego), which change even based on punctuation:

   Billund -> Billund
   Jeg er i Billund -> I am in Billund
   Jeg er i Billund. -> I'm in London.
For whatever reason, intriguing place-name translations are particularly common in the Danish->English case. Brøndby is often Red Sox, Odense is Kentucky, and Hillerød is sometimes Whatfield.

My favorite one was the word "Amistad!" translated Spanish->English:

    Amistad! -> Friendship!
You could add more exclamation points, and they'd show up on the other side:

    Amistad!!! -> Friendship!!!
But when you reached five, you apparently hit some sort of context changeover, because:

    Amistad!!!!! -> Murder!
Sadly, it has since been fixed.

if someone kept on yelling "Friendship!" louder and louder, it might lead to murder...

Now I'm imagining an automated translation of a story from, say, the Star Trek universe to the Star Wars universe, substituting place and character names based on frequency of use in each universe.

Bad idea. Star Trek stories are horrid quality compared to Star Wars.[1] Do Star Wars to Star Trek. ;)

1: Based on my own experience. I've read at least 50% of the available Star Trek books and 75% of Star Wars. I've got maybe four Star Trek books I've liked and dozens of Star Wars ones.

Maybe he was talking about the canon (the TV episodes) of Star Trek and the movies of Star Wars?

I believe you are looking for this: http://tattuinardoelasaga.wordpress.com/

No doubt would be confused by "peru is the Spanish for turkey" as well...

Nitpick: peru is the Portuguese for turkey. In Spanish it is "pavo".

St. Martin St. Dr. Martin Luther King Dr. Red & black; Black & Decker. Blue moon, blue Nile, blue sky?

It's not an easter egg, at least not in the sense that it's special-cased to that one chunk of text. It worked with some arbitrary lorem ipsum I generated from lipsum.com . I think this one might actually be funnier: http://translate.google.com/#auto/en/Aliquam%20viverra%20mat....

    Funny lion always feasible, innovative policies hatred assured.
Seems like commentary on the fall of ancient reddit.

This sort of incoherence really isn't unexpected, considering that Lorem Ipsum is not a piece of coherent text, but rather a series of sentence fragments and even fragments of words.


The "Alpha" mouseover notes of the Latin translator

   "This language is still in early stages of development..."  
Really? I thought it had been around for a while... :)

I commented [1] in a previous version of this [2] that you get amusing character-by-character changes typing into the text box by hand. The results are slightly different today.

[1] https://news.ycombinator.com/item?id=5201472

[2] https://news.ycombinator.com/item?id=5200728

I can't decide if this is a result of Google's use of statistical machine translation, or an Easter Egg.

Probably an Easter Egg, given that it's been in use since a time prior to common use of "online"

If Google takes its statistical corpus from online, then I think the SMT still makes sense.

Oh God I really hope people start using this in place of Lorem Ipsum.

I think the corpus of latin/english translations is not large enough, because the translation of even the basest schoolboy latin seems mangled; different declensions of the same word get different translations. 'Ancilla' [female slave, I was taught] is translated: maid, handmaiden, women, and ancillary, depending on declension?

Oh, I think I can shed some light on this! Never studied Latin but I sing a lot of church music. The very popular "Magnificat" text in Latin in includes the line:

  quia respexit humilitatem ancillae suae.
This is traditionally translated as:

  Because he hath regarded the humility of his handmaid;

  For he hath regarded the lowliness of his handmaiden.
See: http://en.wikipedia.org/wiki/Magnificat . In context, Mary is definitely not talking about being a slave, but a willing servant.

Innovative policies, hatred assured.

I really wish this would become a meme.

Surely there has been some manipulating of the search results (this non-standard version of) Lorem Ipsum to get those results form google translate. A more standard Lorem Ipsum text comes up with a pretty standard translation (it actually seems to revert to the original Ciceronian text which it is based on).

It was generated on this site : http://www.lipsum.com/

Lorem ipsum is frequently misrepresented as nonsense text. It's not actually the case: http://www.straightdope.com/columns/read/2290/what-does-the-...

It is nonsense - the text was altered - with modified, added and removed words, that make it nonsense. This is why the latin translator doesn't translate it. It is based though on a Cicero text.

It's not completely nonsense, but it definitely doesn't have any coherent meaning. It started as an excerpt from Cicero but it was mangled by removing words and letters without regard for sentence structure or even for preserving words. Some of the words aren't even real Latin words.

I wonder if this is how they came up with the dialog for the Hybrids in Battlestar Galactica.

"We will be sure to post a comment." I thought that was a sign that this was an intentionally unserious translation that an engineer snuck in as an Easter Egg. Though it's a bit more absurdist than Google's usual brand of humor.

This is a wonderful "correct horse battery staple" password generator !!

It will have poor entropy, appearances to the contrary. Don't use this method.

not anymore...

I don't mean literally with Lorem Ipsum - you can use Google Translate to transform any piece of text into gibberish. When in doubt, translate back.

no no, I know. I'm only teasing :)

Still, probably best not to feed every password to Google.

I'm fairly certain Google already knows almost every password considering most are rubbish and Google Books has been around in one form or another since at least 2004.

Also : https://www.google.com/search?q=password+list

In case you're interested in learning more, or just having a quick lorem ipsum generator: http://lipsum.com

and in combination with nknighthb's comment you get this: http://www.rikeripsum.com/

Thank you for this. Not only is it thoroughly entertaining, but the source code is fairly helpful. :)

Or go for "An all-around better Lorem Ipsum experience": http://lorem2.com/

It was generated with that site.

also, this is much more useful that you'd think:


I'm obligated to chime in: http://airbr.us/h/tipsum

This almost sent me furiously scrambling for my high school Latin text book. Almost...

The best part, "Information that is no corporate Japan."

I was about to post that exact same thing. "Information that is no corporate Japan". You can definitely say that again.

I read this like spoken word poetry. It was quite amusing.

Also, please note that "lorem" is translated as "China", and "ipsum" is translated as "footbal". At least for me.

It's just nonsense. The actual translation of "football" will be obvious once you see it:


Pediludium, literally "foot-game". And ipsum means "itself":


Google really needed to seed their Latin translator with a basic dictionary before letting it pick up crap on the internet.

The English translation is way better. :P

I wonder why Cisco Security is in there?

Cisco Security has been vital since ancient times, obviously there would be Latin for it.

Indeed! Cisco Security saved the ancient Republic from the Cataline rebellion, delivering famous orations. "Quo usque tandem abutere, Catilina, patientia nostra? Quam diu etiam furor iste tuus nos eludet? Quem ad finem sese effrenata iactabit audacia?"

No, wait, that was Cicero. My bad.

