I had to put the App to test.
Printed some simple English words and tried it out.
What the App shows in REALTIME:
Comparison with Google Translate:
[EDIT: Note, Translating Muerto Fin to English in Google Translate, does result in 'Dead End'. Can any Spanish reader clarify why the original Google Translate chose to translate 'DEAD END' to 'callejón sin sali'.]
[EDIT: So let me get this straight. Their program, running on iPhone [256MB RAM, 600 MHz ARM CPU], can take a live image, perform OCR, translate, create another image with the translated words. And all of this happens real time? Wow.]
I think it's really hard to get a 'Wow' reaction from HN crowd. And you have hit a home run! As someone else said in this thread, you are going to make boatload of money. Congrats!
First: This app is amazing!
But I think that this app is doing almost a word by word translations, without a list of usual expressions or grammatical analysis.
For example, the correct translation of 'Dead End' is 'callejón sin salida', that means literally something like 'street without exit'. Translating word by word 'Dead End' you obtain 'Muerto Fin' that is unintelligible.
Another example from the video is the translation of
'Lengua boliviana con una salsa picante de anchoas'.
'Tongue Bolivian with a sauce spicy of anchovies',
'Bolivian Tongue with a spicy sauce of anchovies',
Real time OCR + word for word translation....maybe its not particularly technically interesting on paper, but to watch it on a handheld device in front of your eyes is breathtaking.
Do you have a timeline for other languages? (Japanese)
But I think that the real problem are the idioms and phrases, like "dead end", that have a completely different translation.
On the other hand, I think that you have more processing time that 1/10 sec. Most of the time the user will point the app to the same text for 10 or 15 seconds. I think that it is possible to show first a very fast translation almost instantly and a few seconds later show an improved version.
Still, amazing technology, specially since it's all client-side. Hopefully they'll have some contextual analysis in the future or enable Google Translation API use.
For example, a few years ago, some students of my wife give to her a homework about clocks and gears. When she read it, she was annoyed because the redaction was incredible horrible. But later we realized that the students didn't write it, the "homework" was a web page translated with Google Translate.
Another time, I need an example of the differences between JPEG and JPEG2K for a internal talk. I found a photograph with a zoom of Lena's eye with a legend like "JPEG2K 1% vs JPG 1%", but the webpage was in Japanese, and I can’t reed it. So I use Google Translate to be sure that it was a comparation of the two methods with the same compression level.
So Google Translate is very good to get an idea of what some web page means, but it is not good enough to make a final version of the translation.
This app have some additional difficulties that GT doesn't need to solve: they need to OCR the text in the wild, they have less computational power and they have to do it in real time. It is almost incredible that they can solve these things. With a word to word translation, I think that you can get a good enough translation 90%, 95% or even 99% of the times, but the corner cases can be really unintelligible. The translation of "Tongue Bolivian ..." is fine, and the user can understand what it means, so it is useful. The translation of "Dead End" is something that they should improve in next version.
I once saw this forum post that was in German, that Google translated "A: Nyet." to "A: Yes."
Since it was a question and answer post regarding information for an upcoming game, there was quite of bit of fuss over it before this translation error was discovered.
I cannot understand what heuristic would translate the complete German sentence "Nein." into "Yes.".
About the word for word thing, that is mostly correct to say it's word-for-word. We have just a few short phrases in there, like "por favor". If that didn't translate correctly, it would have been teh lame.
Oh please, I mean, it's only one of a kind app that millions could make use of :)
Do you have an API or something? I would love to work on a tamil to english dictionary.
Can you please email me? My email is in my profile.
Are you hiring? ;) I am a native Spanish speaker btw.
Frickin awesome app tho'
On the other hand, "muerto fin" has NO meaning in spanish, (and that's what you get when you translate word for word without context).
[Search Term] -> [Interpretation]
New York -> New York city
New York Times -> The news paper
New York Times Square -> Famous tourist spot in NYC.
There were other examples in the article. I will try to find that article and link it up.
I think Google should buy these guys and provide them with their knowledge of 'context'. Google Translate team and these guys should talk right away.
The best examples of this, I thought, were the following passages:
Yesterday I went to the symphony. The sound was beautiful.
Yesterday I went to Long Island. The sound was beautiful.
In the first passage, sound refers to an auditory experience; in the second, it refers to a body of water. Within the sentence "The sound was beautiful", there's no way to know which to use, so one must look to surrounding information to make a guess.
But I've given a short and clear example; anyone can see the right solution. But in real-life usage it's frequently unclear.
Yep... solving "context" 100% is equivalent to solving A.I.!
I like how "Make right turn" becomes "Make correct turn".
i hope this app will be nice and disruptive, and we'll be looking carefully at what people expect and how they actually use it. it's a platform, and we're really excited about the directions it opens up.
thanks for the link, cheers!
Or anyone for that matter. Make a business, be king of the world. You deserve it.
With such a product, you should perhaps brand yourself more on your product than your company name. :)
Just throwing it out there.
In this case someone has already registered wordlensapp.com. We can only hope it's the Quest Visual people.
There are many trends for using domains similar to what you would desire: get<product> being the best, and <product>app being one of the other choices.
I have very bad experiences with finding an app's site, if it models <product>app. Maybe it's just bad SEO on their part, but I imagine that it'd be an awful hassle for any normal user who don't know the conventions.
Then it struck me that <use>product would be a very interesting domain that makes much more sense than <product>app. It also mirrors the imperative <get>, although it may not be as popular and known to users - yet.
I think it's a shame they didn't secure a regular <product>.com domain for such a great product - but, on the other hand, I'm sure all the press will forgive them (hell, they're trending on Twitter, and 80% of my social media digest today was about the bloody thing).
I've registered usewordlens.com, usewordlens.net, and usewordlens.org and will be happy to hand it over or point it to a domain of their choice (if someone tells me how the hell to do it using name.com, because I'm a complete idiot in that regard. I guess I have to mess with some DNS).
Different registrar, different DNS and the obvious spam landing page. sigh
otaviogood or johndeweese, if you're reading this, my email is in my profile.
Agree I can't read it, pretty simple request.
Aside from that, and as a Spanish native speaker, the translated text sounds really weird to me. Even without knowing English I would be able to understand the translated text (at least the example that you gave) but it can be improved. But this is the only con that I see to your application. Probably you're already working on that, but I thought that you might like this input.
The first card says "welcome to the future" which I thought was perfect :)
This could just be a case of needing to optimize and train their recognizer to deal with a much larger set of possible characters, or it could require implementing kanji-specific OCR techniques like attempting to decompose the characters into their constituent strokes, and recognize based on classification of those strokes (orientation, position, direction).
Of course, guessing the correct word when performing word-for-word translation of hanzi is almost impossible, so even the extremely primitive product I'm thinking of is very difficult.
On the other hand, an English to Chinese translator might be more doable, and might also be more commercially profitable.
An app that converted from characters to pinyin (a much easier problem) would be gold to someone trying to learn mandarin. I'd easily pay $50 for something with that functionality without it even making an attempt at the english. It's a much smaller market than those trying to understand chinese signs on a vacation, but it's one with a much larger stake and interest in the result.
Here's a video demo: http://www.youtube.com/watch?v=x7VTo0656Rc
I downloaded the app, bought the Spanish -> English pack, and tried it with some simple phrases in a big TextEdit window on my monitor. It flickered a bit, but I expected that from a monitor. It got them right.
Big deal, common phrases are easy -- I could just buy a phrasebook. Then I tried a random phrase from the Spanish version of "Dive into Python":
Una función, como cualquier otra cosa en Python, es un objeto.
A function, as any other thing in Python, is a object.
As soon as a French -> English pack is released I'll buy it, even if it's $100.
This is the kind of thing that would make it possible for me to move to Montreal. I love that city, but don't know French. I could learn, but it would take time and I'd be lost like a baby gazelle on the Serengeti while I learned.
This app could ease the process of moving to an entirely different country. That's amazing.
Hello, future, it's nice to see you.
It is easy to get the hang of signage too; lots of French words are spelt almost identically in English, and the language is much easier to read than it is to listen to. You should get the hang of it quickly.
On the other hand, if you have to get any kind of job that involves speaking with customers ... you are shit out of luck.
Raise the price. For the love of god. Raise the price. This is much, much more valuable than five bucks. There have been times in life where I'd gladly have paid a dollar per minute for this. And I'm a cheapskate.
Absolutely, read http://bit.ly/bdHKmm to understand why. (Sorry for the shortened URL, the URL recognition software here doesn't understand URLs with apostrophes in here, so it actually is necessary.)
Something like this should not be $5.
We have a lot more Vietnam refugees here than we do Chinese folks. If you'll notice, the three main languages you can get government papers in (at least over in San Jose, no experience elsewhere) is English, Spanish, and Vietnamese.
Incidentally, there's other and easier untapped startup opportunities for the Asian language demographic, if anyone wants to send me a note via gmail
My in-laws are a mix so either language is good, both is better. I can promise 3 Android and 1 iPhone sale just in my home :)
The dictionary is free and the OCR is available as an in app purchase for $15.
I'm just saying this because it is not trivial to adapt an OCR engine to non-latin scripts, because the image analysis techniques are rather different.
I understand that Chinese is harder, for many reasons. It is also much harder for humans. But that is why I'd happily pay, say, $10 per day of my trip for an engine that gives reasonable hints. Maybe more. Try me and see.
Of course, once this technology is ubiquitous and the price of such a thing has fallen to nigh-zero it is going to change the world. But we have to walk before we run, and you'll need the money for R&D and legal costs.
I immediately sent the link to a dozen people, about half wrote back "How much is it?" and "How do I download it?"
Since QuestVisual.com is clearly going to sweep the entirety of the internets by this time next week a few friendly conversion suggestions:
a) Change your button to "Try Word Lens FREE - Click Here"
b) The logo under your button looks like another button, just get rid of it.
Dropped you an email if you'd like the Conversion Voodoo team to bang out a new page tomorrow AM gratis that's ready for installation by noon PST :).
The signs (in Spanish) have grammar mistakes, but they are automatically translated to correct English sentences. I have a few examples:
- The third sign says "Lo traduce el texto", but that sentence doesn't make sense. It should be "Traduce el texto" or even "Se traduce el texto".
- 'Ropas opcional' sounds strange to me, although it may be accepted in some countries.. It should be something like 'Ropas opcionales'.
I tried both examples in Google, Bing and Babel Fish and the results were OK, so I don't think that Word Lens translator is very accurate.
"Mr. Babbage, if you put into the machine wrong figures, will
the right answers come out?" ... I am not able rightly to
apprehend the kind of confusion of ideas that could provoke
such a question."
The offers will come, that much is certain; I can only hope they decide to go for it and turn this into something great.
This should also be viewed as a prime example that it is not always better to push complex processing into the cloud. This would not be possible trying to push to a server somewhere real-time.
Kudos to this team.
You would ideally want the OCR and text replacement together with Google’s translation algorithms, which, besides requiring a network connection at the moment, would also introduce way too much lag that would kill the experience.
(Word-for-word translation is a perfect compromise in the meantime.)
While I agree this is a killer app for the iOS platform, it doesn't yet have the polish that Jobs would demand for a live demo. A good deal of that is the hardware contraints of the system.
I think if you had invented a dedicated device that did nothing but this app it would sell like hotcakes, and we've heard in this forum that people are considering buying iPod touches just to give them out this app to family.
Thus if you wanted to move the needle from 'wow' to 'insanely great' it would take hardware level engineering to optimize the device, which I don't think it outside the realm of possibility.
Remember it was killer apps like quake, unreal, and crisis that moved the graphics card industry for years. This app has the potential for even broader appeal, and its successors will require even more horsepower.
Oh god, I hadn't thought of that. I just naturally assumed this would be coming to Android soon.
Screw zuckerberg, The developers of this app should be on the cover of time magazine.
Seeing this thing in action in real life is life changing.
I was using a 4th gen iPod Touch, so lower-res camera and no flash.
Unfortunately, I was not able to get it to work as well as I'd hoped. I tested physical copies of a few things:
1. Mac OS X Snow Leopard cd case. Black sans-serif text on white background. This worked the best, with the word-reversal consistently getting "Snow" and "Leopard" reversed with little flicker. "Mac" seemed to go in and out. OS was rarely replaced, and usually along with "Mac" as though it were part of the same word.
2. C++ Programming language book cover (http://pixhost.info/pictures/631454). Dark blue serif text on white. This almost never worked. When it did recognize letters, the recognition shifted so much that the word was a constantly moving jumble.
3. Throat Coat tea bag. White serif text on Red. At any point in time, about 50% of the words were recognized and reversed.
You can take a snapshot, and each word it recognizes is highlighted, which is pretty cool.
I would definitely buy something like this that handled non-romance languages to English.
I believe the iPhone 3GS/4's autofocus makes a huge difference in clarity for close-ups.
Having said that, I'm always glad to see interesting translation being developed and getting such a positive reaction on HN. Congratulations to Word Lens on launching!
For instance, no spanish speaker would EVER say "LO TRADUCE EL TEXTO" (@ 0:20), or "ROPAS OPCIONAL" (@ 0:50). They just picked some words in spanish that made sense when translated, but my guess is that average translation would be much more awful
Unless it translates `apple` as `chair` - I'll understand what the meaning is.
I just went to the local bar with my roomate and, despite it being pretty busy, I told one of the bartenders "Hey, want to see something cool?" and pointed my phone at their menu. Now, almost immediately after I said this, I realized it was kindof rude (they were busy), but she looked anyway...then she proceeded to yell at all the other bartenders to come over and check it out...and the guy sitting next to me spent about a minute going over the menu (translating from english to spanish) wowed as well.
You're not going to get that with angry birds.
And I’m sure that it will also, by effectively removing all barriers to communication between different cultures and races, cause more and bloodier wars than anything else in the history of creation.
See what you've done?!
When I was a kid, I would have bet all the money I would ever earn that we'd have flying cars before the ability to do something like this with a phone. Amazing. Absolutely amazing.
I am spellbound.
EDIT: People with limited vision are often considered blind, so they could use this app.
Blind people are patient by necessity, I'm sure there are many who wouldn't mind waving around their phone scanning for text.
That doesn't rule out that text to speech with translation wouldn't be beneficial - just not in this context imho.
//Actually -- I'm not sure how limited the text-to-speech stuff is in the kindle. (as-in -- how many books in the store allow it. But on the couple that I've tested - it works great)
Also -- audible/various other audiobook retailers are great.
Here's one from 2006:
Here's a modern version on a cheap cell phone:
What is the ETA on French and German?
The potential for amusing pareidolias is definitely there. Maybe a tumblr for 'found translations'?
* OCR, to identify the text
* Removing the background of the text, filling it with surrounding colors
* Translating words
* Placing new words on the same area, using the same rectangle
It's very, very clever, but it's not something that should be controlled by one person or company. Should any combination of existing techniques will also be patentable? Where do we draw the line?
They were the first and they'll have a head start and make a lot of money. Now, be exclusive? I don't think so.
Patents are no defense against patent trolls, because (by definition) the patent trolls don't actually produce anything. There is no defense against patent trolls, except lobbying to narrow the scope and duration of technology patents.
Can you speak upon the origins of the software?
Did this stem about from other projects/research (Edu, darpa, lone disillusioned coder..)?
I presume one could use dictionaries of things other than plaintext? Say symbols, objects, patterns?
(for signage and 'custom' use)
Are you opposed to this being 'opensourced' at any time?
You don't have permission to access / on this server.
Additionally, a 403 Forbidden error was encountered while trying to use an ErrorDocument to handle the request.
Apache/2.2.16 (Unix) mod_ssl/2.2.16 OpenSSL/0.9.7a mod_fcgid/2.3.5 Phusion_Passenger/2.2.15 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/184.108.40.20635 Server at questvisual.com Port 80
http://itunes.apple.com/us/app/word-lens/id383463868 for direct iTunes link.
Huge congrats, this really is magic to me. If you asked me yesterday I'd have told you it's impossible to do.
The awesome thing is that once the platform is there, they could easily open it up to other languages and get them rolling too--though they might have to really think before diving into languages with a different character set, such as Japanese.
I admit I was just thinking about reading foreign language books and translating some parts, this app takes it much further.
At least in my limited experience, deciding what to work on is really important.
Acknowledging great ideas doesn't take away anything from their superb execution.
I did: "One that I've mentioned in almost every post here for ages, a service that I can take a photo, it does OCR and feeds the text into Google translate. Bonus points if it does source-language-detection. More bonus points if it is actually usable in a foreign country on real live things like signposts, menus, timetables, advertisements, book covers, leaflets, etc." - me, http://news.ycombinator.com/item?id=569838
Nearly two years ago, travelling in a foreign country with an iPhone, knowing Google Translate existed, knowing OCR existed, and not being able to do anything with that knowledge.
(It's so rare that I hit this kind of advance view on things I'm going to be shamelessly happy about it. NB, I'm not claiming anything here - I couldn't write it and didn't try).
The creators will likely see a payoff from this whether or not they do any of these things correctly. But the magnitude of that payoff is entirely dependent on those ifs and the difference between executing correctly and not is humongous.
This is something the US military would have paid millions and millions for in the past. Now, you see people say, "I'd even pay fifteen whole dollars for that app"
haha, oh wow.
of course, people are fantastically good at it, if trained in school. looking at /other/ people's scripts though, well, here we are. :)
Hopefully this app is a home run for them!
Holy smokes, when is the Chinese version going to be available?
Let me know if you need help with a Portuguese version, seriously.
Agree with some of the other comments that their logo and site design could use a boost.
Killer idea, love it.
Here's one thing they could do: open the platform to distribute dictionaries from third parties.
Questvisual could still sell "basic" or "standard" dictionaries for each language pair, but they would also sell competing dictionaries, that could either try to address the problem from a different angle (phrase translation vs. word translation), or be specialized dictionaries: legal, medical, etc.
They would take a cut, of course, and they would create a market that they would curate. Great wins for everyone!
The lessons I learned today from Word Lens
Assembly innovation is really cool. Every piece of Word Lens was here, but nobody made a perfect combination before today. Academic researchers will never do another Word Lens, as they are overfocused on novelty and hate just-assemble-the-pieces work.
Clever freemium business model. Word Lens is free, but you have to buy language packs. "erase words" and "reverse words" are free demo modes to prove that the app really works. Note, that you can even turn it into subscription model with dictionary updates.
BlendBack is the heart of this invention. Word Lens goes like this: (1) detect and recognize characters, (2) translate, (3) produce text in similar colors and shape and blend it back to the picture. The last step is the most innovative and can be used beyond Word Lens. E.g. one can do "Bar Code replacer". Turn your phone on any barcode and see some picture there. Can be used as a cheap replacement for road signs and ads.
No connection required. This is extremely important. 3G is unstable. WiFi is not everywhere. 4G has not really arrived yet. When you travel, your carrier can not cover you perfectly. I can see more and more essential apps that will not require connection. "Yelp in a box" anyone?
Global appeal. This is not another geek's app. It is mom and pop's app. It is an app for every country and and every village. We need to spend more time outside Silicon Valley to find needs like this one.
Science fiction inspiration. Part of the reason for press craziness is that Word Lens matched the science fiction story (Babel fish from "The Hitchhiker's Guide to the Galaxy"). We love seeing SF concepts turning into reality. Let's reread old classic and implement all other concepts from there :)
What I would do next
Brainstorm pricing. A lot of options are available: different price for the first month, bundle prices for several languages, one-time price discount (a la 23andme), subscription model, enterprise package.
Put on hold all talks to investors and potential acquirers.
Immediately start working on versions for other platforms (Android, Blackberry, Nokia, WinMo). Hire another person to do just that.
Run a contest: iPhone for the best "Word Lens in the wild" video.
I disagree. I strongly believe this is close to the 1st successful iteration of a killer app for mobile devices (the Babel Fish). One that other platforms might not have for a while. I would be surprised if Apple has not already reached out to invite them to One Infinite Loop to talk about their future plans (not acquisition, but Android).
I've known about this application for a long time, Otavio showed me an early prototype of this running on his laptop over a year ago. I'm ecstatic that they finally released it.
(I'm also happy that I was able to get their story on HN before TechCrunch :D)
The product is so cool and it seems to work so well, I have to admid that I am a bit jealous. I would have been proud to be able to say "I did is".
To the executing team responsible of the application: be proud what you did. Great work.
I am so happy for these guys. You believed it's doable and you did it. Congratulations once again.
I would love to read the "project diary". How long did it take to, what were the unexpected problems, etc.
Then I got better at it. It seemed to reverse the words and letters more or less fine.
Then I spent the $5 on the product for Spanish (could be handy). I'll happily spend another $5 on a French version.
And I know how to impress everyone tomorrow!
From what I understand, this is the only app that does that.
Times like this remind me how ridiculous consumer expectations can be. People still don't get that product development takes time and energy -- they just like to criticize with their elitist expectations only because they own an iPhone.
But yes, this app is great. Keep it up!
I don't care if it even only works 20% of the time right now. The rest is elbow grease and faster hardware - it'll get there.
It's not every day that you consciously realize - hey, the world just changed today (for the better!).
Why is it not possible for the developers to use the orientation sensor/gyroscope to accomodate for this? Many times it's easier to fit a bunch of text on the camera in landscape mode than the alternative. It's also a lot more natural for me to hold my phone in landscape mode when using the camera.
Just my $0.02 and congrats on all the hubbub. Can't wait for more languages :)
So what they have is
- an OCR function that reads a few words from an (easy) image
- a (simple) translator that translates the words into a different language word for word
- and then paints the translated words back using the same font/size
...continuously and quickly !
Don't get me wrong, this is cool and all, but it would be much more useful with a single snapshot that translates better, instead of focusing on doing it "realtime" (but the video wouldn't be as cool :-) )
The idea is nice, so I expect Evernote and Google to implement this ASAP ;-)
Good luck !!!
Just one small piece of feedback on the "commercial." It's a bit confusing for the first 10-15 seconds or so. I was too busy looking at the iPhone screen and not the billboards. Can I recommend an animated version with v/o?
Edit: One more piece of feedback. The icon isn't nearly cool enough for how cool this app is. My prediction is that you guys are going to make a boatload of money, hire a designer sooner rather than later to spiff anything up.
(Here's me desiring this a couple of years ago:
The 'traditional' method of firing up something like the Google Translate app, manually (and potentially painfully, if it's a language that uses symbols you aren't used to typing in on a regular basis) typing in the text, and maybe waiting for a server in the cloud to spit back response... is just crap compared to something like this. Even if Word Lens doesn't do proper grammatical and idiomatic translation (yet?), it still seems like it'd be super useful.
People will buy smartphones/iTouches/tablets for this purpose. Alone.
This is a simple to use, cheap app. on an existing popular platform which makes a significant improvement on a task a lot of people want to do. You're standing there saying "What's all the fuss about Chrome and FireFox? I can browse the internet with IE6 and have done for years!".
This reaction is future shock, this is a surprisingly strong wave in the flow of future technologies and it's shaking people up.
yeah doing it so fast on a mobile device is impressive, but am I alone in thinking this isn't going to change the world?
You are looking at a product from it's technical merits only.
From a product perspective, I can travel to any country in the world and read the signs on the shop, read menus, or even learn the language since I have an always available translator to help.
It makes the world smaller and more accessible. Those things always change the world
If I could point my macbook at something and get the same thing, then I might be a bit fascinated...the fact that I can do this with something that fits in my pocket is what is so amazing.
A minor thing about the webpage: If I want to share it on facebook (and I do!), it does not come up with any sort of summary or images, like it normally does for links. Now, I don't know exactly how it gets this information, but I would definitely find out if I were you.
Not sure if it's the lighting or the text I'm trying, but it does have a tendency to make words dance around insanely. The execution could be improved upon a bit.
I can certainly see a lot of potential for this. Let me know when it's embedded in my augmented reality eyeglasses!