
Project Naptha: a browser extension that enables text selection on any image - antimatter15
http://projectnaptha.com/
======
atourgates
Wow - this is amazing.

Right this very moment (well, a few moments ago when I wasn't procrastinating
on HN) I was in the midst of extracting data from a client's old website in
preparation of creating a new website.

A lot of that data is contained within images.

From a few preliminary tests, I'm hugely impressed. This seems on-par with any
other OCR software I've used, and the fact that it happens in realtime in the
browser is amazing.

I tried it on a piece of content I'd just had to type out, that was originally
in an image. Typing out the content took about 10 minutes. Copying and pasting
with Naptha, and then making some minor edits/corrections, did the same thing
in about 2 minutes.

~~~
vidarh
There's actually been a bit of research on the error rates you need to beat
for OCR to be cost-effective vs. having people re-type. I don't have the
references handy, but I believe it's generally cost effective to OCR with
error rates up to nearly 2%, and most current "consumer grade" OCR is well
below 1% error rates for scans that aren't absolutely atrociously poor
quality.

My Msc thesis was on reducing OCR error rates by pre-processing of various
forms, and while I managed to get some reduction in error rates, one of the
things I found was actually that given how low the error rates generally was
to begin with, you have a _very_ tiny budget in terms of extra processing time
before further error reduction just isn't worth it - if a human needs to check
the document for errors anyway, a "quick and dirty" scan+OCR is often far
better than even spending the time to get "as good as possible" results.
Spending even a few extra seconds per page to place the page perfectly in a
scanner, or waiting a few extra seconds for more complicated processing, can
be a net loss.

It's a perfect example of "worse is better": OCR, at least for typed text, is
good enough today that the best available solutions aren't really worthwhile
to spend resources on (for users) unless/until they give results so perfect it
doesn't need to be checked by a person afterwards.

~~~
WalterBright
It was suggested to me by a friend that to get good OCR results, run it
through the scanner/OCR twice, then diff the results. Usually one or the other
will get it right, and if you run the two results through a difference editor
like 'meld', it's quick to fix.

~~~
vidarh
That may work for some cases, and especially with horrible OCR engines and low
quality scanners, but frankly when I did my research into this, the results
varied extremely little from run to run, and you could usually easily identify
specific artefacts in the source that tripped the engine up (rather than
problems with the quality of the scan). E.g. letters that were damaged, or had
run together, creases in the paper etc.

With really low res scanners I can image it could make a big difference.

~~~
Corrado
Back in the late 90's I worked for a company that did a lot of OCRing and they
ran the same image through multiple engines and then manually corrected the
results. I think they had 3 engines, all from different companies, which
processed all images and put the results into a custom format. Human beings
were then employed to manually merge and correct the final text. It worked
fairly well, especially considering the hardware/software available at the
time.

The biggest problem was stuffing too many files into an NTFS directory.
Apparently, NTFS didn't like tens of thousands of files in one directory. :)

------
trishume
Holy crap, antimatter15 does so many cool things. I keep finding things that
are really cool and then scroll down to find they are all written by him.
First Shinytouch, then Protobowl years later and now this. And he's only a
year older than me (19) so it isn't that he's had more time. Check out his
Github profile for more of his projects:
[http://github.com/antimatter15](http://github.com/antimatter15)

~~~
ErikBjare
I was amazed as well, same age as me. Now I feel challenged to execute even
more of my ideas, well done Sir!

------
jgj
> Unfortunately, your browser is not yet supported, currently only Google
> Chrome is supported.

FF 28 seems to be working fine with the "Weenie Hut Jr." version...is it just
the add-on that isn't supported?

awesome tech, btw

~~~
antimatter15
Yeah, I just haven't gotten around packaging the whole thing as a Firefox
Addon. It's actually technically possible to run the whole thing on a normal
unprivileged webpage (in fact, that's my development environment).

~~~
SteveDeFacto
Please give us a Firefox version! I'm begging you!

~~~
Natsu
Seconded. This is one of the most incredibly useful projects I've seen in a
long time.

------
vidarh
Reminds me of Powersnap on the Amiga. Many applications did their own text
rendering without supporting cut and paste, and so this guy called Nico
Francois had the bright idea of letting you select a region of a window, and
matching the standard fonts against the windows bitmap.

Of course then it was "easy": almost all the text would have been rendered
with one of a tiny number of fonts available on the system, with little to no
distortion.

~~~
Ogre
Powersnap was amazing. I seem to recall it was usually able to figure out what
font each program was using and only had to search for letters for that
specific font, and only fall back to a bigger search if that failed. I might
be misremembering, but regardless, it was essentially as fast as any copy-
paste today, in an environment where many programs weren't even written to
support it.

Even though it solved a problem we don't usually have today (this story
notwithstanding), it was still one of the most amazingly useful programs ever.

~~~
vidarh
You're probably right - the manual says it did. It'd be able to get the last
used font from the RastPort structure used to draw to the window [1].

If the window was rendered with multiple font that wouldn't be reliable, but I
guess it'd likely be "good enough" to avoid a wider search most of the time.

[1] Here's the RastPort struct from AROS (open source re-implementation of
AmigaOS):
[http://repo.or.cz/w/AROS.git/blob/HEAD:/compiler/include/gra...](http://repo.or.cz/w/AROS.git/blob/HEAD:/compiler/include/graphics/rastport.h)

------
pestaa
This is great news for those who have to live with disabilities.

Maybe soon I won't feel guilty for leaving my alt attributes empty.

------
leeoniya
@antimatter15, i have a project that does client-side image analysis and
decompses document structures. it looks like your OCR code would be a great
replacement for the server-side Tesseract ocr i currently use :)

here's what the project does now with js + web workers:

[http://i.imgur.com/QvXSkY2.png](http://i.imgur.com/QvXSkY2.png)

processing time is < 1500ms in Chrome and < 2000ms in FF

the code is open source, though using it isnt yet polished. i'm working slowly
on a blog post series to detail how to use the lib(s).
[https://github.com/leeoniya/pXY.js](https://github.com/leeoniya/pXY.js)

a walkthrough of the base lib is here:
[http://o-0.me/pXY/](http://o-0.me/pXY/)

~~~
antimatter15
The OCR code is an Emscripten port of the GPL-licensed Ocrad program. I
published it on Github a few months ago,
[http://antimatter15.github.io/ocrad.js/demo.html](http://antimatter15.github.io/ocrad.js/demo.html)

But in my experience, the recognition quality isn't good enough to replace
Tesseract if you have that capability.

~~~
leeoniya
it would be very useful to maybe just use part of the code. (the part that
detects where there is text, rather than what the text is)

------
skizm
Doesn't work great. Went to reddit's advice animal page to try it out and it
doesn't seem to work with livememe (I think they have an invisible layer over
their images to try and block hot linking).

Here is a copy/paste example from imgur:

[http://i.imgur.com/sKQXx8v.jpg](http://i.imgur.com/sKQXx8v.jpg)

Top: vou SAID w[ W[R[ |[AVINĞ`ON TIM[TOAV

Bottom: TN[ FACTTNATl'M MAWING TNISM[M[ g INST[AD of DRIVING D[TERMIN[D
TN#rWASA ll[

Maybe it needs to be a certain font for better results. Still pretty cool.
Hopefully all the kinks get worked out. I would definitely find this useful.

EDIT: need to make sure the language is set to "internet meme" and it works
much better.

~~~
antimatter15
By default it uses Ocrad.js, a pure javascript OCR engine (ported via
emscripten, see
[http://antimatter15.github.io/ocrad.js/demo.html](http://antimatter15.github.io/ocrad.js/demo.html)).
But if you right click on the selection and change the language to "Internet
Meme", it should transcribe it correctly (note that this sends the selection
off to a server for remote processing- It's not the default for privacy and
scalability considerations at the moment).

~~~
skizm
Ah, much better!

Top: YOU SAID WE WERE LEAVING'ON TIME:TODAY

Bottom: THE FACT THAT I'M MAKING THIS MEME INSTEAD OF DRIVING DETERMINED THAT
WAS A LIE

Next time I'll RTFM.

~~~
abandonliberty
Mine automatically selects the appropriate language on the meme images.
Including your link. Updated already?

------
elwell
Every time I click "Allow" on "Access data on all sites" for an extension I
creep closer to my security hole paranoia threshold. If it was all in JS, who
cares? But this sends ajax to remote servers of course.

Am I alone?

~~~
erikrothoff
That is the wording that Google Chrome chose for "allow this extension to
access the DOM on any page". It sounds bad but these are the permissions an
extension needs to be able to access images and text on any page.

~~~
elwell
Yeah, or any password.

------
yaddayadda
1) Very, very flippin' cool!

2) Erase Text option menu location Using version 0.7.2, the "Erase Text"
option is displayed under the "Translate" section (certainly not where I would
ever intentionally look for it).

3) Select Text -> Right-click changes selection After selecting my text, when
I right-click the selected text often (almost always) changes. For example,
with the kitten text, I selected both paragraphs, but when I right-clicked to
go to Translate->Erase the first paragraph ceased to be highlighted. After
erasing the second paragraph I tried in vain to select _and_ erase the first
paragraph, but everytime I'd right-click the selected paragraph only a single
word would still be highlighted. I eventually tried erasing text while only
one word was highlighted and the entire first paragraph was erased.

4) I really appreciate the Security & Privacy section of the project page.

5) I would love to see a Firefox version of Project Naptha!

------
bigbugbag
I wonder how deep this project is in violation of the GPLv3.

For starter it's based on gnu ocrad [1] but fails to state a license and to
publish any source code.

[1]:
[https://www.gnu.org/software/ocrad/](https://www.gnu.org/software/ocrad/)

~~~
antimatter15
[https://github.com/antimatter15/ocrad.js](https://github.com/antimatter15/ocrad.js)

------
JoelHobson
This is simply incredible. I'm just blown away by it.

I wonder if you could get better performance when running locally by sending
the result through a spellchecker and doing some Bayesian magic on the word
choice...

------
iooi
Couldn't get it to work on:
[http://graphics8.nytimes.com/adx/images/ADS/37/09/ad.370964/...](http://graphics8.nytimes.com/adx/images/ADS/37/09/ad.370964/184x90_LEFT_cm_final.jpg)

Also for:
[http://www.wsoddata.com/clients/8bec9b10/ads/300x250_static/...](http://www.wsoddata.com/clients/8bec9b10/ads/300x250_static/images/300x250_v1_bkgd.png)
It can't get the top-right text correctly

Awesome tech though

~~~
antimatter15
One of the rules for the heuristic for what images to ignore is that it needs
to have over 19,000 square px, and that first image was a bit under that.

------
rooted
Very slick! Does it automatically start OCRing every image, or does it wait
for a user to try to select the image text? Asking because I'm concerned about
this decreasing performance.

~~~
antimatter15
It waits until you start selecting the image text, but the text detection
starts when your cursor moves toward an image. It uses WebWorkers extensively,
so on a multicore system, the performance shouldn't be hit. I haven't noticed
an effect on battery life, but that's not out of the question.

------
tiles
This is amazing! Is there a planned open-source license or commercialization
of this?

------
SchizoDuckie
Wow. Just wow. How did I live my life before this?

Once again, such a simple implementation by somebody that grabs some
components that have been around for ages and mashes them up in a way that
makes people question why it wasn't invented before

I've got this installed and it'll probalby never leave my chrome profiles.
Keep up the awesome work!

------
userbinator
I have a feeling that if you just make the OCR better, a lot of users are
going to use this for entering CAPTCHAs...

~~~
vxNsr
Doesn't seem to work on reCAPTCHA images at all.

~~~
userbinator
Like I said, needs better OCR.

------
michaelchum
I remember your 2nd place win at HackMIT, congratz again. It was THE most
useful hack by far and I'm glad you've made it a public product now, and free.
Wow, it seems like you beat all those years industrial OCR products... and by
far. This is simply amazing, keep on the great work!!!

------
aalpbalkan
Certainly a cool idea but it didn't work fine on an XKCD comic:

[http://www.xkcd.com/](http://www.xkcd.com/) bottom line here is recognized
as:
"T1EN°5'lI'ONAl.1?E£ONNH\56PNCE(YHCEPlP6ﬁN(N)SURLH’PR3AO-i‘lDlsIr'£7E‘5IJ%z"

~~~
antimatter15
Randall Munroe's handwriting is a bit difficult to OCR because a lot of the
letters are smushed together close enough that the it's not possible to
unambiguously segment the text into distinct letters (which is a necessary
first step in any OCR engine that I'm aware of). Maybe Google's (or
Vicarious's) magical convolutional neural net that can solve CAPTCHAs would
fare better.

~~~
gorhill
> it's not possible to unambiguously segment the text into distinct letters
> (which is a necessary first step in any OCR engine that I'm aware of)

This made me realize I never saw such a thing as OWR, i.e. a software that
would first try to recognized whole words, then go down to character level if
no satisfying match found.

Found out this exists already:
[https://en.wikipedia.org/wiki/Intelligent_word_recognition](https://en.wikipedia.org/wiki/Intelligent_word_recognition)

------
jpasden
This is amazing, and it has truly revolutionary implications for learners of
scripts like Chinese, which are still truly indecipherable to learners when
embedded in images. I was really happy to see that this extension supports
both simplified and traditional Chinese. I tried it out, and while it shows
promise there, it definitely still needs a lot of work.

I posted a review on my blog here:
[http://www.sinosplice.com/life/archives/2014/04/24/can-
proje...](http://www.sinosplice.com/life/archives/2014/04/24/can-project-
naptha-read-chinese-text-in-images)

OP, I'd be happy to work with you on improving the recognition of Chinese
text. Just get in touch with me through my blog (linked to above).

------
m_ke
Cool, I implemented the stroke width transform for text detection about a year
ago. Nice to see someone else using implementing it, but I'm pretty sure
convolutional neural nets do a better job at text localization.

------
plicense
This isn't particularly awesome, because

1\. The implementation of Stroke Width Transform is not super good. So far,
[http://libccv.org/](http://libccv.org/) has the best implementation of SWT.
But again, you can neither make the head nor the tail of that implementation.

2\. There are just too many false text regions and the text detection accuracy
is no where near what you can call good. A mixed use of multiple OCR engines
might give better results.

All that said, you can't take away the cleverness of the application of
detecting text. Mind == Blown, on that area.

~~~
antimatter15
I actually modeled my implementation after libccv's implementation. Part of
what libccv seems to do is to run it multiple times at different scales, which
isn't something that's very computationally feasable for a pure javascript
implementation. My implementation has a second stage color filter which
refines the SWT (this is something of a tradeoff that improves accuracy for
machine-generated text and reduces accuracy for natural scenes, and I'm under
the impression that the corpus used by SWT focuses on the latter).

Ocrad is being used as the default because it runs locally and it's small
enough that it's easy to ship with. The remote OCR engine uses Tesseract which
gets much closer to acceptable in a lot of circumstances.

But there is a lot of work which can be done to improve it. I have a friend
who constantly nags me for not having a solid test corpus to run regression
analysis/parameter tuning/science. Certainly it lacks the rigor of an academic
and scientific endeavor, but I've always imagined this as a sort of advanced
proof of concept. I think the application of transparent and automatic
computer vision, deserves to be part of the interaction paradigm for the next
generation of operating systems and browsers.

------
sailfast
This looks very cool and could come in quite handy.

In case anyone from the project is monitoring - text selection did seem to
work fine for me in FireFox (ESR 24.3) despite the "Not Supported" text being
displayed.

~~~
zxexz
I think the developer just meant that he hasn't yet made a FF add-on yet; the
code works great for me in FF as well.

------
x0ner
Extension is awesome and while the code is messy, it has enough little jokes
to keep you amused. For those looking to access the backend OCR service, it
seems to be down right now, but will hopefully come back up soon.

Here were the API references I could find for the remote OCR:

\- GET [https://sky-lighter.appspot.com/api/read/<chunk.key>](https://sky-
lighter.appspot.com/api/read/<chunk.key>)

\- GET [https://sky-
lighter.appspot.com/api/lookup?url=<image.src>](https://sky-
lighter.appspot.com/api/lookup?url=<image.src>)

\- POST [https://sky-lighter.appspot.com/api/translate](https://sky-
lighter.appspot.com/api/translate)

Apparently the author was one of the winners of HackMIT 2013 according to some
of the comments. Couple of fun things in there if you decide to poke around in
the code. Jump into naptha-wick.js for the remote logic.

Note from the Dev
([http://challengepost.com/users/antimatter15](http://challengepost.com/users/antimatter15),
[http://antimatter15.com/wp/](http://antimatter15.com/wp/),
[https://twitter.com/antimatter15](https://twitter.com/antimatter15)):

/* It's April 16, 2014.

It's been six months since I started this project.

Just under two years after I first came up with the idea.

It's weird to think of time as something that happens, to think of code as
something that evolves. And it may be obvious to recognize that code is not
organic, that it changes only in discrete steps as dictated by some
intelligence's urging, but coupled with a faulty and mortal memory, its
gradual slopes are indistinguishable from autonomy.

Hopefully, this project is going to launch soon. It looks like there's
actually a chance that this will be able to happen.

The proximity of its launch has kind of been my own little perpetual delusion.
During the hackathon, I announced that it would be released in two weeks time.

When winter break rolled by, I had determined to finish and release before the
end of the year 2013.

This deadline rolled further way, to the end of January term, IAP as it is
known. But like all the artificial dates set earlier, it too folded against
the tides of procrastination.

I'll spare you February and March, but they too simply happened with a modicum
of dread. This brings us to the present day, which hopefully will have the
good luck to be spared from the fate of its predecessors.

After all, it is the gaseous vaporware that burns.

*/

~~~
antimatter15
Yeah, I made the mistake of setting the App Engine budget to $1.00. Turns out
that's probably not enough for a sustained run as HN's #2.

Yeah, the code is super messy, but I'd prefer if you didn't play around too
much with the remote OCR service, specifically, the translation parts because
Google Translate is pretty expensive per-use.

~~~
LeBleu
You have no donate link... if you're gonna be on big sites like HN, you might
as well have a donation link so that hopefully you break even on App Engine.

------
steren
Very impressive work. I'm not surprised to find antimatter15 behind it.

The website was not very clear if work was done client-side or not (mentioning
server calls). It turns out that server calls can be disabled and the
extension is working quite fine without. By default, I would disable this
option and offer opt-in, it is better for privacy I think.

------
StringyBob
This is great. I'd love to see this extended for natural images with whatever
algorithm Google uses for OCR in streetview -
[http://googleonlinesecurity.blogspot.co.uk/2014/04/street-
vi...](http://googleonlinesecurity.blogspot.co.uk/2014/04/street-view-and-
recaptcha-technology.html)

------
bananas
This is EXACTLY what I need at the moment.

I get a big problem with various people sending me screenshots with stackdumps
in. This is perfect for extracting them into the ticket bodies and it does it
perfectly (I've just done 20 with it and manually checked them!)

This is the sort of stuff that really improves people's lives by making all
data equal.

------
Craque
Please help, it looks brilliant, however, only the test page works for me.
Can't get any other pages to work. Text simply isn't selectable - cursor
remains as a pointer, not an'I' :(

I'm using the latest version of Chrome on a modern Mac and have Naptha
properly installed and Chrome has been relaunched.

Any hints would be appreciated.

------
yeukhon
Awesome. I was actually at HackMIT. It is great to see you actually continue
working on this. As a matter of fact, I told my friends who were working on
similar idea for their senior project your project name last Fall. I emailed
you for the Microsoft reference papers :) Not sure if I should copy and paste
that.

Anyway, good luck!

------
tehaaron
This is really neat. I was playing with it on pictures of street signs and
buildings and realized that if I select some text and then do ctrl+a it tried
to select everything it thought was text...Then I used right click > translate
> reprint to see what it thought each thing was.

Here is the picture: [http://thesuperslice.com/wp-
content/uploads/2012/04/downtown...](http://thesuperslice.com/wp-
content/uploads/2012/04/downtownla_timelaps5.png)

And the text outcome - found it most interesting what symbols it thought it
recognized:

lam

on-0'0

s.

Ic 0on

§-i-

I-*-

-unm

-$3.»;

o

G %T1

00-O

. o C-‘7' H ' .-.”-." «'~3;

.35

$16 O-O

‘D Q-=¢1

‘-M

km“

‘MIMI

DOW:

TLDR

D001”

'."'IIu

ff"

)0‘

\\\

,¢-.5 ,:~L.

r/J

------
RyanMcGreal
I tried it on the handwritten all-caps text on this page:
[http://xkcd.com/1271/](http://xkcd.com/1271/)

It (sort of) worked:

"I AB5ENTH|NDEDLY5ELECT RANDU1 Bl.OO<5 OFTEXTHSI READ, PND FEEL SLRONSCDUSLY
SATISFIED LHEN THE HIGHUGHTED AREA |"PKE5 H 5Yl’R1ETRICHL 5|-PPE"

------
frankosaurus
I had high hopes for this, as I sometimes need to manually transcribe serial
numbers from customers' screenshots.

However, it seems to confuse letter O and number 0. Since serial numbers are
not English words, I'm not sure how you would solve this unless you had a
lookup for commonly used web fonts.

------
bigbugbag
Seemed like an interesting project, clicked on the linked scanned the page an
it seems to be an empty pointless web page trying to explain over pages worth
of scrolling that it allows to deal with text trapped inside images which I
already knew when I clicked the link.

Going back to the page after closing it once, I noticed written in smaller
characters that this somewhat pointless page is for a useless extension as it
is exclusively limited to the worst offender privacy wise of a web browser
that I would not touch with a stick. google chrome is the new internet
explorer to me as its main use is to download firefox.

In conclusion this looked promising but a confusing web page and browser lock-
in renders it useless and shows that it is far from doing what it claims. "...
on every image you see while browsing the web" should be "...on every image
you see while browsing the web in google chrome".

No github and no open license tells me that as a linux user of opera I'm
pretty much assured I will never see a version of this extension.

~~~
bigbugbag
Not sure why my comment is downvoted, this is worthwhile (potential) user
feedback/criticism.

Webpage is not to the point and design has some room for improvement. See
point 1 and 2 of [http://www.webpagesthatsuck.com/biggest-mistakes-in-web-
desi...](http://www.webpagesthatsuck.com/biggest-mistakes-in-web-
design-1995-2015.html)

------
Omnipresent
This is extremely powerful for the end user. I've been doing a bit of OCR work
using some pre-processing methods combined with Tesseract and OpenCV. I am
curious to know how you are doing this on the fly and also as a chrome
extension. Is the processing done in JS?

------
3JPLW
The biggest thing I'd like to see is enabling in-page (control/command-f)
search. In my quick scan through the page it looks like it doesn't do that… is
that right? Are there plans to add invisible text to the DOM that control-f
can find?

~~~
antimatter15
One problem with that is that it processes images lazily. It continually
extrapolates cursor moments ~1 second into the future and processes those
relevant parts of relevant images. But it should be possible that after an
image is processed (or even eagerly by looking up previously recognized
regions from the cached OCR server), the page could be made Ctrl+F-able.

~~~
omegant
Or add the option to the ctrl+F pop up "search inside the image" That way you
save memory if you are not using it.

------
cornholio
I like the way this extension removes text in the image, but I would much
rather have a video delogo filter for that does not suck. It would be very
useful for removing hard subtitles, station logos, screener warnings etc.

------
eddyb
The Mentalist reference, anyone?

In any case, pretty cool project, I'm a bit amazed how far we've come since
I've last played with OCRs (and defeated one bad CAPTCHA implementation, still
in use at pastebin.com it seems).

------
adem
Cool idea, definitely worth exploring the possibilities. A quick run showed me
that it often interprets the "i" as "l" whenever the the gap between the line
and dot is not apparant

------
deviltreh
Now that is pretty damn cool. Will help at work when marketing people do not
copy paste email/article and just put screenshot of it and if you want to
quite something from that picture...

------
RaphiePS
Saw you demo this at the hackathon session at CPW! Really, really cool.

------
jpdlla
@antimatter15 any recommendations for optimizing Tesseract?

~~~
kylebrown
Curious about this too. Also, what's the stack providing Tesseract-as-a-
Service? According to my cursory search, Google app engine won't run Tesseract
as its a native library, not an API. I'd like to try this on non-Latin/CJK
hardcoded subtitles, but ocrat does latin only.

~~~
antimatter15
I wrote a little C program that uses TessBaseAPI to extract letter locations
which gets triggered with ImageMagick's convert by a NodeJS script. The app
engine frontend which acts as a caching reverse proxy.

------
Tsagadai
I have wanted an extension to do this for so long. I even started coding my
own at one stage but hit various issues. Thank you so much for creating this.

------
jawerty
You guys should consider making an API for this. It would be awesome to have
an API that inputs images via url and outputs the text of said image.

------
darkhorn
What project won the first place in HackMIT 2013?

~~~
garrettgrimsley
A hack called "Lightboard."

[http://bostinno.streetwise.co/all-series/photos-recap-and-
wi...](http://bostinno.streetwise.co/all-series/photos-recap-and-winners-
of-2013-hackmit/)

[https://github.com/vincentsiao/Lightboard](https://github.com/vincentsiao/Lightboard)

------
vanderZwan
> _In a sense that’s kind of like what a human can do: we can recognize that a
> sign,_

Oh god... how does it finish! I need closure!

(PS: this is awesome)

~~~
antimatter15
Fixing that! I also have to write the entire second half of the chronology
section, but at least it looks less like I pulled a "Monty Python animator".

------
jonnynezbo
This is pretty cool, but not perfect. Upon copy & pasting the captured text,
several of the words and letters are wrong.

------
swavaldez
Everyone deserves to have this extension. It's even better if it could be a
browser's default feature. ;P

------
Aardwolf
I would like it if this would work on ANY text in webpages.

Too many webpages make it too hard to select even actual plain text.

------
krsunny
Completely agree with the "where has this been all my life" sentiments. This
is awesome, thank you.

------
username42
Just tested with a random scanned page
([http://www.hpl.hp.com/research/info_theory/ShannonWeb/fullsi...](http://www.hpl.hp.com/research/info_theory/ShannonWeb/fullsize/A\)%20Clean%20Original.gif))
the result is almost garbage. It seems as bad as most OCR software I have
encountered. This was to be expected as it is based on ocrad.

~~~
ishi
Almost garbage? This is the OCR result for the 2nd paragraph. Almost perfect,
although the last word in each line gets joined to the first one in the next
line:

"The fundamental problem of communication is that of reproducing atone point
either exactly or approximately a message selected at anotherpoint. Frequently
the messages have meamlng; that is they refer to or arecorrelated according to
some system with certain physical or conceptualentities. These semantic
aspects of communication are irrelevant to theengineering problem. The
signiﬁcant aspect is that the actual message isone selected from a set of
possible messages. The system must be designedto operate for each possible
selection, not just the one which will actuallybe chosen since this is unknown
at the time of design."

~~~
entropy_
I tried it with both ocrad and tesseract modes, and indeed, the ocrad mode
produces garbage, the tessaract mode produces a really good result but takes a
longer time doing it(mainly the time it takes to upload the entire thing and
get the result back).

That seems to make sense to me, at least. Use ocrad mode by default, if it
doesn't perform well, switch to tessaract and you'll hopefully get a better
result.

------
nileshtrivedi
Very nifty. Although, it would have been even more awesome if it worked with
Google Books.

------
bz123
cool idea, a bit buggy yet and when i am trying to actually save images i do
get the custom extension right click bar instead of the normal chrome bar to
save the image, but i guess its still under development.

------
swah
Great idea, this should make a couple million for the creators.

------
amazd
This seems like a great addition to my side project (amazd.com)

------
darkhorn
In Firefox 31.0a1 I can copy the text only with Ctrl+C.

------
sourcex
This is Awesome and very useful! BTW did he do it ?

------
ernestipark
Awesome. How does this affect page performance?

------
valbaca
Worth it for the dozen-click easter egg.

------
seshakiran
Very nice. Will give it a try.

------
nemrow
Way cool! I am impressed.

------
Thiz
Magic.

Indistinguishable from magic.

------
atixid91
wow a step ahead! amazing extension....

------
est
bonus points to scan QR-codes

------
pinaceae
Very cool stuff, but need to satisfy my OCD:

It's spelled Naphtha
([http://en.wikipedia.org/wiki/Naphtha](http://en.wikipedia.org/wiki/Naphtha)).
And for the HN hordes - read the bottom of the linked project page, it is
supposed to be a reference to Naphtha.

:)

~~~
clarkm
Isn't that like telling Google that it's actually spelled Googol?

------
sscalia
Badass. Now support Good Browsers™ like Safari and Firefox.

~~~
wdewind
Curious, what makes you find those to be better than Chrome? I recently
switched to FF for a variety of random work reasons, and found it so much
worse than chrome (basic UI, dev tools, speed) that I switched back asap.
Maybe I'm missing something awesome about them.

~~~
ycaspirant
I use Safari, and I find it to be better than Chrome because it's easier to
sync with my iPhone and iPad, and with iCloud keychain even my passwords are
synced.

~~~
wdewind
Interesting thanks

------
jbeja
This my friends is called "Innovation".

------
batmansbelt
Now the NSA will be reading the contents of your animated GIFs.

~~~
SoftwareMaven
Do you really think the NSA hasn't had access to OCR technology until now?

~~~
batmansbelt
It was a hilarious joke.

~~~
tlrobinson
Apparently not hilarious.

~~~
batmansbelt
Tough crowd.

~~~
SoftwareMaven
In the immortal words of reddit: Woosh!

Sarcasm is hard to read on the internet. I'm usually pretty good at it, but
this one flew right past me.

------
bondolo
I can imagine quite a few blind people are creaming their jeans about now.

