
Seeing AI for iOS - kmather73
https://www.microsoft.com/en-us/seeing-ai/
======
nharada
Yes, YES, _this_ is what I'm talking about Microsoft. I'm surprised how muted
the reaction is from HN here.

On the technical side, this is a perfect example of how AI can be used
effectively, and is a (very obvious in hindsight) application of the cutting
edge in scene understanding and HCI. There are quite a few recent and advanced
techniques rolled into one product here, and although I haven't tried it out
yet it seems fairly polished from the video. A whitepaper of how this works
from the technical side would be fascinating, because even though I'm familiar
with the relevant papers it's a long jump between the papers and this product.

On the social side, I think this is a commendable effort, and a fairly low
hanging fruit to demonstrate the positive power of new ML techniques. On a
site where every other AI article is full of comments (somewhat rightfully)
despairing about the negative social aspects of AI and the associated large
scale data collection, we should be excited about a tool that exists to
improve lives most of us don't even realize need improving. This is the kind
of thing I hope can inspire more developers to take the reins on beneficial AI
applications.

~~~
ezequiel-garzon
Also on the technical side, the fact that Microsoft (Microsoft!) is releasing
an iOS-only app, even if it comes from its research department, is maybe more
shocking than its open-source releases of late. Who are you, Microsoft?!

~~~
taspeotis
It makes sense to start with iOS anyway, it's known for having great
accessibility. A lot of people with accessibility needs use iOS for that
reason.

~~~
reitanqild
Also rumours has it that Apple punishes you in the app store if you launch on
Android first.

~~~
pmarreck
I just used my best google-fu and can find no evidence for this. If you're
going to make controversial claims like that in the future, include a link to
the evidence or you will mostly just end up spreading FUD.

~~~
reitanqild
Upvoted you. Couldn't find it either. Think it was in a checklist I read once.

~~~
pmarreck
It's missing because Apple took it down wherever it was, obviously.

/conspiracyfallacysarcasm :)

~~~
bobsam
I think @rustyshelf has removed the most salty posts, but the bulk of the
story is still somewhere in their blog:

[https://blog.shiftyjelly.com](https://blog.shiftyjelly.com)

------
mattchamb
Just as an extra piece of information for people, the person presenting the
videos in that page is Saqib Shaikh, who is a developer at Microsoft. Earlier
on HN, there was a really interesting video of him giving a talk about how he
can use the accessibility features in visual studio to help him code.
[https://www.youtube.com/watch?v=iWXebEeGwn0](https://www.youtube.com/watch?v=iWXebEeGwn0)

------
dcw303
US App Store only at this stage it seems. Pity, I'd like to try this.

edit: I'm wrong. it's in other stores as well, but not in the Australian app
store, which is the one that I tried.

~~~
seeingai
For the launch, its available in USA, Canada, India, New Zealand, Singapore
and Hong Kong. And gradually, should be available in more countries.

Anirudh from Seeing AI team

~~~
TheCoreh
Just curious, what is the reasoning behind such rollouts? Localization?

~~~
kiliankoe
I really hope not. I'm not saying localization isn't important, but holding
back an English app because of this is something I don't understand. A huge
part of the world speaks English just fine. Release first, localize later if
it's not possible right away. This is probably not the case here, but I hate
it when devs forget that the US isn't the entire world that they're releasing
their app to.

~~~
eriknstr
Also some of us don't even like to have the software localized. English is not
my mother tongue and it's not an official language where I live either but I
always set the system language to American English on my computers and
devices.

~~~
Sharlin
Localization != translation. Especially when it's a machine vision app that's
supposed to be useful _locally_.

~~~
eriknstr
You are right. Internationalization might be the word for what I'm talking
about, though some might argue that that's not correct either because they
might say that l10n is a subset of i18n. Not quite sure. But anyways, yes,
l10n in the sense that it works in my part of the world is desirable. I18n if
only taken to mean translation (and things like right-to-left and such which
arise from supporting certain languages) is what doesn't matter to me.

------
booleandilemma
It correctly identified my refrigerator and bookshelf. Color me impressed.

Things like this make articles like this one seem silly:
[https://www.madebymany.com/stories/what-if-ai-is-a-failed-
dr...](https://www.madebymany.com/stories/what-if-ai-is-a-failed-dream)

~~~
ianai
It was very unjudgey about my living room too.

------
ve55
If I have a friend that is visually impaired and is using this, I have to
consent to their phone recording me and analyzing me and sending all of that
data off to who knows where.

And this is just from my perspective - someone who is not visually impaired.
For the person who is, every single thing they look at and read is going to be
recorded and used.

It's an unfortunate situation for people to put in, and I'm sure everyone will
choose using improvements like this over not using them. As much as I would
love to see a focus on privacy for projects like this, I don't imagine it
happening any time soon, given how powerful the data involved is.

I imagine a future where AI assistants like this are commonplace, and there is
no escaping them.

~~~
visarga
In the future, most AI will happen locally, not remote, because of privacy,
latency and cost. There is a lot of research in adapting models to fit phones
(such as reducing the size of the NN by 20x-100x and still having almost all
the accuracy).

------
engulfme
If I remember correctly, this came out of a OneWeek project - Microsoft's
company-wide weeklong hackathon. Very cool to see a final published version of
this!

~~~
seeingai
You are right, this project evolved from the week long 2015 company wide
Hackathon, with a participation of close to 16,000 employees, where people are
allowed to take a week off to build anything of interest. Many accessibility
projects have come out, off the top of my head:

(1) Eye Controlled Wheelchair for people with ALS

(2) Color Binoculars - App for people who are colorblind

(3) Hearing AI - App including live speech recognition and sounds recognition
for people with profound hearing loss

(4) Learning Tools for OneNote -
[https://www.onenote.com/learningtools](https://www.onenote.com/learningtools)

(5) Dictate - Speech based keyboard control to type emails/documents (
[http://dictate.ms](http://dictate.ms)), built originally for people with hand
dexterity issues

A few of my peers now work full time on projects originated at the hackathon.

\- Anirudh from Seeing AI team

~~~
ddlutz
Looks like this is only for iOS, any plans on coming to Android?

------
ian0
Wow. You can imagine a near future where this, a small wearable camera and an
earphone could really make a big difference to a persons daily life.

Screw Siri, thats a real AI assistant :)

~~~
jimmcslim
Apparently there is some research and development taking place into a cane
that has a sonar or some similar sensor embedded into its tip, and transmits
haptic feedback to the user via the handle.

These assistive technologies are fantastic, but I wonder whether a vision-
impaired person who has adapted to life before they were available would be
weary about adopting them, on the basis that if it breaks or becomes
unavailable that they would have maybe lost the skills and sharpness of other
senses to compensate?

------
scarface74
The text recognition from documents is amazingly primitive. It doesn't use any
type of spell checking to make a best guess at what a word is. It's straight
text recognition.

On the other hand the "short text" feature works amazingly well to read text
is sees from the camera. It's fast and accurate when reading text even at some
non optimal angles.

How do you get it to try to recognize items that the camera sees?

Edit:

Oops. I guess it would help if I swiped right....

~~~
ouid
Why on earth would you want spell check on text-to-speech for the blind? Spell
check is for people writing messages, not for people reading messages.

This is a terrible idea.

~~~
scarface74
I'm not talking about the feature that reads text aloud. I'm referring to the
OCR "document" feature that makes transcription errors that could easily be
fixed if it had any level of text correction.

~~~
ouid
regardless, text correction is a tool for the writer, not a tool for the
reader. In general, text-correction does one of two things, it makes an error
better, or it makes an error worse, unless you can guarantee that it only does
the first thing, you have no idea what the total value of the operation will
be.

Additionally, presumably its capacity to make an error better is not better
than a human's capacity to make that error better. Even over very small
timescales, tens of seconds, a computer confronted with a typo will never
outperform a human confronted with the same typo.

~~~
scarface74
A simple text correction could tell the difference between the word "rnethod"
and "method" and know one is more likely an error in ocr.

~~~
ouid
what if it were OCR applied to this conversation?

------
EA
I am recommending this for my elderly family members with poor eyesight. This
could greatly increase their quality of life.

------
GeekyBear
One of my friends from college has limited vision and the feature to read text
aloud will be a game changing convenience.

He has a magnifier in his home, but it isn't portable and is limited to
working only with documents and images that can lie flat.

edit: After speaking with my friend, he already uses a popular app called KNFB
Reader that works very well on short text and documents, but costs $100. On
the plus side, it works on Android or iOS.

------
Finbarr
I'm pretty blown away by this. I took a picture of myself in the mirror with
the scene description feature, and it said "probably a man standing in front
of a mirror posing for the camera". I took a picture of the room in front of
me and it said "probably a living room". Think I'll be experimenting with this
for days.

~~~
extra88
"probably a man standing in front of a mirror posing for the camera"

I would hope it could get this one right, there's a massive amount of training
data to recognize it.

------
rb666
Well, it needs some work, but pretty cool nonetheless, can see where it was
going with this :)

[http://imgur.com/jxsWrEq](http://imgur.com/jxsWrEq)

------
zyanyatech
The use of AI and ML for application purposes is starting to get to a point
that it can really be used for problem solving, we did a demo app similar use
of this technology, [https://zyanya.tech/hashtagger-
android](https://zyanya.tech/hashtagger-android)

I am going to give Seeing AI a try as well, but I totally understand why a
research department would like to have a demo as an Application available for
public.

------
mechaman
Does it do hot dog or not?

------
vinitagr
This is quite an amazing technology. With new products like HoloLens and this,
i think Microsoft is finally coming around.

------
rasengan0
I love the low vision pitch as there are a dearth of low vision resources
particularly those hit with age related macular degeneration. I wonder if they
are any censored items in the backend that may limit functionality -- Seeing
AI won't be seeing any sex toys...

------
hackpert
This is amazing. I don't know if a lot of people here realize this, but it is
really hard to pull off this level of integration of different computer vision
components (believe me, I've tried). Microsoft has really outdone themselves
this time.

------
ClassyJacket
Not available in the Australian store. Ah, I forgot my entire country doesn't
exist.

~~~
robotresearcher
They can't recognize things upside down.

------
coworkerblues
Does this mean RIP OrCam (the other startup form mobileye creator which
basically does this as a full hardware / software solution) ?

[http://www.orcam.com/](http://www.orcam.com/)

------
parish
I'm impressed. Good job MS

------
leipert
Awesome technology; and todays SMBC [1] seems to be related.

[1]: [https://www.smbc-comics.com/comic/the-real-me](https://www.smbc-
comics.com/comic/the-real-me)

------
NicoJuicy
Microsoft app ' Office Lens ' is the only app I use for screenshots of
documents in Android. I see part of the tech is used in this app to take a
screenshot also.

Love it

------
cilea
I think this is also cool for learning English as well. An English learner
who'd like to express what s/he sees can verify with AI's response.

------
MrJagil
The videos explaining it are really nice
[https://youtu.be/dqE1EWsEyx4](https://youtu.be/dqE1EWsEyx4)

------
arized
Looks amazing, I really need to dive into machine learning more this year...
Waiting impatiently for UK release to give it a try!

------
armandososa
Remember this?
[http://i.dailymail.co.uk/i/pix/2013/05/13/article-2323625-19...](http://i.dailymail.co.uk/i/pix/2013/05/13/article-2323625-19C04224000005DC-589_634x510.jpg)
when it was a really big deal that Microsoft agreed to help Apple and release
some software for their platform?

------
blaines
There was a demo of this on the series Bill Nye Saves The World

------
jtbayly
Unimpressed. Took a picture of a ceiling fan and it said, "probably a chair
sitting in front of a mirror." Took a pic of a dresser, and it said "a bedroom
with a wooden floor." Tried the ceiling fan again and got an equally absurd
answer.

Deleted app.

------
LeoNatan25
Whenever I take a picture with the camera button on the left, it shows a
loading indicator and the app crashes. Not a great first impression. Coming
from a company the size of Microsoft, such trivial crashes should have been
caught.

~~~
seeingai
Hi Leo,

I am guessing you are using iOS 11 Beta. We are currently working on support
for iOS 11 beta, and will shortly be updating the app. Stay tuned!

Anirudh from Seeing AI team

~~~
mandeepj
Hi Anirudh,

It's a really commendable effort from MS. An immense benefit for people who
have vision challenges.

Are you guys hiring? If yes then what are the skills you are looking for?

