
Microsoft Cognitive Services - e0m
https://www.microsoft.com/cognitive-services
======
e0m
This demo by Saqib Shaikh (a visually impaired MSFT engineer) is an incredibly
compelling use case of image annotation APIs.

I've done work with the Braille Institute in San Diego, and access to a device
like this would truly be world-changing for any of the people there.

[https://twitter.com/Microsoft/status/715234653938933761](https://twitter.com/Microsoft/status/715234653938933761)

~~~
ndarilek
He mentions some smart glasses in that video, but says it kinda quickly and I
can't ID them. Anyone know what they are? As a blind developer myself, I've
pretty much been wanting to build this kind of app for a while. Even if facial
recognition may not be good enough to identify someone accurately by name, for
instance, just hearing "face, blue eyes, blond hair, male, 80% confidence" or
something similar would give me a bunch of additional data points to identify
someone when they say hello. I don't even know what color my own girlfriend's
hair/eyes are. :P I'd love a set of these to develop my own apps with.

But this appears to blow all my ideas out of the water. I wonder how much work
they had to do to obtain clean input (I.e. how many mis-identifications we
didn't see.)

Edit: Ah, I guess it's these:
[http://www.pivothead.com/](http://www.pivothead.com/). SDK and dev kit
pricing not available unless I fork over my details. If I have to ask how much
it costs, I can't afford it.

------
hbt
Something I'm starting to realize with all these AI APIs is how difficult it
is to evaluate and compare the results.

How do you know their algorithms are better than the competition? The
marketing buzzwords are all there. "Powerful algorithms", "never seen anything
comparable" etc.

The worst part is the algorithms might work well with certain data sets but
fail miserably with others.

~~~
vikiomega9
If you had an internal API as opposed to using an external one you'd have some
sort of A/B or multi-armed testing in place to check if the model you've
deployed is performing as expected. At this point you don't think about the
theoretical best you could perform at for the said model without computing
some sort of complexity measure. I assumed one would have to use a similar
automated process for external APIs. I don't see why they have to be different
per se.

------
romaniv
I have a bittersweet feeling about all these APIs. On the positive side, this
is something we can take and use, possibly saving ourselves tons of time on
some routine tasks. On the negative side, this particular flavor of AI-as-a-
service from a big company like Microsoft means that algorithms will be
reduced to a black box which you don't really understand or control in any
meaningful way.

------
travjones
Maybe this is nit-picky, but the use of "Cognitive" in the name seems odd.
This service relies on applied maths (e.g., machine learning), not
"cognition." I guess it was a marketing decision?

~~~
azakai
The APIs perform cognitive tasks, like speech to text, identifying objects,
and so forth. Yes, underlying it is applied math, but what they actually
perform are things that (for the most part) only humans are cognitively
capable of.

~~~
travjones
I disagree. Basic research in behavior analysis has shown that nonhuman
animals can perform visual (and other sensory modalities) discrimination and
categorization among a number of trained and untrained stimuli ([0] and [1]
are just two examples). I don't think these APIs do anything that is uniquely
human or "cognitive." Nonetheless, this API still has enormous value for
developers and users. I'm not discounting that.

UPDATE:

0:
[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1334394/pdf/jeab...](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1334394/pdf/jeabehav00221-0041.pdf)

1: [http://kops.uni-
konstanz.de/bitstream/handle/123456789/20501...](http://kops.uni-
konstanz.de/bitstream/handle/123456789/20501/Delius%20Categorical.pdf?sequence=1&isAllowed=y)

~~~
azakai
Processing human language is definitely a human skill, and many of the image
categorization tasks would be as well (while animals have great visual
systems, they categorize differently than us, it's not clear if cats see
"cars", for example).

In any case, the word "cognitive" is used as in "cognitive science",

[https://en.wikipedia.org/wiki/Cognitive_science](https://en.wikipedia.org/wiki/Cognitive_science)

~~~
travjones
I guess it's kind of a semantic argument. I stand corrected. It seems the
culture accepts "cognitive" in AI according to the wikipedia link you
provided. Although, I question the degree to which the use of "cognitive" in
marketing materials contributed to this acceptance.

On the topic of "processing," it's very difficult (sometimes impossible) to
know what humans or nonhumans are doing covertly when they "process" stimuli.
Thus, I'm not sure how you can determine that "they [animals] categorize
differently than us."

A dog that is trained to sit when his owner says, "sit" must be "processing
human language" when the dog sits. Right? I know this is an oversimplified
example, but I'm just making a point that calling things "cognitive" and
referring to "processing" doesn't help explain what is really happening.
Sometimes these terms obscure a description/explanation. Skinner [0] makes
this point a lot better than me.

[0]:
[http://carboneclinic.com/portal/conferences/files/IESCUM%20D...](http://carboneclinic.com/portal/conferences/files/IESCUM%20Dec%202014%20SENT/new%20readings/Operational.pdf)

~~~
pedrosorio
I think when it comes to cognitive ability of animals versus humans we are at
the same stage as biology before Darwin - assuming hard boundaries that don't
really exist.

------
bawana
The links to the python notebooks fail.

Here in the emotion API, look for the link to the jupiter notebook on github
[https://www.microsoft.com/cognitive-services/en-
us/emotion-a...](https://www.microsoft.com/cognitive-services/en-us/emotion-
api/documentation/getstartedwithpython)

Here in the vision API, same thing-jupiter notebook is 404.
[https://www.microsoft.com/cognitive-services/en-us/face-
api/...](https://www.microsoft.com/cognitive-services/en-us/face-
api/documentation/get-started-with-face-api/getstartedwithpython)

I know, real coders use C++, C#, or objC. But I need to understand something
before I code it.

~~~
overbrimming
Hi! Good catch. We're fixing the links in the docs. In the meantime:

[https://github.com/Microsoft/ProjectOxford-
ClientSDK/tree/ma...](https://github.com/Microsoft/ProjectOxford-
ClientSDK/tree/master/Emotion/Python)
[https://github.com/Microsoft/ProjectOxford-
ClientSDK/tree/ma...](https://github.com/Microsoft/ProjectOxford-
ClientSDK/tree/master/Face/Python)
[https://github.com/Microsoft/ProjectOxford-
ClientSDK/tree/ma...](https://github.com/Microsoft/ProjectOxford-
ClientSDK/tree/master/Vision/Python)

Uh, and "real coders" also use Python... :)

~~~
terhechte
Hey, I've signed up for a test account, and I'm trying out the Computer Vision
API. However the three properties I'm most interested in (which are also
featured on the example page [1] are not being returned (or documented!) for
the actual SDK: "Description", "Tags" and "Celebrities".

Are these coming soon? I'd love to develop something with the Vision API, but
I'd rather not continue until I know if / when my requests will indeed return
these features.

Cheers & Thanks

[1] [https://www.microsoft.com/cognitive-services/en-
us/computer-...](https://www.microsoft.com/cognitive-services/en-us/computer-
vision-api)

~~~
overbrimming
These features are here now! We're fixing a broken link to an old version of
our API reference. In the next hour or so, you can check here for version of
the documentation as of 3/30/16:

[https://dev.projectoxford.ai/docs/services/56f91f2d778daf23d...](https://dev.projectoxford.ai/docs/services/56f91f2d778daf23d8ec6739/operations/56f91f2e778daf14a499e1fa)

For the celebs category, you may also want to read:

[https://www.microsoft.com/cognitive-services/en-
us/computer-...](https://www.microsoft.com/cognitive-services/en-us/computer-
vision-api/documentation/domain-specificmodels)

Thanks for checking out the APIs!

~~~
terhechte
Awesome, thanks! Works great!

------
mark_l_watson
The APIs look good. I have used Microsoft's search API for years and have had
customers also use it. I like the business model of a few free calls per day,
with a relatively low cost for more API calls. Things that are 100% free make
me nervous because they are probably more likely to go away in the future.

------
bhouston
So many of these will be replicated by free open source libraries. Thus I am
not sure that most these services are that defensible in terms of pricing, if
there is a premium over AWS base costs.

~~~
jxy
As time goes on libraries may not be the fundamental issue. Hardware, on the
other hand, may become the decisive factor. You will just pay more and more in
terms of the fraction of the cost that goes to energy consumption.

