
Can Watson save IBM? - apsec112
http://www.ft.com/cms/s/2/dced8150-b300-11e5-8358-9a82b43f6b2f.html#axzz3xczqvLeH
======
paulsutter
Watson of course is PR-ware, a sort of flux capacitor for selling seven figure
consulting services.

IBM does truly deserve credit for making Linux a safe choice for corporate IT.
Few people remember this, but believe me I sat in on IT focus groups in
2001-ish.

Every. Single. Person. In every single session in every city said "there's no
way we could use it in production". In every group, there was one who added
"... but if IBM..." (literally as a sentence fragment like that), then the
whole room would agree out loud.

Spooky. And then IBM came out with a big Linux campaign and the rest is
history. Anyway, that happened.

It doesn't make a business for IBM. But it was a good contribution. That too
was a way for IBM to sell consulting services. Which is all they really are.

~~~
Diederich
Thank you for this reminder, we owe IBM a lot for their early Linux efforts.

I was pushing Linux into production, hard, for a very large company in the
late 90s, with only limited success. It wasn't until 2001, because of IBM,
when the nearly fanatical resistance to Linux started to fade.

------
pesenti
I know that there is a lot of PR around Watson but it is not just PR. There is
a genuine effort to put AI-related technologies in the hands of as many
developers as possible. If you want to check for yourself what we are doing,
please go to:
[http://ibm.com/watsondevelopercloud](http://ibm.com/watsondevelopercloud).

We are trying to expose state-of-the-art technology in speech, dialog, text
analytics, vision, machine translation, etc as a coherent platform. I am happy
to answer (without any PR BS) any question you may have.

~~~
pathsjs
Ok, here is a question. I went to the site and tried the document conversion
service. I gave it three very clean PDF files containing CS articles. In all
cases, what I got in response was "The input document failed to be converted
because An invalid XML character (Unicode: 0x0) was found in the element
content of the document."

Tika is able to parse them without a glitch, and has been doing so for a few
years at least.

So I tried with some sample HTML taken from the main italian news sites. In
this cases, it parsed it, but returned some garbage HTML tags and no content.
BoilerPipe or similar libraries are able to extract a clean body of the news
in all cases.

How is that state of the art technology?

~~~
pesenti
Thanks for giving it a try. I agree that particular test does not sound like
state-of-the-art. But don't judge the whole package just based on one test of
one API. It'd be great to have access to these files (you can email them to me
at my YC ID @gmail.com). Best would even be to post your experience on our
forum:
[https://developer.ibm.com/answers/smartspace/watson/](https://developer.ibm.com/answers/smartspace/watson/).
We are usually pretty responsive.

~~~
pathsjs
Sorry I was harsh. I tried the translation service and it work better.

Anyway, I don't remember exactly all the articles I tried, but two of them
were

[https://www.cs.princeton.edu/~chazelle/pubs/mst.pdf](https://www.cs.princeton.edu/~chazelle/pubs/mst.pdf)
[https://www.cs.ubc.ca/~condon/papers/chungcondon96.pdf](https://www.cs.ubc.ca/~condon/papers/chungcondon96.pdf)

while one news that failed to parse was

[http://www.repubblica.it/economia/2016/02/09/news/borse_9_fe...](http://www.repubblica.it/economia/2016/02/09/news/borse_9_febbraio-133008943/)

I tried other articles and news, but I do not recall each of them exactly

~~~
mfulgo
Hey pathsjs, sorry for the bad experience... Nevertheless, thanks for the
feedback. TLDR: I pushed a fix for the bad character issue, and those PDFs
should convert now.

The long version: It has to do with the underlying structure of the PDF; some
of the characters in the above PDF have glyphs for display but don't actually
map the characters to code points. So, when we pull out the text, they come
through as invalid characters, which we should have filtered out. This is an
issue we've seen with (all?) PDF viewers; the text you copy from a sentence
isn't always what you expect... But, we're aware of that shortcoming and are
looking at some ways to improve the quality.

In regard to the extra content in the news articles, we're not currently
trying to do what BoilerPlate does. If you want to include or exclude specific
content from a page, we have config options to do that via XPaths. Though,
we're always open to ways of improving our services, and incorporating
something like that would probably be useful.

------
swingbridge
I've asked many IBM employees to explain to me what Watson actually does (like
for real, not the marketing BS that's on TV) that would make me want to buy
it. Nobody could explain it. Literally some where just like "yeah, I think
they're still trying to figure out how they can make a product out of it."

The Jeopardy thing was cool but since then it's turned into this mystery black
box that will supposedly solve all your problems, but nobody's can seem to
point to any real successes that would matter to a business. Honestly it seems
mostly like marketing fluff built on some midly proprietary distributed
computing technology.

~~~
pesenti
You can check and use many things that Watson does by using trying and using
our cloud services:
[http://ibm.com/watsondevelopercloud](http://ibm.com/watsondevelopercloud). I
know there is a lot of PR around Watson. But there are also real efforts and
some state-of-the-art technology in speech, dialog, text analytics, vision,
machine translation, etc. I am happy to answer any question you may have.

------
baldfat
The issue for me is does IBM need to be saved? Their business practices have
continued to be patent laden and closed. If they went down the road in a more
open nature I feel that they wouldn't be in the place they are.

~~~
iheartmemcache
The last 15 years of IBM has basically been open.

1) They're in the top 5 re: Linux contributors. (Guess who else is in there?
Microsoft.)

2) They dropped their JVM (J9) for OpenJDK - again upstreaming their patches
to the community.

3) Eclipse was founded by them in 2001 (later released into the public
domain).

4) They released (at least part of) their ML system "SystemML" into the public
domain as well, during the same month MS opened theirs and Google opened
TensorFlow.

They aren't patent trolls in the sense of those Texas LLP's who lease a few
hundred square feet and fire off scare-letters. They hold a lot of patents
because they are one of the few companies that actually funds pure research.
They make a ton of money by bilking banks/the feds/states who still run their
RMV on AIX @ 2mill a year in support contracts, etc, who are vendor-locked in.
A not insignificant amount of money goes to:

5) Post-docs from really smart uni's to publish theoretical comp sci work in
more journals than you'd imagine, which guess what? is also public domain!

They're really not hurting the consumer at all. In fact, I honestly don't see
how they could be MORE open.

\------

[1] (And guess Who's not upstream'ing much at all but making billions off
community written software? I'll give you a hint, its not Barnes & Noble! Even
VMware donated back to FreeBSD. Support HN users who actually pay-it-
forward[2] and use Tarsnap[3] instead of S3.)

[2] [https://www.freebsdnews.com/2015/02/25/donate-freebsd-
founda...](https://www.freebsdnews.com/2015/02/25/donate-freebsd-foundation/)

[3] www.tarsnap.com . Not affiliated with Colin but he's been a long-time HNer
who isn't a conglomerate, taking GPL/BSD software, slapping a pretty coat of
paint on it and selling it for huge margins. Pay a few cents extra and support
startups. Karma, guys. < /preachy hippie shit>

~~~
awqrre
> 1) They're in the top 5 re: Linux contributors. (Guess who else is in there?
> Microsoft.)

quantity and quality of contributions doesn't mean that they are the most
useful contributions though

~~~
iheartmemcache
I really don't want to turn this into a Microsoft praise fest but similar to
IBM, they've been a net-zero since about Ballmer era, and a net-positive since
Satya took over. All of that Microsoft Research papers which silently go
unnoticed in Silicon Valley (or published under the MSR, INRIA // MSR,
Cambridge name and people don't do the math) but here's a few things MS has
been responsible for

a: Asynchrony -- the async / await paradigm - if you use it in JavaScript it
came from C#, which came from a MSR paper (or maybe a Microsoft Languages
whitepaper.) A lot of the semantics took a very long time by very smart people
to work out the execution logistics, and we get to reap the benefits.

b: Haskell -- paying SPJ's salary so he can work on Haskell full-time as a MSR
Cambridge employee and implement a whole bevy of things like STM[1], the
popularization of pure functions (immutable.js users, as well as CLJS)

c: Clojure -- if you use Clojure's STM, borrowed from work funded by
Microsoft. For me, it's used every day in Datomic.

d: Verifiability -- a mind-blowingly large amount of formal verification stuff
- ranging from the first formally verified OS + integrated language, to things
like F*

e: Z3 -- no need to explain this if you're in any sort of reverse-engineering
I'm sure your life has been made significantly easier since MS released this
into the public domain

f: CoreCLR + the JS engine + their crossplatform 100% free IDE + opening up VS
Community (which is about as as VS Professional)

g: Probably most importantly -- Well, just go here.
[http://research.microsoft.com/en-
us/groups/rise/default.aspx](http://research.microsoft.com/en-
us/groups/rise/default.aspx). Basically 1 out of 4 things in there is already,
or will be imminently, used by 'mainstream' languages after it moves out of
the "only publishing grad students who eagerly wait for POPL every year" zone.

[1] I know STM comes from that '88 paper. I've read it. SPJ did too and
brought it into the mainstream, so I give him credit.

IBM and Microsoft (and, presumably GoogleX but they havent been publishing)
are one of the few corporations I'm aware of that dedicate more than a few
resources to research-for-the-sake-of-research (n.b. MS has been doing this
silently since 1991, and IIRC employ more than 100 staff just to produce work
to the field of computer science further.)

------
IIAOPSW
Why don't they just have watson play a few more rounds of jeopardy and use the
winnings to support the company.

~~~
justaman
This would actually make for decent television. Man vs. Machine.

------
freddealmeida
Watson is no longer the system that won Jeopardy. It is a collection of
technologies that have nothing to do with machine intelligence and some that
do. What is clear is that by no means has IBM won the game against FB, Google
or Baidu. In many cases they can't do it. They have invested too much into a
specific technology framework.

But what I believe is that IBM will survive. Their ability to sell will win
over technology.

------
santa_boy
I feel Watson is very heavily marketing heavy. I've tried their APIs and their
technology is very limited in functionality and accuracy. I have seen too many
videos around "cognitive abilities" but their application at the moment from
trial runs seems to be extremely crude.

------
ksec
IBM will just need to focus on what they are good at. And Cloud Server /
Softlayer just isn't one of them to be honest.

------
rbryan71
[https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...](https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines)

No, Watson cannot save IBM.

------
awinter-py
with gorilla gone, is there hope for man?

------
douche
IBM is not a technology company - it is a technology holding company.

Watson has been around for a fair number of years; almost five years since the
Jeopardy stunt. Aside from the story about Watson turning into a foul-mouthed
asshole after being exposed to Urban Dictionary, I've not seen much that is
particularly impressive out of it.

