

Wolfram Alpha: First-Hand Impressions - hzzk
http://thenoisychannel.com/2009/03/31/wolfram-alpha-first-hand-impressions/

======
hendler
Without seeing a demo, it's hard to say anything factual - but in principle,
the "problem" in any well-design/near-perfect system is still human beings.
(context, language, HCI, "true", "news" etc) So while it is possible that
Wolfram Alpha has successfully modeled a completely new area of
search/information modeling - more likely, WA is another novel way parse
language and information.

Also, Freebase and Wikipedia might the competitors (I agree with Nova Spivak)
- but Wolphram Alpha won't be a panacea until it's open source
data/technology. The wisdom of the crowds is the only way to model... the
wisdom of crowds.

~~~
dtunkelang
Actually, they're not really about natural language--that's what I'm trying to
get across in my post. They're more about a place to store and use
_structured_ data and access it formally. That they have an NLP interface
right now is, in my mind, a kludge.

And they aren't taking an open source / wisdom of crowd approach. Rather, they
are taking a very retro curated data approach, more Britannica than Wikipedia.
I'm a huge Wikipedia fan (see my recent post reacting harshly to Nick Carr's
calling Wikipedia a "Potemkinpedia"), so my reaction was highly skeptical.
But, for objective data like the populations of countries and atomic weights
of elements, curated works just fine. In fact, for structured data, I suspect
curated is probably the way to go.

------
dtunkelang
Am very curious to hear from others who have seen Wolfram Alpha in action.
They did a lot to address my initial skepticism (which was extreme--see my
earlier post), but it's no slam dunk by any means.

~~~
hschenker
A biz dev guy from Wolfram gave me a walkthrough recently (no NDA required,
which was nice) - he had a big text list of queries that he ran through it,
and although I didn't try to stump him with my own suggestions, the results
returned seemed reasonable, if you weren't expecting an exact replica of a
Google results page, and particularly if you had a rough idea of what
information sources it contained, and didn't ask questions outside those
bounds.

Some examples seemed gimmicky and of questionable value (e.g. dividing the
price of gold by the price of silver) - but given the right context, I'm sure
it could return useful results for more realistic queries as well.

In one instance, he used the wrong tense ("is" instead of "was") in a query
("How old is Barack Obama in 1967" or something like that), and no results
were returned - but the error message wasn't smart enough to suggest why it
didn't work.

One big challenge will undoubtedly be to "curate," as they say, enough common
data sources to make it useful _nearly all_ of the time...rather than just
some, or even most, of the time. I think there's a threshold of usefulness
(measured in percentage of successful queries - perhaps 98% or more) that
Wolfram will have to reach if it intends not to languish as a mere curiosity
like Cuil. Given that they've already got 200 people doing curation of just
their existing data sources, and they plan to add many more sources, that
seems like a pretty significant financial barrier to cross - I don't know
whose pocketbook got the project to this stage, or how they intend to continue
before it starts generating income.

If Wolfram Alpha IS a success, it seems like it could change the behavior of
web users in two ways:

1\. People will start expecting a higher degree of authoritativeness from
information they refer to on the web. For instance, if you're making a
decision with a committee, and there's an email thread going around that sums
up facts upon which the decision will be based, nowadays those facts might
include links to "non-expert" sources like Wikipedia. If you have the option
of linking to the same information in Wolfram Alpha, you'll probably choose to
link to Wolfram instead, since you know the information has been vetted by
paid experts. There also appeared to be a "References" link at the bottom of
each results page that had an AMA-citation-style list of sources. Although
Wikipedia also often links to sources, it's arguably more of a crap shoot as
to whether you should trust the source, the editor referring to the source, or
even that the reference was made in the right context.

2\. People will start using Wolfram as the single (or at least the first)
place that they go to for factual, authoritative information. For example,
nowadays, if I want authoritative information about a stock, I can go to Yahoo
Finance and pull up the financial details - and be reasonably certain that
it's accurate and trustworthy. However, if I also need to refer to another
type of data - for example, official population data for a region - I'd have
to find another authoritative source - perhaps the CIA world factbook. Instead
of having to remember multiple sources, it's much easier to go to a single,
trusted source for all types of factual data.

~~~
dtunkelang
Sounds like we may have talked to the same guy. I agree that breadth of data
is a challenge, and I didn't have the chance to push its limits. Though I was
surprised that it knew about acids but not about the pH levels. That might be
ok--and certainly fixable if it matters to enough people--but it's a reminder
that there will always be holes.

It is interesting to imagine an approach like this raising the bar for
information quality / provenance on the web. That would be a delightful
outcome wrt information accountability.

------
derefr
The is_a(china, country) example suddenly made Alpha "click" for me: it's the
"general knowledge" database that you feed to semantic web programs to
correlate their domain-specific knowledge to.

~~~
dtunkelang
Yeah, that's at least how I saw it. And I think that message is being lost in
the rush to see it as the next Powerset. I know Powerset and NLP generally
gets lots of hype, but I think that sort of positioning for them would be a
death sentence.

------
jmtame
i had no idea until about a month ago that these guys are all located down the
street from me. cool to see them getting coverage considering we're in the
middle of a corn field. </random>

