
Tableau acquires ClearGraph, a data analysis startup using natural language - dgudkov
https://techcrunch.com/2017/08/09/tableau-acquires-cleargraph-a-startup-that-lets-you-analyze-your-data-using-natural-language/
======
sprobertson
I've been working on a (soon to be) open source version of this, I didn't
realize there was a real business version out there. So far it works great
alongside Salesforce e.g. "Find me appointments in San Mateo today set by
Jason Jones".

~~~
cleansy
Nice, do you have any details already? Library / Language? Can't wait to see
some actual efforts in open source in this space.

~~~
sprobertson
It's based on seq2seq translation models, where instead of translating from
one human language to another, it translates a human language to a command
language: "Find appointments in San Mateo" -> "find($type=appointments,
$location=San Mateo)". The code that does that parsing (PyTorch based) is at
[https://github.com/spro/RARNN](https://github.com/spro/RARNN)

The missing piece is generalizing the training process to fit other people's
schemas; generating training data based on models, attributes, and
relationships.

~~~
Macuyiko
This is very interesting! How far does the 'missing generalization' impact go?
I.e. your Nalgene grammar file (nice reference, didn't know about Nalgene) is
used to generate both flat input strings and nested desired outputs, which you
use to train your network on (if I understand correctly). This file seems to
contain quite a lot of hand-written varieties like "please/plz/plox/...".
After training the network, I assume it is capable to also handle inputs not
seen in the training data? Like someone writing "please?!11"? If not, I don't
really understand why'd you'd train a network in the first place: you have put
in the effort to create a grammar, so might as well use that one to use the
actual conversion to a tree, no?

Totally not trying to be negative here, just trying to understand your
workflow a bit better.

~~~
sprobertson
The words are turned into GloVe word vectors on the input side, so it is able
to handle a decent amount of variation in spelling and using synonyms. Having
synonyms defined in the Nalgene file helps the network to accept vectors in a
general region rather than an exact point in space. It also encourages the
network to learn about the grammar of the input rather than use of specific
words, so it can handle words that it hasn't seen before (good for names,
places).

~~~
Macuyiko
Gotcha! Thanks for the explanation!

------
bgraves
Saw a pretty sweet demo of Tableau's home-grown prototype at their annual
conference last November. It was surprisingly useful to be able to just speak
"show me all of the 3-bedroom homes in the downtown Seattle area less than
$400,000".

It was slow, but effective. I kept feeling myself wanting to click around for
the first few minutes but quickly realized I didn't need to.

I did have to speak in away that the NLP engine could understand (i.e. "four-
hundred thousand dollars" instead of "four-hundred k") so it still feels like
I'm building a SQL query with my voice instead of just speaking an idea and
the software figures out what I mean (hard problem to solve, I know!)

~~~
tostitos1979
Is there an open source NLP engine out there? I've been trying to learn this
area and there are so many "pot holes" and wrong paths ... I've looked at
OWL/Sparql, Graph DBs, logic programming, rule based systems. I feel like I'm
dancing around the real topic and I don't know what "it" is :'(

~~~
stephengillie
.NET has some speech synthesis and listener libraries. They work pretty well;
I built a modestly-functional chat bot once with them. Not sure about the
overall .NET licensing arrangement, but I heard it was moving towards open
source.

Though they feel abandoned, and there hasn't been much recent activity around
them. Microsoft probably has all speech engineers working on Cortana instead.
(Though I'd be surprised if she's not using .NET at some level.)

~~~
mycall
> Microsoft probably has all speech engineers working on Cortana instead

Microsoft cognitive services

------
burton32
There's a lot of movement in this space at the moment. I'm aware of the
following players:

\- Veezoo www.veezoo.com

\- Wizdee www.wizdee.com

\- Kueri www.kueri.me

~~~
asavadatti
Pokemon or Data company

------
greggyb
This has been in production with Power BI + Cortana for about a year now.

~~~
dgudkov
Is it any good?

~~~
greggyb
Mostly agree with Baconner.

It does help with discovery if configured well. 'Configured well' is a fairly
high cost, so only worthwhile for a fairly simple data model that will have a
large number of consumers.

Typically, there's not much of a population of "people who are unfamiliar with
the dataset, but need to ask questions of it".

~~~
baconner
I do agree. It can work and when you've put the time into designing it to work
then it can feel fairly magical, but there's a point at which you feel like
you're almost pre-creating all the queries for users.

------
dmix
This is a great combination IMO. I spent some time researching various BI data
analytics services and I was impressed with some of the newer ones like
Tableau. This seems like essential tech for any medium-large company.

Dashboards and visualizations that can be easily composed with a natural
language interface... it makes a lot of sense. Especially when combined with
alerting services and/or chatbot-esque interfaces for automating workflows.

~~~
bgraves
Hmm - I hadn't thought about the dashboard / data viz creation side of this.
Not sure that I'd want to be sitting in my cube creating dashboards "out
loud".

 _" Okay, let's bring in 'Sales' to the rows card._

 _" Nah, I don't like that. Move 'Sales' to the columns card._

 _" Hmm, that doesn't work either. Put 'Sales' back to the rows card but add
'Profitability Indicator" to the details section._

 _" Crap. Still not working. Let's start over."_

Imagine 10 analysts in a room all talking like this :)

~~~
kornish
Well, anything you can say you can also type out.

~~~
baconner
this here. I've spent the last couple of years working on a similar data
discovery style product and after a lot of playing around with concepts I
think semi-natural language descriptions typed and also generated based on
your manual data selection can be really useful.

If I'm speaking i have to finish the whole thought and deal with excluding all
my "uuhms" and half thoughts. If I'm typing i can intellisense prompt for
relevant things. Correlate Sales with _[Discounts, ...]. I think terse natural
langage descriptions of data views are really useful aside from voice.

Incidentally nothing like trying to play around with this stuff to make you
super self conscious about uuh how you speak.

------
claudfuen
Curious to see how the industry will play out.

Seems like more analytics and BI companies will have to incorporate NLP if
they want to compete.

Currently working on a similar project

\- www.askned.com

~~~
cleansy
You mean another player like the company you are a VP of marketing?

~~~
claudfuen
Absolutely, It's a unique opportunity in the space, it feels like everyone is
racing to market - next 6 months will be interesting to watch.

~~~
kornish
Nice edit to your comment.

Originally, claudfuen wrote something like:

"Wonder if we'll see more acquisitions this space, like askned.com."

