
DeepText: Facebook's text understanding engine - adwmayer
https://code.facebook.com/posts/181565595577955/introducing-deeptext-facebook-s-text-understanding-engine/
======
iamdave
Slightly contrarian, or maybe not-but is anyone else sort of fatigued with
social media trying it's hardest to do it all for you? Algorithmic timelines
telling me what happened while I was gone (trust me, twitter...I would have
gotten to it eventually), prioritizing what shows up (trust me facebook, I do
a very good job hiding content on my own), telling me what I ought to buy
(trust me, Amazon if I wanted it-I'd either already buy it, or I'm buying it
later when I have the disposable funds), and now arranging transportation when
hinting at going out an about for me (trust me, facebook Messenger-I live
seventeen steps away from a bus stop that carries me right into the heart of
downtown-I'm good)

I have no mistakes that these features are probably loved by some, maybe even
_most_ , but more and more-while I don't want to disconnect FULLY from social
media (while I much love the laconic, brevity inherent design of twitter, the
fact that the 140 character limit is going away has me sighing heavily), I do
sometimes finding myself wishing I could opt out and take a bit more control
over the content I'm ostensibly subscribing to.

Has anyone else felt similarly, or could maybe phrase the phenomenon better
than I have?

~~~
amelius
What scares me about this is the "bubble" it creates, see [1]

[1]
[https://en.wikipedia.org/wiki/Filter_bubble](https://en.wikipedia.org/wiki/Filter_bubble)

(even mentions Facebook's news-stream)

~~~
smoe
In my opinion the bigger problem is, when people are not aware that they're in
a bubble. They existed, even before social media and internet searches. E.G.
when I was a kid/teen, I got most of the information about the world from one
local news paper and one evening news show on tv. Yes, their "newsfeed" wasn't
generated by algorithms. But still, it's someone that decided for me what is
relevant and what not. Nowadays I have access to much more sources and
different point of views as long as I am aware that they exist and know it's
up to me looking them up. Similar thing if your part of a strongly opinionated
subculture. It is hard to even realize how limited your horizon might be.

So I try to embrace the bubble when it is useful for me. I appreciate that
when I type "something something python" into google the top results are
relavant to my work and not about snakes or comedians. But when I'm want to
form a politcal opinion I use duckduckgo and activly look for contrary
positions.

~~~
marlag
Common man is brought up within the bubble of their parent's beliefs.
Intelligent man breaks out of that bubble, once he find it too constraining.
Unintelligent man does not. Unintelligent man also have trouble identifying
bubbles.

Earth provide opportunities for success for both men. In the long run, one of
them will dominate the other.

~~~
matt4077
Darwin speak better. Darwin know grammar.

------
nl
Well that's a pretty disappointing post: "Hey, we use our deep learning tools
on text. Look at this 12 month old paper".

Everyone in the field has read that paper. It was good work! But there are
lots of intriguing things mentioned in the post which deserve further details.

The most interesting thing to me is "more than 20 languages"!! That's pretty
nice - the paper had some early results for Chinese, but if it can perform
similarly to the English results across 19 other languages that is probably
the state-of-the-art for many of them.

------
wslh
Yesterday I mentioned vodka to my wife in Facebook Messenger and a few minutes
later she saw a vodka advertisement on her timeline. Obviously Messenger will
never include end to end encryption.

~~~
mandeepj
are you serious? They had to decrypt text so that your wife can read it and
that is where they started displaying vodka ads.

edit - I meant to say even if they had end-to-end encryption then your text
would still need to be decrypted so that other user can read it.

~~~
guelo
End-to-end encryption means the server doesn't know the content of the
message. Only the clients, the end-points, have the decryption key.

~~~
lucb1e
And then the endpoints integrate with ad networks after they've distilled it
into topics.

I mean, my message contents are personal data. The topics "buy a bike" and
"sell a bike" are not (I mean, not by themselves), so those you can transmit
to ad companies.

Now of course you need to make the detection lightweight enough that you don't
need to talk to the server anymore, but that's just a matter of time. The
person you're responding to, who's got downvoted, has a point (if this was his
point).

Edit: I sounded like I agreed with it, which is not the case. Even if it's
stored "private and securely" at Facebook, they would still be able to connect
my topics of interest with my identity and contacts, which would "technically"
comply with the concept of end to end encryption (only metadata leaks because
topics alone are metadata, which always happens because you need to route
messages)... but which still doesn't comply with my own definition of proper
privacy.

~~~
chillacy
I'm not so sure about that. If you were talking about subjects that might be
disagreeable to people in your country, maybe it's sex toys or homosexuality,
if those topics start coming up in ad networks then that's still going to be a
problem.

~~~
lucb1e
Oh, shit, I made one fatal mistake writing that post: it now sounds like I
agree with it.

This is just how marketing would sell these features to the world (end to end
encryption, and yet targeted ads), but disclosing each topic I ever discuss
with a given person is still private in my opinion.

I've edited my post's original text slightly, adding a bigger edit at the
bottom.

------
Guildpact
I wonder if this is in response to Google's open sourcing of Parsey
McParseface: [http://googleresearch.blogspot.com/2016/05/announcing-
syntax...](http://googleresearch.blogspot.com/2016/05/announcing-syntaxnet-
worlds-most.html)

~~~
buskila
Or Microsoft's language parser: [https://www.microsoft.com/cognitive-
services/en-us/language-...](https://www.microsoft.com/cognitive-services/en-
us/language-understanding-intelligent-service-luis)

~~~
deet
Facebook already has Wit.ai which is extremely similar to Microsoft's LUIS,
and in some ways more advanced.

------
justsaysmthng
> Understanding the various ways text is used on Facebook can help us improve
> people's experiences with our products

You mean show us more "relevant" ads ? Last time I checked, Facebook's
*product" was advertising. Thanks, this is exactly what I miss in my life -
more ads for products and services I don't need.

So excited that Facebook is going to understand everything I'm talking about
privately with my "friends" and sell my identity to more ad buyers, who'll
design more subliminal ads to squeeze the last millisecond of what's left of
my attention span.

------
Xeoncross
Is this the paper they said they based this on: "Text Understanding from
Scratch": [http://arxiv.org/abs/1502.01710](http://arxiv.org/abs/1502.01710)

Here is example code for text classification from character-level using
convolutional networks in Torch 7
[https://github.com/zhangxiangxiao/Crepe](https://github.com/zhangxiangxiao/Crepe)

------
gmantom
So I mean this is great but are they open sourcing it?

Is there a way for us to play with it? Or are they just bragging?

------
return0
I think FB has a strategy here that plays along well with their messenger (and
their emphasis on bots). Imagine the example they give "I would like to sell
my old bike for $200, anyone interested?", being matched with another user who
said "I am looking for a cheap bike nearby". This plays right into google's
territory, with the ability to organize and match people's intentions.

------
samwestdev
Everything is deep nowadays. Somebody should do a parody product called
DeepBalls

~~~
WiseWeasel
Makes me wonder what the next AI industry buzzword will be. After Big and
Deep, maybe comes Fine or Narrow, Tactful, and finally Aware.

------
crypto5
Looks like just text classifier, which classifies text snippets by predefined
simple intent.

~~~
chlestakoff
Indeed.

------
derEitel
I would be interested to learn more about their research behind it. Will there
be any papers released? Specifically: how much data did they train on? How
many machines for how long? How did the different NN architectures perform?

I guess those things would be fine to share without revealing the inner
workings.

~~~
visarga
Such results as they published today have become pretty standard in the
academic papers. I was wondering why weren't there any cutting edge
applications of deep learning in text, as a product. It's probably too
expensive to run the necessary hardware to use the latest and greatest ideas.

------
ilyaeck
Are they actually open-sourcing it? There doesn't seem to be link to the code
in sight.

------
etiam
Now we're really well on the way into DeepSh*t.

When these things get just a little bit better, the Five Eyes agencies will
suddenly not have a staffing problem any more.

------
hellofunk
I feel like the more AI gets into the middle of our online interactions, then
the more AI will try to steer those interactions. There is already plenty of
evidence that our online behaviour is strongly influenced by algorithms, and I
suspect this is just the beginning. One reason why I log onto FB maybe once
every couple months; it just feels like I'm being manipulated.

~~~
mbrock
Eben Moglen:

> Facebook is strip-mining human society. Watching everyone share everything
> in their social lives and instrumenting the web to surveil everything they
> read outside the system is inherently unethical.

> But we need no more from Facebook than truth in labelling. We need no rules,
> no punishments, no guidelines. We need nothing but the truth. Facebook
> should lean in and tell its users what it does.

> It should say: "We watch you every minute that you're here. We watch every
> detail of what you do. We have wired the web with 'like' buttons that inform
> on your reading automatically."

> To every parent Facebook should say: "Your children spend hours every day
> with us. We spy upon them much more efficiently than you will ever be able
> to. And we won't tell you what we know about them."

> Only that, just the truth. That will be enough. But the crowd that runs
> Facebook, that small bunch of rich and powerful people, will never lean in
> close enough to tell you the truth.

------
mark_l_watson
It is useful to hear what architectures they are using (e.g., BRNNs) and I
appreciate the pointer to the original paper.

I was hoping for some open source code to read, or more detail on their
models. Facebook is pretty good at open sourcing things, so hopefully more
papers will be released and open source software as it makes sense for FB to
do that.

EDIT: typos

------
EGreg
Are they going to open source it so that everyone else can use DeepText?

~~~
_audakel
It seems like a lot of big companies are open sourcing their deep learning
stuff. Part of me wonders if they do it because, as people get scared of AI,
the federal govt can't come after lots of of small companies using the
opensource AI to regulate/fine/shutdown. They could come after just Google or
FB once people start loosing lots of jobs to AI, if they where the only kids
on the playground.

~~~
derEitel
They are mostly releasing their tools though. The raw diamonds are the trained
network weights, it's architecture and the data itself. Releases on those
parts we see rarely from private companies.

------
lucb1e
My Facebook account was only for public posts and private messages only, to
force myself not to post anything that Facebook might consider public as if it
were private. This just made me give up on the company altogether.

Fun that you guys can train a neural net to deduce what I like better than me,
but as they said at the CCC conference, everyone has to decide for themselves
how close they are with their machines (though that was in the context of
taking your laptop to the toilet, in context of leaving it alone unattended).

------
pookieinc
Ya know, I normally don't believe in these "Facebook is listening into your
conversations"-type voices and opinions because it seems like such an invasion
of privacy and the company wouldn't do something like _that_ without
permission. However, after a recent episode where I _know_ Facebook was
listening to my conversation[1], I'm now much more nervous based on this
machine learning news. I've become highly critical and pessimistic when
Facebook says things like "Better understanding people's interests" and "New
deep neural network architectures", not so much because of what their intent
with the data is, but how they receive that data. It's done so sneakily, this
is what I disagree with. I normally don't have a problem with Google reading
my emails and suggesting news topics on Google Now + my flights, etc., because
they ask if they can, but with Facebook, there's an era of creepiness that
they are always listening.

[1]: This past weekend, a close friend was telling me about his friend who
works at Google and the following morning, this Google person came up on my
"Friends suggestions". It was the most bizarre thing... I also confirmed with
my friend if he looked this Google guy up later on Facebook, which may have
prompted his Google friend to show up on mine, and he said he didn't touch
Facebook after our conversation.

EDIT: For further clarity.

~~~
cyphar
Why did you think Facebook wouldn't do that? In their view, any content you
provide to their site becomes their property. Why wouldn't that include
conversations. Considering the fact that everything is unencrypted once it
reaches Facebook, it's a mistake to assume good faith on the part of a company
that makes its living trying to make you spend more of your life on their
website.

~~~
pookieinc
Any content I provide to their site becomes their property is something I
understand. However using my mic without specifically asking permission is
something else. Yes, I do have their app, but even then, their app on my
iPhone does not make my iPhone their property. In all honesty, I actually
thought that this was something that phones would prohibit (I use an iPhone),
but I was definitely mistaken and surprised by it.

Having said all this, it was actually in the news[1] yesterday that Facebook
will soon be providing end to end encryption, though I doubt it'll apply to
the listening in on people's conversations chatting.

[1]:
[https://www.theguardian.com/technology/2016/may/31/facebook-...](https://www.theguardian.com/technology/2016/may/31/facebook-
messenger-bot-encryption-secure-messaging)

------
cromwellian
Isn't "text understanding engine" a poor description of what is basically a
classifier?

I'd expect a text understanding engine to do more than classify, I'd expect it
to read a bunch of sentences in context and then be able to answer questions
about it:

John was walking down the stairs. John tripped and fell. John lays on the
floor.

Describe John's status: Is he hurt or in pain? Is he standing?

~~~
nl
Facebook have one of the leading teams on doing Deep Learning question
answering. Jason Weston et. al wrote a paper introducing a machine-generated
textual dataset[1] similar to what you describe, eg:

    
    
        Daniel picked up the football.
    
        Daniel drops the football.
    
        Daniel got the milk.
    
        Daniel took the apple.
    
        How many objects is Daniel holding?
    

There have been plenty other papers published by them and others tackling that
and similar but harder problems eg[2]

[1][http://arxiv.org/pdf/1502.05698v10.pdf](http://arxiv.org/pdf/1502.05698v10.pdf)

[2] [http://arxiv.org/abs/1511.02301](http://arxiv.org/abs/1511.02301)

~~~
pimlottc
Can't Daniel just type "inventory"?

------
tmpanon1234act
I know that tech company M.O. is creeping invasiveness by acclimating users to
marginally more Orwellian incursions of their privacy over time, but are we
already at the point where something like this can gain widespread acceptance?
Seems like the willingness of the general populace to prostrate itself to our
newfangled corporate overlords is only accelerating.

------
johansch
I kind of think Google will win over Facebook in this particular war, based on
what I have seen so far.

------
atrudeau
And here I was hoping to see links to published papers or God forbid the
release of datasets for fellow researchers... Heck, even a trained model would
be nice.

Seems like a PR stunt to attract talent.

------
themgt
Perhaps we're all computer simulations whose purpose is to a/b test marketing
on real-world people based on their social media posts.

------
greenimpala
More creepy idealism from the team at Facebook.

------
uptownfunk
Would love to see a compare / contrast between FB/GOOG/MSFT's offerings in
this space.

------
hdshgfdsg
gfd

------
hdshgfdsg
sgfdsgfdsgfsdgf

------
hdshgfdsg
dgdhgfsegfs

