
Introducing FBLearner Flow: Facebook's AI Backbone - dsr12
https://code.facebook.com/posts/1072626246134461/introducing-fblearner-flow-facebook-s-ai-backbone/
======
calgoo
"When you log in to Facebook, we use the power of machine learning to provide
you with unique, personalized experiences. Machine learning models are part of
ranking and personalizing News Feed stories, filtering out offensive content,
highlighting trending topics, ranking search results, and much more."

So, everything that i hate about FB is related to the AI.

Ranking and Personalizing: Horrible ranking results and the "personalized"
results that they push into my feed have nothing to do with me, my interests
and is not something i would ever want in my social feed. Its almost as bad as
their advertising results (which im guessing is based on the same AI).

Offensive Content: I could not give 2 f@@@@ about this, but i guess some
people don't like strong language or pictures (only point that i see as
acceptable use of the AI for now).

Trending Topics: FB, you are not Twitter, please stop trying to copy them. If
i want trending topics I'll go to Twitter. My Facebook is a way to contact
people i have not talked to in months.

Search results etc: What search??? Facebook as a search engine now? Or are you
talking about the BS do you know John? You should know john! - FB: If I have
not added John, it might mean i dont want to talk to John cause he is an a@@.
Until your AI learns to detect a __holes then please switch off this crap.

I know its a rant, but Im so tired of services trying to fill my feeds with
crap, and also the crap BS about AI saving / ending the world. It might create
nice cool articles for the news, and buzzwords in the bubble out west. We can
only hope that Winter will not come this time, and we can advance far enough
that AI becomes an actual useful product for more then spam filters, and
detecting if John went to the same school as me.

* John is an invented person based on people i know. * Swearing is blocked as I'm not sure what people prefer on here. * Yes, I know AI is advancing a lot, and I'm a big supporter. I just don't like how its used, and how the PR makes it look like its the 7th wonder of the world.

~~~
visarga
You are probably a person who has been online for a long time, and seen a lot.
You have expectations that are not similar to the majority of FB users. They
are mostly young people, or poor, or from poor countries who only have a cheap
Android phone, no laptop, don't know what Twitter is and they discover
Facebook for the first time. They find uses for it without having any
expectations (like, "If i want trending topics I'll go to Twitter").

You probably use FB for one hour per month because it's the most convenient
way to catch up with your friends and family and they spend 4 hours per day
because for them the internet is Facebook. FB uses statistics to become better
for them, not for you.

~~~
tmaly
I must be in his same bucket. All I see our either Hillary Clinton or Donald
Trump posts. It all seems very polarized to me. Back when I first got on the
internet, it was through dial-up modems. I would browse around on gopher and
everyone chatted on IRC. It felt more like a community. I think there are
still pockets of that out there, its just buried in all the noise.

~~~
vthallam
Exactly! I am sick of seeing posts related to only Hillary/Trump even when
they were not the presumptive leading candidates. Also missed some important
updates of friends because FB thinks it's not relevant to me.

~~~
scholia
Have you tried hiding the Hillary/Trump posts and interacting with the friends
whose updates you want to see?

Or you could just install Social Fixer, and control your feed that way.
However, in my experience, it's quite hard to produce a better feed than
Facebook's AI...

[http://socialfixer.com/](http://socialfixer.com/)

~~~
alttab
Seriously - just hide posts. Facebook's AI is pretty good at allowing you to
properly prune your news feed.

You can also game this system by getting others to interact with your posts
more often, which will push your posts into their feed at a higher rate.

------
Fede_V
While FB has released some excellent open source software in the past, I don't
really find blog posts about closed source software to be that interesting.

The optimist in me says that the reason this hasn't been open sourced is
because a lot of the distributed code is tied specifically to FB's
infrastructure - but I guess we'll wait and see.

~~~
forrestthewoods
Some of my most popular blog posts have been explanations as to how closed
source video game engines work. Some of the most popular conference talks are
on closed source solutions to encountered problems.

Dismissing a blog post simply because it describes closed source software is
silly.

~~~
msl09
I'm thinking he's dismissing the tool not the post?

~~~
detaro
> _I don 't really find blog posts about closed source software to be that
> interesting._

seems to be mostly about the post

------
dandermotj
There's a key statement here that most will either miss or never reach. The
second last paragraph

> Machine Learning Automation

Tuning models is a pain, and if you don't understand the model and all its
parameters (like most software engineers who's job is the write software and
not build models!) it takes time and can be immensely frustrating. Randal
Olson[1] just announced TPOT[2], a Python tool that "automatically creates and
optimized machine learning pipelines using genetic programming". This is going
to be a huge lever for engineers wanting to experiment/implement with ML
algorithms.

[1] [http://www.randalolson.com/2016/05/08/tpot-a-python-tool-
for...](http://www.randalolson.com/2016/05/08/tpot-a-python-tool-for-
automating-data-science/) [2]
[https://github.com/rhiever/tpot](https://github.com/rhiever/tpot)

~~~
feral
Yes, but does it work? I cannot tell from reading the introductory materials
whether it will actually search efficiently over large space (where it is
computationally expensive to evaluate the fitness of any individual point).

I know that folks are having some success with Gaussian process optimisation
for hyper parameter tuning (eg
[https://github.com/Yelp/MOE](https://github.com/Yelp/MOE)), but I would be
sceptical about the application of genetic programming to this area,
particularly as broadly defined a search space as TPOT seems to set up
(genetic programming either doesn't make assumptions about the objective
function being optimised which can be used to search more efficiently, like
Gaussian optimisation does - or uses implicit assumptions that may be
unsuitable, depending on how you want to think about it); has anyone seen
benchmarks against other search methods? The workflow does look great though -
integrating it into scikit looks really clever.

~~~
dandermotj
I agree and I think it's clear TPOT and similar tools, are the first
generation. Genetic algorithms might be slow/costly but the concept is there
with an integrated, implementable solution. If it's as useful as I think it
could be, there will be a flood of modules, libraries and packages developed
with efficiency, UI and domain specific improvements.

------
auvi
I wonder whether the "Flow" moniker was chosen based on Google's TensorFlow.
It is hard not to notice for me.

~~~
x0x0
A flow is a common term for pieces of an execution dag. See eg cascading or
azkaban.

[http://docs.cascading.org/cascading/2.0/javadoc/cascading/fl...](http://docs.cascading.org/cascading/2.0/javadoc/cascading/flow/Flow.html)

[http://submitteddenied.github.io/azkaban2/documents/2.1/crea...](http://submitteddenied.github.io/azkaban2/documents/2.1/creatingflows.html)

------
aub3bhat
It isn't Open Source. Interesting since it written in Python rather than
Lua/Torch, PHP or JS.

~~~
danpalmer
I'm not particularly surprised that it's written in Python.

\- Lua/Torch appears to not be designed for distributed systems as much, and
looks less mature in terms of the scientific computing side. \- PHP is clearly
the wrong tool for this job - it's only really the right tool for websites
(debatable). \- Could do Node.js, but has little/no data-science/scientific
computing side.

C++ would be the only reasonable competitor to Python here I think. Python is
a pretty good combination of great for science, good for servers and easy to
pick up and use for all the engineers who will need to interact with it.

------
yeukhon
In the article they mentioned they implemented custom type system for building
UI and for the input/output. Interesting, but I don't understand what that
means. Are we talking about data type? Custom compiler? Very very elegant
ideas I supposed, but I am just not familiar with the concepts...

> The body of the workflow looks like a normal Python function with calls to
> several operators, which do the real machine learning work. Despite its
> normal appearances, FBLearner Flow employs a system of futures to provide
> parallelization within the workflow, allowing steps that do not share a data
> dependency to run simultaneously.

"Futures" did they implement this with concurrent.futures (which also has a
backport prior to Ptyhon 3.2)?

~~~
igravious
Relevant part:

> Operators: Operators are the building blocks of workflows. Conceptually, you
> can think of an operator like a function within a program. In FBLearner
> Flow, operators are the smallest unit of execution and run on a single
> machine.

> Channels: Channels represent inputs and outputs, which flow between
> operators within a workflow. All channels are typed using a custom type
> system that we have defined.

I think this is related to the thoughts in the following essay:
[https://colah.github.io/posts/2015-09-NN-Types-
FP/](https://colah.github.io/posts/2015-09-NN-Types-FP/)

When you think about it. What do NNs fundamentally do (once trained)? Map
inputs to outputs, right? So they behave like functions. In type theory
functions are intensional and we explicitly construct them. With NNs we have
our ML algorithm and construct them based on training data and feedback. With
regular functions we have explicitly constructed A -> B. Let's represent NNs
as A ~> B. Let's represent channels as A||B. Say we have two operators: E ~> F
and G ~> H then to join them,

So E ~> F||G ~> H we'd have to construct a channel F||G.

At least this is how I imagine they mean things! Corrections from FB most
welcome!

------
bcherny
They should consider a better name.. Flow is already taken by a different FB
team - [https://github.com/facebook/flow](https://github.com/facebook/flow)

~~~
googlryas
It's FBLearner Flow, not Flow

------
tall
Nice! This is a product that I have contributed to while at fb, and it's cool
to see it getting some attention.

