
Twitter meets TensorFlow - hurrycane
https://blog.twitter.com/engineering/en_us/topics/insights/2018/twittertensorflow.html
======
llao
> Machine learning enables Twitter to drive engagement, surface content most
> relevant to our users, and promote healthier conversations.

One that wants to manipulate your mind, one that echochambers your discovery,
one that censors arbitrarily.

~~~
dirkgently
Do you remember hiw bad YouTube comments have been? And it's significantly
better lately?

That happened due to ML.

Not everyone is out there to use ML for nefarious purpose. And there are
multiple use cases for ML.

~~~
blauditore
At least for music videos, comments are still pretty shitty. It feels like
more than half of them are one of the following patterns:

\- "Like if you're watching in June 2018?" [2k upvotes]

\- "Most underrated artist of our time" [5k upvotes]

\- "Wow, mad respect for <artist>" (written by verified account by other
artist trying to promote their own content)

~~~
onion2k
Why are those bad comments? The first one is a bit ridiculous, but the others
are fine. They're _comments_. They're not discussion points. YouTube isn't a
forum or a social network. There's nothing compelling people to leave
intelligent or even constructive points. The comment section is for people to
note what they think. That's it.

When people say these sorts of things are bad I think it's because their
expectations and understanding of a comments section are wrong.

~~~
blauditore
It's just super repetitive, and actually good and original comments get buried
because they don't match the hivemind.

Of course, it's a question of taste what comments should be more visible. If
the community actually prefers reading the same stuff over and over again,
then the current state is probably justified, and it's just a moronic
community.

------
ralston
I always enjoy browsing the Twitter Engineering blog because of 1) the
granular detail that they give 2) the concise, organized format that the
information is presented, and 3) because I love their creativity. However, the
first sentence of this article "Machine learning enables Twitter to drive
engagement, surface content most relevant to our users, and promote healthier
conversations. As part of its purpose of advancing AI for Twitter in an
ethical way, Twitter Cortex is the core team responsible for facilitating
machine learning endeavors within the company" is a complete turn off. Keep it
technical. Leave the Marketing/PR out of it.

\- Regular blog.twitter.com/engineering reader

------
mlthoughts2018
This is very confusing and meandering. It gives flow charts and lists of steps
that don’t map to my experience building deep learning models at scale, and
spends a strange amount of time passive aggressively dismissing Lua Torch and
extolling virtues of TensorFlow that aren’t very important.

As with all of these purported pipelining systems, I’m skeptical and happy to
let a bunch of other people deal with the headches of making it adequately
general for a few years before I’ll even start caring about grokking it for my
use cases.

In the meantime, creating build tooling, data pretreatment tooling and
deployment tooling is pretty valuable for me to understand business
considerations and make sure all my modeling & experimentation aren’t just
time wasting ivory tower projects, particularly in terms of customizing
performance characteristics on a situation-to-situation basis, free to design
the deployed system without a constraint to a particular serving architecture.

It also makes me very disinterested in applying to work for the Cortex team,
because even though the article is talking about DeepBird v2 as a means to
free ML engineers to do more research, it seems pretty obvious that there’s a
huge surface area of maintenance and feature management for this platform.
Your job is probably going to be _less_ about research, which is scarce work
that people compete over anyway.

Possibly attractive for people who just like deep C++ platform building, which
is an internal drive not often found in people wanting to solve business
problems with ML models.

~~~
pavanky
> Possibly attractive for people who just like deep C++ platform building,
> which is an internal drive not often found in people wanting to solve
> business problems with ML models.

And the world has plenty of people who are not interested in trying to solve
business problems with ML models, but are rather interested in the engineering
side of ML. I am one of those people. My current work (at cortex) involves
improving the time taken to train the types of models not usually described in
research papers.

If anyone is interested in machine learning but wants to build high
performance training / serving platform, message me over twitter (@pavan_ky),
Cortex is hiring.

~~~
mlthoughts2018
Personally I am very interested in both how models can help solve business
problems and how to make effectively engineered tools for machine learning.

In my work I spend a lot of time on new deep learning architectures or
experimenting with modifications or fine-tuning or ensembling.

I write a lot of container and Makefile tooling to ensure experiments are
reproducible and results have identifiers that map back to the full set of
data, software and parameters.

I also write a lot of backend server software to wrap trained models in a web
application, mostly in Python, and do a lot of work with Cython after
profiling to target only those spots of the code that reveal actual
performance bottlenecks in terms connected directly to a specific business
problem’s latency or throughput requirements — as in, not taking the huge
premature optimization step of assuming a whole system needs to be written in
C++, and instead using profiling and case-by-case diagnostics to know when to
write something as an optimized C extension module callable from Python for
very specific and localized sections of code.

My experience has been that there is such a lack of transparency about how
deployment will work, how performance will work, etc., when using cookie
cutter pipeline approaches, like sklearn Pipelines, TensorFlow serving,
Fargate, etc. You’ll always need to break some assumption of the pipeline,
layer in new diagnostics, debug latency issues, etc., on a case-by-case basis.

99% of the time, ease of specifying a new model or articulating an experiment
is not hard, requires little dev work, and only represents about 10% of the
actual work needed to explore a model’s appropriateness for a given problem at
hand.

The rest requires very specialized control and visibility to basically perform
application-specific surgery on the pipeline, customizing and tailoring many
aspects, from how multi-region deployment should look to how optimized the web
service code should be to whether to use asynchronous workers or a queuing
service to stage and process requests, to optimizing preprocessing treatments,
to instrumenting some extra New Relic metric tracking that the pipeline isn’t
extensible enough to just specify in some config, and so on.

What’s been most important is that the deep learning engineers on the team,
who are _researchers_ , are also excellent system engineers at all those
topics too and display a high degree of curiosity towards them, and absolutely
do not look at it like “boring work” that distracts from the experimentation
they would rather do. Their value add is not driven by spending more time
experimenting — that’s virtually never the case. Their value add is in _both_
knowing the details of the deep learning models intimately while also knowing
the deep details of implementation, optimization, deployment, and diagnostics.

In that sense, tying model development to a cookie cutter pipeline framework,
whether Spark or sklearn or something custom in-house, is something I believe
strongly is an anti-pattern.

------
michelb
Maybe they can use ML on the user feedback to figure out what features people
actually would like on the platform, instead of trying to figure out
relevancy, which no company has ever successfully done. /s

Is there a recent in-depth article somewhere about Twitter's internals? It
must be frustrating working on features very few people want to use.

I still can't edit a tweet I just posted, any video looks absolutely horrible
for the first 10 seconds, the timeline is a mess, third-party developers that
make interesting/much better clients are getting stomped on, and harassment is
running mostly unchecked on the platform. Yet I read interesting development
articles on the engineering blog. Wasted talent?

I really like the Twitter engineering blog articles, but it seems like it's
just an HR tool.

------
rotskoff
This blog post describes twitter's move from Lua Torch to Tensorflow. I am
surprised to see it so highly ranked on the front page because there's very
little content here. Basically, they describe the sorts of data structures
they use and list a couple of advantages of Tensorflow vs. the out-of-date Lua
Torch framework.

~~~
s17n
It's probably on the front page because a lot of people are deciding between
PyTorch or Tensorflow for their company's production ML pipeline, and this is
an article about making exactly that decision. Even if they didn't present any
actual analysis, just the sentence "Twitter's ML team, which had previously
used LuaTorch, recently evaluated PyTorch and Tensorflow and ultimately
decided to standardize on Tensorflow" would be of interest to a lot of people.

------
dmitriid
This post says all you need to know about Twitter.

"Machine learning enables Twitter to drive engagement...".

The very first thing they mention is engagement. They don't care about
quality, or what users want, or to foster communication. They care about one
thing and one thing only: engagement. More clicks, more likes, more moving
around the interface.

------
mastazi
> given the migration of the Torch community from Lua to Python via PyTorch,
> and subsequent waning support for Lua Torch, Cortex began evaluating
> alternatives to Lua Torch

I'm a bit out of the loop, has PyTorch really become so much more popular than
Lua Torch, in the short time span since it launched? And is it true that most
of the original Lua Torch community switched to PyTorch?

~~~
Eridrus
PyTorch is rivaling TensorFlow in popularity now. Lua Torch is basically dead.

~~~
mastazi
Thanks, I saw many articles about PyTorch lately, but didn't realise that most
people had switched for good.

------
kieckerjan
Off topic, but the article mentions "deep learning at scale" which triggered
me. Is this use of the term "at scale" something new(ish)? As far as I know
"at scale" means something like "in the appropriate amount". Here it seems to
be shorthand for "implemented in a scalable manner". This use seems all over
the place now. Is there a native English speaker who can comment on that?

~~~
xtacy
I am not a native English speaker, but quite familiar with the use of "at
(large) scale" in similar context.

It means that one has deployed it/is using the tool with a non-trivial (could
be anywhere from few 100s to few 1000s of machines) amount of CPU and/or data.
In the context of serving, "large scale" could also mean the number of
queries/second hitting the serving layer.

------
DrNuke
Brought my following down to 9 accounts last week and enjoying the flow again.
I have always used Twitter passively, only re-tweeting 3-4 interesting stories
a day. Hoping TensorFlow for healthy conversations does not spoil this sort of
24/7 breaking news from trusted sources, now. What I fear the most, is having
my flow made of one tweet from sources I choose & one tweet TensorFlow
suggests to me.

~~~
wingerlang
> I have always used Twitter passively, only re-tweeting 3-4 interesting
> stories a day

That seems fairly active to me. In almost 10 years I did my first re-tweet
last year.

~~~
DrNuke
You’re right, of course, ehehe. Just avoiding to express personal opinions, so
that no flamewars or endless arguments waste your day, is not passive enough
if you still sign up once and sign in daily. Therefore, the next step is
deactivating my account for good while following trusted sources without
logging in at all? Twitter cookies would still track my activity, though.

------
rado
What could possibly go wrong?

------
drivebyops
Did twitter drop Scala? Wonder how the story is with ML and Scala

~~~
random314
No,they have not. Especially on the machine learning teams

------
sintaxi
Twitter has become a tragedy.

------
rs86
PR sucks. It looks like PR.

