
TensorFlow, Keras and deep learning, without a PhD - blopeur
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#0
======
gas9S9zw3P9c
As a researcher in the field I am not quite sure how I feel about these kind
of resources. I am all for making research accessible to a wider audience and
I believe that you don't need a PhD, or any degree, to do meaningful work.

At the same time, the low barrier of entry and hype has resulted in a huge
amount of people downloading Keras, copying a bunch of code, tuning a few
parameters, and then putting their result on arXiv so they can put AI research
on their resume. This has resulted in so much noise and low quality work that
it really hurts the field.

You don't need a degree, but I think you do need to spend some time to get a
deep enough understanding of what's going on under the hood, which often
includes some math and takes time. This can be made accessible and there are
plenty of good resources for that. But all these "become an AI pro by looking
at some visualizations and copying this code" is maybe hurting more than it
helps because it gives the illusion of understanding when it's actually not
there. I wouldn't want people learning (solely) from this touching my
production systems, writing blogs, or putting papers on arXiv.

~~~
amelius
Well, maybe I got the wrong impression but after reading the (very accessible)
Yolo V3 paper [1], it seems to me that even the experts do little real math
and lots of guesswork, kicking a model until it starts giving results.

[1]
[https://pjreddie.com/media/files/papers/YOLOv3.pdf](https://pjreddie.com/media/files/papers/YOLOv3.pdf)

~~~
gas9S9zw3P9c
Oh yes, I don't think advanced math is required in any way. However, there is
a difference in that researchers who have worked with these models for many
years (often including having done some of the math) have a very good
intuitive understanding of these models. Once you have that, it's fine to be
driven by gut feeling. Just like many engineers are driven by gut feeling that
come from tacit knowledge through experience.

What is dangerous is reading a 30 minute blog post and getting the illusion of
having some kind of understanding, when in reality it can take years to
develop that. It's like cloning the postgres Github repository, compiling it,
running a few queries, and then saying you've built databases and being hired
to become the "database expert" at some company, spreading wrong knowledge
left and right.

That's why the popularization of these quick immediate reward tutorials is
dangerous. It takes time and effort to become knowledgable at something. Of
course, many people are smart enough to know that these tutorials (or cloning
the postgres repo) is just the first step on a longer learning journey, and in
that case it's totally fine, but there are also many people who start thinking
they are experts ready to work on research or production models after going
through such things, not being aware of the many things they don't know [0]

[0]
[https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Du...](https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Dunning%E2%80%93Kruger_Effect_01.svg/1024px-
Dunning%E2%80%93Kruger_Effect_01.svg.png)

~~~
shrimpx
It sounds like what you’re complaining about is some people who put papers on
arxiv and deceivingly claim to be experts. And kind of implying that these
blog posts are to blame, so the blog posts should be retracted due to those
dishonest academic people? You’re making a very confusing point.

~~~
listenallyall
Not confusing. He's saying that watching Judge Judy for a week doesn't make
you a lawyer. And when hiring, be careful because lots of people who claim to
be experts are far from it.

~~~
shrimpx
That’s valid, but he’s implying that Judge Judy should be taken off air
because some people watch it and then pretend to be lawyers. Squelching
information seems like a terrible way to eliminate a few impostors downstream.

~~~
gas9S9zw3P9c
Yes, but that's a bad example because pretending to be a lawyer is hard. A
better example would be gurus spreading nutrition recommendations that are not
wrong per-se, but extremely simplified. Nutrition is a complex topic and
individual differences make it hard to generalize. Let's say the information
are so simplified that they are likely to hurt people who blindly follow them
without doing further research. So, should this information be taken off air
or not? I would say yes, and perhaps you would say no. In either case, I don't
think the answer is quite as clear-cut.

------
echelon
I'm doing deep learning without a data science background.

Some of my current results are:

* [https://vo.codes](https://vo.codes)

* [https://trumped.com](https://trumped.com)

The voices need better data curation and longer training, but some speakers
such as David Attenborough are quite good.

I've also built a real time streaming voice conversion system. I want to
generalize it better so that it can be an actual product. I think it could be
a killer app for Discord. Imagine talking to your friends as Ben Stein or
Ninja.

I've been watching TTS and VC evolve over the last few years, and it's
incredible pace at which things are coming along. There are now singing neural
networks that sound better than Vocaloid. If you follow researchers on Github
(seriously, their social features are a killer app!), you'll see model after
model get uploaded - complete with citations, results. It's super exciting,
and it's the future I hoped research would become.

If you're diving into this, I would recommend using PyTorch, _not_ TensorFlow.
PyTorch is much easier to use and has better library/language support.
TorchScript / JIT is really fantastic, too. I even mean this if just you're
poking around with someone else's model - find a PyTorch alternative if you
can. It's much easier to wrap your head around. TensorFlow is just too obtuse
for no good reason.

~~~
forgingahead
Hi Brandon, nice work! Some questions to learn more if you don't mind - are
you using Tacoton2 for the voice generation? If it's Tacotron2, are you using
a base model before you train up new speakers, or is each speaker model
trained from scratch? How long do you run the training for normally (for both
cases), and what hardware are you running?

You mentioned elsewhere you're renting the V100s, what services have you used,
and would you recommend them?

By the way, your Trumped.com is throwing some errors in the console so the
site isn't working for me.

Keep up the good work!

~~~
echelon
> are you using Tacoton2 for the voice generation?

Nope. glowtts. Tacotron2 has higher fidelity (it's a denser network), but it's
really slow and expensive to run.

> are you using a base model before you train up new speakers

Absolutely! Transfer learning is essential to training on sparse or limited
data, and it's incredibly effective.

> How long do you run the training for normally

> and what hardware are you running?

I think I explained this in my other answer? But if not, it's typically
awhile. The best guiding light though is to frequently listen to the inference
results. Are they improving? Watch the Tensorboard graphs to see what loss and
attention look like and make sure you're actually learning.

> By the way, your Trumped.com is throwing some errors in the console so the
> site isn't working for me.

Oh no. I realize there are some bugs I left for iPhone (simply because I don't
have one), and I really need to get those fixed. I'm not sure if this is your
case or not. Perhaps the pods with the Trump model are also experiencing
duress -- I'll need to investigate that too. I have yet to hook up monitoring
(yikes).

------
donquichotte
Tangentially related (and also using the ubiquitous MNIST dataset), Sebastian
Lague started a brilliant, but unfortunately unfinished video series on
building neural networks from scratch.

This video was an absolute eye-opener me [1] on what classification is, how it
works and why a non-linear activation function is required. I probably learned
more in the 5 minutes watching this than doing multiple Coursera courses on
the subject.

[1]
[https://www.youtube.com/watch?v=bVQUSndDllU](https://www.youtube.com/watch?v=bVQUSndDllU)

~~~
pests
One "ah ha" about that required non-linear function I had was the fact that if
you were just passing numbers through a series of linear functions they could
by definition just be combined into one equation.

------
wirrbel
The whole AI/ML stuff has become so hyped up that its probably time to find
another topic of interest in software engineering for me. Its a weird melange
nowadays where frameworks and "academic credentials" are fused together by
major tech companies and leaves me - who has deployed a dozen of classical ML
models into production that are still running after couple of years -
wondering what this is all about.

Overall, working with people with different backgrounds, a ML-related PhD is
usually not correlated nor anti-correlated to these people having a good
understanding of the relationship between models and and their applications.

I wish we could leave the framework and name-dropping behind and talk more
about what it takes to evaluate predictions, how to cope with biases, etc.

~~~
mattigames
Instead of saying anti-correlated is better to say "inversely correlated" (or
if you mean lack of correlation then "uncorrelated")

~~~
Yajirobe
You mean "negatively correlated"?

~~~
mattigames
Yes, "negatively correlated" is more widely used (the iron law of nitpicking
strikes again).

------
secondcoming
It's a pity that most Tensorflow tutorials out there seem to deal with images.
We tried to use it for real-time data classification (data -> [yes | no]).
Every tuturial out there seems to assume you're using Python (which is
probably not an invalid assumption). Here's my 2c when trying to use
Tensorflow with C++:

a) Loading SavedModels is a pain. I has to trawl the Tensorflow repo and
Python wrappers to see how it worked.

b) It's incredibly slow. It added ~250ms to our latency. We had to drop it.

c) It has a C++ framework that doesn't work out-of-the-box, you have to use
the C lib that wraps an old version of the C++ framework (confused? me too).

d) It's locked to C++03.

Tensorflow-Lite looked to fit the bill for us, but our model weren't
convertible to it. We no longer use Tensorflow.

~~~
osipov
I don't understand why you are getting downvoted. TensorFlow 1.x barely worked
and people stuck with it because the alternatives were worse. I moved to
PyTorch as soon as I could because it is better than TensorFlow 1.x,
TensorFlow 2.x, or Keras w/TensorFlow backend.

TensorFlow is designed by committee and is more of a brand now than a machine
learning framework. Did you know there is TensorFlow Quantum?

~~~
deadliftpro
There were alternatives. IMO, Microsoft Cntk being one which was better than
tensorflow. Tensorflow 'won' because of herd mentality and the perception that
everything google does is cool.

------
matlo
This is a nice introduction, even though as most of the tutorials on ML it
goes from 0 to 100 in 2 lessons.

A couple years ago I started studying ML, and I have a design background, so I
needed to digest all the math and concepts slowly in order to understand them
properly.

Now I think I understand most of the fundamental concepts, and I've been using
it quite a lot for creative applications and teaching, and I have to say the
best resource I've found for beginners, by far, is "Make Your Own Neural
Network" by Tariq Rashid.

It starts really from the beginning and it takes you through all the steps of
building a NN from zero, with no previous knowledge. really good.

------
lumberjack
Since everyone is talking about hype in ML, I wish there was some hype for
good ole' conversional scientific computing. Yes, it's not so sexy, you have
to build your own model yourself, and then the hard work is in finding and
verifying a suitable numerical method and finally devising a solid
implementation. It requires a vast number of different skills, anything from
pure math to low level programming and it is definitely not trivial work, but
it does not seem like it pays that well.

~~~
LorenzGauge
So excited to start my Masters in Numerical/Scientific Computing for these
reasons!

~~~
lc0_stein
What are some good universities for something like this?

------
sibmike
I am constantly puzzled by people saying that AI is overhyped and fresh grads
won't have enough jobs for them. Almost every real life industry: retail,
logistics, construction, farming, heavy industries, mining, medicine have just
recently started to try AI. The amount of manual and suboptimal tasks that
have to be automated and optimized is enormous. I am pretty sure there is more
the enough work for applied DSs with domain knowledge in mentioned industries.

~~~
andrewnc
Agreed, I feel we are just barely scratching the surface. I'm fairly involved
in the AI/ML world (research at FAANG) and the amount of hype definitely
scares me, but ML/AI isn't going anywhere soon imo.

------
mark_l_watson
This is very well done, hitting on some pain points and explaining how to work
around them.

I have devoted close to 100% of my paid working time on deep learning for the
last six years (most recently managing a deep learning team) and not only has
the technology advanced rapidly but the online learning resources have kept
pace.

A personal issue: after seven years of not updating my Java AI book, I am
taking advantage of free time at home to do a major update. New material on
deep learning was the most difficult change because there are so many great
resources, and there is only so much you can do in one long chapter. I ended
up deciding to do just two DL4J examples and then spending most of the chapter
just on advice.

The field of deep learning is getting saturated. Recently I did a free
mentoring session with someone with a very good educational background (PhD
from MIT) and we were talking about needing specializations and specific
skills, just using things like DL, cloud dev ops, etc. as necessary tools, but
not always enough to base a career on.

Definitely it can help peoples’ careers working through great online DL
material, but great careers are usually made by having good expertise in two
or three areas. Learn DL but combine that with other skills and domain
knowledge.

------
robpal
I attended a conference talk of a FB AI-engineer talking about her paper with
backprop equations so obviously wrong my eyes hurt, and incorrect definitions
of objects. It did not stop her from participating (btw. this is always
unclear -- who did what) in state-of-the art research in object detection.

PhD is overrated in the deep learning context. It is more about forging the
intellectual resilience and ability to pursue ideas for months/years than
learning useful things/tricks/theorems.

~~~
Donthatme
> It is more about forging the intellectual resilience and ability to pursue
> ideas for months/years ...

Isn't this what a PhD generically signals?

------
dekhn
Twenty five years ago, this would have been "LINUX, UNIX, and serving, without
a PhD" and Matt Welsh's Linux Installation And Getting Started was the intro
([https://www.mdw.la/papers/linux-getting-
started.pdf](https://www.mdw.la/papers/linux-getting-started.pdf)). I was one
of many who adopted Linux early, using this book (later I read the BSD Unix
Design and Implementation, which I would describe as senior undergrad/junior
grad student material).

Having those sorts of resources to introduce junior folks to advanced concepts
are really great to me- my experience is that I learn a lot more by reading a
good tutorial than a theory book, up until I need to do advanced work (this is
particular to my style of learning; I can read code that implements math, but
struggle to parse math symbology).

------
m0zg
Mandatory plug: do consider using PyTorch instead. It's far easier to pick up
and work with. Easy things are easy, hard things are possible.

~~~
jiggunjer
Does it train just as fast on gpus?

~~~
comicjk
In my experience, not quite as fast for fully-tuned code, but the difference
is small - and given the same project deadline, the PyTorch version will
probably be faster.

------
halflings
The video version [1] is also pretty awesome, though its code itself is a bit
outdated now. Explains a lot of very practical issues that you might not find
in most academic textbooks, but you encounter every day in practice.

[1]
[https://www.youtube.com/watch?v=vq2nnJ4g6N0](https://www.youtube.com/watch?v=vq2nnJ4g6N0)

------
amrx101
Have a simple rant here. All these BIG $ companies every now and then come out
with statements and what not, that doing AI ML is very easy and every one
including their cats should do AI, ML courses and training(preferably on their
platform). Once that is done the job market is yours. Reality is far from
this.

\- Today AI|ML does not have the capability marketed by these big companies.
Incidentally marketing is targeted at governments, big non tech companies and
gullible undergraduates.

\- Undergrads many a times take these training courses wherein they acquire
the skill set to call these APIs and flood the job market wherein a data-entry
or data-analyst job is tagged as AI|ML job.

\- High paying jobs in AI|ML still require a Masters or PhD or a mathematical
background.

In conclusion, the current hype around AI|ML is misguiding gullible undergrads
and governments(I dont mind the government being cheated THOUGH).

~~~
s1t5
Another side of this that I've seen is people with PhDs in applied mathematics
on a research team where a PhD is a hard requirement, spending all of their
time passing dataframes around and convincing the sales team that the bar
charts in the product are in fact correct.

~~~
zurfer
Have we been working on the same team? :P The worst part (or best part of the
joke) is that with a mathematical PhD you're very ill equiped for a discussion
with your typical sales rep.

------
seek3r00
The title is obviously clickbait-y, but it’s fine: they’re trying to sell a
product (Google Colab).

IMO if you’re interested in AI research or ML engineering, you already know
that — in order to avoid getting people killed - you have to understand how it
works under-the-hood. You’re doing yourself, your employer and your fellow
humans a favour.

Just keep up the good work, and ignore the bullshit. If an AI winter comes,
you’ll be well prepared to migrate to another engineering role.

------
polskibus
I wonder if Google is using this resource to train their own staff without PhD
and after that allow them to work as ML engineers? That would provide
credibility to such program - instead it is more aimed to sell more ML
computing power to the masses (who won't really understand how to use it to
get meaningful results).

~~~
nxpnsv
not bloody likely...

------
Guest42
To me it seems the real goal of hyping these things so heavily is to increase
their cloud revenues.

------
DrNuke
As usual with tools, it’s the use instead of the instrument. Domain knowledge
still has advantages against generalism, in applied and critical fields. I
mean, you can’t do medicine or materials yet with AI/ML without understanding
the domain?

------
arunoda
This is something I did with fast.ai -
[https://deeplearningmantra.com/](https://deeplearningmantra.com/)

------
daiyanze
Is this tutorial aiming at entry level? Those graphs are quite difficult. I
guess it will take a lot of time to do homework on some fundamental
curriculums.

------
totorovirus
It is totally possible and easy to use tensorflow/torch if you havn't skipped
linear algebra classes. Ph.D is needed if you are going for a job where you
design a sophisticated model (not just adding layers, but experimenting with
activation, attention etc).

~~~
s1t5
A PhD isn't even a requirement for doing the more advanced stuff. Obviously
you need a lot of math and ML specific knowledge but there's no reason why you
can't have that knowledge with an undergraduate math degree (for example).
Spending 3-6 years doing research in a very narrow and possibly unrelated
branch of mathematics will give you a PhD, but the linear algebra and
multivariable calculus that you actually need for the ML stuff are covered in
a bunch of undergrad/masters courses in mathematics, computer science,
engineering, physics etc.

~~~
lonesword
I second this. I have worked with a bunch of undergrads (I am pursuing a
masters degree in CS) and they had a thorough grasp of the math and could
really contribute to the research agenda of the group. When I did my undergrad
(in 2011-15), I ended up taking a lot of electronics/hardware courses. Turns
out undergrads these days just swap them with math/machine learning courses.
Good for them.

------
yumraj
Does anyone know what is this site generated with?

