
Deep Learning with PyTorch: A 60 Minute Blitz [video] - vyuh
https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html
======
rayalez
For anyone who's interested in learning PyTorch, here's the best video course
I was able to find:

[https://www.youtube.com/playlist?list=PLZbbT5o_s2xrfNyHZsM6u...](https://www.youtube.com/playlist?list=PLZbbT5o_s2xrfNyHZsM6ufI0iZENK9xgG)

They explain things incredibly well, videos are easy to understand, engaging,
and to the point. Highly recommend it to everyone!

I've also heard that Udacity has some good courses, but I can't vouch for
those yet.

~~~
M5x7wI3CmbEem1O
Do you have any more recommendations?

I'm an undergrad student, and I'm nervous about picking between
Tensorflow+Keras over PyTorch.

It looks like many more companies are hiring for TensorFlow, and there's a
wealth of information out there on learning ML with it. In addition, it just
got the 2.0 update.

But, PyTorch is preferred nearly every single time when I see the discussion
come up on HN and Google searches. I'm having a hard time deciding what to
dedicate my time to.

~~~
cmarschner
Abstract from the tools. They come and go. You will need to adopt a new one
every other year.

Instead, make sure to understand the math and the concepts, and then it‘s easy
to translate that to an implementation.

One way of doing this (though not sufficient) is to learn both tools.

Right now the pull is away from TF (increasingly convoluted API and lots of
deprecations) and towards pytorch (more support from the research community
and increasing performance in production).

------
vyuh
I posted this link but now the title has somehow changed. I do not know what
is the policy on HN. But the title saying "[video]" might give a wrong
impression that this points to a one hour long video. The link points to a
tutorial which embeds an _entirely optional_ two minute video that introduces
the main content contained in five web pages.

~~~
jdoliner
I was very confused when I clicked the link, spent a while looking for the
full video.

------
amelius
One thing I've noticed is that it's quite hard to have vibrant discussions
about DL because it is all either so simple or it is dauntingly
complicated/unpredictable. Mostly my DL conversations end up being about
frameworks. Anyone else experience this?

Also the number of DL submissions on HN seems surprisingly low given the
applicability of the technology.

~~~
solidasparagus
It's pretty easy when you're talking to people who understand the fundamentals
of deep learning, but that understanding isn't very common even on HN. I think
that's because the real-world, valuable usecases of DL are not very
accessible:

(a) DL is pretty complicated in a way that's unfamiliar to most software
engineers. You are consistently working with Tensors that have a couple more
dimensions than people are used to holding in their heads (i.e. images mean
you are typically working with 4D Tensors).

(b) You learn from academic papers, not blogs. It's a new workflow for many
software people and intimidating to some (although the papers are usually
closer to blog posts than rigorous academic papers).

(c) It's very difficult to learn deep learning on your own without it getting
pretty expensive. Advanced uses pretty much require GPUs/TPUs and that's
either a big upfront purchase or a serious per-experiment cost.

(d) Deep Learning is not a single field. It is CV, NLP, RL, speech recognition
and probably others I'm forgetting about. They overlap, but it further reduces
the number of people you can have informed discussions with because being
knowledgeable about computer vision does not mean you are able to have a
vibrant discussion about NLP.

~~~
fossuser
Could you list some of the places you look to learn? Where do you find
relevant academic papers? It's hard to find information about what this
learning workflow is.

Any recommended resources?

------
abledon
Ugh its so easy compared to what i've been wrangling in tensorflow.

~~~
ftufek
Keep in mind that tutorials will always make it look easy compared to
debugging actual production code. If you look through tensorflow tutorials,
they also look very easy, especially with TF2.

That said, I've experimented with pytorch and I agree that it is really nice
to work with.

Disclaimer: I work at Google and do use tensorflow, though I don't work on the
tensorflow team.

~~~
m0zg
PyTorch is 10x easier to debug than even TF2, and it's been that way all
along. TF2 is no easier to debug than the previous releases if you're not
using eager mode (which most people don't), and even in eager mode it
sometimes errors out in ways that do not offer any suggestion as to _which op_
caused the error. This is nuts. Modern architectures have hundreds, sometimes
thousands of ops. It basically boils down to flying blind and guessing and can
easy take days of trial and error to figure out each issue. Plus, every time
you start a TF program it just sort of sits there for a minute or so before it
starts doing anything. This severely hampers productivity when debugging.

To all the folks who are just starting out: just go with PyTorch. It's
downright intuitive compared to anything Google has been able to put out so
far.

Disclosure: ex-Googler. Used TF while there (and DistBelief before it). Gave
it up as soon as PyTorch came out. Couldn't be happier.

------
spicyramen
Very good that Pytorch emerged as a serious contender to TF. While TF still
provided more production grade tools (TFX, TensorRT, TF serving), Pytorch
continue to evolve and hope soon we have a more complete ecosystem

~~~
ibab
I really like JAX as well:
[https://github.com/google/jax](https://github.com/google/jax). It's younger
than PyTorch and TF, but feels cleaner and more expressive. It has a very nice
autodiff implementation (based on
[https://github.com/HIPS/autograd](https://github.com/HIPS/autograd)) and
performance is comparable to TF in my experience.

~~~
solidasparagus
It feels like JAX doesn't have any of the high-level APIs that PT/TF/MXNet
that are vital for fast prototyping of model architectures. Is that correct?

~~~
ibab
It has stax, which is a minimal example of how to build a high level library:
[https://github.com/google/jax/blob/master/jax/experimental/s...](https://github.com/google/jax/blob/master/jax/experimental/stax.py)

It seems that the JAX developers are focusing their time on making the core
framework better and are leaving the task of building high-level APIs to the
community for now. I suspect we'll see a few high-level APIs emerge over the
next few months that explore different approaches before the community settles
on a particular one.

~~~
solidasparagus
I hope not. That's part of what makes TF so miserable - the core library
didn't provide the tooling people actually needed so the community built a ton
of different tools and it just made TF confusing to use.

------
sillysaurusx
Is there a drop in replacement for TensorBoard? It’s probably the biggest
thing keeping me using tensorflow. Ideally the api of the pytorch equivalent
would be about the same too.

I answered my own comment before posting it. But in case it’s helpful to
anyone else, I’ll put the answer here: yes, TensorBoardX. Looks like it’s very
easy to use:
[https://tensorboardx.readthedocs.io/en/latest/tutorial.html](https://tensorboardx.readthedocs.io/en/latest/tutorial.html)

Anyone have thoughts on TF2.0 vs pytorch? Over on Twitter people seem to be
pretty hyped about TF2.0, but when I tried learning it it just felt... not
very fun. I need to give it a fair shot though.

~~~
mathusuthan
PyTorch supports logging into TensorBoard too ...More details can be found at
[https://pytorch.org/docs/stable/tensorboard.html](https://pytorch.org/docs/stable/tensorboard.html)

------
emilfihlman
Whoever changed the title did a bad job.

------
faizshah
Anyone know somewhere that has a good overview of the various ML and DL model
types and what they are good for? I've been looking for a survey paper or book
or just a glossary of ML.

~~~
sillysaurusx
When you hear autoregressive model, think “predicting a sequence”. These are
good for text to speech since you can say “given some text, generate a
spectrogram.” GPT-2 is probably the most impressive example of autoregressive
techniques (I think).

GANs, and especially stylegan, are good for generating high quality images up
to 1024x1024. These take about 5 weeks to train and $1k of GCE credits. The
dataset size is around 70k photos for FFHQ. Mode collapse is a concern, which
is when the discriminator wins the game and the generator fails to generate
anything that can fool it. Stylegan has some built in techniques to combat
this. IMLEs recently showed that mode collapse can be solved without gans at
all.

Hmm.. what else... I’ll update this as I think of stuff. Any questions?

EDIT: Regarding IMLE vs GAN, here are some resources:

Mode collapse solved (original claim):
[https://twitter.com/KL_Div/status/1168913453744103426](https://twitter.com/KL_Div/status/1168913453744103426)

Overview of mode collapse, why it occurs, and how to solve it with IMLE:
[https://people.eecs.berkeley.edu/~ke.li/papers/imle_slides.p...](https://people.eecs.berkeley.edu/~ke.li/papers/imle_slides.pdf)

Paper + code:
[https://people.eecs.berkeley.edu/~ke.li/projects/imle/scene_...](https://people.eecs.berkeley.edu/~ke.li/projects/imle/scene_layouts/)

Some simple code for reproducing IMLE from scratch (I haven't seen this
referenced many other places; stumbled onto it by accident):
[https://people.eecs.berkeley.edu/~ke.li/projects/imle/](https://people.eecs.berkeley.edu/~ke.li/projects/imle/)

Super resolution with IMLE:
[https://people.eecs.berkeley.edu/~ke.li/projects/imle/superr...](https://people.eecs.berkeley.edu/~ke.li/projects/imle/superres/)

For comparing images, I believe they use the standard VGG perceptual loss
metric that StyleGAN uses. (See section 3.5 of
[https://arxiv.org/pdf/1811.12373.pdf](https://arxiv.org/pdf/1811.12373.pdf))

It seems to me that the main disadvantage of IMLE is that you might not get
any latent directions that you get with StyleGAN. E.g. I'm not sure you could
"make a photograph smile" the way you can with StyleGAN. But in the paper,
they show that you can at least interpolate between two latents in much the
same way, and the interpolations look pretty solid.

~~~
mistrial9
I found a strange bifurcation recently while collecting papers on a sub-topic
of this question.. China-based authors quoting other China-based authors
extensively, in English with math, of course. Meanwhile, the US and Western EU
seem like "it" , in other words, all the papers referenced seem like the ones
you would reference..etc self-consistant.

~~~
cmendel
One of the incredibly unfortunate things about science out of China. It may or
may not be trustworthy, as in the data may be just straight false. I'm not
surprised that you saw that split, I'd be leary of quoting/referencing a
potentially false paper myself.

------
BillFranklin
Does PyTorch have a learn to rank module? Tensorflow released a ranking module
earlier this year, but I’d like to try out PyTorch.

~~~
geraltofrivia
Not as far I know. It does have max-margin loss [1], which is pretty much all
you need to implement a neural ranking model, apart from data iterators, and
training loops.

[1]
([https://pytorch.org/docs/stable/nn.html?highlight=margin%20l...](https://pytorch.org/docs/stable/nn.html?highlight=margin%20loss#torch.nn.MarginRankingLoss))

------
theemathas
As a chess player, "60 minute blitz" sounds very wrong.

------
__Asturias__
Does no one build their own ml algos anymore? I don't understand the need for
pytorch and tensor flow. I honestly thought tensor flow was nothing but a
teaching thing for undergrads

~~~
kiloreux
Not all of us need to build their own ML algos. Just in the same way that not
all of us need to build their sorting libraries or data structures. Some
people are specialized in this to develop and do research. While other
software engineers just want something they can use without much hassle and
just a superficial understanding.

~~~
morningseagulls
>Not all of us need to build their own ML algos. Just in the same way that not
all of us need to build their sorting libraries or data structures.

And yet they love to ask you to do exactly that at technical interviews...
coming up next: what ML algos you need to know to ace that interview.

