They explain things incredibly well, videos are easy to understand, engaging, and to the point. Highly recommend it to everyone!
I've also heard that Udacity has some good courses, but I can't vouch for those yet.
I'm having to learn this framework for a course assignment, and I feel a lot better about it now than I did after going through the OP.
Thanks for sharing!
I'm an undergrad student, and I'm nervous about picking between Tensorflow+Keras over PyTorch.
It looks like many more companies are hiring for TensorFlow, and there's a wealth of information out there on learning ML with it. In addition, it just got the 2.0 update.
But, PyTorch is preferred nearly every single time when I see the discussion come up on HN and Google searches. I'm having a hard time deciding what to dedicate my time to.
Instead, make sure to understand the math and the concepts, and then it‘s easy to translate that to an implementation.
One way of doing this (though not sufficient) is to learn both tools.
Right now the pull is away from TF (increasingly convoluted API and lots of deprecations) and towards pytorch (more support from the research community and increasing performance in production).
At the end of the course you will be able to implement almost any ML state of the art solution (classification, regression and Computer vision).
Sounds too good to be true? Jeremy have that effect, The other day in a podcast they told him Saint.
It's free btw.
Also the number of DL submissions on HN seems surprisingly low given the applicability of the technology.
(a) DL is pretty complicated in a way that's unfamiliar to most software engineers. You are consistently working with Tensors that have a couple more dimensions than people are used to holding in their heads (i.e. images mean you are typically working with 4D Tensors).
(b) You learn from academic papers, not blogs. It's a new workflow for many software people and intimidating to some (although the papers are usually closer to blog posts than rigorous academic papers).
(c) It's very difficult to learn deep learning on your own without it getting pretty expensive. Advanced uses pretty much require GPUs/TPUs and that's either a big upfront purchase or a serious per-experiment cost.
(d) Deep Learning is not a single field. It is CV, NLP, RL, speech recognition and probably others I'm forgetting about. They overlap, but it further reduces the number of people you can have informed discussions with because being knowledgeable about computer vision does not mean you are able to have a vibrant discussion about NLP.
Any recommended resources?
These are “hands on” in the sense that you can replicate the results just by pasting in the same code. It’s kind of like a tutorial notebook in essay form.
Speaking of tutorial notebooks, pbaylies’ stylegan-encoder is quite good and you can run it on colab: https://colab.research.google.com/github/pbaylies/stylegan-e...
(Set runtime to GPU up in the menu.)
In my experience the best place to have informal ai discussions is Twitter. The community is shockingly helpful. Follow @jonathanfly, @roadrunning01, @pbaylies and whoever pops up in the stuff they post. Roadrunning in particular posts tweets of the form “here’s some research; here’s the code” often with an interactive notebook.
That said, I've experimented with pytorch and I agree that it is really nice to work with.
Disclaimer: I work at Google and do use tensorflow, though I don't work on the tensorflow team.
To all the folks who are just starting out: just go with PyTorch. It's downright intuitive compared to anything Google has been able to put out so far.
Disclosure: ex-Googler. Used TF while there (and DistBelief before it). Gave it up as soon as PyTorch came out. Couldn't be happier.
It seems that the JAX developers are focusing their time on making the core framework better and are leaving the task of building high-level APIs to the community for now.
I suspect we'll see a few high-level APIs emerge over the next few months that explore different approaches before the community settles on a particular one.
I answered my own comment before posting it. But in case it’s helpful to anyone else, I’ll put the answer here: yes, TensorBoardX. Looks like it’s very easy to use: https://tensorboardx.readthedocs.io/en/latest/tutorial.html
Anyone have thoughts on TF2.0 vs pytorch? Over on Twitter people seem to be pretty hyped about TF2.0, but when I tried learning it it just felt... not very fun. I need to give it a fair shot though.
GANs, and especially stylegan, are good for generating high quality images up to 1024x1024. These take about 5 weeks to train and $1k of GCE credits. The dataset size is around 70k photos for FFHQ. Mode collapse is a concern, which is when the discriminator wins the game and the generator fails to generate anything that can fool it. Stylegan has some built in techniques to combat this. IMLEs recently showed that mode collapse can be solved without gans at all.
Hmm.. what else... I’ll update this as I think of stuff. Any questions?
EDIT: Regarding IMLE vs GAN, here are some resources:
Mode collapse solved (original claim): https://twitter.com/KL_Div/status/1168913453744103426
Overview of mode collapse, why it occurs, and how to solve it with IMLE: https://people.eecs.berkeley.edu/~ke.li/papers/imle_slides.p...
Paper + code: https://people.eecs.berkeley.edu/~ke.li/projects/imle/scene_...
Some simple code for reproducing IMLE from scratch (I haven't seen this referenced many other places; stumbled onto it by accident): https://people.eecs.berkeley.edu/~ke.li/projects/imle/
Super resolution with IMLE: https://people.eecs.berkeley.edu/~ke.li/projects/imle/superr...
For comparing images, I believe they use the standard VGG perceptual loss metric that StyleGAN uses. (See section 3.5 of https://arxiv.org/pdf/1811.12373.pdf)
It seems to me that the main disadvantage of IMLE is that you might not get any latent directions that you get with StyleGAN. E.g. I'm not sure you could "make a photograph smile" the way you can with StyleGAN. But in the paper, they show that you can at least interpolate between two latents in much the same way, and the interpolations look pretty solid.
The whole advantage of GaN is it does NOT need an explicit distance metric for comparing images--instead the discriminator effectively learns the metric in order to improve its ability to distinguish real images from generated/fake ones. Arguably this is the whole advantage of GaNs.
So to argue that IMLE can solve mode collapse is a false equivalency.
Edit: If your algorithm is not using neural networks, then libraries like TF may or may not be a good fit, it depends on the algorithm.
Writing custom low-level code can still make sense in those cases.
Although the endpoint is likely to be a better understanding of the choices made by a mature implementation, and of the work involved in fixing up edge cases.
And yet they love to ask you to do exactly that at technical interviews... coming up next: what ML algos you need to know to ace that interview.
What I have to say is this: please don't build your own.
Some people do. It's a good challenge.
I had a lesson in writing crypto once, when I made what I thought was a good enough secret mixing procedure to encode some data I wanted to email outside of a company that didn’t allow web access. (Long time ago, circa 2000). It all looked undecipherable and I sent most of the data before I discovered that strings of binary zero were leaking my secret key. Oops, pretty stupid.
D'oh! Good point though lol.
But the parent of the comment I was replying to clearly had the former in mind, as a subsequent comment showed.
I am sure you could write stuff like Diffentiable Processors or the like from scratch with numpy but if you respect yourself and your time, you won’t. Complicated architectures are orders of magnitude harder than writing feed forward networks from scratch. For example, see the Merlin paper.