
Open-Sourcing Bit: Exploring Large-Scale Pre-Training for Computer Vision - theafh
https://ai.googleblog.com/2020/05/open-sourcing-bit-exploring-large-scale.html
======
lsb
I've been going through the fast.ai course, and pre-training is like
witchcraft, it's so spookily effective.

You can take a 15MB mobilenet model, add a layer at the end, fine tune with
half a dozen examples of a few different image classes (in a few minutes on a
consumer grade laptop), and recognize lots of different examples in real time
with a web app reading continuously from a webcam.

The advances made in Computer Vision in the last ten years are mind blowing.

------
ricklamers
Interesting experiments. Too bad they're not releasing the JFT pretrained
models. I guess the cutoff point of what's too valuable to share has been
reached.

~~~
gwern
GB has been pretty good about releasing models (especially compared to, say,
DeepMind), such as EfficientNet.

JFT is the exception. I find JFT interesting so I pay close attention to
anything using it, and as far as I've noticed, no model has ever been released
based on JFT, going back to 2015 at least when it was much smaller. It's
always either held back or the released model is based on public datasets (eg
BigGAN - released G was on ImageNet though the paper notes that the JFT BigGAN
completely avoided divergence problems, which is very interesting). I've
wondered if legal/copyright issues block any release: there's always someone
who tries to argue that a model is a derived work, and nothing in the JFT-300M
papers mentions having licenses covering public redistribution.

~~~
jcjohns
I don't think Google has ever released models trained on JFT. But if you're
interested in large-scale vision models, you can check out these models from
Facebook trained on 940M Instagram images (several times bigger than JFT!)

[https://github.com/facebookresearch/WSL-
Images](https://github.com/facebookresearch/WSL-Images)

------
londons_explore
No comments, probably because everyone is trying to fire up the demo code in
colab and trying to make "whose dick is it?" classifiers...

~~~
fxtentacle
Well for my humble needs, the first layers of a pretrained VGG16 were already
good enough, so I have little use for yet another even more resource hungry
visual encoder.

