
Facebook open-sources Detectron - rmason
https://research.fb.com/facebook-open-sources-detectron/
======
evmar
Also noteworthy: the Apache 2 license, which includes a patent grant (unlike
the previous Facebook licenses that have caused concern in the past).

~~~
haeffin
caffe2 (which this is built on) was switched from bsd+patents to Apache2 a
while ago too.

~~~
zitterbewegung
I think if either a project that FB uses upstream has Apache 2 or if you are a
large organization (for example the Apache Software Foundation) you can use
your clout to force Facebook to use Apache.

~~~
traek
Caffe2 is a Facebook project, so they’re not exactly being “forced” to release
it or any downstream projects under Apache 2.

------
megous
So is this the end of Google captchas asking for where the car/sign/whatever
is? Will there be a final battle of AIs, where they will kill each other, and
the unfettered access to websites over VPN/tor wins and laughs the last laugh?

~~~
kurtisc
For the car captchas, I've found actually clicking all the boxes with part of
a car will always be a wrong answer (distinct from when it just makes you
answer twice). Instead, you have to click on the squares that you know it
thinks are cars.

This creates a twisted Turing test situation where, to prove you are a human,
you have to pretend to be a machine's idea of what a human is.

~~~
wott
I have more problems with bridges than with cars. The damn thing forces you to
select 3 bridges, except that there are only 2... So you are forced to select
what it thinks is a bridge, and confirm his erroneous bias even more.

~~~
Chaebixi
Storefronts are difficult too, I don't think I'm ever good enough at those to
satisfy it. The most reasonable one seems to be street signs, but I think it
fails me for not flagging the _unpainted back_ of one.

------
yohann305
my overall feeling as someone that wants to start getting into visual
recognition is that there are a bunch of great libraries/ecosystems to choose
from and all of them have pros and cons, but i honestly don't want to make the
wrong decision and end up being stuck later on. Anyone here has any advise on
what i should use to have a camera(rpi) recognize most common objects and then
add a layer where we can teach specific objects, ie (putting a name on a
person or a pet), thank you!

~~~
2bitencryption
unless I'm mistaken, this is the very first lesson of the fast.ai course.

~~~
apendleton
The first lesson is image classification ("is this a picture of a cat or a
dog?"). Given that OP is commenting on an object detection library release,
though, I assume they're interested in object
recognition/detection/segmentation and rather than just image classification.
So, more like: "what things are in this image and where are they?" or even
just "where are the dogs in this image?"

That's also covered eventually in fast.ai, but not until the second course if
memory serves.

------
newscracker
This looks amazing from the computing point of view (and is an achievement of
sorts), yet the confidence percentages are lower than what an average human
might be able to solve for (like in the case of CAPTCHA tests).

Meanwhile, I wonder about the human costs if systems like these are adopted
for purposes where they may be ill suited for, especially cases where their
confidence scores are ignored (or mistakenly assumed to be 100% even when
they're lower). Anyone have reading material on this?

~~~
notyourwork
Care to give any examples of scenarios you are concerned about?

~~~
newscracker
My primary worry is law enforcement and government surveillance considering
these systems as infallible and making judgments or life-changing decisions
based on interpretations like this from computers. Computing has improved our
lives a lot, but sometimes I feel there's an air of over confidence that
clouds our judgment.

------
franciscop
Does anyone know an alternative that works on RaspberryPi? This states:
"Detectron operators currently do not have CPU implementation; a GPU system is
required."

Even low FPS (3-5) would be acceptable.

~~~
m_ke
You could try tensorflow object_detection api with tensorflow lite

[https://github.com/tensorflow/models/tree/master/research/ob...](https://github.com/tensorflow/models/tree/master/research/object_detection)

google also recently put up their mobilenet v2 paper which handles
segmentation
[https://arxiv.org/abs/1801.04381](https://arxiv.org/abs/1801.04381)

~~~
deepGem
+1 for the Google object detection API. The trained model is quite huge
though. 200 MB based on Resnet faster R-CNN. There are creative ways of
chunking this model to keep it small.

~~~
m_ke
I think they have a mobilenet SSD model as well.

------
drdrey
> Beyond research, a number of Facebook teams use this platform to train
> custom models for a variety of applications including augmented reality and
> community integrity.

Any idea what they mean by "community integrity"?

~~~
readams
detecting porn, presumably.

~~~
adventured
I would expect it to work on detecting the mismatching of content for types of
communities in general. For example, preventing pictures of cats or giraffes
being uploaded as a product photo on Poshmark when it's supposed to be a pair
of shoes.

That type of check should become standard in a short amount of time for all
communities that accept photos (that isn't meant to be general purpose, eg
Imgur).

~~~
schrep
For example marketplace (where you can sell items on Facebook) will suggest a
category for the item if you upload a photo.

~~~
paulie_a
I can't wait for Facebook marketplace to fail. The constant stream of useless
ads is obnoxious. I am using Facebook less because there is no way to block
that low quality content

------
JepZ
Anybody knows a way to run CUDA programs with the open source driver
(nouveau)?

~~~
yorwba
CUDA needs driver support to talk to the GPU, and since it is proprietary
nVidia technology, the open-source driver can't support it. So either you run
the nVidia driver or you have to use OpenCL.

------
aaroninsf
Can someone with GPUs and love in their hearts, bundle this with trained
models in a Docker container?

(serious request... I got a cluster, and something like a million pictures;
but no GPUs or time for another side project...)

~~~
burningion
I’ve started working on this, it seems the current Dockerfile for caffe2
doesn’t work out of the box because of a forced push.

Follow me on Twitter, and I’ll post it there when it’s finished. Same username
as here.

* edit: I've put a pull request in that builds the Dockerfile for the GPU for now: [https://github.com/facebookresearch/Detectron/pull/15](https://github.com/facebookresearch/Detectron/pull/15)

------
candlefather
Reminds me of this pic from Terminator
[https://s3.amazonaws.com/pbblogassets/uploads/2015/08/Termin...](https://s3.amazonaws.com/pbblogassets/uploads/2015/08/Terminator-
POV.jpg)

~~~
ry_ry
Accompanied by endless banner ads for your clothes, your boots and your
motorcycle.

------
yters
Is there a class of math problem humans can solve but computers cannot? Then
we could just use these problems as a guaranteed test instead of the current
CAPTCHA arms race.

~~~
m_ke
There is no arms race. 99.9% of the time Google knows if you're a robot based
on your browser state. They make you label the images because it's a free way
to get training data.

~~~
gldalmaso
I see this argument a lot and considered it the case to, but I have to wonder
if that really is the case. Wouldn't google be empowering people to really
mess up their training data?

If I'm trying to automate a system to fool their captcha, I'm probably getting
a lot of bad results. Or I could just be intentionally feeding them bad data,
the fact that not being allowed through captcha keeps letting me make more and
more inputs would enable someone to do that as long as they would like to.

I don't know, maybe I'm missing something.

~~~
kabes
I believe they cross-check between different users. I believe it used to work
something like this on the old recaptcha system (the one with the words or
house numbers): They show you 2. 1 is known by recaptcha, the other one isn't.
If you enter the known one correctly, you can enter. The unknown one is
presented to other users and when there is enough consensus amonst users it is
promoted to a known one. So it's hard to mess with the system as an
individual.

------
ibdf
I was just looking into trying out YOLO. Does anyone know how both compare?

------
xiphias
I'm happy that tech companies are open sourcing basic research all the time,
and thinking a lot about what would have happened if large pharmacy companies
did the same thing. I'm just hopeful that with new biotech companies the
science behind curing people will get faster as well.

~~~
ejstronge
Unlike the case in tech, pharma basic research is far less important in
advancing our knowledge when compared to academia. A good example comes from
the last few blockbuster cancer therapies - CAR-T cells and checkpoint
blockade all arose in academic labs.

Also, for drugs that do make it to market, efficacy and side effect
information is published as a condition of drug approval, at least for new
drugs.

Whether basic science research papers should be behind a paywall is a wholly
separate issue, but the life science community largely shares its finished
products. Indeed, there’s even a push to share early stage data, too.

------
wazoox
This is relying on proprietary CUDA technology. This doesn't qualify as Free
Software to me.

~~~
zer0t3ch
Who said it was Free Software?

------
stmw
This is great! I do wish this were written in something other than Python.
What is the carbon footprint of all this computer vision, compute-intensive
code still being run billions of times a day in Python? Someone should
calculate...

~~~
stmw
It is funny to see this comment get "-4" already... What's so offensive? After
all, Facebook has rocksdb in C++, percona in java, and a PHP->C++ compiler, so
they clearly have both the belief and the skill in moving away from
interpreted programming languages for performance-sensitive code.

~~~
linkmotif
For some reason, people are offended by gross misunderstandings. This
framework is in Python, but it’s a Python binding that sets up code that runs
natively (not even sure the details myself; others are writing CUDA).
TensorFlow is the same way. It’s in Python, but the computations are not in
Python. As you point out, that wouldn’t make sense.

~~~
stmw
Having written Python -> C bindings before, I am quite aware of how this
works... It still has considerable overheads.

~~~
linkmotif
But these aren’t those kind of binding. Here the Python just sets things up,
and that’s the end of Python. That’s my understanding.

