
“The Unreasonable Effectiveness of Deep Learning Representations” - neuhaus
https://blog.insightdatascience.com/the-unreasonable-effectiveness-of-deep-learning-representations-4ce83fc663cf
======
kujaomega
I see a good explanation of the problem and a good evolution of the done
steps. But I see a problem in the approach. When you are getting the most
similar result, you are supposed to compute high cosine similarity between all
the embeddings. If you have more than a billion of embeddings and the
embeddings have 1k dimensions, it will take a lot of time. How would you solve
this problem? Clustering the embeddings?

~~~
bunderbunder
There are off-the-shelf libraries like ANNOY and nmslib that index the vectors
in a way that allows for fast (possibly approximate) nearest neighbors
searches.

------
JoeAltmaier
Looking at pictures and categorizing them by appearance has very limited
application. The human context can't be in grasped that way, can it?

For instance, 'wedding pictures'. A cake being cut; a cute kid throwing flower
petals; a black-clad clergyman; a hand with a ring on it. Any human could
categorize a pile of pictures into those that are in the 'wedding' category,
and those that aren't. But no strategy based on weighting pixels is ever going
to get there.

Likewise, 'cute' or 'scary' or 'funny'. And on and on.

~~~
jchw
There are definitely networks in our brain that classify what we're seeing. An
uneducated guess, though, is that the networks in our brains have many, many
intermediate representations and don't go directly from image -> words, but
rather go to abstract classifiers that can go back to words.

------
robius
What I find unreasonable is doing all this without knowing what the model is
doing. It's blind with no way to steer and correct it.

That is what feed forward networks and back propagation do for us. So why do
we keep using them?

Then there's the statistics of it all.. what are we actually modeling? 'The
real world' you say? Think again.

Data has to be changed and manipulated into i.i.d. form, or the algorithms
won't work. How does an independent set of random variables give us a model of
the actual dataset which is a very limited representation of the real world?
It doesn't. It's modeling something else.

Okay, why don't we take dependence into account? Surely that would represent
the real world better. Good question! (Shirley has nothing to do with it.)

It's because there is no formal definition of dependence in statistics. Let
that sink in for a minute.

So the math needs work, statistics needs a revolution, and then we can begin
to change AI enough for it to finally start making sense. Focus on explainable
algorithms and actual ability to validate that what models generate make sense
and will not be unlawfully biased or have outliers that will cause harm.

There appears to be only one company who has something like this. But few
actually care.

~~~
wadkar
> So the math needs work

Finally! I thought I was alone (and stupid) for thinking like this.

Is there any literature or any meta-work that discusses the notion of
probability itself? What is expectation? What is dependence?

~~~
meikos
what do you mean by the notion of probability itself?

probability was mastered far before computers were a thing

~~~
akvadrako
Probability is far from clear. Very briefly, there are two main camps:

1\. Bayesian probability is about degrees of belief. But that's always
subjective and belief about _what_ , if not probability? It's circular.

2\. Frequentist probability is about, after _X >> 1_ runs of an experiment, an
outcome with odds of _Y_ occurs _Y /X_ times. But it's only exact with an
infinite number of runs, which never happens. And what's the odds of exactly
_Y x 1000_ outcomes after 1000 runs? Again, that's circular.

My favourite way to think about probability is the multiverse kind:

3\. Assuming there are an infinite number of fungible identical worlds, if a
coin flip has 50% of heads, it means observers in exactly half the worlds see
heads. However, this isn't actually probability at all - from a god's eye view
it's objectively certain what happens.

~~~
reitanqild
> Probability is far from clear. Very briefly, there are two main camps:

Isn't this a bit like saying there are two main camps when it comes to coins:

1\. "heads"

2\. and "tails"

?

At least to me it felt like the different forms of statistics where only
different techniques.

~~~
perl4ever
I don't even understand how the frequentist view is a valid alternative. It
always seemed to me like either you are honest about your priors, and use
Bayesian logic to take them into account, or you sweep it under the rug. Lying
to yourself always produces bad results, is my overriding heuristic. But I'm
not good at math.

