Background Removal can be thought of as Foreground Segmentation, inverted. That ...

johndough · 2024-06-22T06:30:07.000000Z

For many tasks that neural networks can solve, there are traditional algorithms that are more compact (lines of source code vs size of neural network parameters), but they are not always faster and often produce results of lower quality. For a fair comparison, you have to compare the quality of result together with the computation time, which is not straightforward since those are two competing goals. That being said, neural networks perform quite well for two reasons:

1. They can produce approximate solutions which are often good enough in practice and faster than exact algorithmic solutions.

2. Neural networks benefit from billions of dollars of research into how to make them run faster, so even if they technically require more TFLOPs to compute, they are still faster than traditional algorithms that are not extremely well optimized.

Lastly, development time is also important. It is much easier to train a neural network on some large dataset than to come up with an algorithm that works for all kinds of edge cases. To be fair, neural networks might fail catastrophically when they encounter data that they have not been trained on, but maybe it is possible to collect more training data for this specific case.

I have not discussed any methods to compress and simplify already trained models here (model distillation, quantization, pruning, low-rank approximation, and probably many more that I've forgotten), but they all tip the scales in favor of neural networks.

salamo · 2024-06-22T07:04:25.000000Z

"Neural networks are the second best way of doing just about anything." ~ John Denker

It's an old quote that, although not 100% accurate anymore, still sums up my feelings quite nicely.

sitkack · 2024-06-22T06:10:48.000000Z

There is some work to convert NNs to decision trees.

https://towardsdatascience.com/neural-networks-as-decision-t...

https://arxiv.org/abs/2210.05189

I haven't reviewed any of it, I only know of it tangentially.

https://www.semanticscholar.org/paper/Converting-A-Trained-N...

Distilling a Neural Network Into a Soft Decision Tree https://arxiv.org/abs/1711.09784

GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent https://arxiv.org/abs/2305.03515

TeMPOraL · 2024-06-22T07:33:34.000000Z

NNs are, in a way, already "compiled". If all you want to do is inference (forward pass), then you mostly do a lot of matrix multiplications. It's the training pass that requires building up extra scaffolding to track gradients and such.

It occurred to me that NNs ("AI") are indeed a bit like crypto, in the sense that both attempt to substitute compute for some human quality. Proof of Work and associated ideas try to substitute compute for trust[0]. Solving problems by feeding tons of data into a DNN is substituting compute for understanding. Specifically, for our understanding of the problem being solved.

It's neat we can just throw compute at a problem to solve it well, but we then end up with a magic black box that's even less comprehensible than the problem at hand.

It also occurs to me that stochastic gradient descent is better than evolutionary programming because it's to evolution what closed-form analytical solutions are to running a simulation of interacting bodies - if you can get away with a formula that gives you what the simulation is trying to approximate, you're better off with the formula. So in this sense, perhaps it's worth to try harder to take a step back and reverse-engineer the problems solved by DNNs, try to gain that more theoretical understanding, because as fun as brute-forcing a solution is, analytical solutions are better.

--

[0] - Which I consider bad for reasons discussed many time before; it's not where I want to go with this comment.

johndough · 2024-06-22T06:09:51.000000Z

Neural networks are not trained with evolutionary algorithms because they are very slow, especially for the millions or billions of parameters that NNs have. Instead, stochastic gradient descent is used for training, which is much more efficient.

eevilspock · 2024-06-22T12:30:39.000000Z

> doesn’t it then mean that whatever process is encoded in the NN, it should both be possible to represent in some more efficient representation...?

Not if NNs are complex systems[1] whose useful behavior is emergent[2] and therefore non-reductive[3]. In fact, my belief is that if NNs and therefore also LLMs aren't these things, they can never be the basis for true AI.[4]

---

[1] https://en.wikipedia.org/wiki/Complex_system

[2] https://en.wikipedia.org/wiki/Emergence

[3] https://en.wikipedia.org/wiki/Reductionism, https://www.encyclopedia.com/humanities/encyclopedias-almana..., https://academic.oup.com/edited-volume/34519/chapter-abstrac...

[4] Though being these things doesn't guarantee that they can be the basis for true AI either. It's a minimum requirement.