"We" are not forbidding you to open a computer, start experimenting and publishing some new method. If you're so convinced that "we" are stuck in a local maxima, you can do some of the work you are advocating instead of asking other to do it for you.
You can think chemotherapy is a local maxima for cancer treatment and hope medical research seeks out other options without having the resources to do it yourself. Not all of us have access to the tools and resources to start experimenting as casually as we wish we could.
As I understand it a local maxima means you’re at a local peak but there may be higher maximums elsewhere. As I read it, transformers are a local maximum in the sense of outperforming all other ML techniques as the AI technique that gets the closest to human intelligence.
Can you help my little brain understand the problem by elaborating?
Also you may want to chill with the personal attacks.
Not a personal attack. These posters are smarter than I am, just ribbing them about misusing the terminology.
"Maxima" is plural, "maximum" is singular. So you would say "a local maximum," or "several local maxima." Not "a local maxima" or, the one that really got me, "getting trapped in local maxima's."
While "local maximas" is wrong, I think "a local maxima" is a valid way to say "a member of the set of local maxima" regardless of the number of elements in the set. It could even be a singleton.
No, a member of the set of local maxima is a a local maximum, just like a member of the set of people is a person, because it is a definite singular.
The plural is also used for indefinite number, so “the set of local maxima” remains correct even if the set has cardinality 1, but a member of the set has definite singular number irrespective of the cardinality of the set.
You can't have one local maxima, it would be the global maxima. So by saying local maxima you're assuming the local is just a piece of a larger whole, even if that global state is otherwise undefined.
No, you can’t have one local maxima, or one global maxima, because it’s plural. You can have one local or global maximum, or two (or more) local or global maxima.
MNIST and other small and easy to train against datasets are widely available. You can try out anything you like even with a cheap laptop these days thanks to a few decades of Moore's law.
It is definitely NOT out of your reach to try any ideas you have. Kaggle and other sites exist to make it easy.
My pet project has been trying to use elixir with NEAT or HyperNEAT to try and make a spiking network, then when thats working decently drop some glial interactions I saw in a paper. It would be kinda bad at purely functional stuff, but idk seems fun. The biggest problems are time and having to do a lot of both the evolutionary stuff and the network stuff. But yeah the ubiquity of free datasets does make it easy to train.
Not to mention not everyone can be devoted to doing cancer research. Some Drs. and Nurses are necessary to you know actually treat the people who have cancer.
Are we sure there’s anything “net new” to find within the same old x86 machines, within the same old axiomatic systems of the past?
Math is a few operations applied to carving up stuff and we believe we can do that infinitely in theory. So “all math that abides our axiomatic underpinnings” is valid regardless if we “prove it” or not.
Physical space we can exist in, a middle ground of reality we evolved just so to exist in, seems to be finite; I can’t just up and move to Titan or Mars. So our computers are coupled to the same constraints of observation and understanding as us.
What about daily life will be upended reconfirming decades old experiment? How is this not living in sunk cost fallacy?
Einstein didn't say that about insanity, but... systems exist and are consistently described by particular equations at particular scales. Sure we can say everything is quantum mechanics, even classical physics can technically be translated as a series of wave functions that explain the same behaviors we observe, if we could measure it... But it's impractical, and some of the concepts we think of as fundamental to certain scales, like nucleons, didn't exist at others, like equations that describe the energy of empty space. So, it's maybe not quite a fallacy to point out that not every concept we find to be useful, like deep learning inference, encapsulate every rule at every scale that we know about down to the electrons, cogently. Because none of our theories do that, and even if they did, we couldn't measure or process all the things needed to check and see if we're even right. So we use models that differ from each other, but that emerge from each other, but only when we cross certain scale thresholds.
If you abstract far enough then yes, everything what we are doing is somehow akin to what we have done before. But that then also applies to what Einstein has done.