This depends entirely on the definition of 'steering a 2D car'. In the model used, throttle is simply proportional to the distance to the nearest wall in front of the car. This means the agent will never accelerate coming out of a corner, because it can't know it has the headroom to steer away from the wall as it's coming out.
Similarly, the model for steering inherently steers the car towards the middle of the track. I would expect the car to wobble from left to right if the road's edges are ragged, make up its own corners if the track edges describe a 'fake' turn on a straight bit, and the car would likely crash if it were to encounter a Y junction or a pit stop. The neural network agents showed smarter behavior here because it is able to capture more complex cross-dependence between different inputs.
On the topic of junctions, if the track were to include them, perhaps it'd be nice if the car chose the quickest route to optimize for lap times. But maybe that stretches the problem statement too much.
> Instead of doing anything fancy, my program generates the coefficients at random to explore the space. If I wanted to generate a good driver for a course, I’d run a few thousand of these and pick the coefficients that complete the course in the shortest time.
In theory this is more random and less efficient than an evolutionary algorithm, which searches the problem space in a structured way. If the author really wanted to hammer the point home, a least squares method to one-shot the coefficients would be more convincing.
All in all, the author doesn't make any hard claims that are false. But I would nuance the point of "neural networks are unnecessary" to "simpler models will do for simpler objectives".
Sure, YouTube itself proably built insane stuff in their engine you could never replicate with classic methods (ignoring whether the YT algo is any good).
However, if we are just talking about the Vlog of your real estate company, you should probably A/B-test whether your viewers prefer order by time or clicks and implement a decent title search bar. And kick the consultant hyping you up about ML out, now.
So my takeaway is that is not about whether AI is never useful or about 2D steering, but about using the right tool for the right job.
And building on that, I have to give the author probs to demonstrating an alternative solution to I problem wich I would have definitely solved via AI.
Probably. But pretty much anything they recommend is junk, so... That's where the author may have a point. If you don't understand your AI algorithm anymore, it's hard to improve it or even realize how wrong it is. AI is generally good at steering the masses into a couple of "averaged" directions. At the individual level though, it's often crap, unless you are the perfect stereotype that the algorithms assumes you to be.
Rather than discovering something new, everyone just watches The Office and Parks and Rec., again and again. Now those theme songs make my skin fucking crawl.
This is a consequence of the metrics that are being optimized, it's not a fault of the algorithm per se.
The rare occasions I discover a new channel, it’s almost always from some source other than the algorithm: a referral from a friend, this site, another YouTuber, etc. My viewership of the same repetitive roster of videos absolutely tails off until I find something new from elsewhere.
For example, in months of being subscribed to my mechanics  (who does incredibly engrossing and relaxing restorations of mechanical stuff), not once was I suggested a video from Baumgartner Restoration , an art conservator who produces videos with a similar attention to detail and high production value.
Thematically this should be an easy recommendation for YouTube to make, but evidently the content is just different enough that it scores as a false-negative. After finding the latter channel independently, my viewing time absolutely rose for a while.
In theory, YouTube ought to be able to detect and learn from this signal of non-algorithmic discovery of new content. Yet, here we are.
I suspect there is something similar going on with video/music recommendations. When a bad novel suggestion is made the penalty is likely too high to overcome (User immediately clicks off) with traditional reinforcement methods.
The application of ML and data science in this industry is quite hilariously bad, really.
That is, that pushing the left handgrip forward, at speed, turns left and not right. Yet, at very slow speeds, like walking it, it's the opposite.
The other interesting thing is how hard it is to convince non-riders that counter-steering is a thing. They will just not believe you. Even people who've grown up riding bicycles and counter-steering unconsciously their entire childhood.
This is the most counter intuitive physics i’ve ever experienced.
The really weird thing is i’d been riding pushbikes all my life and had my motorbike license almost a year before i learned this. It wasn’t taught as part of the licence training.
Super handy to know as a tool, it’s kept me from going for a closer look at the scenery on a couple of occasions.
pretty interesting to me that we can reverse our intuition.
But they are all the rage and it is no surprise that a lot of people want to play with them.
Cynically, neural networks are easier as you don't really have to think about your model. Give some examples with some classes and you're done. Or give examples of one class and let the neural net generate new ones. Doing away with the abstraction beforehand is an enticing prospect.
This way of thinking about it leads directly to things like statistical redlining.
It's also not specific to neural networks. I take a similar approach with logistic regression. Except that I like to replace the "and you're done" step with, "and you're ready to analyze the parameters to double check that the model is doing what you hope it is." Even when linear models need some help, and I need to do a little feature engineering first, I find that the feature transformations needed to get a good result are generally obvious enough if I actually understand what data I'm using. (Which, if you're doing this at work, is a precondition of getting started, anyway. IMNSHO, doing data science in the absence of domain expertise is professional malpractice.)
There is no, "and you're done" step, outside of Kaggle competitions or school homework. Because machine learning models in production need ongoing maintenance to ensure they're still doing what you think they're doing. See, for example, https://research.google/pubs/pub43146/
NNs are just polynomial regression with polynomial activations; and piece-wise linear regression with relu activations (etc.).
A NN is just a highly parameterized regression model -- for better, or worse.
I had always thought of neural nets in terms of the massive connected graph, that in my head was somehow behaved like a machine.
When I realized in the end its just a representation of a massive function, f:Rm->Rn, which needs to fitted to match inputs and outputs.
I know this is not precisely correct and glosses over many, many details - but this change in viewpoint is what finally allowed me to increase the depth of my understanding.
What are the nodes and edges?
There is a computational graph which corresponds to any mathematical function -- but it is not the NN diagram -- and not very interesting (eg., addition would be a node).
NNs are neither neural nor networks.
It is only a temporary solution - unless it works.
Just because something is taught in an ML course doesn't mean that it is ML. It is pretty common for physics classes to teach maths and for chemistry classes to teach physics for example.
So if something is taught in ML class but also in statistics class then it is statistics and not ML. If something is taught in ML class but also in a numerical methods class then it is numerical methods and not ML.
If you just replace ML with AI everywhere in this article it is going to make sense.
The article has other problems, one being the main premise.
The problem isn't to drive a car around track (which is what the polynomials did) but rather write a program that can figure out how to drive a car without you knowing how to solve it.
We don't find these systems intelligent because, on inspection, they arent.
We are intelligent. Not "magically", but actually nevertheless.
Our intelligence, and that of dogs (, mice, etc.) consists in the ability to operate on partial models of environments; dynamically responsive to them; and to skilfully respond to changes in them.
This sort of intelligence requires the environment to physically reconstitue the animal in order to non-cognitively develop skills.
It is skillful action we are interested in; and precisely what I missing in naive rule-based models of congition.
It's why I use "AI" and "ML" interchangeably although I know it's technically incorrect - the formal definition doesn't match what people are currently thinking.
So yes, "any solution that imitates intelligent behaviour" is probably right, but with nuances with regard to what that actually means.
Edit: yeah, you can downvote this, but current AI research splits right along this line, whether it's symbolic or statistical. Some AI courses will use NNs, others will use Prolog and ASP. You can't just dismiss a whole field of research by reducing AI to statistical methods.
I know that AI researchers are usually a bit dismissive about the other area. I don't like statistics either. Reducing the whole of AI research to statistical approaches (and NNs are one of those) is disingenious and dismisses hundreds of researchers doing important work.
You may not want to have rule-based image recognition, but if your car decides to run over somebody, I feel we better have an explanation for this behaviour based on reasoning and logic.
> If I was developing a racing game using this as the AI, I’d not just pick constants that successfully complete the track, but the ones that do it quickly.
If they wrote code that automatically picked constants that successfully completed the track quickly, (even something as simple as sorting the results by completion time), then that's reinforcement learning.
One also can argue otherwise .
I am not sure when we changed the terms, but back in the day, this would happily fall into machine learning. As he mentioned, if you want a good driver you would execute thousands of experiments to pick a good set of parameters
Here is a trivial example: one of the best ways of modeling timeseries data, both in and out of sample, is to naively take the moving average. This is a rolling mean parameter estimate on n lagged values from the current timestep. Not only is this an excellent way of understanding the data (by decomposing it into seasonality, trend and residuals), it's a competitive benchmark for future values. The first step in timeseries analysis shouldn't be to reach for a neural network or even ARIMA. It should be to naively forecast forward using the mean.
You might be surprised at how difficult it is to beat that benchmark with cross-validation and no overfitting or look-ahead bias.
Sure, they are dummy regressors , but they can be so useful for proving that your whatever ML model you choose is at least better than a dummy baseline. If your model can't beat it, then you need to develop a better one.
They can even be used as a place-holder model so you can develop your whole architecture surrounding it, while another teammate is iterating over more complex experiments.
You could also settle in for a moving average process as a first model in a time-series , because they are easy to implement and simple to reason about.
Never under-estimate the power of an "average".
1. To feel smart
2. To justify our paycheck
Most of the time simple solutions like the one in the link will be more than enough but we just can't resist the urge to implement this new paper we just found. I remember thinking about using a NN for a problem we had, after looking it closely for two days all I needed was a simple linear regression and got the extra benefits of being able to explain what was going on.
Eventually I started looking for the "elegant" solution because now that was the thing that made me feel smart. Although some times using brute force is the best approach. There's a balance you find once you gain enough experience... I think.
Of course, as you mention, there's always the business-political aspects - of explaining or justifying your choice to people who don't understand any of those tools, and who often want to pretend that they're part of a "smart data-driven AI" company.
Finally someone on HN is being honest. It's also a good touch that this is downvoted somewhat while replies getting mad at essentially choice of words (of the title, nonetheless) are on top.
Doing the elegant solution was the first thing I learned in grad school, it was literally a comment a professor made. In undergraduate, you feel smart for working out a long calculation. In grad school and beyond, you should feel smart by doing the least math possible and using intuition. It's not that long calculations aren't necessary, it's just that thinking a little first is important rather than just turning the hard work crank.
If you are lucky enough to be in a truly "data driven" startup then you'll most likely have fun a learn a lot.
For example, I've seen many mission-critical rules-based systems with no accuracy metrics. A data scientist is asked to beat this system using machine learning and fails miserably because the rules were developed by experts over years and the data scientist has a month and is only a few years out of grad school. However, in the process of building the model that data scientist built data pipelines for measuring the rules-based system's accuracy and fixed a small number of major data integrity issues. Unfortunately this isn't considered a win and the data scientist usually leaves the company soon after. :shrug:
Regarding recommender systems I see many companies trying neural nets and so many other fancy ML stuff for things that - in AB-tests are always outperformed by basic rules.
I understand the fun it is to build stuff and to use the new hot stuff. And at least for many analysts and marketing people as well as shop product owners this is new hot shit.
I also understand, that it is way more easy to get management to hand out the big bucks for something that is the new rage, as they tend to read the respective soundbites in their manager magazines.
But as said - I see it underperforming in tests nearly all the time. Not only, but esp. if you take the costs of development and maintenance into account. These systems cost more to build, more to enhance, more to run and bring in less real business value 70 - 90 percent of all the times I have seen them.
But they are presented to management in shiny presentations from agencies that need to sell the new hot shit to their clients to show that they are relevant. Because, as said, management thinks they need it and are often not open to agencies telling them, that the business value could be better served otherwise. Because in the end for a manager it is often times more valuable to show a fancy state of the art project to his/her higher ups than creating real business value.
I work on large scale recommender systems for an ecommerce company and in my career I’ve seen only the exact opposite.
Don’t get me wrong, sometimes simpler ML models, like clustering LSA vectors or nearest neighbors, work better than complex models like neural nets.
But I have never seen plain rule systems work better for any problem even remotely at scale. Rule systems give an illusory sense of control and understanding, yet are rife with complex interaction effects and edge cases that typically make them intractable to change.
The quality of recommendations nearly always increased from a revenue as well as perceived quality standpoint. However, it almost never had a positive impact on the profit margin that would have justified the necessary investments.
As said - from a quality standpoint most simple systems were just "good enough".
One would need to know the business case and environment. But take automotive (new cars) as an example: The goal is nearly always to get the user to request some form of contact from a physical dealership near them. For that you nearly never need a perfekt, fully configured recommendation.
I know of an example (a car manufacturer) where the search space of all configurable variants (including various things the car owner would never register because they are specific screw variants) is of a size where even the number of visitors to the website per year is some orders of magnitude less than the number of options.
The way to go here was to reduce the search space and the number of variants. Here it turned out that you can, quite fast, reach that goal with few specific questions (active learning) to lead the user to vehicle variants, which correspond to its interests and led to a disproportionately high contact behavior.
And yes: ML techniques were used for the analysis and reduction of the search space. For the concept people to then develop specific questions to get to these reduced attributes. But in the end the recommender now works rule-based.
I don't imply that this holds true for every scale of company/problem. And I know some counter examples - but most companies do not operate on that scale. If you are ebay, Zalando (Germany) and the likes: I would probably get different results from testing the revenue validity of the different approaches.
In fact, I’ve always found even just plain cost per unit service goes down with the introduction of more complex ML models. Their greater training complexity and compute costs are much more than amortized by improved performance, easier ability to train and deploy new models (it’s much harder and labor intensive to adjust a rat’s nest of custom business rules than a black box ML model, even in terms of transparency).
Just reduction of operating costs alone is usually a reason to favor ML solutions, even if they only achieve parity with rules systems (though usually they outperform them by a lot).
Your comment makes me feel your methodology for assessing business value and comparing with rule systems is deeply flawed and probably biased to go against ML solutions for preconceived reasons.
Wow. Nice ad hominem. Thanks a lot for that.
> Just reduction of operating costs alone is usually a reason to favor ML solutions
I have yet to see one solution in the industries I work in and the clients I work with, were a ML solution beats simpler systems in development and operation costs (given the current real world environemnt there).
And believe me I try to sell these projects to clients, as I strongly believe that in the long run they could gain something from that.
But that would also mean getting rid of a clusterfuck of different systems, different data definitions from department a to department b as well as market x to market y. Politically motivated data mangling (we do not want "central" to know everything so we do not send all data or data in the necessary format).
When you see that markets use technically the same CRM system for example, but they rename tables, drop columns, use same dimension names for different things and so on integrating one market into a central data lake becomes a daunting task, let alone 130 markets. And this is just CRM. Not sales. Not - given automotive - the data from retailer systems.
But this would nonetheless be the data you need for ML systems to learn from. And then there are legal issues. car dealerships are separate legal entities. They are not allowed to "just" send PII date to the central brand (at least not with European GDPR). There is also a lot of stuff central just isn't legally allowed to know like discounts given - just to name one example.
After you get all of this entagneld and cleaned up (and changing all necessary business processes that depend on said structures I strongly believe ML would probably be cheaper. And leading to better results.
Don't think that I am telling my clients otherwise.
Not because they do not work, but because simpler systems can be run comparatively cheap in environments that are very stratified and were the underlying data situation is a messed up clusterfuck to begin with.
Believe me I really, really wonder how these companies are able to make money given what they have in terms of underlying central data quality. It is unbelievable sometimes.
Rule based systems are great if you have people with deep domain understanding developing the rules.
Unfortunately, those people are rare, so most rule-based approaches fail to perform well.
However, most recommendation systems suck unless you get someone who knows what they are doing to build them.
In terms of business value, I would be very hesitant to make strogn statements like the above (in both cases, actually).
I also believe that with a good situation in underlying data quality we could be talking about massively reduced costs in getting these systems up and running - and this would tip the scale in favor of said systems.
But what I see in terms of data quality makes me sometimes just want to run as fast as I can in the other direction.
If the data isn't being logged by automated systems daily, then you probably don't have enough to make these kinds of things work.
In smaller data environments, rules are going to perform much, much better (but still require the domain expertise, which isn't cheap).
Most tools you can start by solving simple problems with, and gradually work up the complexity of the problem you’re addressing until it does “something useful”.
This is a good way to learn what you can, can’t and should use a tool for.
Deep learning is problematic though; solving trivial tasks is actually quite difficult, and often the “something useful” level of sophistication means copy pasting someone else’s paper and tweaking it a bit and kind of vaguely hoping something you do makes any difference.
Being able to solve trivial problems with neural networks is really important, and useful; not because it’s a good solution, but because it means you can see what happens when you try it out.
The problem with this post isn’t the conclusion; the author is quite right. You can solve this problem in many ways, maybe NN aren’t the best for this kind of trivial problem.
...but, if you want to solve harder problems with more than random trial and error tweaking parameters, solving easy problems with the same tool is a good way to learn how.
...and we have like 20 years of proof that using hand crafted models has been proven not to scale effectively.
What do you mean by this? In what context?
Like, the twenty year old models are still running in credit risk and insurance, so I'm confused if you mean in ML/statistical modelling.
...or, NLP, audio & image recognition, recommendations... come on. It’s not controversial.
A lot of this is standard statistical methods, much of which are much older than twenty years.
Really, that's the part that threw me, twenty years ago everyone was going crazy for SVM's, which is still machine learning, but the features were definitely hand crafted.
I think deep learning has been super successful with unstructured data, but for tabular data it's pretty much a wash between NN's and boosted trees or generalised additive models.
You do need some flexibility in your function approximation, but not as much as people commonly believe.
The deep learning "revolution" corresponded with an exponential growth in the time/effort/money being thrown at these problems and the amount of data available to do so.
In an alternate universe, could everyone be going crazy about kernel machines?
GPU's are definitely a big part of why this stuff has improved, as you can train much, much faster on larger datasets which is going to improve performance.
NN's are super flexible though, and I'm not sure you'd have gotten the same level of performance out of other methods.
Interesting question though.
Attempting to implementing a paper from scratch is an interesting experience. (1) implement what they describe (2) it doesn't work (3) ... well, that was fun. Project over.
Very different experience from trying to implement quicksort, even though the extensions to a basic neural net often aren't conceptually much more complicated.
The big difference between this and the other approach mentioned in the article is the model is a simple one that's easy to understand instead of a many layered neural network which is rather opaque.
I think the article may be better titled "You might not need neural networks."
Since a linear model is essentially a single layer neural network with linear activation, we can't even say that. The athor was using a neural network without realising it :)
Doing automatically controlled systems was still possible before computers just that it required a bit more creative use of analog components.
I think that it's important to note here that this was the set of problems that computer science researchers didn't know how to solve.
In the early days, they mostly ended up re-inventing statistical methods.
And to be fair, a neural net is just a bunch of linear models joined by a non-linearity. In that case, it's essentially stacked logistic regression, which was invented before computers.
Nobody did this before computers though.
> . In that case, it's essentially stacked logistic regression, which was invented before computers.
It isn't "basically logistic regression", it is just a technique which uses logistic regressions. The full technique is ML. If you remove the ML parts it is basically just logistic regression left though.
I think this would probably be a more profitable discussion if you could define Machine Learning for me.
How does your evidence support your claim?
But machine learning has predated neural networks by hundreds of years. The core mathematical basis of all machine learning coursework linear regression and decision trees. Other models like SVMs, Bayesian models, nearest neighbor indexes, TFIDF text search, naive Bayes classifier, etc., are basically like machine learning 101, and they have many different properties regarding interpretability depending on the problem to solve.
I know this is super pedantic, but it's important to remember the roots of things, and that even things which appear new have precursors that are much older than a lot of people realise.
But perhaps using a cost function or a loss function is enough to call it machine learning. A machine just used an algorithm to learn another algorithm after all.
I do agree some optimization algorithms are rooted in other fields. I wasn’t trying to say that machine learning is the only historic field from which optimization methods were developed. I just wanted to point out it is a major historic field where some highly respected search and optimization procedures were first created, since people often overlook how old machine learning is and the vast set of modeling procedures apart from neural networks that make up the core of machine learning.
Instead it seems you falsely believe you are writing with some sarcasm that endows rhetorical flair to undercut my comment. It’s very rude and juvenile in addition to being wholly ineffective.
It is worth pointing out that this strategy can likely overfit on the data that you have used for training: when you change the track, your car may not behave as good as before. In other words: the coefficients are only good for that specific track(s).
The author is still using Machine Learning, even if not with neural networks: the need for rigorous strategies for model selection doesn't disappear.
More generally, I think this approach is only suited to "static" courses, which only matter in contrived demos. Any real use of a steering algorithm requires reacting to conditions that can't be predicted a priori; e.g. if you have to avoid collisions with other cars, and one car is controlled by a human player, every run is effectively a different track and overfitting like this would not be an option.
If you have to generate new coefficients for each track, your polynomial regression reduces to a polynomial interpolation of two-dimensional points which represent the track path on a plane. Which is fine and still accomplishes the specific goal, but doesn't solve what would generally be considered the actual research problem.
But then again I don't know if the neural network actually achieves this. It's a little unclear in the video: I don't know if the model is being able to learn from the human guiding the vehicle on n iterations or if the model is generated by the human guiding the vehicle on n iterations. Presumably the research goal is to develop a model which learns tracks (in this circumstance, that would be akin to the model choosing the coefficients rather than being the coefficients).
You probably don't need a neural network if your model has 5 inputs.
Manager - We need to do X, we need a ML Data Scientist.
Engineering Team - there's a simple solution we can use instead of ML.
Manager - No we need a ML Data Scientist, we need to do this right.
Time Passes, hiring a ML Data Scientist is hard, and the problem never gets solved.
Toy problems like this are still interesting because they demonstrate techniques that are applicable to bigger problems. I'm guessing the network in the car game doesn't need more than one layer and two outputs (acceleration vector), but that is besides the point.
The technique being demonstrated is the ability of GA (or maybe particle filters), to find "optimal" weights for a network given whatever simulator. This is always interesting, especially when done with graphics like this.
If he used a standard optimzation method instead, convergence would be fast and the result much better. A similar problem, using splines to set force inputs for a robot that travels through a maze with barriers optimized using start and end position and force minimization, was a lab in a course I took last year.
The question becomes one of hyper parameter search, i.e. what kind of model/function approximator is sufficient. Here the problem is easy enough that its easy to find a sufficient simple model.
The huge networks are nice for more general problems because they tend to work moderately well for everything... in the dataset.
Each and every time I see neural networks on some "techies" blog, I wanna vomit.
My job is to be the director of machine learning at a medium-sized ecommerce company. My company uses machine learning to solve lots of problems in search & customer recommendation, image & text processing, time series forecasting, and a few “backend” support models for things like phishing / fraud detection and gaining efficiencies in customer support operations.
I am happy to answer any questions I can about why machine learning has been a continued growth and investment area for my company and how we thoroughly validate business value when deciding whether to adopt ML solutions.
We use modern neural networks in probably about 10-15% of the solutions we operate.
The actual problem is that we don't know how long it'll be until we build an AGI. Experts put the range somewhere between 10 years and never.
Building an unconstrained AGI is an existential risk, so it's important to try and narrow the confidence bands on these questions. That's one of the reasons why Musk pledged $1 billion to OpenAI.
But why? So far no-one has been able to explain this to me.
An AGI in at of itself is nothing but a brain-in-a-vat. The exponential increase in scientific knowledge was based on the scientific method, which replaced the ancient Greece discussion-based epistemology with a cycle of observation-hypothesis->test->observation->...
An isolated AGI cannot test its predictions about the world. Since the unconstrained space of possible explanations is bigger than the space of explanations constrained by observational evidence, there is no way an AGI can gain useful knowledge without access to external observations.
This means we are in full control of how "intelligent" (by whatever metric) an AGI can even get by restricting its access to information. But even unlimited access to (passive) information only gets you so far, as some models require data that cannot be obtained passively (i.e. they require deliberate controlled experiments).
The final nail in the coffin of dangerous AGI is interaction with the physical world. Yes, even a toddler is a terrifying menace to millions of people if I place a button right next it that sets off a thermonuclear bomb in a city centre.
But how about we just don't do that? An AGI with limited or no physical interaction with the world (directly via robot body or indirectly through remote access) can't be any more harmful and menacing than the late Stephen Hawking.
There's no need to put AI on a leash, since there's a final naturally limiting factor: energy. Switch off the cooling system and your AGI has to throttle down lest it faces fiery death.
How about simply pulling the plug of the computer or even just the network cable?
More to the point, how would an AI even learn about such mysterious exploit if it doesn't have access to an external network in the first place? Even run-of-the-mill supercomputer centres aren't directly connected to the internet for security reasons, so why change that with a potentially dangerous computer program?
A few examples: Using speakers or microphones to transmit arbitrary data via ultrasounds. Making the CPU/GPU fans vibrate in a way that sends encoded bits. Blinking the screen to emit electromagnetic waves. Transfering certain data patterns between RAM and CPU so fast that they produce oscillations, effectively converting the BUS into a GSM antenna that can emit arbitrary data over a regular cellular network. Turning the fans off to change the heat signature in a way that transmits information.... Or even simply bliking a light to send data through regular lightwaves?
Scientists discover new ways to exfiltrate data basically every other year, how can you be so certain you've thought of every possible way?
> A few examples: ...
All these examples require active physical interaction of the machine with the world, which simply isn't possible for a server to do.
This is the environment an AGI will likely "live" in: https://bit.ly/33w7ySX
There's no speakers, no bus oscillations to pick up (from where? by what?) and you might notice that these ominous boxes don't have anything that can pick up signals (optical, vibrations, or otherwise).
Exfiltrating data from machines without touching them is completely unrelated to the capabilities of a program running in a box like this https://bit.ly/3l2hPfw
There are no sensors that can measure the outside and the floors these boxes are located at are heavily shielded and isolated from the outside anyway for various reasons (EM protection, physical security, etc.) so no potential target PC in sight.
These hyperconnected boxes are definitely (hopefully?) not where an AGI will be built.
We are doing an spectacularly lousy job of regulating what it already exists.
A) In most domains, a nonlinear model is far superior to a linear model. At the very least, generally you need to at least apply a transformation (e.g. logistic) to create a nonlinear model from a linear model, because most problem spaces are nonlinear. A nonlinear model doesn't have to be particularly complex though.
B) Machine learning (i.e. automatic tuning of parameters) is far simpler for the user than manual tuning of parameters. It's not a question of whether you "need" machine learning, but whether it will save you work. In fact, the author here is in denial - he actually says he would do machine learning, but doesn't realize that is what it is: "Instead of doing anything fancy, my program generates the coefficients at random to explore the space. If I wanted to generate a good driver for a course, I’d run a few thousand of these and pick the coefficients that complete the course in the shortest time."
This example shows why you _do_ need machine learning. The toy steering problem is a place for you to work out how the car can learn to drive with as little structure or assumption baked in as possible. Putting less a priori structure on it directly means you need models with higher capacity to figure it out.
To put this all more succinctly, “never use machine learning when business logic would do” is a statement about how commonly needed machine learning is, not business logic.
The non-nn example given here is one for epoch in epochs loop to take care of picking the best coefficients away from being a ML implementation.
I might be pedantic here, but wouldn't this then be a machine learning algorithm? Since the machine is learning the most appropriate coefficient based on some heuristic (best of X random).
Wouldn't it be better (and more honest) to say then that, simpler ML models and learning techniques can often yield good enough result that don't justify going to more advanced model and learning techniques?
I would argue no. Simply trying a bunch of inputs and choosing the most effective ones based on the output is like running a single round of training on a machine learning model. It's hardly machine learning by any stretch—there's no feedback loop, the machine doesn't "learn" anything.
If I build a compiler and test a thousand constants for default configuration options and choose the ones that perform the best, or would be a very bold claim to call my compiler "powered by machine learning".
What if that was automated in the build? I had assumed the quote meant that it would be, not that they'd manually run a few random cases and manually pick the best. I thought it be automated, like given a new course, you'd run some training routine where the machine would play the course say 100 times each time choosing random coefficients and at the end, it'll take the ones from the pass that resulted in the quickest playthrough, and those would say go in some config file and become the coefficient used by the AI CPU race cars for that course.
To me this is machine learning. Especially if you crank up that 100 to 1 million or 1 billion. At that point, it's still something that only a machine could do, I couldn't realistically try 1 billion random coefficients for the ones that result in the fastest playthrough.
So in effect, I see the machine is learning which coefficients perform better for a given course. If it tried 1 billion, it learned that out of 1 billion different possible coefficients, some particular set was the best.
> there's no feedback loop, the machine doesn't "learn" anything
So that's interesting, because if my prior statement is not to be considered machine learning. My next question is what's the criteria to go from the above to machine learning proper? So it seems that it could be the learning has to be a feedback loop. So each attempt at learning must take something away from the previous.
I'd be okay with this definition. And now I'm thinking what's the most minimal modification I can make to meet this new definition.
What if as it tried random coefficients, it remembered the ones it tried prior? And what if it made sure that no new attempt at a random set of coefficients had already been attempted before? This isn't super refined, no heuristic to what are the best coefficients to try next given the ones tried prior like say what linear regression would accomplish. But it still meets the definition. It starts random on the first round, and the next round is no longer truly random, since it can't pick the previous's round coefficients again. So at least it's descending along the set of possible coefficients through a path that will eventually try all combinations.
Would this be enough to be considered machine learning?
I'm also thinking this can start to sound a lot like Evolutionary Computation. Oh boy, in all honesty, I've always been confused about differences between stochastic, metaheuristc, and machine learning optimization techniques.
> The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning.
> Machine learning (ML) is the study of computer algorithms that improve automatically through experience.
So it seems optimizations are the techniques for which most ML is based on, but ML is more of the idea that an algorithm would improve automatically from experience. Later in the wikipedia page, it creates a very clear distinction:
> The difference between the two fields arises from the goal of generalization: while optimization algorithms can minimize the loss on a training set, machine learning is concerned with minimizing the loss on unseen samples.
I don't even know if I agree to the statement. Polynomial regression solves basically the same problem as neural networks but performs way, way, way worse on big datasets. But nonetheless I would say that polynomial regression is machine learning too.
Also the reinforcement learning community does a significant amount of it's work with neural networks which are not trained using gradients. Gradient free training is a highly active research area. More like 90% of neural networks are trained with gradient descent
This should not be surprising: a core premise of data science (empiricism in general? All logic and human knowledge?) is that intuition is often wrong. Initial hypotheses/problem definitions are just intuition.
The concepts and equations he has are pretty simple but I'm really not seeing how they translate into code.
The code also isn't the easiest to try and understand for me. Multiple 1-3 letter or otherwise (Seemingly) poorly named variables, some C jargon I'm not familiar with (Mostly the -> operator) and the fact it's a relatively large chunk of code.
It seems like the core logic loop is here: https://gist.github.com/skeeto/da7b2ac95730aa767c8faf8ec3098...
And the position and acceleration of the car is altered each iteration through randomization?
It's hard to piece together.
I think the main driving logic is on line 228. It could definitely do with some more descriptive variable names, though - "a" for angle, and "s" for sense inputs seems a bit too short.
f->w is another way to write (*f).w
But it is true that ML - in particular neural networks - often gets thrown at problems that traditional analysis and modeling could serve more efficiently, and also more predictably. The other benefit of first trying traditional analysis is that it helps the implementer of the system better understand the domain first.
Pertinently, I've heard some scientists working on medical imaging describe to me how there has been a trend away from the basic science of understanding the medical phenomena behind the image being analyzed and instead relying excessively on ML based pattern recognition to make predictions/inferences.
For example, if I want to recognize pictures of animals, with reasonable accuracy, how much space would such neural net take?
I'm not very taught about ML, but in my mind, since ML is generally heavy on computation, it should be possible to distribute neural nets for specific applications and re-use them.
The given example in the opening paragraph describes a scenario where you don't need it.
Unless you seriously need it, and have a massive budget to match, basic boolean logic will probably get you where you want to be faster.
I wouldn't advise any solo developer to use machine learning unless that's the entire product. It's very easy to make an insanely difficult game without it
what NN allows is to test millions of permutations of thousands these functions to find the one that works
if your problem is intractably hard to figure out functions for, NN provide a larger hammer to brute-force an approximate solution.
That's what the author means. There are a lot of "traditional" ML and AI techniques that can work very well for a lot of problems.
The efficiency of ML is in cost, not overall computation. Throwing machine resources at a problem is cheaper than hiring some guru with the necessary math and CS background to solve the problem.
We've seen the same thing with frameworks/libraries. Before it took specialized knowledge to do basic things like networking, media creation, etc. in code. Now there are existing tools that do everything for you.
The same has happened with optimization/algorithmic knowledge, and the genie is not going back in the bottle. There will definitely be specialized cases where that particular expertise is needed, but that is no longer the norm.