Hacker News new | past | comments | ask | show | jobs | submit login

Unpopular quote from my image and video processing professor - “The only problem with machine learning is that the machine does the learning and you don’t.”

While I understand that is missing a lot of nuance, it has stuck with me over the past few years as I feel like I am missing out on the cool machine learning work going on out there.

There is a ton of learning about calculus, probability, and statistics when doing machine learning, but I can’t shake the fact that at the end of the day, the output is basically a black box. As you start toying with AI you realize that the only way to learn from your architecture and results is by tuning parameters and trial and error.

Of course there are many applications that only AI can solve, which is all good and well, but I’m curious to hear from some heavy machine learning practitioners - what is exciting to you about your work?

This is a serious inquiry because I want to know if it’s worth exploring again. In the past university AI classes I took, I just got bored writing tiny programs that leveraged AI libraries to classify images, do some simple predictions etc.

ML is more like growing crops than it is about "designing stuff". Growing crops is slow, and you don't know beforehand what the result will be. However, you can still throw a lot of science at growing crops ("plant breeding" is a science), and the same holds for engineering.

Or you throw ML at growing crops, much like they throw ML at ML these days (https://ai.googleblog.com/2017/05/using-machine-learning-to-...).

This might be pedantry, but there are far sillier things to apply ML to than ML itself, despite the initial sound of the thing.

I'd highly recommended watching this podcast interview between Lex Fridman and the creator of fast.ai, released yesterday: https://youtu.be/4CTDdxfSXF0

He covers a lot of relevant and interesting topics, including how he tries make it less of a black box when designing their courses, and also how it has the potential to confer increased rather than decreased insight into what's going on in a dataset.

I don't personally know much about ML, but I think even though there will still be probably an opaque aspect in many cases for a while to come, immense value will still be continually gleaned, as long as people are aware of the limitations. If you accept something is a black box and don't oversell it, a black box is better than no box.

All of our own brains are far more of a black box than any deep learning model, in many capacities. But we still use it daily for meat-machine learning, to great success, and can still tune the parameters a bit to improve outcomes, even if we very often don't really understand exactly what we're tuning or why it seems to cause certain effects for some brains (or why it doesn't have those effects for other brains). Consciousness may be the biggest black box of them all, but here we are all talking and making nearly non-stop use of it.

I agree it's very important to try our hardest to reach a deeper understanding, but it's kind of like psychiatry vs. neuroscience, or experimental quantum physicists who "shut up and calculate" vs. theoretical quantum physicists who actually want to know what's really going on here at the most fundamental level beyond the useful black box of quantum behavior. While we're trying to solve the hard problems of deep understanding, we can make practical use of what we have in the meantime. We need both kinds of fields and people.

If future AI architectures and ideas lead to some degree of convergence with a biological brain, I wonder if the black box problem might become amplified. Maybe one part of the solution is to not use the brain as the model to aspire to, and to eventually seek out alternative avenues to higher, and eventually general, intelligence? (Or maybe I'm completely talking out of my ass, because I'm not at all a researcher or practitioner. I'd appreciate any input from experts.)

For me, the exciting part is to understand how the "blackbox" can learn and to find a representation of the data that makes this box to learn.

For instance, I've been working in users profiling and it's been a challenge to find which features and in which representation allow the model to learn. It's fantastic when you make a little change in a feature (for instance, use the median instead of the mean) and your model suddenly gets a +5% acc.

The field in the real world is not as simple as to make an API call. The data in the wild is really complicated. To get value from this data is the real challenge.

"replace the mean with median"

This illustrates the black-box aspect of it. You changed something and the results are affected but you don't know why. Median has a built-in implicit filtering (it's not affected by extreme outliers like the mean), so it could simply be that you needed to filter your inputs. But won't know, because... black box.

and the results are affected but you don't know why

Well, that's just your assumption without knowing the exact problem and I think you are missing my point.

You can approach to data preparation by randomly changing things, and maybe you can get some interesting results but I promise you you will fail many many times. Other way is to know what does it means to change the mean for the median for instance (as I mentioned in this just random example), and I promise you will find better solutions.

The idea is not just "change and test" and see what happens. The interesting part is to understand how the model uses your representation and why one is "better" than another.

"what is exciting to you about your work?" Seeing the black boxes solve problems I know I could not solve manually. If you go into industry, expect to be spending the vast majority of your time wrangling data, integrating the black boxes into other software, and apologizing repeatedly to managers because you have no idea how long it will take to make your black box work, if it ever does. Trying to make plans around AI based software is an exercise in futility.

I've worked with blackboxes (special sauce) which didn't actually do anything (useful).

Free career advice:

Don't be the boy in the parable of the emperor's clothing. You will be punished. Just smile and nod, say vaguely positive stuff, get your job references, and quickly find a new gig.

I'd suggest you look at Probabilistic programming languages: https://en.m.wikipedia.org/wiki/Probabilistic_programming

As someone who like you find ML boring, PPL is much more fascinating, because it brings the programming back into AI. It's more like the logical evolution of logic based rules system, adding bayesian probabilities to it.

I’m most excited about what is now being called scientific machine learning, ie machine learning models explicitly structured to learn interpretable models that can be used to understand some scientific domain better. For example, I’m starting to work on using graph RNNs to study dynamical behavior of reinforcement learning agents and how it might give us insights into psychiatric disorders.

I'm interested in reading more about this area of work. Can you share your project page if it exists or any foundational papers in this space

My inspiration comes from this: https://arxiv.org/pdf/1809.06303.pdf

There's a long history of using differential equation models to model high-level brain function (see also neural mass models and neural field models) but I think there are advantages to using discrete time approximations such as neural networks in an RL setting to investigate how the dynamics (e.g. attractor states, etc) map onto behavior.

Your professor was quoting/paraphrasing Alan Perlis in reverse.

Epigram #63

63. When we write programs that "learn", it turns out we do and they don't.


SIGPLAN Notices Vol. 17, No. 9, September 1982, pages 7 - 13.

Nice reference, but I think that claim is opposite to the professor's one.

you are correct.

I’m not a heavy practitioner but as a robotics engineer being able to use existing algorithms to perform previously difficult tasks is exciting. I’m also hopeful that the coming decades will bring a lot more to robotics software as machine learning research continues.

One thing I’ve been learning is that the black box nature of machine learning algorithms is partially a myth. A lot of tools have been written to help explain models. However I’m a mere novice and student so that’s just something I’ve heard. Would love it if a skilled practitioner chimed in.

There has been a lot of research in understanding neural networks and making them less of a black box. If you classify cat from dog videos on YouTube, it doesn't matter if you make a mistake every now and again. But if you want to build a self-driving car or make a medical diagnosis, you better be able to explain which your network made a certain decision.

I'm hearing this in the last few years quite often but I'm not sure what kind of explanation you mean.

The x-ray image was incorrectly misdiagnosed, because... What type of thing should come here?

... it didn't look like the other class. ... it didn't have this weird smudge thing on the top left in which case usually there should be a little hazier blob in the middle, except when the pointiness of the thing that is to the right of the brightest blabla...

You get the idea, my description is even exaggerating the nameability and describability of these structures. You'd get a long and complex description at best, because simple models don't work for pattern recognition. But even if you made sure to just use well understood features like edge thickness, angles, sizes of connected components etc, how would a boolean formula of a hundred such terms be helpful in court or wherever you want to use these explanations?

For example there is a concept visual attention. You plot which areas of the image the model pays attention to when making it's decision.

Does the FDA require you are able to explain how drugs work? Or do you just have to show their efficacy and safety in trials?

One thing that is exciting to me is that machine learning is the next level of abstraction in our tools for thoughts, giving us ways to deal with fuzzy and ultra-high-dimensional problems.

Machine learning can be viewed through optimization, probablity and information theoretical lenses, and each of them will give you understanding of a different aspect. One recent "click" I had in my head came from debating with colleagues who are really strong in optimization: they routinely define values, sets, functions etc. as the end result of an optimisation problem. And this makes total sense.

Going down to maths that most people might be familiar with from highschool, think about how you define a line in space, or a plane. You write down "y=f(m)=a+mx" for a line (which you could read: to find a point m times the length of x away from the anchor point of a line, start at the anchor and move m times along vector x") "P={x:dot(w,x-x0)=0}" for a plane (read: the set of all x for which the dot product is 0). These are two different forms of describing the objects, one which is expressed as a function* using the constraints placed on the object to move along on it, and the other simply as a concise way to express all constraints. You can move from one form to another for both objects, and finding these types of connections increases your understanding of these objects and how their constraints define their structures.

Now, optimisation problems are similarly defined by their cosntraints,but they are much more flexible and connected to reality. We describe constraints in the hierarchy of convex (LP, QP, SOCP,SDP, conal) and nonconvex ( aka. "fucking difficult to deal with") constraints and then develop optimisers that attempt to fulfill these constraints as best as possible given some data. If it's solvable, then we have a way to reason explicitly about very complex properties the data and the mathematical objects contained in it - which can be used to solve actual problems, like finding the pareto frontier of a decision space, or laying out an UI nicely.

The machine learning takes this and pushes it to it's extreme, finding ways to navigate extremely high dimensional spaces and reason about the objects contained therein. Things like word vectors, latent representations and the "natural image manifold", finding ways to discover the "separation" of classes by building generative models and measuring distances in the latent space, visualisation techniques like curch plots, TSNE ...all of this extends our ability to reason about and understand data and ultimately our world.

That's one thing that excites me at least

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact