Hacker News new | past | comments | ask | show | jobs | submit login

> approximation of the target language is good enough for certain tasks but the entire process is inherently flawed

That is pretty much the definition of a "model" :)

I recently went through the "Tensorflow in Practice" specialization on Coursera and it was illuminating. The thing about ML models, whether CNNs for images, or word2vec+RNN, or whatever else, is that they really don't have any rigid scientific basis for why they work. You're doing, say, Stochastic Gradient Descent to optimize the neuron weights across your dataset. Out the other side of the training, you have a mostly meaningless set of coefficients that work well to classify other unseen data.

I dual-majored in CS and EE, and I leaned towards the "science" side of things, where things get modelled mathematically and analyzed, accepting that the model is likely incomplete but still useful. The thing that drives me nuts with ML is that there's no explanation of what the terms in the ML model actually mean (because the process that produced them doesn't actually investigate meaning, it just optimizes the terms). But... I've accepted that even though the models are pretty much semantically meaningless, they work.

> I've accepted that even though the models are pretty much semantically meaningless, they work.

Until they don't. Which may happen easily if you deploy a model for the first time.

My personal view is that (at this moment) ML is mostly correlation detection and pattern recognition, but has little to do with intelligence.

The point is that we don't have mental capacity to understand this stuff. Nobody has any clue how to interpret millions of dimensions, some non-linear manifold there and how to translate it to something humans are capable of understanding. These things might be done automatically by our brains on subconscious level in a similar fashion (or not), but on conscious level we are completely clueless and basically shoot darts to see which ones become somewhat useful.

I think you object to the lack of "mathematical beauty", but my point is "who cares?". Not sure why should reality conform to some mental model we find "appealing" for whatever reason. Deep Learning is similar to experimental physics.


Explainable AI is a emerging field, I hear about this necessity specially in NLP and Law. We expect to understand how some decision was reached, and we'll never accept some computer-generated decision if it wasn't explained how each logical step was done. And just giving millions of weights of each neuron won't give us that, because we won't be able to reach the same decision with just those parameters.

We know that IA is a bunch of probabilities, weights and relations in n-dimensions. Our rational brain can know that too, but can't feel it.

That's why you use interpretability tools like LIME

Example of this would be here: https://github.com/Hellisotherpeople/Active-Explainable-Clas...

Wait... That's what dimensionality reduction is for. I can interpret 3 PCA dimensions pretty damn well since I can figure out the covariance explained by each dimension of my original dimensions

Yeah, but your accuracy also drops. You might end up with interpretable underwhelming solution instead of non-interpretable SOTA.

As you hint at, the (more common) alternative to defining the world in advance is to rely on machine learning to figure things out (and possibly generate separate clusters for meanings that don't even map well to specific words but resolve the ambiguity). But even then you can run into problems. Even if your model can parse "cloud" correctly based on the context, good luck trying to parse a text about online storage of meteorological data.

The ontological approach described in the article doesn't really work all that well with real world data.

The raw ML approach works well enough but has a multitude of problems (e.g. learning biases, like "black" being a negative sentiment classifier when talking about people because of the texts the model was initially fed).

But given how hard it is to "solve" these problems, I'm not convinced ML alone will ever progress beyond the 80% "good enough" solution it is now, without being replaced with something completely different.

This is what makes me skeptical of all the tall tales about strong AI and "the singularity". While the specialised applications (e.g. deepfakes) are certainly impressive and a lot of the more generalised applications can go a long enough way to get a decent amount of funding despite unfixable flaws (e.g. sentiment analysis), getting from "here" to "there" seems to require more than just more incremental refinement.

Computer Linguistics courses have been teaching ontological "scientifically sound" approaches that yielded no real-world applications while Google had been eating their lunch with dumb statistical models. The dumb models have since become infinitely more intricate and improved from "barely usable" to "good enough" but seem to be inching ever closer to an insurmountable wall, whereas the "scientific" models still seem to be chasing their own tail describing spherical cows in a vacuum.

Well, science includes scientific method. Trial and error that is abundant in deep NN ML research is at the core of science. The theories and math have almost always arrived after stuff worked heuristically.

That's also one of the reasons why deep NN returned the spark to ML. ML was so deep into the proven models and math that the lack of trial and error part slowed down progress.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact