
Software 2.0 (2017) - overwhelm
https://medium.com/@karpathy/software-2-0-a64152b37c35
======
chmod775
A classical case of "when all you have is a hammer, everything looks like a
nail".

Or rather, you're blind to everything that doesn't require a hammer.

Even in most of the examples he listed (speech/voice recognition _software_ ,
translation, games, and databases), the vast majority of "code"/logic is not a
neural network.

You don't "train" your UI, neural nets won't directly consume audio or spit
out encoded audio, don't do HTTP, aren't an operating system and just having a
game AI is pretty far from having a finished game.

Neural nets approximate and guess, but for the vast majority of problems in
computing we want the exactness of code.

And coincidentally that's what makes neural networks great: They let us solve
problems where stating the exact precise steps to solve them is impossible.

~~~
yjftsjthsd-h
> You don't "train" your UI, neural nets won't directly consume audio or spit
> out encoded audio

I'm not an expert, but I thought pictures and audio were one of the few places
where you _could_ feed raw inputs into a neural net and get good results? Or
am I wrong and we instead feed in some pre-processed version?

~~~
deathanatos
My understanding is that they took in RGB picture data, yes, but that they
were required to be square (s.t. matrix transforms work on them, I think?).

But that's still different from taking in the actual JPEG data, which is sort
of what the parent gets at: something has to decode that, and that software
isn't a neural net.

(Further, when I worked w/ ML that dealt w/ image data, we had a host of non-
ML code written around it to support it, dealing with the various facets of
running in the cloud, where to get the data, where to store the results, who
to notify about the results, and a bunch of preprocessing on the image — such
as removing pointless borders that humans put around images.)

~~~
recursivecaveat
Just FYI, non-square images should work fine. Arguably that's one of the
selling points of convolutional neural nets. The early big-name training sets
tended to be all square though, and if your design ends with a fully-connected
layer (more common back then as well) then you're pretty locked-in for aspect
ratio unless you retrain.

------
scandox
> Across many applications areas, we’ll be left with a choice of using a 90%
> accurate model we understand, or 99% accurate model we don’t.

I remember a doctor telling me that about a condition my child had. He said
that "generally" it wasn't dangerous. It was interesting because to him what
was important - naturally - was to work with the numbers and get the best
overall results - without knowing or understanding exactly why. It was better
for him and his time use to just know that it was "generally" OK. From my
point of view I wasn't interested in "generally" but only specifically whether
it was going to be dangerous in this case.

I think when the shit hits the fan for one personally then 90% you understand
is going to be vastly better than 99% you don't. In a world where everything
runs on the 99% model I imagine most people will have it good, but sometimes
insanely terrible things will happen. Maybe that's ok.

As someone (I forget) pointed out we already have a form of opaque AI in the
form of huge human bureaucracies that produce outcomes without anyone
understanding all the steps that produce them. So maybe we already partly live
in that world.

------
ooobit2
Reminds me of those videos from the 1950s that predicted what life would look
like in the year 2000.

He's right about cleaning datasets being an entire job itself. But that should
have been a red flag for his conclusion even then. Scaling anything means
scaling costs. So, the more you want your neurochip to do that it isn't
capable of doing by design, you need to do before or after it's done its
task(s). That's what's between the lines of what he mentioned of "silent
fails." If you don't want bias in your output, and your design isn't capable
of vetting bias, you need to do the work of vetting bias _before_ you pass
that dataset off. That means you need an entire model defining bias,
predicting impact and constraints on outputs.

I've said it before and will continue saying it 'til I'm apparently blue in
the face: The complexity of a solution is dependent on the complexity of the
problem it solves. I get that simple things feel attractive because they
require less effort. We don't really reduce effort so much as we shift it from
wholly solutionizing to partly solutionizing and mostly trying to continue to
partly solutionize.

------
Kednicma
The author's boastful tweet should really be examined. _Can_ a deep-learning
stack write better code than a person?

Although, really, the more important question: _Can_ we do studies on this
without comparing expert computers to undergraduates who are just learning?

~~~
choeger
Sure it can.

For algorithms that fit a particular form, you could always automatically
derive an optimal solution (think of dynamic programming). The actual question
is: When is a neural net a good idea? Personally, I think it will always be
great when the output goes to a human. So speech synthesis is a really great
example. The same holds for game AIs.

But things that go into a "software 1.0" system (anything that needs to be
understood by humans)? It will be difficult. Imagine recognition for self-
driving cars sounds great until you have the first deadly accident that you
simply cannot explain. Your system might even be really, really good, but that
won't reduce the liability in that one case where it doesn't work.

~~~
Kednicma
Interesting ideas. How long should it take to create one of these solutions?
NP-completeness doesn't care whether the thinker is human or neural-net, and
optimality can take exponentially long to find.

Also, what if there simply isn't an optimal solution? Matrix multiplication
comes to mind; we think it's quadratic-time, given the asymptotic number of
operations that ought to be needed, but all existing algorithms are cubic-time
(rounded up), and possibly there _isn 't_ a quadratic-time algorithm.

------
jrsj
Even if you assume the benefits here are all true and not overly optimistic,
there is a significant downside. Software that is "better" by means of more
complexity via AI will be significantly harder for a human to understand. Even
if it's _generally_ more accurate, software that is a black box is inherently
frustrating to at least some of it's users. If the advantage of AI generated
software would be in solving problems through greater complexity than a human
would think of, then it won't be able to achieve the kind of simplicity that
makes it much easier to understand what software is actually doing.

------
darkhorse13
This has not aged well. Not saying deep learning is not useful, but the hype
finally caught up with reality.

~~~
p1esk
Take a look at GPT-3 doing a programming phone screen:
[https://twitter.com/lacker/status/1279136788326432771](https://twitter.com/lacker/status/1279136788326432771)

This is an unedited output of a language model which was not trained to pass
programming phone screens. The only thing it was trained to do is to predict
the next word in a sentence. Just 10 years ago this would be pure sci-fi.
Today it's "meh". More importantly, there's no sign of slowing down.

------
quantum_state
The article is trying to get free lunch ... we all know what it leads to ...

------
ishcheklein
Ideas discussed in his post might seem too far from the reality and
controversial, but first don't forget they have been doing Tesla's self-
driving ML models for a few years now - so, he definitely has some material to
generalize. It's one of the most advanced ML models in production, and without
reflecting on the process it would be hard to develop it, would be hard to
maintain it, etc.

Also, from my own experience building DVC - when you do any ML project you do
have code indeed, but not doubt that data can be considered as important
element as code, we need to take it seriously - track, review, etc, etc.

~~~
mindfulplay
That is indeed the scariest part. Given the number of vehicular fatalities
this guy is personally responsible for at Tesla, I don't think this is a smart
or important approach.

The callous nature in which people apply Silicon Valley mindset to critical
infrastructure problems is scary: perhaps he should get off his self-driving
high horse.

~~~
ishcheklein
I think it's orthogonal question, right? Even if there were some mistakes, how
many more death there would have been if they were not thinking about rigorous
process behind ML development?

~~~
mindfulplay
I would not use Silicon Valley technology to A/B test death rates by 'trying'
arbitrary 'cool' things inside a fast moving metal torpedo. He can play with
toys as he pleases but using real cars seems absolutely bonkers.

